Self-­Supervised Learning for Visual Obstacle Avoidance

More Info
expand_more

Abstract

With a growing number of drones, the risk of collision with other air traffic or fixed obstacles increases. New safety measures are required to keep the operation of Unmanned Aerial Vehicles (UAVs) safe. One of these measures is the use of a Collision Avoidance System (CAS), a system that helps the drone autonomously detect and avoid obstacles. The design of a Collision Avoidance System is a complex task with many smaller subproblems, as illustrated by Albaker and Rahim [1]. How should the drone sense nearby obstacles? When is there a risk of collision? What should the drone do when a conflict is detected? All of these questions need to be answered to develop a functional Collision Avoidance System. However, all of these subproblems – except the sensing of obstacles – only concern the behavior of the vehicle. They can be solved independently of the target platform as long as it can perform the required maneuvers; it does not matter whether it is a UAV or a larger vehicle. The sensing of the environment, on the other hand, is the only subproblem that places requirements on the hardware, specifically the sensors that should be carried by the UAV. It is the hardware that sets UAVs apart from other vehicles. Unlike autonomous cars, other groundbased vehicles or larger aircraft, UAVs have only a small payload capacity. It is therefore not practical to carry large or heavy sensors such as LIDAR or radar for obstacle avoidance. Instead, obstacle avoidance on UAVs requires clever use of lightweight sensors: cameras, microphones or antennae. This research will therefore focus on the sensing of the environment. Out of the sensors mentioned above – cameras, microphones and antennae – cameras are the only ones that can detect nearly all groundbased obstacles and other air traffic; microphones and antennae are limited to detection of sources of noise or radio signals1. Therefore, this research will focus on the visual detection of obstacles. The field of computer vision is welldeveloped; it may already be possible to find an adequate solution for visual obstacle detection using existing stereo vision methods like Semiglobal Matching (SGM) [23]. These methods, however, only use a fraction of the information present in the images to estimate depth – the disparity. Other cues such as the apparent size of known objects are completely ignored. The use of appearance cues for depth estimation is a relatively new development driven largely by the advent of Deep Learning, which allows these cues to be learned from large, labeled datasets. As long as the UAV’s operational environment is similar to this training dataset it should be possible to use appearance cues in a CAS. However, this is difficult to guarantee and may require a prohibitively large training set. SelfSupervised Learning may provide a solution to this problem. After training on an initial dataset, the UAV will continue to collect new training samples during operation. This allows it to ‘adapt’ to its operational environment and to learn new depth cues that are relevant in that environment. SelfSupervised Learning for depth map estimation is a young field, the first practical examples started to appear around 2016 (e.g. [17]). Most of the current literature is focused on automotive applications

Files