Robust Monocular Depth Estimation For UAVs

Master Thesis (2024)
Author(s)

I. Hassan (TU Delft - Electrical Engineering, Mathematics and Computer Science)

Contributor(s)

R. T. Rajan – Mentor (TU Delft - Signal Processing Systems)

R. Sabzevari – Graduation committee member (TU Delft - Group Sabzevari)

Faculty
Electrical Engineering, Mathematics and Computer Science
More Info
expand_more
Publication Year
2024
Language
English
Graduation Date
18-12-2024
Awarding Institution
Delft University of Technology
Programme
['Electrical Engineering | Signals and Systems']
Faculty
Electrical Engineering, Mathematics and Computer Science
Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

This thesis presents an approach to monocular depth estimation for Unmanned Aerial Vehicles (UAVs). Monocular depth estimation is a critical perception task for UAVs, enabling them to infer depth information from visual data without relying on heavy or power-consuming sensors such as LiDAR or stereo cameras. Given the operational constraints of UAVs, such as limited payload and energy resources, robust and efficient depth estimation methods are required to facilitate safe navigation and environmental interaction. The proposed methodology in this thesis integrates visual data from a monocular camera with inertial measurements from an Inertial Measurement Unit (IMU) sensor. This combination aims to address challenges such as scale ambiguity in the depth estimates and inaccuracies in dynamic environments that are common in aerial operations. The integration of IMU data with a differentiable camera-centric Extended Kalman Filter (EKF) allows for better ego-motion estimation, effectively calibrating the visual information with drone dynamics. The method further incorporates depth map frame prediction, leveraging initial depth estimates along with temporal dynamics to predict future depth maps. This predictive capability improves efficiency by reducing the need for full depth estimation in every frame, allowing robotic agents to anticipate environmental changes. The evaluation on simulated and real-world datasets shows that while the algorithm performs well over short forecast horizons, accumulating errors from IMU data and the assumption of a static environment limit its long-term accuracy. The future depth map prediction algorithm reduced the need for DynaDepth from 10 runs per second to 2, and on the Mid-Air dataset, from 25 to 5. Additionally, this study provides a foundation for future work, including the integration of an object-oriented frame prediction algorithm.

Files

License info not available