Public safety and emergency response agencies increasingly consider the deployment of mobile robots as mounting climate-related disasters and security challenges place human personnel at higher risk and stress. Mobile robots, such as drones, are a promising strategy to respond to
...
Public safety and emergency response agencies increasingly consider the deployment of mobile robots as mounting climate-related disasters and security challenges place human personnel at higher risk and stress. Mobile robots, such as drones, are a promising strategy to respond to these challenges: They can navigate difficult, hazardous terrain, gather real-time situational data, and conduct search or reconnaissance tasks without putting humans at direct risk. However, the currently practiced teleoperation of robots is challenging for such complex missions since the simultaneous navigation, situation assessment, and search tasks can overload human cognitive abilities. Therefore, autonomous planning and decision-making algorithms are required to enable robots to explore and search unknown environments for targets such as missing persons or hazardous materials.
Moving towards this goal, this thesis addresses two core problems. First, local motion planning must carefully account for information gained from sensor observations as well as collision avoidance and the robot’s dynamics while moving through cluttered, unknown areas. Second, global exploration planning must strategically select where in the environment to explore to find the target quickly - especially when the environment is large or complex. Given that human operators often possess semantic knowledge about likely target locations, we hypothesize that incorporating such guidance by observed semantic features (e.g., object or room types) into the exploration planning is crucial for time-efficient autonomous search. We address these two core problems by making the following contributions.
The first contribution of the thesis is an informative local motion planning approach
that generates safe, collision-free trajectories around obstacles while minimizing uncertainty about the target locations. The critical challenge is to achieve computationally efficient planning of trajectories that maximize information gain under the robot’s kinodynamic constraints. In the proposed approach, a model predictive control (MPC) motion planner is guided by a learned viewpoint policy. The policy is trained via deep reinforcement learning (DRL) to maximize long-term information gain by providing a local subgoal to the MPC. The MPC follows the subgoal and ensures that the motion plan remains feasible and collision-free. Therefore, the robot can rapidly replan safe and informative local trajectories online. Simulation experiments demonstrate that the method achieves competitive performance in locating targets compared to a computationally expensive state-of-the-art planner using Monte Carlo Tree Search (MCTS), while allowing for significantly faster execution and replanning.
While local informative planning is crucial for exploring cluttered spaces, it often be-
haves myopically and inefficiently with respect to large and complex environments. Therefore, the second contribution introduces a global target search planner that balances directed search towards semantically promising areas with complete search space coverage. This planner extends the idea of frontier exploration - focusing observations on the boundaries between explored and unexplored regions - to target search, where different frontiers are assigned a semantic priority. This priority represents the semantic relationships between the target and nearby objects. To minimize target search time, the target search planner schedules high-priority frontiers earlier by solving a custom combinatorial optimization problem to determine the visitation order. By integrating coverage gains into the frontier priorities, the planner ensures that the robot explores the environment efficiently while focusing on semantically relevant areas. We demonstrate this approach in two studies outlined below.
Large, high-quality datasets for learning target-specific semantic relationships are
scarce in many real-world scenarios, especially in search and rescue. The third contribution addresses this limitation by proposing a method to learn semantic priority models from expert feedback. Rather than collecting massive amounts of labeled data, the approach exploits an expert operator’s sparse guidance inputs in a few target search scenarios. This expert guidance selects a frontier to explore next, which is stored in a training dataset together with the frontier’s semantic features. An expert model is then trained to approximate a priority function that predicts how relevant each frontier is for the expert. By incorporating this learned priority function into the global target search planner, the robot can autonomously prioritize semantically relevant areas according to the expert’s semantic knowledge. Experiments show that using a small number of expert demonstrations is sufficient for the robot to significantly improve its search efficiency and reduce travel distance until the target is found.
Lastly, the thesis extends semantic target search to three-dimensional environments
by integrating it into a 3D planning pipeline for micro aerial vehicles (MAV). The pipeline first detects objects in the environment using onboard vision and associates them with priority values computed from pre-trained large language model (LLM) embeddings. These priorities are then propagated into frontiers in a 3D voxel map, indicating frontier regions that are most likely to contain the target. This enables the evaluation of frontier viewpoints for their information gain that accounts for both semantic priority and volumetric coverage. The viewpoint gains are then used in the combinatorial target search planner to prioritize the viewpoints that most likely lead to the target while ensuring efficient coverage of the environment. By integrating the MAV’s kinodynamic constraints into the planning costs, the system ensures smooth, feasible trajectories in real-time. Simulation studies reveal that semantically guided exploration leads to faster and more reliable target discovery than different purely coverage-based exploration baselines. Experiments with a real MAV in the lab confirm the approach’s ability to autonomously navigate an MAV through complex 3D environments to a target, exploiting semantic cues, maximizing
coverage, and avoiding collisions.
In summary, this thesis demonstrates how planning and learning techniques can be
combined for autonomous target search and exploration. These techniques enable mobile robots to navigate unknown environments efficiently and safely while searching for targets and collecting required information. Crucially, our proposed method for semantically guided frontier planning bridges the gap between recent learning-based navigation approaches and established planning-based approaches suitable for real-world robotic systems. By integrating semantic knowledge into robotic exploration, the proposed methods can reduce human operator cognitive load and, therefore, facilitate robot deployment in scenarios such as search and rescue or reconnaissance missions.