Learning to search & track dynamic targets with graph representations

More Info
expand_more

Abstract

Autonomous robots have been widely applied to search and rescue missions for information gathering about target locations. This process needs to be continuously replanned based on new observations in the environment. For dynamic targets, the robot needs to not only discover them but also keep tracking their positions. Previous works focus on either searching for static targets or tracking dynamic targets given the number of targets and their initial positions. However, the prior information including targets not moving and initial target states can be difficult to obtain in reality. There are also some efforts to solve the search and tracking task jointly by switching between the search mode and the track mode or designing hybrid heuristics. But these methods cannot account for the effect of target movement during the search process, and the trade-off between search and tracking is sensitive to the heuristics.

To overcome the limitations above, in this thesis, we propose a graph formulation of the search and tracking of an unknown number of dynamic targets. The search and tracking problem is decoupled into two parts: search for undiscovered targets and track discovered ones. The search objective is modeled by minimizing the uncertainty in the environment evolving according to a diffusion mechanism and the tracking objective is formulated as minimizing the entropy of target belief distributions. Based on that, we design a novel graph neural network architecture, trained via Reinforcement Learning, that outputs the next motion primitive for the robot to collect information in the environment. We first evaluate this framework in the pure search and the pure tracking tasks. The results show that our method outperforms a variety of baselines both when searching in small and medium-scale environments, and tracking multiple dynamic targets in medium-scale environments. Then the experiments of the search and tracking task validate that our method achieves a better trade-off under equally good search or tracking performance, and scales to a large number of targets.