Adaptive Risk-Tendency

None, None; None, None; None, None

Adaptive Risk-Tendency

Nano Drone Navigation in Cluttered Environments with Distributional Reinforcement Learning

Conference Paper (2023)

Author(s)

C. Liu (TU Delft - Control & Simulation)

EJ van Kampen (TU Delft - Control & Simulation)

G. C. H. E. de Croon (TU Delft - Control & Simulation)

Research Group

Control & Simulation

Copyright

DOI related publication

https://doi.org/10.1109/ICRA48891.2023.10160324

To reference this document use:

https://resolver.tudelft.nl/uuid:f53b6aea-4799-4404-9cbf-dcf6d5a85242

More Info

expand_more

Publication Year

2023

Language

English

Copyright

Research Group

Control & Simulation

Bibliographical Note

Green Open Access added to TU Delft Institutional Repository ‘You share, we take care!’ – Taverne project https://www.openaccess.nl/en/you-share-we-take-care Otherwise as indicated in the copyright section: the publisher is the copyright holder of this work and the author uses the Dutch legislation to make this work public. @en

Pages (from-to)

7198-7204

ISBN (electronic)

9798350323658

Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

Enabling the capability of assessing risk and making risk-aware decisions is essential to applying reinforcement learning to safety-critical robots like drones. In this paper, we investigate a specific case where a nano quadcopter robot learns to navigate an apriori-unknown cluttered environment under partial observability. We present a distributional reinforcement learning framework to generate adaptive risk-tendency policies. Specifically, we propose to use lower tail conditional variance of the learnt return distribution as intrinsic uncertainty estimation, and use exponentially weighted average forecasting (EWAF) to adapt the risk-tendency in accordance with the estimated uncertainty. In simulation and real-world empirical results, we show that (1) the most effective risk-tendency varies across states, (2) the agent with adaptive risk-tendency achieves superior performance compared to risk-neutral policy or risk-averse policy baselines. Code and video can be found in this repository: https://github.com/tudelft/risk-sensitive-rl.git

Files

Adaptive_Risk_Tendency_Nano_Dr... (pdf)

(pdf | 6.1 Mb)

- Embargo expired in 05-02-2024

License info not available