QY
Qisong Yang
11 records found
1
Authored
General Optimal Trajectory Planning
Enabling Autonomous Vehicles with the Principle of Least Action
This study presents a general optimal trajectory planning (GOTP) framework for autonomous vehicles (AVs) that can effectively avoid obstacles and guide AVs to complete driving tasks safely and efficiently. Firstly, we employ the fifth-order Bezier curve to generate and smooth ...
In traditional reinforcement learning (RL) problems, agents can explore environments to learn optimal policies through trials and errors that are sometimes unsafe. However, unsafe interactions with environments are unacceptable in many safety-critical problems, for instance in ro
...
Without an assigned task, a suitable intrinsic objective for an agent is to explore the environment efficiently. However, the pursuit of exploration will inevitably bring more safety risks.
An under-explored aspect of reinforcement learning is how to achieve safe efficient explor
...
Unmanned Aerial Vehicle (UAV) maneuver strategy learning remains a challenge when using Reinforcement Learning (RL) in this sparse reward task. In this paper, we propose Subtask-Masked curriculum learning for RL (SUBMAS-RL), an efficient RL paradigm that implements curriculum ...
Safety is critical to broadening the application of reinforcement learning (RL). Often, we train RL agents in a controlled environment, such as a laboratory, before deploying them in the real world. However, the real-world target task might be unknown prior to deployment. Reward-
...
Safety is critical to broadening the real-world use of reinforcement learning. Modeling the safety aspects using a safety-cost signal separate from the reward and bounding the expected safety-cost is becoming standard practice, since it avoids the problem of finding a good balanc
...
Safety is critical to broadening the real-world use of reinforcement learning (RL). Modeling the safety aspects using a safety-cost signal separate from the reward is becoming standard practice, since it avoids the problem of finding a good balance between safety and performance.
...
Safety is critical to broadening the a lication of reinforcement learning (RL). Often, RL agents are trained in a controlled environment, such as a laboratory, before being de loyed in the real world. However, the target reward might be unknown rior to de loyment. Reward-free R
...
The use of reinforcement learning (RL) in real-world domains often requires extensive effort to ensure safe behavior. While this compromises the autonomy of the system, it might still be too risky to allow a learning agent to freely explore its environment. These strict impositio
...
Safe exploration is regarded as a key priority area for reinforcement learning research. With separate reward and safety signals, it is natural to cast it as constrained reinforcement learning, where expected long-term costs of policies are constrained. However, it can be hazardo
...
Deep reinforcement learning went through an unprecedented development in the last decade, resulting in agents defeating world champion human players in complex board games like go and chess. With few exceptions, deep reinforcement learning research focuses on fully observable env
...
Contributed
Deep reinforcement learning went through an unprecedented development in the last decade, resulting in agents defeating world champion human players in complex board games like go and chess. With few exceptions, deep reinforcement learning research focuses on fully observable env
...
Deep reinforcement learning went through an unprecedented development in the last decade, resulting in agents defeating world champion human players in complex board games like go and chess. With few exceptions, deep reinforcement learning research focuses on fully observable env
...
Deep reinforcement learning went through an unprecedented development in the last decade, resulting in agents defeating world champion human players in complex board games like go and chess. With few exceptions, deep reinforcement learning research focuses on fully observable env
...