Search results | TU Delft Repositories

document

Generalized Model and Deep Reinforcement Learning-Based Evolutionary Method for Multitype Satellite Observation Scheduling

Song, Yanjie (author), Ou, Junwei (author), Pedrycz, Witold (author), Suganthan, Ponnuthurai Nagaratnam (author), Wang, X. (author), Xing, Lining (author), Zhang, Yue (author)

Multitype satellite observation, including optical observation satellites, synthetic aperture radar (SAR) satellites, and electromagnetic satellites, has become an important direction in integrated satellite applications due to its ability to cope with various complex situations. In the multitype satellite observation scheduling problem ...

journal article 2024

document

Hierarchize Pareto Dominance in Multi-Objective Stochastic Linear Bandits

Cheng, Ji (author), Xue, Bo (author), Jiaxiang, Y. (author), Zhang, Qingfu (author)

Multi-objective Stochastic Linear bandit (MOSLB) plays a critical role in the sequential decision-making paradigm, however, most existing methods focus on the Pareto dominance among different objectives without considering any priority. In this paper, we study bandit algorithms under mixed Pareto-lexicographic orders, which can reflect...

journal article 2024

document

DACOOP-A: Decentralized Adaptive Cooperative Pursuit via Attention

Zhang, Zheng (author), Zhang, Dengyu (author), Zhang, Qingrui (author), Pan, W. (author), Hu, Tianjiang (author)

Integrating rule-based policies into reinforcement learning promises to improve data efficiency and generalization in cooperative pursuit problems. However, most implementations do not properly distinguish the influence of neighboring robots in observation embedding or inter-robot interaction rules, leading to information loss and inefficient...

journal article 2024

document

Hierarchical Motion Planning and Tracking for Autonomous Vehicles Using Global Heuristic Based Potential Field and Reinforcement Learning Based Predictive Control

Du, Guodong (author), Zou, Yuan (author), Zhang, Xudong (author), Li, Z. (author), Liu, Qi (author)

The autonomous vehicle is widely applied in various ground operations, in which motion planning and tracking control are becoming the key technologies to achieve autonomous driving. In order to further improve the performance of motion planning and tracking control, an efficient hierarchical framework containing motion planning and tracking...

journal article 2023

document

The first AI4TSP competition: Learning to solve stochastic routing problems

Zhang, Yingqian (author), Bliek, Laurens (author), da Costa, Paulo (author), Refaei Afshar, Reza (author), Reijnen, Robbert (author), Catshoek, T. (author), Vos, D.A. (author), Verwer, S.E. (author), Schmitt-Ulms, Fynn (author)

This paper reports on the first international competition on AI for the traveling salesman problem (TSP) at the International Joint Conference on Artificial Intelligence 2021 (IJCAI-21). The TSP is one of the classical combinatorial optimization problems, with many variants inspired by real-world applications. This first competition asked the...

journal article 2023

document

Teacher-apprentices RL (TARL): leveraging complex policy distribution through generative adversarial hypernetwork in reinforcement learning

Tang, Shi Yuan (author), Irissappane, Athirai A. (author), Oliehoek, F.A. (author), Zhang, Jie (author)

Typically, a Reinforcement Learning (RL) algorithm focuses in learning a single deployable policy as the end product. Depending on the initialization methods and seed randomization, learning a single policy could possibly leads to convergence to different local optima across different runs, especially when the algorithm is sensitive to hyper...

journal article 2023

document

Benchmarking Robustness and Generalization in Multi-Agent Systems: A Case Study on Neural MMO

Chen, Yangkun (author), Yu, Chenghui (author), Zhu, Hengman (author), Liu, Shuai (author), Zhang, Yibing (author), Suarez, Joseph (author), Zhao, Liang (author), He, J. (author), Chen, Jiaxin (author)

We present the results of the second Neural MMO challenge, hosted at IJCAI 2022, which received 1600+ submissions. This competition targets robustness and generalization in multi-agent systems: participants train teams of agents to complete a multi-task objective against opponents not seen during training. We summarize the competition design...

journal article 2023

document

Synchromodal freight transport re-planning under service time uncertainty: An online model-assisted reinforcement learning

Zhang, Y. (author), Negenborn, R.R. (author), Atasoy, B. (author)

The objective of this study is to address the issue of service time uncertainty in synchromodal freight transport, which can cause delays, inefficiencies, and reduced satisfaction for shippers. The proposed solution is an online deep Reinforcement Learning (RL) approach that takes into account the service time uncertainty, assisted by an...

journal article 2023

document

Learning to Solve Multiple-TSP With Time Window and Rejections via Deep Reinforcement Learning

Zhang, Rongkai (author), Zhang, Cong (author), Cao, Zhiguang (author), Song, Wen (author), Tan, Puay Siew (author), Zhang, Jie (author), Wen, Bihan (author), Dauwels, J.H.G. (author)

We propose a manager-worker framework (the implementation of our model is publically available at: https://github.com/zcaicaros/manager-worker-mtsptwr) based on deep reinforcement learning to tackle a hard yet nontrivial variant of Travelling Salesman Problem (TSP), i.e. multiple-vehicle TSP with time window and rejections (mTSPTWR), where...

journal article 2023

document

Eligibility traces and forgetting factor in recursive least-squares-based temporal difference

Baldi, S. (author), Zhang, Z. (author), Liu, Di (author)

We propose a new reinforcement learning method in the framework of Recursive Least Squares-Temporal Difference (RLS-TD). Instead of using the standard mechanism of eligibility traces (resulting in RLS-TD((Formula presented.))), we propose to use the forgetting factor commonly used in gradient-based or least-square estimation, and we show that...

journal article 2022

document

Barrier Function-based Safe Reinforcement Learning for Formation Control of Mobile Robots

Zhang, Xinglong (author), Peng, Yaoqian (author), Pan, W. (author), Xu, Xin (author), Xie, Haibin (author)

Distributed model predictive control (DMPC) concerns how to online control multiple robotic systems with constraints effectively. However, the nonlinearity, nonconvexity, and strong interconnections of dynamic system models and constraints can make the real-time and real-world DMPC implementations nontrivial. Reinforcement learning (RL)...

conference paper 2022

document

A new reinforcement learning-based variable speed limit control approach to improve traffic efficiency against freeway jam waves

Han, Yu (author), Hegyi, A. (author), Zhang, Le (author), He, Zhengbing (author), Chung, Edward (author), Liu, Pan (author)

Conventional reinforcement learning (RL) models of variable speed limit (VSL) control systems (and traffic control systems in general) cannot be trained in real traffic process because new control actions are usually explored randomly, which may result in high costs (delays) due to exploration and learning. For this reason, existing RL-based...

journal article 2022

document

Learning Complex Policy Distribution with CEM Guided Adversarial Hypernetwork

Tang, Shi Yuan (author), Oliehoek, F.A. (author), Irissappane, Athirai A. (author), Zhang, Jie (author)

Cross-Entropy Method (CEM) is a gradient-free direct policy search method, which has greater stability and is insensitive to hyperparameter tuning. CEM bears similarity to population-based evolutionary methods, but, rather than using a population it uses a distribution over candidate solutions (policies in our case). Usually, a natural...

conference paper 2021

document

Model-Reference Reinforcement Learning for Collision-Free Tracking Control of Autonomous Surface Vehicles

Zhang, Q. (author), Pan, W. (author), Reppa, V. (author)

This paper presents a novel model-reference reinforcement learning algorithm for the intelligent tracking control of uncertain autonomous surface vehicles with collision avoidance. The proposed control algorithm combines a conventional control method with reinforcement learning to enhance control accuracy and intelligence. In the proposed...

journal article 2021

document

R3L: Connecting Deep Reinforcement Learning To Recurrent Neural Networks For Image Denoising Via Residual Recovery

Zhang, Rongkai (author), Zhu, Jiang (author), Zha, Zhiyuan (author), Dauwels, J.H.G. (author), Wen, Bihan (author)

State-of-the-art image denoisers exploit various types of deep neural networks via deterministic training. Alternatively, very recent works utilize deep reinforcement learning for restoring images with diverse or unknown corruptions. Though deep reinforcement learning can generate effective policy networks for operator selection or architecture...

conference paper 2021

document

Reinforcement learning control of constrained dynamic systems with uniformly ultimate boundedness stability guarantee

Han, Minghao (author), Tian, Yuan (author), Zhang, Lixian (author), Wang, J. (author), Pan, W. (author)

Reinforcement learning (RL) is promising for complicated stochastic nonlinear control problems. Without using a mathematical model, an optimal controller can be learned from data evaluated by certain performance criteria through trial-and-error. However, the data-based learning approach is notorious for not guaranteeing stability, which is...

journal article 2021