Searched for: subject%3A%22reinforcement%255C+learning%22
(1 - 16 of 16)
document
Song, Yanjie (author), Ou, Junwei (author), Pedrycz, Witold (author), Suganthan, Ponnuthurai Nagaratnam (author), Wang, X. (author), Xing, Lining (author), Zhang, Yue (author)
Multitype satellite observation, including optical observation satellites, synthetic aperture radar (SAR) satellites, and electromagnetic satellites, has become an important direction in integrated satellite applications due to its ability to cope with various complex situations. In the multitype satellite observation scheduling problem ...
journal article 2024
document
Cheng, Ji (author), Xue, Bo (author), Jiaxiang, Y. (author), Zhang, Qingfu (author)
Multi-objective Stochastic Linear bandit (MOSLB) plays a critical role in the sequential decision-making paradigm, however, most existing methods focus on the Pareto dominance among different objectives without considering any priority. In this paper, we study bandit algorithms under mixed Pareto-lexicographic orders, which can reflect...
journal article 2024
document
Zhang, Zheng (author), Zhang, Dengyu (author), Zhang, Qingrui (author), Pan, W. (author), Hu, Tianjiang (author)
Integrating rule-based policies into reinforcement learning promises to improve data efficiency and generalization in cooperative pursuit problems. However, most implementations do not properly distinguish the influence of neighboring robots in observation embedding or inter-robot interaction rules, leading to information loss and inefficient...
journal article 2024
document
Du, Guodong (author), Zou, Yuan (author), Zhang, Xudong (author), Li, Z. (author), Liu, Qi (author)
The autonomous vehicle is widely applied in various ground operations, in which motion planning and tracking control are becoming the key technologies to achieve autonomous driving. In order to further improve the performance of motion planning and tracking control, an efficient hierarchical framework containing motion planning and tracking...
journal article 2023
document
Zhang, Yingqian (author), Bliek, Laurens (author), da Costa, Paulo (author), Refaei Afshar, Reza (author), Reijnen, Robbert (author), Catshoek, T. (author), Vos, D.A. (author), Verwer, S.E. (author), Schmitt-Ulms, Fynn (author)
This paper reports on the first international competition on AI for the traveling salesman problem (TSP) at the International Joint Conference on Artificial Intelligence 2021 (IJCAI-21). The TSP is one of the classical combinatorial optimization problems, with many variants inspired by real-world applications. This first competition asked the...
journal article 2023
document
Tang, Shi Yuan (author), Irissappane, Athirai A. (author), Oliehoek, F.A. (author), Zhang, Jie (author)
Typically, a Reinforcement Learning (RL) algorithm focuses in learning a single deployable policy as the end product. Depending on the initialization methods and seed randomization, learning a single policy could possibly leads to convergence to different local optima across different runs, especially when the algorithm is sensitive to hyper...
journal article 2023
document
Chen, Yangkun (author), Yu, Chenghui (author), Zhu, Hengman (author), Liu, Shuai (author), Zhang, Yibing (author), Suarez, Joseph (author), Zhao, Liang (author), He, J. (author), Chen, Jiaxin (author)
We present the results of the second Neural MMO challenge, hosted at IJCAI 2022, which received 1600+ submissions. This competition targets robustness and generalization in multi-agent systems: participants train teams of agents to complete a multi-task objective against opponents not seen during training. We summarize the competition design...
journal article 2023
document
Zhang, Y. (author), Negenborn, R.R. (author), Atasoy, B. (author)
The objective of this study is to address the issue of service time uncertainty in synchromodal freight transport, which can cause delays, inefficiencies, and reduced satisfaction for shippers. The proposed solution is an online deep Reinforcement Learning (RL) approach that takes into account the service time uncertainty, assisted by an...
journal article 2023
document
Zhang, Rongkai (author), Zhang, Cong (author), Cao, Zhiguang (author), Song, Wen (author), Tan, Puay Siew (author), Zhang, Jie (author), Wen, Bihan (author), Dauwels, J.H.G. (author)
We propose a manager-worker framework (the implementation of our model is publically available at: https://github.com/zcaicaros/manager-worker-mtsptwr) based on deep reinforcement learning to tackle a hard yet nontrivial variant of Travelling Salesman Problem (TSP), i.e. multiple-vehicle TSP with time window and rejections (mTSPTWR), where...
journal article 2023
document
Baldi, S. (author), Zhang, Z. (author), Liu, Di (author)
We propose a new reinforcement learning method in the framework of Recursive Least Squares-Temporal Difference (RLS-TD). Instead of using the standard mechanism of eligibility traces (resulting in RLS-TD((Formula presented.))), we propose to use the forgetting factor commonly used in gradient-based or least-square estimation, and we show that...
journal article 2022
document
Zhang, Xinglong (author), Peng, Yaoqian (author), Pan, W. (author), Xu, Xin (author), Xie, Haibin (author)
Distributed model predictive control (DMPC) concerns how to online control multiple robotic systems with constraints effectively. However, the nonlinearity, nonconvexity, and strong interconnections of dynamic system models and constraints can make the real-time and real-world DMPC implementations nontrivial. Reinforcement learning (RL)...
conference paper 2022
document
Han, Yu (author), Hegyi, A. (author), Zhang, Le (author), He, Zhengbing (author), Chung, Edward (author), Liu, Pan (author)
Conventional reinforcement learning (RL) models of variable speed limit (VSL) control systems (and traffic control systems in general) cannot be trained in real traffic process because new control actions are usually explored randomly, which may result in high costs (delays) due to exploration and learning. For this reason, existing RL-based...
journal article 2022
document
Tang, Shi Yuan (author), Oliehoek, F.A. (author), Irissappane, Athirai A. (author), Zhang, Jie (author)
Cross-Entropy Method (CEM) is a gradient-free direct policy search method, which has greater stability and is insensitive to hyperparameter tuning. CEM bears similarity to population-based evolutionary methods, but, rather than using a population it uses a distribution over candidate solutions (policies in our case). Usually, a natural...
conference paper 2021
document
Zhang, Q. (author), Pan, W. (author), Reppa, V. (author)
This paper presents a novel model-reference reinforcement learning algorithm for the intelligent tracking control of uncertain autonomous surface vehicles with collision avoidance. The proposed control algorithm combines a conventional control method with reinforcement learning to enhance control accuracy and intelligence. In the proposed...
journal article 2021
document
Zhang, Rongkai (author), Zhu, Jiang (author), Zha, Zhiyuan (author), Dauwels, J.H.G. (author), Wen, Bihan (author)
State-of-the-art image denoisers exploit various types of deep neural networks via deterministic training. Alternatively, very recent works utilize deep reinforcement learning for restoring images with diverse or unknown corruptions. Though deep reinforcement learning can generate effective policy networks for operator selection or architecture...
conference paper 2021
document
Han, Minghao (author), Tian, Yuan (author), Zhang, Lixian (author), Wang, J. (author), Pan, W. (author)
Reinforcement learning (RL) is promising for complicated stochastic nonlinear control problems. Without using a mathematical model, an optimal controller can be learned from data evaluated by certain performance criteria through trial-and-error. However, the data-based learning approach is notorious for not guaranteeing stability, which is...
journal article 2021
Searched for: subject%3A%22reinforcement%255C+learning%22
(1 - 16 of 16)