Searched for: subject%3A%22reinforcement%255C+learning%22
(1 - 20 of 196)

Pages

document
van der Spaa, L.F. (author)
Physical human-robot cooperation (pHRC) has the potential to combine human and robot strengths in a team that can achieve more than a human and a robot working on the task separately. However, how much of the potential can be realized depends on the quality of cooperation, in which awarenes of the partner’s intention and preferences plays an...
doctoral thesis 2024
document
Murti, Fahri Wisnu (author), Ali, Samad (author), Iosifidis, G. (author), Latva-aho, Matti (author)
Virtualized Radio Access Networks (vRANs) are fully configurable and can be implemented at a low cost over commodity platforms to enable network management flexibility. In this paper, a novel vRAN reconfiguration problem is formulated to jointly reconfigure the functional splits of the base stations (BSs), locations of the virtualized central...
journal article 2024
document
Bai, Chengchao (author), Yan, Peng (author), Piao, Haiyin (author), Pan, W. (author), Guo, Jifeng (author)
This article explores deep reinforcement learning (DRL) for the flocking control of unmanned aerial vehicle (UAV) swarms. The flocking control policy is trained using a centralized-learning-decentralized-execution (CTDE) paradigm, where a centralized critic network augmented with additional information about the entire UAV swarm is utilized...
journal article 2024
document
Wan, Z. (author), Xu, Y. (author), Chang, Z. (author), Liang, M. (author), Šavija, B. (author)
Vascular self-healing concrete (SHC) has great potential to mitigate the environmental impact of the construction industry by increasing the durability of structures. Designing concrete with high initial mechanical properties by searching a specific arrangement of vascular structure is of great importance. Herein, an automatic optimization...
journal article 2024
document
He, K. (author), Shi, S. (author), van den Boom, A.J.J. (author), De Schutter, B.H.K. (author)
Approximate dynamic programming (ADP) faces challenges in dealing with constraints in control problems. Model predictive control (MPC) is, in comparison, well-known for its accommodation of constraints and stability guarantees, although its computation is sometimes prohibitive. This paper introduces an approach combining the two methodologies...
journal article 2024
document
Lu, Miaojia (author), Yan, Xinyu (author), Sharif Azadeh, S. (author), Wang, P. (author)
The volume of instant delivery has witnessed a significant growth in recent years. Given the involvement of numerous heterogeneous stakeholders, instant delivery operations are inherently characterized by dynamics and uncertainties. This study introduces two order dispatching strategies, namely task buffering and dynamic batching, as...
journal article 2024
document
Groot, D.J. (author), Ellerbroek, Joost (author), Hoekstra, J.M. (author)
Conventional Air Traffic Control is still predominantly being done by human Air Traffic Controllers, however, as the traffic density increases, the workload of the controllers increases as well. Especially for the area of unmanned aviation, driven by the rise in drones, having human controllers might become unfeasible. One of the methods that...
journal article 2024
document
Tseremoglou, I. (author), Santos, Bruno F. (author)
In the Condition-Based Maintenance (CBM) context, the definition of optimal maintenance plans for an aircraft fleet depends on an efficient integration of : (i) the probabilistic predictions of the health condition of the components and (ii) the stochastic arrival of the corrective maintenance tasks, together with consideration of the...
journal article 2024
document
Dierikx, M. (author), Albers, N. (author), Scheltinga, Bouke (author), Brinkman, W.P. (author)
Goal-setting is commonly used in behavior change applications for physical activity. However, for goals to be effective, they need to be tailored to a user’s situation (e.g., motivation, progress). One way to obtain such goals is a collaborative process in which a healthcare professional and client set a goal together, thus making use of the...
conference paper 2024
document
Li, Siyue (author), Zhou, Shize (author), Xue, Yongqi (author), Fan, Wenjie (author), Cheng, Tong (author), Ji, Jinlun (author), Dai, Chenyang (author), Song, Wenqing (author), Gao, C. (author)
Network-on-Chip (NoC) is a scalable on-chip communication architecture for the NN accelerator, but with the increase in the number of nodes, the communication delay becomes higher. Applications such as machine learning have a certain resilience to noisy/erroneous transmitted data. Therefore, approximate communication becomes a promising solution...
journal article 2024
document
Cheng, Ji (author), Xue, Bo (author), Jiaxiang, Y. (author), Zhang, Qingfu (author)
Multi-objective Stochastic Linear bandit (MOSLB) plays a critical role in the sequential decision-making paradigm, however, most existing methods focus on the Pareto dominance among different objectives without considering any priority. In this paper, we study bandit algorithms under mixed Pareto-lexicographic orders, which can reflect...
journal article 2024
document
Lai, Li (author), Dong, You (author), Andriotis, C. (author), Wang, Aijun (author), Lei, Xiaoming (author)
Effective transportation network management systems should consider safety and sustainability objectives. Existing research on large-scale transportation network management often employs the assumption that bridges can be considered individually under these objectives. However, this simplification misses accurate system-level representations,...
journal article 2024
document
Song, Yanjie (author), Ou, Junwei (author), Pedrycz, Witold (author), Suganthan, Ponnuthurai Nagaratnam (author), Wang, X. (author), Xing, Lining (author), Zhang, Yue (author)
Multitype satellite observation, including optical observation satellites, synthetic aperture radar (SAR) satellites, and electromagnetic satellites, has become an important direction in integrated satellite applications due to its ability to cope with various complex situations. In the multitype satellite observation scheduling problem ...
journal article 2024
document
Yao, X. (author), Du, Zhaocheng (author), Sun, Zhanbo (author), Calvert, S.C. (author), ji, Ang (author)
Deep Reinforcement Learning (DRL) has made remarkable progress in autonomous vehicle decision-making and execution control to improve traffic performance. This paper introduces a DRL-based mechanism for cooperative lane changing in mixed traffic (CLCMT) for connected and automated vehicles (CAVs). The uncertainty of human-driven vehicles (HVs...
journal article 2024
document
Ribeiro, M.J. (author)
Increasing delays and congestion reported in many aviation sectors indicate that the current centralised operational model is rapidly approaching saturation levels. Air Traffic Control (ATC) system is not expected to keep pace with the ever-increasing demand for air transportation. Its capacity is still limited by the available controllers, and...
doctoral thesis 2023
document
Dai, Pengcheng (author), Yu, Wenwu (author), Wang, He (author), Baldi, S. (author)
Actor-critic (AC) cooperative multiagent reinforcement learning (MARL) over directed graphs is studied in this article. The goal of the agents in MARL is to maximize the globally averaged return in a distributed way, i.e., each agent can only exchange information with its neighboring agents. AC methods proposed in the literature require the...
journal article 2023
document
Yang, Q. (author), Spaan, M.T.J. (author)
Without an assigned task, a suitable intrinsic objective for an agent is to explore the environment efficiently. However, the pursuit of exploration will inevitably bring more safety risks.<br/>An under-explored aspect of reinforcement learning is how to achieve safe efficient exploration when the task is unknown.<br/>In this paper, we propose a...
conference paper 2023
document
Liu, Y. (author), Pan, W. (author)
Machine learning can be effectively applied in control loops to make optimal control decisions robustly. There is increasing interest in using spiking neural networks (SNNs) as the apparatus for machine learning in control engineering because SNNs can potentially offer high energy efficiency, and new SNN-enabling neuromorphic hardware is being...
journal article 2023
document
Tseremoglou, I. (author), van Kessel, Paul J. (author), Santos, Bruno F. (author)
Condition-based maintenance (CBM) scheduling of an aircraft fleet in a disruptive environment while considering health prognostics for a set of systems is a very complex combinatorial problem, which is becoming more challenging in light of the uncertainty included in health prognostics. This type of problem falls under the broad category of...
journal article 2023
document
Geursen, Isaak L. (author), Santos, Bruno F. (author), Yorke-Smith, N. (author)
Current state-of-the-art airline planning models face computational limitations, restricting the operational applicability to problems of representative sizes. This is particularly the case when considering the uncertainty necessarily associated with the long-term plan of an aircraft fleet. Considering the growing interest in the application of...
journal article 2023
Searched for: subject%3A%22reinforcement%255C+learning%22
(1 - 20 of 196)

Pages