Search results | TU Delft Repositories

document

Learning Human Preferences for Physical Human-Robot Cooperation

van der Spaa, L.F. (author)

Physical human-robot cooperation (pHRC) has the potential to combine human and robot strengths in a team that can achieve more than a human and a robot working on the task separately. However, how much of the potential can be realized depends on the quality of cooperation, in which awarenes of the partner’s intention and preferences plays an...

doctoral thesis 2024

document

Deep Reinforcement Learning for Orchestrating Cost-Aware Reconfigurations of vRANs

Murti, Fahri Wisnu (author), Ali, Samad (author), Iosifidis, G. (author), Latva-aho, Matti (author)

Virtualized Radio Access Networks (vRANs) are fully configurable and can be implemented at a low cost over commodity platforms to enable network management flexibility. In this paper, a novel vRAN reconfiguration problem is formulated to jointly reconfigure the functional splits of the base stations (BSs), locations of the virtualized central...

journal article 2024

document

Learning-Based Multi-UAV Flocking Control With Limited Visual Field and Instinctive Repulsion

Bai, Chengchao (author), Yan, Peng (author), Piao, Haiyin (author), Pan, W. (author), Guo, Jifeng (author)

This article explores deep reinforcement learning (DRL) for the flocking control of unmanned aerial vehicle (UAV) swarms. The flocking control policy is trained using a centralized-learning-decentralized-execution (CTDE) paradigm, where a centralized critic network augmented with additional information about the entire UAV swarm is utilized...

journal article 2024

document

Automatic enhancement of vascular configuration for self-healing concrete through reinforcement learning approach

Wan, Z. (author), Xu, Y. (author), Chang, Z. (author), Liang, M. (author), Šavija, B. (author)

Vascular self-healing concrete (SHC) has great potential to mitigate the environmental impact of the construction industry by increasing the durability of structures. Designing concrete with high initial mechanical properties by searching a specific arrangement of vascular structure is of great importance. Herein, an automatic optimization...

journal article 2024

document

Approximate dynamic programming for constrained linear systems: A piecewise quadratic approximation approach

He, K. (author), Shi, S. (author), van den Boom, A.J.J. (author), De Schutter, B.H.K. (author)

Approximate dynamic programming (ADP) faces challenges in dealing with constraints in control problems. Model predictive control (MPC) is, in comparison, well-known for its accommodation of constraints and stability guarantees, although its computation is sometimes prohibitive. This paper introduces an approach combining the two methodologies...

journal article 2024

document

An adaptive agent-based approach for instant delivery order dispatching: Incorporating task buffering and dynamic batching strategies

Lu, Miaojia (author), Yan, Xinyu (author), Sharif Azadeh, S. (author), Wang, P. (author)

The volume of instant delivery has witnessed a significant growth in recent years. Given the involvement of numerous heterogeneous stakeholders, instant delivery operations are inherently characterized by dynamics and uncertainties. This study introduces two order dispatching strategies, namely task buffering and dynamic batching, as...

journal article 2024

document

Analysis of the impact of traffic density on training of reinforcement learning based conflict resolution methods for drones

Groot, D.J. (author), Ellerbroek, Joost (author), Hoekstra, J.M. (author)

Conventional Air Traffic Control is still predominantly being done by human Air Traffic Controllers, however, as the traffic density increases, the workload of the controllers increases as well. Especially for the area of unmanned aviation, driven by the rise in drones, having human controllers might become unfeasible. One of the methods that...

journal article 2024

document

Condition-Based Maintenance scheduling of an aircraft fleet under partial observability: A Deep Reinforcement Learning approach

Tseremoglou, I. (author), Santos, Bruno F. (author)

In the Condition-Based Maintenance (CBM) context, the definition of optimal maintenance plans for an aircraft fleet depends on an efficient integration of : (i) the probabilistic predictions of the health condition of the components and (ii) the stochastic arrival of the corrective maintenance tasks, together with consideration of the...

journal article 2024

document

Collaboratively Setting Daily Step Goals with a Virtual Coach: Using Reinforcement Learning to Personalize Initial Proposals

Dierikx, M. (author), Albers, N. (author), Scheltinga, Bouke (author), Brinkman, W.P. (author)

Goal-setting is commonly used in behavior change applications for physical activity. However, for goals to be effective, they need to be tailored to a user’s situation (e.g., motivation, progress). One way to obtain such goals is a collaborative process in which a healthcare professional and client set a goal together, thus making use of the...

conference paper 2024

document

HAS-RL: A Hierarchical Approximate Scheme Optimized With Reinforcement Learning for NoC-Based NN Accelerators

Li, Siyue (author), Zhou, Shize (author), Xue, Yongqi (author), Fan, Wenjie (author), Cheng, Tong (author), Ji, Jinlun (author), Dai, Chenyang (author), Song, Wenqing (author), Gao, C. (author)

Network-on-Chip (NoC) is a scalable on-chip communication architecture for the NN accelerator, but with the increase in the number of nodes, the communication delay becomes higher. Applications such as machine learning have a certain resilience to noisy/erroneous transmitted data. Therefore, approximate communication becomes a promising solution...

journal article 2024

document

Hierarchize Pareto Dominance in Multi-Objective Stochastic Linear Bandits

Cheng, Ji (author), Xue, Bo (author), Jiaxiang, Y. (author), Zhang, Qingfu (author)

Multi-objective Stochastic Linear bandit (MOSLB) plays a critical role in the sequential decision-making paradigm, however, most existing methods focus on the Pareto dominance among different objectives without considering any priority. In this paper, we study bandit algorithms under mixed Pareto-lexicographic orders, which can reflect...

journal article 2024

document

Synergetic-informed deep reinforcement learning for sustainable management of transportation networks with large action spaces

Lai, Li (author), Dong, You (author), Andriotis, C. (author), Wang, Aijun (author), Lei, Xiaoming (author)

Effective transportation network management systems should consider safety and sustainability objectives. Existing research on large-scale transportation network management often employs the assumption that bridges can be considered individually under these objectives. However, this simplification misses accurate system-level representations,...

journal article 2024

document

Generalized Model and Deep Reinforcement Learning-Based Evolutionary Method for Multitype Satellite Observation Scheduling

Song, Yanjie (author), Ou, Junwei (author), Pedrycz, Witold (author), Suganthan, Ponnuthurai Nagaratnam (author), Wang, X. (author), Xing, Lining (author), Zhang, Yue (author)

Multitype satellite observation, including optical observation satellites, synthetic aperture radar (SAR) satellites, and electromagnetic satellites, has become an important direction in integrated satellite applications due to its ability to cope with various complex situations. In the multitype satellite observation scheduling problem ...

journal article 2024

document

Cooperative lane-changing in mixed traffic: a deep reinforcement learning approach

Yao, X. (author), Du, Zhaocheng (author), Sun, Zhanbo (author), Calvert, S.C. (author), ji, Ang (author)

Deep Reinforcement Learning (DRL) has made remarkable progress in autonomous vehicle decision-making and execution control to improve traffic performance. This paper introduces a DRL-based mechanism for cooperative lane changing in mixed traffic (CLCMT) for connected and automated vehicles (CAVs). The uncertainty of human-driven vehicles (HVs...

journal article 2024

document

Conflict Resolution at High Traffic Densities with Reinforcement Learning

Ribeiro, M.J. (author)

Increasing delays and congestion reported in many aviation sectors indicate that the current centralised operational model is rapidly approaching saturation levels. Air Traffic Control (ATC) system is not expected to keep pace with the ever-increasing demand for air transportation. Its capacity is still limited by the available controllers, and...

doctoral thesis 2023

document

Distributed Actor-Critic Algorithms for Multiagent Reinforcement Learning Over Directed Graphs

Dai, Pengcheng (author), Yu, Wenwu (author), Wang, He (author), Baldi, S. (author)

Actor-critic (AC) cooperative multiagent reinforcement learning (MARL) over directed graphs is studied in this article. The goal of the agents in MARL is to maximize the globally averaged return in a distributed way, i.e., each agent can only exchange information with its neighboring agents. AC methods proposed in the literature require the...

journal article 2023

document

CEM: Constrained Entropy Maximization for Task-Agnostic Safe Exploration

Yang, Q. (author), Spaan, M.T.J. (author)

Without an assigned task, a suitable intrinsic objective for an agent is to explore the environment efficiently. However, the pursuit of exploration will inevitably bring more safety risks.<br/>An under-explored aspect of reinforcement learning is how to achieve safe efficient exploration when the task is unknown.<br/>In this paper, we propose a...

conference paper 2023

document

Spiking Neural-Networks-Based Data-Driven Control

Liu, Y. (author), Pan, W. (author)

Machine learning can be effectively applied in control loops to make optimal control decisions robustly. There is increasing interest in using spiking neural networks (SNNs) as the apparatus for machine learning in control engineering because SNNs can potentially offer high energy efficiency, and new SNN-enabling neuromorphic hardware is being...

journal article 2023

document

A Comparative Study of Optimization Models for Condition-Based Maintenance Scheduling of an Aircraft Fleet

Tseremoglou, I. (author), van Kessel, Paul J. (author), Santos, Bruno F. (author)

Condition-based maintenance (CBM) scheduling of an aircraft fleet in a disruptive environment while considering health prognostics for a set of systems is a very complex combinatorial problem, which is becoming more challenging in light of the uncertainty included in health prognostics. This type of problem falls under the broad category of...

journal article 2023

document

Fleet planning under demand and fuel price uncertainty using actor–critic reinforcement learning

Geursen, Isaak L. (author), Santos, Bruno F. (author), Yorke-Smith, N. (author)

Current state-of-the-art airline planning models face computational limitations, restricting the operational applicability to problems of representative sizes. This is particularly the case when considering the uncertainty necessarily associated with the long-term plan of an aircraft fleet. Considering the growing interest in the application of...

journal article 2023

Pages

Pages