S. Stroobants | TU Delft Repository

Evolving Behaviour Trees to Control a Swarm of Flapping-Wing Micro Aerial Vehicles for Greenhouse Exploration

Master thesis (2025) - L.H. Uptmoor, S. Stroobants, M. Popovic, G.C.H.E. de Croon

Micro aerial vehicles have shown promising use to further automate food production in greenhouses recently. Compared to conventional multirotor drones, flapping-wing drones offer safe and robust operation around plants due to their soft, slowly-moving wings. Their limited sensing and computational capabilities, however, prohibit the use of map-based navigation methods. To compensate for individual shortcomings, swarming ensures scalability and redundancy. This work proposes a hardware setup combining time-of-flight (ToF) and ultra-wideband (UWB) sensing and explores the artificial evolution of behaviour trees as a reactive planning strategy. Genetic programming, paired with CMAES fine-tuning was able to improve a human-designed exploration strategy by 50%. Neuroevolution has been investigated to encourage emergent swarming behaviours, but requires further experimentation in combination with behaviour trees. The solution obtained in simulation can be readily ported to hardware, but a reality gap in performance persists. These findings contribute to the development of lightweight, scalable aerial systems for autonomous greenhouse monitoring. ...

Guidance and Control Implementation with Spiking Neural Networks

A feasibility study

Master thesis (2024) - T. Avarvarei, C. de Wagter, S. Stroobants, R. Ferede

Quadrotors have continuously leveraged the use of artificial intelligence for navigation and decision-making. Moreover, neuromorphic computing, specifically Spiking Neural Networks (SNNs), is considered as an energy-efficient solution during inference. The current study will analyse the effects of implementing SNNs for mimicking energy optimal guidance and control. To achieve this, population encoding is used and an equivalent of 7-8 spiking neurons per conventional neuron is found to preserve most of the information. The equivalent controller prefers fast adaptation which requires small spiking threshold values and minimal reliance on past information. To improve the controller performance, dataset selection is of utmost importance with a careful trade-off between excessive race track customisation and generalisability being required. The results show that learning is feasible and SNN performance approaches conventional state-of-the-art models trained with multi-layer perceptrons. The current analysis represent an important step towards the rapid guidance and control of ultra-small energy efficient quadrotors. ...

Reinforcement Learning for Spiking Neural Networks

Recurrent Reinforcement Learning with Surrogate Gradients

Master thesis (2024) - K.C.M. Van den Berghe, G.C.H.E. de Croon, S. Stroobants, C. de Wagter, D. Zarouchas

Enabling embodied intelligence in robotics presents several unique challenges. A first major concern is the need for energy efficiency, low latency, and strong temporal reasoning to facilitate effective real-world interaction. Neuromorphic computing has garnered attention as a potential solution to these problems. Secondly, when using deep neural networks, it is hard to shape a learning signal, due to the goal oriented nature of robotics. Reinforcement learning (RL) poses itself as a framework to leverage goal-directed reward functions to create this learning signal.
A key challenge with recurrent and spiking neural networks trained via RL is achieving stable baseline performance, able to creating sequences long enough to stabilize hidden states. This stabilization is crucial for processing sequences that extend beyond the initial warm-up period of the temporal network. In this article, an online RL approach is proposed, enabling temporal training with minimal changes to existing online algorithms, introducing a secondary guiding policy whose sole objective is to prevent episode termination before the warm-up period is complete. This framework is demonstrated to outperform offline RL methods and significantly improve the wall clock time of online RL methods, adapted to sample sequences rather than single transitions. Next, the effect of surrogate gradients as a technique for translating the learning signal from the RL framework to weight updates is analyzed. It is found that the slope, parametrizing the surrogate gradient, plays a crucial role in online RL settings, and can be exploited as an exploration mechanism. ...

Neuro-evolution learned neuromorphic control for a vision-based 3D landing

Master thesis (2023) - E. Lodder, G.C.H.E. de Croon, S. Stroobants

Evolving Spiking Neural Networks to Mimic PID Control

Applied to Autonomous Blimps

Master thesis (2023) - T. Burgers, G.C.H.E. de Croon, S. Stroobants, C. de Wagter, A. Bombelli

In recent years, Artificial Neural Networks (ANN) have become a standard in robotic control. However, a significant drawback of large-scale ANNs is their increased power consumption. This becomes a critical concern when designing autonomous aerial vehicles, given the stringent constraints on power and weight. Especially in the case of blimps, known for their extended endurance, power-efficient control methods are essential. Spiking neural networks (SNN) can provide a solution, facilitating energy-efficient and asynchronous event-driven processing.
In this paper, we have evolved SNNs for accurate altitude control of a non-neutrally buoyant indoor blimp, relying solely on onboard sensing and processing power. The blimp's altitude tracking performance significantly improved compared to prior research, showing reduced oscillations and a minimal steady-state error. The parameters of the SNNs were optimized via an evolutionary algorithm, using a Proportional-Derivative-Integral (PID) controller as the target signal. We developed two complementary SNN controllers while examining various hidden layer structures. The first controller responds swiftly to control errors, mitigating overshooting and oscillations, while the second minimizes steady-state errors due to non-neutral buoyancy-induced drift. Despite the blimp's drivetrain limitations, our SNN controllers ensured stable altitude control, employing only 160 spiking neurons. ...