Thiago D. Simão

info

Please Note

<p>This page displays the records of the person named above and is not linked to a unique person identifier. This record may need to be merged to a profile.</p>

Conference paper (1)

Journal article (1)

2 records found

Scalable Safe Policy Improvement for Factored Multi-Agent MDPs

Conference paper (2024) - Federico Bianchi , Edoardo Zorzi , Alberto Castellini , Thiago D. Simão , Matthijs T.J. Spaan , Alessandro Farinelli

In this work, we focus on safe policy improvement in multi-agent domains where current state-of-the-art methods cannot be effectively applied because of large state and action spaces. We consider recent results using Monte Carlo Tree Search for Safe Policy Improvement with Baseli ...

Scalable Safe Policy Improvement via Monte Carlo Tree Search

Journal article (2023) - Alberto Castellini , Federico Bianchi , Edoardo Zorzi , Thiago D. Simão , Alessandro Farinelli , Matthijs T.J. Spaan

Algorithms for safely improving policies are important to deploy reinforcement learning approaches in real-world scenarios. In this work, we propose an algorithm, called MCTS-SPIBB, that computes safe policy improvement online using a Monte Carlo Tree Search based strategy. We th ...