Searched for: +
(1 - 2 of 2)
document
Tang, Shi Yuan (author), Irissappane, Athirai A. (author), Oliehoek, F.A. (author), Zhang, Jie (author)
Typically, a Reinforcement Learning (RL) algorithm focuses in learning a single deployable policy as the end product. Depending on the initialization methods and seed randomization, learning a single policy could possibly leads to convergence to different local optima across different runs, especially when the algorithm is sensitive to hyper...
journal article 2023
document
Tang, Shi Yuan (author), Oliehoek, F.A. (author), Irissappane, Athirai A. (author), Zhang, Jie (author)
Cross-Entropy Method (CEM) is a gradient-free direct policy search method, which has greater stability and is insensitive to hyperparameter tuning. CEM bears similarity to population-based evolutionary methods, but, rather than using a population it uses a distribution over candidate solutions (policies in our case). Usually, a natural...
conference paper 2021