- document
-
Tang, Shi Yuan (author), Irissappane, Athirai A. (author), Oliehoek, F.A. (author), Zhang, Jie (author)Typically, a Reinforcement Learning (RL) algorithm focuses in learning a single deployable policy as the end product. Depending on the initialization methods and seed randomization, learning a single policy could possibly leads to convergence to different local optima across different runs, especially when the algorithm is sensitive to hyper...journal article 2023