Mastering Atari, Go, Chess and Shogi by Planning with a Learned Model.SAHSS+19.https://arxiv.org/abs/1911.08265
Find out more on the Robustly Beneficial Wiki:https://robustlybeneficial.org/wiki/index.php?title=Reinforcement_learning
Next week's paper is: A Roadmap for Robust End-to-End Alignment. LN Hoang 18.https://arxiv.org/abs/1809.01036
Podchaser is the ultimate destination for podcast data, search, and discovery. Learn More