AMRL: Aggregated Memory For Reinforcement Learning

Content

Abstract
Authors
Shortfacts

Abstract

In many partially observable scenarios, Reinforcement Learning (RL) agents must rely on long-term memory in order to learn an optimal policy. We demonstrate that using techniques from NLP and supervised learning fails at RL tasks due to stochasticity from the environment and from exploration. Utilizing our insights on the limitations of traditional memory methods in RL, we propose AMRL, a class of models that can learn better policies with greater sample efficiency and are resilient to noisy inputs. Specifically, our models use a standard memory module to summarize short-term context, and then aggregate all prior states from the standard model without respect to order. We show that this provides advantages both in terms of gradient decay and signal-to-noise ratio over time. Evaluating in Minecraft and maze environments that test long-term memory, we find that our model improves average return by 19% over a baseline that has the same number of parameters and by 9% over a stronger baseline that has far more parameters.

Top

Authors

Beck, Jacob
Ciosek, Kamil
Devlin, Sam
Tschiatschek, Sebastian
Zhang, Cheng
Hofmann, Katja

Top

Shortfacts

Category	Paper in Conference Proceedings or in Workshop Proceedings (Poster)
Event Title	Eighth International Conference on Learning Representations (ICLR)
Divisions	Data Mining and Machine Learning
Event Location	Addis Ababa, Ethiopia
Event Type	Conference
Event Dates	26.-30.04.2020
Date	26 June 2020
Official URL	https://openreview.net/forum?id=Bkl7bREtDr
Export

Top