MDP Abstraction with Successor Features
Abstraction plays an important role for generalisation of knowledge and skills, and is key to sample efficient learning and planning. For many complex, long-horizon planning problems such as the game Minecraft, abstraction can be particular useful. To solve these tasks, an abstract plan can be first formed, then instantiated by filling in the necessary lowlevel details, and such abstract plans can often generalize well to related new problem settings. In this work, we study temporal and state abstraction in reinforcement learning, where temporal abstractions represent temporally-extended actions in the form of options, while state abstraction induces abstract MDPs by aggregating similar states as abstract states. Many existing abstraction schemes overlook the relation between state and temporal abstraction, consequently, acquired option policies often cannot be directly transferred to new environments due to changes in the state space and transition dynamics. To address these issues, we propose successor abstraction, a novel abstraction scheme building on successor features. This includes an algorithm for encoding and instantiation of abstract options across different environments, and a state abstraction mechanism based on the abstract options. Our abstraction scheme allows us to create abstract environment models with semantics that are transferable across different environments through encoding and instantiation of abstract options. Empirically, we achieve better transfer and improved performance on a set of benchmark tasks compared to state of the art baselines.
Top- Han, Dongge
- Wooldridge, Michael
- Tschiatschek, Sebastian
Category |
Paper in Conference Proceedings or in Workshop Proceedings (Paper) |
Event Title |
AAAI-22 Workshop on Reinforcement Learning in Games |
Divisions |
Data Mining and Machine Learning |
Subjects |
Kuenstliche Intelligenz |
Event Location |
Virtual |
Event Type |
Workshop |
Event Dates |
28.02.2022 |
Date |
28 February 2022 |
Export |