Standing Still Is Not an Option: Alternative Baselines for Attainable Utility Preservation
Specifying reward functions without causing side effects is still a challenge to be solved in Reinforcement Learning. Attainable Utility Preservation (AUP) seems promising to preserve the ability to optimize for a correct reward function in order to minimize negative side-effects. Current approaches however assume the existence of a no-op action in the environment’s action space, which limits AUP to solve tasks where doing nothing for a single time-step is a valuable option. Depending on the environment, this cannot always be guaranteed. We introduce four different baselines that do not build on such actions and therefore extend the concept of AUP to a broader class of environments. We evaluate all introduced variants on different AI safety gridworlds and show that this approach generalizes AUP to a broader range of tasks, with only little performance losses.
Top- Eresheim, Sebastian
- Kovac, Fabian
- Adrowitzer, Alexander
Category |
Paper in Conference Proceedings or in Workshop Proceedings (Paper) |
Event Title |
Cross-Domain Conference for Machine Learning & Knowledge Extraction 2020 (CD-MAKE 2020) |
Divisions |
Security and Privacy |
Subjects |
Angewandte Informatik |
Event Location |
Dublin, Irland |
Event Type |
Conference |
Event Dates |
25-28 Aug 2020 |
Page Range |
pp. 239-257 |
Date |
21 August 2024 |
Official URL |
https://cd-make-2020.archive.sba-research.org/ |
Export |