Links

Tools

Export citation

Search in Google Scholar

Multi-objective optimization perspectives on reinforcement learning algorithms using reward vectors

Proceedings article published in 2015 by Madalina M. Drugan ORCID
This paper is available in a repository.
This paper is available in a repository.

Full text: Download

Question mark in circle
Preprint: policy unknown
Question mark in circle
Postprint: policy unknown
Question mark in circle
Published version: policy unknown

Abstract

Reinforcement learning is a machine learning area that studies which actions an agent can take in order to optimize a cumulative reward function. Recently, a new class of reinforcement learning algorithms with multiple, possibly conflicting, reward functions was proposed. We call this class of algorithms the multi-objective reinforcement learning (MORL) paradigm. We give an overview on multi-objective optimization techniques imported in MORL and their theoretical simplified variant with a single state, namely the multi-objective multi-armed bandits (MOMAB) paradigm.