Reinforcement Learning for Bandits with Continuous Actions and Large Context Spaces

Duckworth, Paul; Vallis, Katherine A.; Lacerda, Bruno; Hawes, Nick

Published in

IOS Press, Frontiers in Artificial Intelligence and Applications, 2023

DOI: 10.3233/faia230320

Tools

Export citation

Search in Google Scholar

Reinforcement Learning for Bandits with Continuous Actions and Large Context Spaces

Book chapter published in 2023 by Paul Duckworth

, Katherine A. Vallis

, Bruno Lacerda

, Nick Hawes

This paper was not found in any repository, but could be made available legally by the author.

Full text: Unavailable

Preprint: archiving allowed

Upload

Postprint: archiving allowed

Upload

Published version: archiving forbidden

Policy details

Data provided by

Abstract

We consider the challenging scenario of contextual bandits with continuous actions and large context spaces. This is an increasingly important application area in personalised healthcare where an agent is requested to make dosing decisions based on a patient’s single image scan. In this paper, we first adapt a reinforcement learning (RL) algorithm for continuous control to outperform contextual bandit algorithms specifically hand-crafted for continuous action spaces. We empirically demonstrate this on a suite of standard benchmark datasets for vector contexts. Secondly, we demonstrate that our RL agent can generalise problems with continuous actions to large context spaces, providing results that outperform previous methods on image contexts. Thirdly, we introduce a new contextual bandits test domain with multi-dimensional continuous action space and image contexts which existing tree-based methods cannot handle. We provide initial results with our RL agent.

Published in

Links

Tools

Reinforcement Learning for Bandits with Continuous Actions and Large Context Spaces

Abstract