Reinforcement learning in continuous spaces

Reinforcement learning in continuous spaces is a mostly unsolved problem at the moment. A simple discretisation of the continuous space results in a combinatorial explosion. This difficulty remains no matter whether ~ one uses a model-based (i.e. explicitly modelling how the environment works) or model-free approach (which only estimates how good different actions are in different parts of the environment). In order to get around this, the following approaches have been suggested:

  1. Fitted Q-iteration
  2. Least Squares Policy Iteration and TD learning.
  3. Rollout Sampling Approximate Policy Iteration
  4. Kernel-based methods (i.e. learning an approximate transition model)
  5. Bayesian methods (i.e. gaussian processes for approximating a value function)
  6. Policy-gradient methods
There seems to be a distinct lack of methods using learning on manifolds, however. Rather than estimating a model of the system for the space of all observations, we may instead first estimate a manifold embedding. This, much smaller, space, can then be used to either estimate a value function or a system model.

This MSc project focuses on evaluating simple manifold methods for reinforcement learning and comparing them to other continuous-space algorithms. It would require a very high level of motivation from the student.


Status:
Open
Location:
Universiteit van Amsterdam
Contact:
Christos Dimitrakakis