Reference
L. Buşoniu, B. De Schutter, and R. Babuška, "Approximate dynamic
programming and reinforcement learning," in
Interactive
Collaborative Information Systems (R. Babuska and
F. C. A. Groen, eds.), vol. 281 of
Studies in Computational Intelligence, Berlin, Germany:
Springer, ISBN 978-3-642-11687-2, pp. 3-44, 2010.
Abstract
Dynamic Programming (DP) and Reinforcement Learning (RL) can be used to
address problems from a variety of fields, including automatic control,
artificial intelligence, operations research, and economy. Many problems in
these fields are described by continuous variables, whereas DP and RL can find
exact solutions only in the discrete case. Therefore, approximation is
essential in practical DP and RL. This chapter provides an in-depth review of
the literature on approximate DP and RL in large or continuous-space,
infinite-horizon problems. Value iteration, policy iteration, and policy search
approaches are presented in turn. Model-based (DP) as well as online and batch
model-free (RL) algorithms are discussed. We review theoretical guarantees on
the approximate solutions produced by these algorithms. Numerical examples
illustrate the behavior of several representative algorithms in practice.
Techniques to automatically derive value function approximators are discussed,
and a comparison between value iteration, policy iteration, and policy search
is provided. The chapter closes with a discussion of open issues and promising
research directions in approximate DP and RL.
Publisher page
Downloads
BibTeX
@incollection{BusDeS:10-028,
author = {Bu{\c{s}}oniu, Lucian and De Schutter, Bart and Babu{\v{s}}ka,
Robert},
title = {Approximate Dynamic Programming and Reinforcement Learning},
booktitle = {Interactive Collaborative Information Systems},
series = {Studies in Computational Intelligence},
volume = {281},
editor = {Babuska, Robert and Groen, Frans C. A.},
publisher = {Springer},
address = {Berlin, Germany},
pages = {3--44},
year = {2010}
}