Reference
R. R. Negenborn, B. De Schutter,
M. A. Wiering, and H. Hellendoorn,
"Learning-based model predictive control for Markov decision processes,"
Proceedings of the 16th IFAC World Congress, Prague, Czech
Republic, pp. 354-359, July 2005.
Abstract
We propose the use of Model Predictive Control (MPC) for controlling systems
described by Markov decision processes. First, we consider a straightforward
MPC algorithm for Markov decision processes. Then, we propose value functions,
a means to deal with issues arising in conventional MPC, e.g., computational
requirements and sub-optimality of actions. We use reinforcement learning to
let an MPC agent learn a value function incrementally. The agent incorporates
experience from the interaction with the system in its decision making. Our
approach initially relies on pure MPC. Over time, as experience increases, the
learned value function is taken more and more into account. This speeds up the
decision making, allows decisions to be made over an infinite instead of a
finite horizon, and provides adequate control actions, even if the system and
desired performance slowly vary over time.
Publisher page
Downloads
BibTeX
@inproceedings{NegDeS:04-021,
author = {Negenborn, Rudi R. and De Schutter, Bart and Wiering, Marco A.
and Hellendoorn, Hans},
title = {Learning-Based Model Predictive Control for {Markov} Decision
Processes},
booktitle = {Proceedings of the 16th IFAC World Congress},
address = {Prague, Czech Republic},
pages = {354--359},
month = jul,
year = {2005}
}