Reference
L. Buşoniu, R. Munos, B. De Schutter, and R. Babuška, "Optimistic
planning for sparsely stochastic systems,"
Proceedings of the
2011 IEEE Symposium on Adaptive Dynamic Programming and Reinforcement Learning
(ADPRL 2011), Paris, France, pp. 48-55, Apr. 2011.
Abstract
We propose an online planning algorithm for finite-action, sparsely stochastic
Markov decision processes, in which the random state transitions can only end
up in a small number of possible next states. The algorithm builds a planning
tree by iteratively expanding states, where each expansion exploits sparsity to
add all possible successor states. Each state to expand is actively chosen to
improve the knowledge about action quality, and this allows the algorithm to
return a good action after a strictly limited number of expansions. More
specifically, the active selection method is
optimistic
in that it chooses the most promising states first, so the novel algorithm is
called
optimistic planning for sparsely stochastic
systems. We note that the new algorithm can also be seen as
model-predictive (receding-horizon) control. The algorithm obtains promising
numerical results, including the successful online control of a simulated HIV
infection with stochastic drug effectiveness.
Downloads
BibTeX
@inproceedings{BusMun:11-007,
author = {Bu{\c{s}}oniu, Lucian and Munos, R{\'{e}}mi and De Schutter,
Bart and Babu{\v{s}}ka, Robert},
title = {Optimistic Planning for Sparsely Stochastic Systems},
booktitle = {Proceedings of the 2011 IEEE Symposium on Adaptive Dynamic
Programming and Reinforcement Learning (ADPRL 2011)},
address = {Paris, France},
pages = {48--55},
month = apr,
year = {2011}
}