Reference
L. Buşoniu, D. Ernst, B. De Schutter, and R. Babuška,
"Cross-entropy optimization of control policies with adaptive basis functions,"
IEEE Transactions on Systems, Man and Cybernetics, Part B:
Cybernetics, vol. 41, no. 1, pp. 196-209, Feb. 2011.
Abstract
This paper introduces an algorithm for direct search of control policies in
continuous-state, discrete-action Markov decision processes. The algorithm
looks for the best closed-loop policy that can be represented using a given
number of basis functions (BFs), where a discrete action is assigned to each
BF. The type of the BFs and their number are specified in advance and determine
the complexity of the representation. Considerable flexibility is achieved by
optimizing the locations and shapes of the BFs, together with the action
assignments. The optimization is carried out with the cross-entropy method and
evaluates the policies by their empirical return from a representative set of
initial states. The return for each representative state is estimated using
Monte Carlo simulations. The resulting algorithm for cross-entropy policy
search with adaptive BFs is extensively evaluated in problems with two to six
state variables, for which it reliably obtains good policies with only a small
number of BFs. In these experiments, cross-entropy policy search requires
vastly fewer BFs than value-function techniques with equidistant BFs, and
outperforms policy search with a competing optimization algorithm called
DIRECT.
Publisher page
Downloads
BibTeX
@article{BusErn:10-004,
author = {Bu{\c{s}}oniu, Lucian and Ernst, Damien and De Schutter, Bart and
Babu{\v{s}}ka, Robert},
title = {Cross-Entropy Optimization of Control Policies with Adaptive
Basis Functions},
journal = {IEEE Transactions on Systems, Man and Cybernetics, Part B:
Cybernetics},
volume = {41},
number = {1},
pages = {196--209},
month = feb,
year = {2011}
}