Doya, Kenji; Samejima, Kazuyuki; Katagiri, Ken-ichi; Kawato, Mitsuo Multiple model-based reinforcement learning. (English) Zbl 0997.93037 Neural Comput. 14, No. 6, 1347-1369 (2002). A multiple model-based reinforcement learning architecture is designed. It is implemented in the discrete-time and continuous-time cases including multiple linear quadratic controllers. These controllers learn to decompose a nonlinear and nonstationary task through the competition and cooperation of multiple prediction models. Reviewer: Angela Slavova (Sofia) Cited in 18 Documents MSC: 93B51 Design techniques (robust design, computer-aided design, etc.) 68T05 Learning and adaptive systems in artificial intelligence 49N10 Linear-quadratic optimal control problems Keywords:multiple model-based reinforcement learning; multiple linear quadratic controller; competition; cooperation; multiple prediction models PDFBibTeX XMLCite \textit{K. Doya} et al., Neural Comput. 14, No. 6, 1347--1369 (2002; Zbl 0997.93037) Full Text: DOI References: [1] DOI: 10.1109/TSMC.1983.6313077 · doi:10.1109/TSMC.1983.6313077 [2] DOI: 10.1162/089976600300015961 · doi:10.1162/089976600300015961 [3] DOI: 10.1016/S0893-6080(05)80053-X · doi:10.1016/S0893-6080(05)80053-X [4] DOI: 10.1162/089976601750541778 · Zbl 0984.68151 · doi:10.1162/089976601750541778 [5] DOI: 10.1038/35003194 · doi:10.1038/35003194 [6] DOI: 10.1162/neco.1991.3.1.79 · doi:10.1162/neco.1991.3.1.79 [7] DOI: 10.1016/S0921-8890(01)00113-0 · Zbl 1014.68179 · doi:10.1016/S0921-8890(01)00113-0 [8] DOI: 10.1109/37.387616 · doi:10.1109/37.387616 [9] DOI: 10.1162/neco.1996.8.2.340 · Zbl 05478621 · doi:10.1162/neco.1996.8.2.340 [10] Singh S. P., Machine Learning 8 pp 323– (1992) [11] Sutton R. S., Machine Learning 3 pp 9– (1988) [12] DOI: 10.1016/S0004-3702(99)00052-1 · Zbl 0996.68151 · doi:10.1016/S0004-3702(99)00052-1 [13] DOI: 10.1016/S0893-6080(99)00060-X · doi:10.1016/S0893-6080(99)00060-X [14] DOI: 10.1177/105971239700600202 · doi:10.1177/105971239700600202 [15] DOI: 10.1038/81497 · doi:10.1038/81497 [16] DOI: 10.1016/S0893-6080(98)00066-5 · doi:10.1016/S0893-6080(98)00066-5 [17] DOI: 10.1016/S1364-6613(98)01221-2 · doi:10.1016/S1364-6613(98)01221-2 This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.