History


Please fill in your query. A complete syntax description you will find on the General Help page.
Evaluating policies for generalized bandits via a notion of duality. (English)
J. Appl. Probab. 37, No.2, 540-546 (2000).
The authors study a generalization of Gitting’s bandit problem in which one-step returns are products of a reward associated with an active arm multiplied by functions of the states of other arms. This generalization was introduced by {\it P. Nash} [J. R. Stat. Soc., Ser. B 42, 165-169 (1980; Zbl 0459.90087)]. The expected total reward criterion is considered. The authors introduce a notion of a dual generalized bandit problem and use it to develop index-based suboptimality bounds for policies.
Reviewer: Eugene A.Feinberg (Stony Brook)
WorldCat.org
Valid XHTML 1.0 Transitional Valid CSS!