Regret-Regression for Multi-Armed Bandit Problem
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 84469
Regret-Regression for Multi-Armed Bandit Problem

Authors: Deyadeen Ali Alshibani

Abstract:

In the literature, the multi-armed bandit problem as a statistical decision model of an agent trying to optimize his decisions while improving his information at the same time. There are several different algorithms models and their applications on this problem. In this paper, we evaluate the Regret-regression through comparing with Q-learning method. A simulation on determination of optimal treatment regime is presented in detail.

Keywords: optimal, bandit problem, optimization, dynamic programming

Procedia PDF Downloads 421