Policies.RCB module

The RCB, Randomized Confidence Bound, policy for bounded bandits.

  • Reference: [[“On the Optimality of Perturbations in Stochastic and Adversarial Multi-armed Bandit Problems”, by Baekjin Kim, Ambuj Tewari, arXiv:1902.00610]](https://arxiv.org/pdf/1902.00610.pdf)
class Policies.RCB.RCB(nbArms, perturbation='uniform', lower=0.0, amplitude=1.0, *args, **kwargs)[source]

Bases: Policies.RandomizedIndexPolicy.RandomizedIndexPolicy, Policies.UCBalpha.UCBalpha

The RCB, Randomized Confidence Bound, policy for bounded bandits.

  • Reference: [[“On the Optimality of Perturbations in Stochastic and Adversarial Multi-armed Bandit Problems”, by Baekjin Kim, Ambuj Tewari, arXiv:1902.00610]](https://arxiv.org/pdf/1902.00610.pdf)
__module__ = 'Policies.RCB'