Arms package

Arms : contains different types of bandit arms: Constant, UniformArm, Bernoulli, Binomial, Poisson, Gaussian, Exponential, Gamma, DiscreteArm.

Each arm class follows the same interface:

> my_arm = Arm(params)
> my_arm.mean
0.5
> my_arm.draw()  # one random draw
0.0
> my_arm.draw_nparray(20)  # or ((3, 10)), many draw
array([ 0.,  1.,  0.,  0.,  0.,  0.,  0.,  1.,  1.,  0.,  1.,  0.,  0.,
        1.,  0.,  0.,  0.,  1.,  1.,  1.])

Also contains:

Arms.shuffled(mylist)[source]

Returns a shuffled version of the input 1D list. sorted() exists instead of list.sort(), but shuffled() does not exist instead of random.shuffle()…

>>> from random import seed; seed(1234)  # reproducible results
>>> mylist = [ 0.1,  0.2,  0.3,  0.4,  0.5,  0.6,  0.7,  0.8,  0.9]
>>> shuffled(mylist)
[0.9, 0.4, 0.3, 0.6, 0.5, 0.7, 0.1, 0.2, 0.8]
>>> shuffled(mylist)
[0.4, 0.3, 0.7, 0.5, 0.8, 0.1, 0.9, 0.6, 0.2]
>>> shuffled(mylist)
[0.4, 0.6, 0.9, 0.5, 0.7, 0.2, 0.1, 0.3, 0.8]
>>> shuffled(mylist)
[0.8, 0.7, 0.3, 0.1, 0.9, 0.5, 0.6, 0.2, 0.4]
Arms.uniformMeans(nbArms=3, delta=0.05, lower=0.0, amplitude=1.0, isSorted=True)[source]

Return a list of means of arms, well spaced:

  • in [lower, lower + amplitude],
  • sorted in increasing order,
  • starting from lower + amplitude * delta, up to lower + amplitude * (1 - delta),
  • and there is nbArms arms.
>>> np.array(uniformMeans(2, 0.1))
array([0.1, 0.9])
>>> np.array(uniformMeans(3, 0.1))
array([0.1, 0.5, 0.9])
>>> np.array(uniformMeans(9, 1 / (1. + 9)))
array([0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9])
Arms.uniformMeansWithSparsity(nbArms=10, sparsity=3, delta=0.05, lower=0.0, lowerNonZero=0.5, amplitude=1.0, isSorted=True)[source]

Return a list of means of arms, well spaced, in [lower, lower + amplitude].

  • Exactly nbArms-sparsity arms will have a mean = lower and the others are randomly sampled uniformly in [lowerNonZero, lower + amplitude].
  • All means will be different, except if mingap=None, with a min gap > 0.
>>> import numpy as np; np.random.seed(1234)  # reproducible results
>>> np.array(uniformMeansWithSparsity(nbArms=6, sparsity=2))  # doctest: +ELLIPSIS
array([ 0.  ,  0.  ,  0.  ,  0.  ,  0.55,  0.95])
>>> np.array(uniformMeansWithSparsity(nbArms=6, sparsity=2, lowerNonZero=0.8, delta=0.03))  # doctest: +ELLIPSIS
array([ 0.   ,  0.   ,  0.   ,  0.   ,  0.806,  0.994])
>>> np.array(uniformMeansWithSparsity(nbArms=10, sparsity=2))  # doctest: +ELLIPSIS
array([ 0.  ,  0.  ,  0.  ,  0.  ,  0.  ,  0.  ,  0.  ,  0.  ,  0.55,  0.95])
>>> np.array(uniformMeansWithSparsity(nbArms=6, sparsity=2, delta=0.05))  # doctest: +ELLIPSIS
array([ 0.   ,  0.   ,  0.   ,  0.   ,  0.525,  0.975])
>>> np.array(uniformMeansWithSparsity(nbArms=10, sparsity=4, delta=0.05))  # doctest: +ELLIPSIS
array([ 0.   ,  0.   ,  0.   ,  0.   ,  0.   ,  0.   ,  0.525,  0.675,
        0.825,  0.975])
Arms.randomMeans(nbArms=3, mingap=None, lower=0.0, amplitude=1.0, isSorted=True)[source]

Return a list of means of arms, randomly sampled uniformly in [lower, lower + amplitude], with a min gap >= mingap.

  • All means will be different, except if mingap=None, with a min gap > 0.
>>> import numpy as np; np.random.seed(1234)  # reproducible results
>>> randomMeans(nbArms=3, mingap=0.05)  # doctest: +ELLIPSIS
[0.191..., 0.437..., 0.622...]
>>> randomMeans(nbArms=3, mingap=0.01)  # doctest: +ELLIPSIS
[0.276..., 0.801..., 0.958...]
  • Means are sorted, except if isSorted=False.
>>> import random; random.seed(1234)  # reproducible results
>>> randomMeans(nbArms=5, mingap=0.01, isSorted=True)  # doctest: +ELLIPSIS
[0.006..., 0.229..., 0.416..., 0.535..., 0.899...]
>>> randomMeans(nbArms=5, mingap=0.01, isSorted=False)  # doctest: +ELLIPSIS
[0.419..., 0.932..., 0.072..., 0.755..., 0.650...]
Arms.randomMeansWithGapBetweenMbestMworst(nbArms=3, mingap=None, nbPlayers=2, lower=0.0, amplitude=1.0, isSorted=True)[source]

Return a list of means of arms, randomly sampled uniformly in [lower, lower + amplitude], with a min gap >= mingap between the set Mbest and Mworst.

Arms.randomMeansWithSparsity(nbArms=10, sparsity=3, mingap=0.01, delta=0.05, lower=0.0, lowerNonZero=0.5, amplitude=1.0, isSorted=True)[source]

Return a list of means of arms, in [lower, lower + amplitude], with a min gap >= mingap.

  • Exactly nbArms-sparsity arms will have a mean = lower and the others are randomly sampled uniformly in [lowerNonZero, lower + amplitude].
  • All means will be different, except if mingap=None, with a min gap > 0.
>>> import numpy as np; np.random.seed(1234)  # reproducible results
>>> randomMeansWithSparsity(nbArms=6, sparsity=2, mingap=0.05)  # doctest: +ELLIPSIS
[0.0, 0.0, 0.0, 0.0, 0.595..., 0.811...]
>>> randomMeansWithSparsity(nbArms=6, sparsity=2, mingap=0.01)  # doctest: +ELLIPSIS
[0.0, 0.0, 0.0, 0.0, 0.718..., 0.892...]
  • Means are sorted, except if isSorted=False.
>>> import random; random.seed(1234)  # reproducible results
>>> randomMeansWithSparsity(nbArms=6, sparsity=2, mingap=0.01, isSorted=True)  # doctest: +ELLIPSIS
[0.0, 0.0, 0.0, 0.0, 0.636..., 0.889...]
>>> randomMeansWithSparsity(nbArms=6, sparsity=2, mingap=0.01, isSorted=False)  # doctest: +ELLIPSIS
[0.0, 0.0, 0.900..., 0.638..., 0.0, 0.0]
Arms.randomMeansWithSparsity2(nbArms=10, sparsity=3, mingap=0.01, lower=-1.0, lowerNonZero=0.0, amplitude=2.0, isSorted=True)[source]

Return a list of means of arms, in [lower, lower + amplitude], with a min gap >= mingap.

  • Exactly nbArms-sparsity arms will have a mean sampled uniformly in [lower, lowerNonZero] and the others are randomly sampled uniformly in [lowerNonZero, lower + amplitude].
  • All means will be different, except if mingap=None, with a min gap > 0.
>>> import numpy as np; np.random.seed(1234)  # reproducible results
>>> randomMeansWithSparsity2(nbArms=6, sparsity=2, mingap=0.05)  # doctest: +ELLIPSIS
[0.0, 0.0, 0.0, 0.0, 0.595..., 0.811...]
>>> randomMeansWithSparsity2(nbArms=6, sparsity=2, mingap=0.01)  # doctest: +ELLIPSIS
[0.0, 0.0, 0.0, 0.0, 0.718..., 0.892...]
  • Means are sorted, except if isSorted=False.
>>> import random; random.seed(1234)  # reproducible results
>>> randomMeansWithSparsity2(nbArms=6, sparsity=2, mingap=0.01, isSorted=True)  # doctest: +ELLIPSIS
[0.0, 0.0, 0.0, 0.0, 0.636..., 0.889...]
>>> randomMeansWithSparsity2(nbArms=6, sparsity=2, mingap=0.01, isSorted=False)  # doctest: +ELLIPSIS
[0.0, 0.0, 0.900..., 0.638..., 0.0, 0.0]
Arms.array_from_str(my_str)[source]

Convert a string like “[0.1, 0.2, 0.3]” to a numpy array [0.1, 0.2, 0.3], using safe json.loads instead of exec.

>>> array_from_str("[0.1, 0.2, 0.3]")
array([0.1,  0.2,  0.3])
>>> array_from_str("0.1, 0.2, 0.3")
array([0.1,  0.2,  0.3])
>>> array_from_str("0.9")
array([0.9])
Arms.list_from_str(my_str)[source]

Convert a string like “[0.1, 0.2, 0.3]” to a list (0.1, 0.2, 0.3), using safe json.loads instead of exec.

>>> list_from_str("[0.1, 0.2, 0.3]")
[0.1, 0.2, 0.3]
>>> list_from_str("0.1, 0.2, 0.3")
[0.1, 0.2, 0.3]
>>> list_from_str("0.9")
[0.9]
Arms.tuple_from_str(my_str)[source]

Convert a string like “[0.1, 0.2, 0.3]” to a tuple (0.1, 0.2, 0.3), using safe json.loads instead of exec.

>>> tuple_from_str("[0.1, 0.2, 0.3]")
(0.1, 0.2, 0.3)
>>> tuple_from_str("0.1, 0.2, 0.3")
(0.1, 0.2, 0.3)
>>> tuple_from_str("0.9")
(0.9,)
Arms.optimal_selection_probabilities(M, mu)[source]

Compute the optimal selection probabilities of K arms of means \(\mu_i\) by \(1 \leq M \leq K\) players, if they all observe each other pulls and rewards, as derived in (15) p3 of [[The Effect of Communication on Noncooperative Multiplayer Multi-Armed Bandit Problems, by Noyan Evirgen, Alper Kose, IEEE ICMLA 2017]](https://arxiv.org/abs/1711.01628v1).

Warning

They consider a different collision model than I usually do, when two (or more) players ask for the same resource at same time t, I usually consider than all the colliding players receive a zero reward (see Environment.CollisionModels.onlyUniqUserGetsReward()), but they consider than exactly one of the colliding players gets the reward, and all the others get a zero reward (see Environment.CollisionModels.rewardIsSharedUniformly()).

Example:

>>> optimal_selection_probabilities(3, [0.1,0.1,0.1])
array([0.33333333,  0.33333333,  0.33333333])
>>> optimal_selection_probabilities(3, [0.1,0.2,0.3])  # weird ? not really...
array([0.        ,  0.43055556,  0.56944444])
>>> optimal_selection_probabilities(3, [0.1,0.3,0.9])  # weird ? not really...
array([0.        ,  0.45061728,  0.54938272])
>>> optimal_selection_probabilities(3, [0.7,0.8,0.9])
array([0.15631866,  0.35405647,  0.48962487])

Note

These results may sound counter-intuitive, but again they use a different collision models: in my usual collision model, it makes no sense to completely drop an arm when K=M=3, no matter the probabilities \(\mu_i\), but in their collision model, a player wins more (in average) if she has a \(50\%\) chance of being alone on an arm with mean \(0.3\) than if she is sure to be alone on an arm with mean \(0.1\) (see examples 3 and 4).

Arms.geometricChangePoints(horizon=10000, proba=0.001)[source]

Change points following a geometric distribution: at each time, the probability of having a change point at the next step is proba.

>>> np.random.seed(0)
>>> geometricChangePoints(100, 0.1)
array([ 8, 20, 29, 37, 43, 53, 59, 81])
>>> geometricChangePoints(100, 0.2)
array([ 6,  8, 14, 29, 31, 35, 40, 44, 46, 60, 63, 72, 78, 80, 88, 91])
Arms.continuouslyVaryingMeans(means, sign=1, maxSlowChange=0.1, horizon=None, lower=0.0, amplitude=1.0, isSorted=True)[source]

New means, slightly modified from the previous ones.

  • The change and the sign of change are constants.
Arms.randomContinuouslyVaryingMeans(means, maxSlowChange=0.1, horizon=None, lower=0.0, amplitude=1.0, isSorted=True)[source]

New means, slightly modified from the previous ones.

  • The amplitude c of the change is constant, but it is randomly sampled in \(\mathcal{U}([-c,c])\).