PoliciesMultiPlayers.DepRound module

DepRound(): implementation of the dependent rounding procedure, from [[Dependent rounding and its applications to approximation algorithms, by R Gandhi, S Khuller, S Parthasarathy, Journal of the ACM, 2006](http://dl.acm.org/citation.cfm?id=1147956)].

It solves the problem of efficiently selecting a set of \(k\) distinct actions from \(\{1,\dots,K\}\), while satisfying the condition that each action \(i\) is selected with probability \(p_i\) exactly.

The distribution \((p_1, \dots, p_K)\) on \(\{1,\dots,K\}\) is assumed to be given.

Dependent rounding developed by [Gandhi et al.] is a kind of technique that randomly selects a set of edges from a bipartite graph under some cardinality constraints.

  • It runs in \(\mathcal{O}(K)\) space complexity, and at most \(\mathcal{O}(K^2)\) time complexity (note that the article [Uchiya et al., 2010] wrongly claim it is in \(\mathcal{O}(K)\)).
  • References: see also https://www.cs.umd.edu/~samir/grant/jacm06.pdf
PoliciesMultiPlayers.DepRound.DepRound(weights_p, k=1)[source]

[[Algorithms for adversarial bandit problems with multiple plays, by T.Uchiya, A.Nakamura and M.Kudo, 2010](http://hdl.handle.net/2115/47057)] Figure 5 (page 15) is a very clean presentation of the algorithm.

  • Inputs: \(k < K\) and weights_p \(= (p_1, \dots, p_K)\) such that \(\sum_{i=1}^{K} p_i = k\) (or \(= 1\)).
  • Output: A subset of \(\{1,\dots,K\}\) with exactly \(k\) elements. Each action \(i\) is selected with probability exactly \(p_i\).

Example:

>>> import numpy as np; import random
>>> np.random.seed(0); random.seed(0)  # for reproductibility!
>>> K = 5
>>> k = 2
>>> weights_p = [ 2, 2, 2, 2, 2 ]  # all equal weights
>>> DepRound(weights_p, k)
[3, 4]
>>> DepRound(weights_p, k)
[3, 4]
>>> DepRound(weights_p, k)
[0, 1]
>>> weights_p = [ 10, 8, 6, 4, 2 ]  # decreasing weights
>>> DepRound(weights_p, k)
[0, 4]
>>> DepRound(weights_p, k)
[1, 2]
>>> DepRound(weights_p, k)
[3, 4]
>>> weights_p = [ 3, 3, 0, 0, 3 ]  # decreasing weights
>>> DepRound(weights_p, k)
[0, 4]
>>> DepRound(weights_p, k)
[0, 4]
>>> DepRound(weights_p, k)
[0, 4]
>>> DepRound(weights_p, k)
[0, 1]
PoliciesMultiPlayers.DepRound.random() → x in the interval [0, 1).