Environment.EvaluatorSparseMultiPlayers module¶

EvaluatorSparseMultiPlayers class to wrap and run the simulations, for the multi-players case with sparse activated players. Lots of plotting methods, to have various visualizations. See documentation.

Warning

FIXME this environment is not as up-to-date as Environment.EvaluatorMultiPlayers.

Environment.EvaluatorSparseMultiPlayers.REPETITIONS = 1¶: Default nb of repetitions

Environment.EvaluatorSparseMultiPlayers.ACTIVATION = 1¶: Default probability of activation

Environment.EvaluatorSparseMultiPlayers.DELTA_T_PLOT = 50¶: Default sampling rate for plotting

Environment.EvaluatorSparseMultiPlayers.MORE_ACCURATE = True¶: Use the count of selections instead of rewards for a more accurate mean/std reward measure.

Environment.EvaluatorSparseMultiPlayers.FINAL_RANKS_ON_AVERAGE = True¶: Default value for finalRanksOnAverage

Environment.EvaluatorSparseMultiPlayers.USE_JOBLIB_FOR_POLICIES = False¶: Default value for useJoblibForPolicies. Does not speed up to use it (too much overhead in using too much threads); so it should really be disabled.

Environment.EvaluatorSparseMultiPlayers.PICKLE_IT = True¶: Default value for pickleit for saving the figures. If True, then all plt.figure object are saved (in pickle format).

class Environment.EvaluatorSparseMultiPlayers.EvaluatorSparseMultiPlayers(configuration, moreAccurate=True)[source]¶

Bases: Environment.EvaluatorMultiPlayers.EvaluatorMultiPlayers

Evaluator class to run the simulations, for the multi-players case.

__init__(configuration, moreAccurate=True)[source]¶: Initialize self. See help(type(self)) for accurate signature.

activations = None¶: Probability of activations

collisionModel = None¶: Which collision model should be used

full_lost_if_collision = None¶: Is there a full loss of rewards if collision ? To compute the correct decomposition of regret

startOneEnv(envId, env)[source]¶: Simulate that env.

getCentralizedRegret_LessAccurate(envId=0)[source]¶: Compute the empirical centralized regret: cumsum on time of the mean rewards of the M best arms - cumsum on time of the empirical rewards obtained by the players, based on accumulated rewards.

getFirstRegretTerm(envId=0)[source]¶: Extract and compute the first term \((a)\) in the centralized regret: losses due to pulling suboptimal arms.

getSecondRegretTerm(envId=0)[source]¶: Extract and compute the second term \((b)\) in the centralized regret: losses due to not pulling optimal arms.

getThirdRegretTerm(envId=0)[source]¶: Extract and compute the third term \((c)\) in the centralized regret: losses due to collisions.

getCentralizedRegret_MoreAccurate(envId=0)[source]¶: Compute the empirical centralized regret, based on counts of selections and not actual rewards.

getCentralizedRegret(envId=0, moreAccurate=None)[source]¶: Using either the more accurate or the less accurate regret count.

getLastRegrets_LessAccurate(envId=0)[source]¶: Extract last regrets, based on accumulated rewards.

getAllLastWeightedSelections(envId=0)[source]¶: Extract weighted count of selections.

getLastRegrets_MoreAccurate(envId=0)[source]¶: Extract last regrets, based on counts of selections and not actual rewards.

getLastRegrets(envId=0, moreAccurate=None)[source]¶: Using either the more accurate or the less accurate regret count.

strPlayers(short=False, latex=True)[source]¶: Get a string of the players and their activations probability for this environment.

__module__ = 'Environment.EvaluatorSparseMultiPlayers'¶

Environment.EvaluatorSparseMultiPlayers.delayed_play(env, players, horizon, collisionModel, activations, seed=None, repeatId=0)[source]¶: Helper function for the parallelization.

Environment.EvaluatorSparseMultiPlayers.uniform_in_zero_one()¶: random() -> x in the interval [0, 1).

Environment.EvaluatorSparseMultiPlayers.with_proba(proba)[source]¶

True with probability = proba, False with probability = 1 - proba.

Examples:

>>> import random; random.seed(0)
>>> tosses = [with_proba(0.6) for _ in range(10000)]; sum(tosses)
5977
>>> tosses = [with_proba(0.111) for _ in range(100000)]; sum(tosses)
11158