List of research publications using Lilian Besson’s SMPyBandits project¶
I (Lilian Besson) have started my PhD in October 2016, and this project is a part of my on going research since December 2016.
1st article, about policy aggregation algorithm (aka model selection)¶
I designed and added the Aggregator
policy, in order to test its validity and performance.
It is a “simple” voting algorithm to combine multiple bandit algorithms into one.
Basically, it behaves like a simple MAB bandit just based on empirical means (even simpler than UCB), where arms are the child algorithms A_1 .. A_N
, each running in “parallel”.
For more details, refer to this file: Aggregation.md and this research article.
PDF : BKM_IEEEWCNC_2018.pdf | HAL notice : BKM_IEEEWCNC_2018 | BibTeX : BKM_IEEEWCNC_2018.bib | Source code and documentation Published Maintenance Ask Me Anything !
2nd article, about Multi-players Multi-Armed Bandits¶
There is another point of view: instead of comparing different single-player policies on the same problem, we can make them play against each other, in a multi-player setting.
The basic difference is about collisions : at each time t
, if two or more user chose to sense the same channel, there is a collision. Collisions can be handled in different way from the base station point of view, and from each player point of view.
For more details, refer to this file: MultiPlayers.md and this research article.
PDF : BK__ALT_2018.pdf | HAL notice : BK__ALT_2018 | BibTeX : BK__ALT_2018.bib | Source code and documentation Published Maintenance Ask Me Anything !
3rd article, using Doubling Trick for Multi-Armed Bandits¶
I studied what Doubling Trick can and can’t do to obtain efficient anytime version of non-anytime optimal Multi-Armed Bandits algorithms.
For more details, refer to this file: DoublingTrick.md and this research article.
PDF : BK__DoublingTricks_2018.pdf | HAL notice : BK__DoublingTricks_2018 | BibTeX : BK__DoublingTricks_2018.bib | Source code and documentation Published Maintenance Ask Me Anything !
4th article, about Piece-Wise Stationary Multi-Armed Bandits¶
With Emilie Kaufmann, we studied the Generalized Likelihood Ratio Test (GLRT) for sub-Bernoulli distributions, and proposed the B-GLRT algorithm for change-point detection for piece-wise stationary one-armed bandit problems. We combined the B-GLRT with the kl-UCB multi-armed bandit algorithm and proposed the GLR-klUCB algorithm for piece-wise stationary multi-armed bandit problems. We prove finite-time guarantees for the B-GLRT and the GLR-klUCB algorithm, and we illustrate its performance with extensive numerical experiments.
For more details, refer to this file: NonStationaryBandits.md and this research article.
PDF : BK__COLT_2019.pdf | HAL notice : BK__COLT_2019 | BibTeX : BK__COLT_2019.bib | Source code and documentation Published Maintenance Ask Me Anything !
Other interesting things¶
Single-player Policies¶
- More than 65 algorithms, including all known variants of the
UCB
, kl-UCB,MOSS
and Thompson Sampling algorithms, as well as other less known algorithms (https://smpybandits.github.io/docs/OCUCB
,BESA
,OSSB
etc). SparseWrapper
is a generalization of the SparseUCB from this article.- Implementation of very recent Multi-Armed Bandits algorithms, e.g.,
kl-UCB++
(from this article),UCB-dagger
(from this article), orMOSS-anytime
(from this article). - Experimental policies:
BlackBoxOpt
orUnsupervisedLearning
(using Gaussian processes to learn the arms distributions).
Arms and problems¶
- My framework mainly targets stochastic bandits, with arms following
Bernoulli
, bounded (truncated) or unboundedGaussian
,Exponential
,Gamma
orPoisson
distributions. - The default configuration is to use a fixed problem for N repetitions (e.g. 1000 repetitions, use
MAB.MAB
), but there is also a perfect support for “Bayesian” problems where the mean vector µ1,…,µK change at every repetition (seeMAB.DynamicMAB
). - There is also a good support for Markovian problems, see
MAB.MarkovianMAB
, even though I didn’t implement any policies tailored for Markovian problems. - I’m actively working on adding a very clean support for non-stationary MAB problems, and
MAB.PieceWiseStationaryMAB
is already working well. Use it with policies designed for piece-wise stationary problems, like Discounted-Thompson, CD-UCB, M-UCB, SW-UCB#.
:scroll: License ? GitHub license¶
MIT Licensed (file LICENSE).
© 2016-2018 Lilian Besson.
Note: I have worked on other topics during my PhD, you can find my research articles on my website, or have a look to my Google Scholar profile or résumé on HAL.
Open Source? Yes! Maintenance Ask Me Anything ! Analytics PyPI version PyPI implementation PyPI pyversions PyPI download PyPI status Documentation Status Build Status Stars of https://github.com/SMPyBandits/SMPyBandits/ Releases of https://github.com/SMPyBandits/SMPyBandits/