policy_server module¶
Server to play multi-armed bandits problem against.
- Usage:
- policy_server.py [–port=<PORT>] [–host=<HOST>] [–means=<MEANS>] <json_configuration> policy_server.py (-h|–help) policy_server.py –version
- Options:
- -h –help Show this screen. –version Show version. –port=<PORT> Port to use for the TCP connection [default: 10000]. –host=<HOST> Address to use for the TCP connection [default: 0.0.0.0]. –means=<MEANS> Means of arms used by the environment, to print regret [default: None].
-
policy_server.
default_configuration
= {'archtype': 'UCBalpha', 'nbArms': 10, 'params': {'alpha': 1}}¶ Example of configuration to pass from the command line.
'{"nbArms": 3, "archtype": "UCBalpha", "params": { "alpha": 0.5 }}'
-
policy_server.
read_configuration_policy
(a_string)[source]¶ Return a valid configuration dictionary to initialize a policy, from the input string.
-
policy_server.
server
(policy, host, port, means=None)[source]¶ Launch a server that:
- uses sockets to listen to input and reply
- create a learning algorithm from a JSON configuration (exactly like
main.py
when it readsconfiguration.py
) - then receives feedback
(arm, reward)
from the network, pass it to the algorithm, listens to hisarm = choice()
suggestion, and sends this back to the network.
-
policy_server.
transform_str
(params)[source]¶ Like a safe
exec()
on a dictionary that can contain special values:- strings are interpreted as variables names (e.g., policy names) from the current
globals()
scope, - list are transformed to tuples to be constant and hashable,
- dictionary are recursively transformed.
Warning
It is still as unsafe as
exec()
: only use it with trusted inputs!- strings are interpreted as variables names (e.g., policy names) from the current