BanditsFlow is a framework that supports the construction of a typical evaluation workflow for comparing bandit algorithms. Your experimental modules on this framework are automatically executed in the Metaflow workflow. In addition, the workflow incorporates experiment management with MLflow Tracking and hyperparameter optimization with Optuna. Combined with code management using Git, you will be able to manage your experiments with high reproducibility.
$ YOUR_BANDIT_FLOW_NAME='sample'
$ python -m banditsflow scaffold $YOUR_BANDIT_FLOW_NAME
$ git add .
$ git commit -m 'Initial commit'
$ git tag first-experiment
$ make run
$ mlflow ui
And access to http://127.0.0.1:5000
- Implement our scenario.
- Implement an actor who acts the scenario.
- Implement a reporter who reports result of actions of the actor.
- Prepare a parameter suggestion for each actor. (optional)
The scenario, actor and reporter must follow each protocol. See each protocol (scenario.Scenario, actor.Actor and reporter.Reporter).
Note that each module has a loader.Loader class which returns its instance by name.
βββ actor
βΒ Β βββ loader.py
βββ reporter
βΒ Β βββ loader.py
βββ scenario
βΒ Β βββ loader.py
βββ suggestion
βββ ACTOR_NAME.yml
βββ loader.py
$ git add .
$ git commit -m 'Customize modules'
$ git tag second-experiment
$ make run
$ mlflow ui
Repeat 2 and 3.
BanditsFlow provides the following workflow. The workflow has optimize and evaluate and report steps. The each step result are saved by Metaflow and MLflow Tracking.
βββββββββββ
β start β
ββββββ¬βββββ
ββββββββββββββββΌβββββββββββββββ
(actor-1) (actor-2) (actor-3)
{suggestion} ββββββ΄βββββ ββββββ΄βββββ ββββββ΄βββββ
{scenario } ββββββΊβoptimize β βoptimize β βoptimize βββββΊ <best_params>
{actor } ββββββ¬βββββ ββββββ¬βββββ ββββββ¬βββββ
best_params best_params best_params
ββββββ΄βββββ ββββββ΄βββββ ββββββ΄βββββ
{scenario } ββββββΊβevaluate β βevaluate β βevaluate βββ¬ββΊ <result>
{actor } ββββββ¬βββββ ββββββ¬βββββ ββββββ¬βββββ β
β β β βββΊ [Parameter]
result result result βββΊ [Metric]
ββββββββββββββββΌβββββββββββββββ
ββββββ΄βββββ
β join β
ββββββ¬βββββ
results
ββββββ΄βββββ
{reporter } βββββββββββββββββββββΊβ report ββββββββββββββββββββΊ [Artifact]
ββββββ¬βββββ
ββββββ΄βββββ
β end β
βββββββββββ
{}: Module
[]: MLflow Tracking
<>: Metaflow
Metaflow MLflow Tracking
ββFlow(BanditsFlow)ββββββββ ββExperimentsββββββββββββββ
β β β β
RAW DATA β β β ββexp-1ββββββββββββββββ β REPORT DATA
β ββRunββββββββββββββ β β β ββRun (actor-1)ββββ β β
<best_params> βββ¬βββββββββ€ ID: mt-run-1 ββββββββ¬ββββββΊβ ID: ml-run-A βββββββββ¬ββΊ [Parameter]
Each <result> βββ€ β β Tag: exp-1 β β β β β β Name: mt-run-1 β β β βββΊ [Metric]
<results> βββ β βββββββββββββββββββ β β β β βββββββββββββββββββ β β
β β β β β ββRun (actor-2)ββββ β β
β β βββββββΊβ ID: ml-run-B βββββββββ¬ββΊ [Parameter]
β β β β β β Name: mt-run-1 β β β βββΊ [Metric]
β β β β β βββββββββββββββββββ β β
β β β β β ββRun (reporter)βββ β β
β β βββββββΊβ ID: ml-run-C βββββββββββΊ [Artifact]
β β β β β Name: mt-run-1 β β β
β β β β βββββββββββββββββββ β β
β β β β β β
β β β β ----------------- β β
β β β β β β
β ββRunββββββββββββββ β β β ββRun (actor-1)ββββ β β
β β ID: mt-run-2 ββββββββ¬ββββββΊβ ID: ml-run-D β β β
β β Tag: exp-1 β β β β β β Name: mt-run-2 β β β
β βββββββββββββββββββ β β β β βββββββββββββββββββ β β
β β β β β ββRun (actor-2)ββββ β β
β β βββββββΊβ ID: ml-run-E β β β
β β β β β β Name: mt-run-2 β β β
β β β β β βββββββββββββββββββ β β
β β β β β ββRun (reporter)βββ β β
β β βββββββΊβ ID: ml-run-F β β β
β β β β β Name: mt-run-2 β β β
β β β β βββββββββββββββββββ β β
β β β βββββββββββββββββββββββ β
β β β β
β β β ββexp-2ββββββββββββββββ β
β ββRunββββββββββββββ β β β ... β β
β β ID: mt-run-3 ββββββββββββββΊ... β β
β β Tag: exp-2 β β β β ... β β
β βββββββββββββββββββ β β β ... β β
β β β βββββββββββββββββββββββ β
βββββββββββββββββββββββββββ βββββββββββββββββββββββββββ
BanditsFlow stores metrics, results and reports for every run.
BanditsFlow assumes that the results are the same for the same experiment, and reduces the time needed to re-run the experiment by using previous results.
These caches are searched using the experiment name, scenario name, and actor name as keys.
You can re-run the experiment by specifying the --revival_from_optimization_by
or --revival_from_evaluation_by
option or changing the name of the experiment by setting another git tag.
BanditsFlow uses Optuna for optimization.
Your suggestion loader class returns parameter suggestions for its actor.
If you use the loader made by scaffold, each actor receives its suggestion which is prepared in suggestion module as YAML of its name.
The YAML has suggestions
dictionary which has list of parameter suggestion dictionary.
Each parameter suggestion has name, type and parameter for each type.
An example of categorical parameter is the following:
suggestions:
- name: epsilon
type: discrete_uniform
low: 0.1
high: 1.0
q: 0.1
See Optuna document for other type and parameter for each type.
$ pip install git+https://github.com/monochromegane/banditsflow