The ecole from ds4dm

How Ecole handles CTRL-C

Describe the bug

At the moment, pressing CTL-C while Ecole is running results in either one the two following effects:

SCIP catches the signal and interrupts the solving process silently, resulting in the episode terminating early, without any error message.
Ecole catches the signal and terminates with an Exception.

I think behaviour 2) is OK: CTRL-C signals should stop everything. Behaviour 1) is ok if one runs SCIP in interactive mode via the console, and wants to resume the optimization process afterwards. But with Ecole, the behaviour does not make much sense. Also it is simply annoying, say I want to kill Ecole in a jupyter Notebook, or in console, I have to keep pressing CTRL-C until Python catches the signal in order for the current execution to terminate.

Setting

All settings.

To Reproduce

Run an MDP loop (while True: reset, then while not done: step). Press CTRL-C during the execution.

Expected behavior

A Python KeyboardInterruption.

Fix

A fix is to tell SCIP to not catch CTRL-C signals:

"misc/catchctrlc": False

Rename the EnvironmentComposer Python class

Describe the problem or improvement suggested

In the documentation, we talk about environments, while referring to the class EnvironmentComposer. This can be confusing, as it could suggest there is actually an Environment class somewhere, but this class is nowhere to be found.

Describe the solution you would like

Change the Python class name EnvironmentComposer to Environment.

Describe alternatives you have considered

Have a proper Environment class, which inherits from EnvironmentComposer ?

Additional context

NA

Environment Event

Describe the problem or improvement suggested

Currently, given for instance a reward function, the process of calling reset and obtain_reward is not transparent to the user. Is When is reset called? Before of after a the dynamics are themselves reset?
When writing a new function, it also leads to confusions. Should reset return a reward or will obtain_reward be called right after?
Furthermore, this current setting offers no possibility to access the Model outside of very specific places.
Finally, the code for calling the callback is very similar between observation functions, reward functions, and soon info functions.

To alleviate this, we could generalize the idea of callback event, to reset_pre, reset_post, step_pre, step_post, seed_pre, seed_post.
Ideally, such a callback would be created by simply defining a method a the same name.

struct AnyEvent: Event {
    void reset_pre(scip::Model  const&) override { /* Brand new model not yet on initial state */ }
};

An reward function would then become an event, with a __call__/operator() method for the reward.

struct LpIterations: RewardFunction, Event {
    void reset_pre(scip::Model const&) {  last_lp_iter = 0; }

    Reward operator()(scip::Model model const&) {
        auto reward = ... - last_lp_iter;
        return reward;
    }

private:
    double last_lp_iter = 0;
};

Describe the solution you would like

Base event class has all callback implemented with nothing being done.
Make event virtual to limit code duplication with Python? Only useful if we are able to call them from C++. In Dynamics?
override on scip::Model& and scip::Model const&

Describe alternatives you have considered

__call__/operator() becomes redundant with reset_post and step_post, so should it simply be a simple getter, that guarantee to return the same on two successive calls.
A more general scheme with arbitrary events that could be registered and fired at runtime (e.g. Scip events, or pytorch ignite events). However, this seem to get in the way of simple use case and it is not clear what use cases it would serve here.

Handle both randomization/randomseedshift and randomization/permutationseed

Currently, Model.seed() only sets randomization/randomseedshift. The permutation of rows / cols depends on a distinct seed, "randomization/permutationseed". We have to deal with that in Model, so that users only have to change one seed, which then affects both. I suggest setting both of them to the same value, or a shifted value, e.g.,
randomization/permutationseed = randomization/randomseedshift + 42

Python State functions composition

State function could be composed easily to create new ones.

Sum, product, power (also with scalars) of rewards functions
Logical operators of early termination functions
Cumulative Reward
TupleObservationFunction and DictObservationFunction for observation functions
Make Environment automatically create Dict/Tuple obs func if given a tuple/dict.

Seeding environment / trajectories

C++ State function composition

Describe the problem or improvement suggested

Implement the same Python API #49 in C++

Describe the solution you'd like

Reward function all derive from the same ABC, so this could be done with virtual calls (rather than templates) and bound to Python (minimize code duplication).

For observation functions, template (and hence code duplication) seem necessary.

Describe alternatives you've considered

Assertions ignored in CI

Due to the fact that SCIP and Ecole are compiled in Release mode in CI, all assertions are silently ignored in CI tests.
This is problematic, especially since SCIP relies heavily on it.

Find the culprit CMAKE_BUILD_TYPE? conda CXX_FLAGS?

Lift the GIL

The GIL is currently in place.

Use the PyBind feature to lift it;
Be careful to lock again in Trampoline functions;
Add a multi threaded test in Python;
If possible, enable ThreadSanitizer on the Python extension

Environment.seed() always returns 0

ecole/libecole/include/ecole/environment/default.hpp

Line 76 in ecd0ae3

m_seed = new_seed = 0;

Don't compute observations in the final state

Describe the problem of improvement suggested

Environments should not ask for an observation in the final state (when done == True). This observation is most likely impossible to compute, since SCIP will likely not be in a SOLVING state. And the final observation of an episode is not relevant for learning anyway. That way, Observation functions do not deal with that situation, and silently return empty observations (current behaviour).

Describe the solution you'd like

The Environment does not call the registered ObservationFunction when done==True, and the ObservationFunctions do not check any more for SCIP_STAGE_SOLVING.

Describe alternatives you've considered

na

Additional context

na

Add Configuring the Solver with Bandits tutorial

The tutorial should showcase how Ecole can be used with the Configure environment and a bandit algorithm to find good parameters for SCIP.

The file should go under /examples

Recover from exceptions in step

When the user pass a wrong action to Environment::step, the environment needs to be reset.
However, when using the environment interactively, it would be more natural that wrong value are rejected, but that the user could still try to pass other values.

This requires either rolling back on exception, or checking action value at the start of step before attempting to step_state.

Convert Model to and from PySCIPOpt

API for Action set

Some environment, such as selecting the branching variable or the next branching node, have a dynamic action set. This means that, within different states of the episode, the set of possible action changes.

This is not the same as the Action Space defined by OpenAi Gym because the latter is a static property of the environment and does not change during an episode (or between episodes).

The set of available actions is critical information that needs to be given to the user.

Here are some ideas I could think of, none is perfect. Any other ideas are welcome.

Option 1

The action set is returned as part of the transition in step and reset

obs, action_set, reward, done, info = env.step(action)

Downsides:

That's starting to be a lot of thing in a tuple. I myself already have to make efforts to remember if it is ..., reward, done, ..., or ..., done, reward, ..., this will add much more confusion.
Degrades API for environment that don't need an action set.

Option 2

An attribute of the environment. Something in the like of

action = user_policy(obs, env.current_action_set())
obs, reward, done, info = env.step(action)

Downsides:

More implicit. Does not clearly tell the user that there is complexity to pay attention to. Nor does it tell the user when it changes (is it every episode? very transition?)

Option 3

Part of the Observation. After all, this is a POMDP, if the observation is not good enough to tell that agent what to do, it is not a good observation, period.
Some thoughts

For branching, NodeBipartite could have a flag for every variable indicating which variable is fractional, which would have the benefit of avoiding having permutation issues with the action set.
However, this is would be different for another observation, e.g. KhalilState.
What of NodeBipartite for other environments, now it needs to contain all action spaces?

Typo in word "developper" (should be developer)

Should be corrected everywhere, e.g. the ECOLE_DEVELOPPER cmake argument.

Obtain an initial reward along with the initial observation (`env.reset()`)

Describe the problem of improvement suggested

Some environment MDPs may be over after reset(), without any call to step(). In such a case, no reward is available. This is annoying, as typically one would like to obtain cumulated rewards for every episode of the MDP, for example when benchmarking different policies. This situation happens often for example in branching, when the instance is solved in preprocessing. We still want to measure the running time, or the number of nodes, or the primal-dual integral etc.

Describe the solution you'd like

observation, action_set, reward, done = env.reset("path/to/problem")

Describe alternatives you've considered

Obtaining the cumulated reward at any moment by calling env.getCumulatedReward(). However, that cumulated reward would not match a cumulated reward manually computed by the user. Observing such a difference can be very confusing for users. Also, it introduces an additional function, which must be safe-guarded (can be called only during / after an episode).

Additional context

Add repr for Python classes

Add fmtlib for simpler string creation
Specialize fmtlib for Ecole classes
Edit representations in doc code samples (doctest)

Bind ecole Exception to Python

Bind ecole::scip::Exception
Bind ecole::environment::Exception
Adapt Python Errors tests to test for bound exceptions

Compare against default SCIP

Describe the problem of improvement suggested

Provide a way to compare against a default SCIP baseline, in terms of Ecole metrics (reward functions).

Describe the solution you'd like

A new environment (DefaultSCIP ? Default ?) with a single step MDP, and empty action set. That way SCIP is ran with default config, and one can extract Ecole reward functions to compare to other Environments. That DefaultSCIP environment could simply be a specialization of the Configuring environment, with an empty action set.

Describe alternatives you've considered

Simply using the Configuring environment to provide a default baseline. This can be a bit confusing for users though.

Add support for pseudo branching candidates

The branching environments should be able to toggle between using regular, fractional branching candidates (SCIPgetLPBranchCands) as hardcoded right now in Ecole, and using all non-fixed variables (SCIPgetPseudoBranchCands), which is what learn2branch uses. We will need it to reproduce learn2branch in Ecole.

Pseudocost observations

Describe the problem or improvement suggested

To reimplement learn2branch, we will need to reproduce pseudocost branching. This requires a new observation function that returns pseudocosts (similar to the current strong branching observation function).

Describe the solution you would like

I would like a new ObservationFunction that returns the pseudocosts, like the current StrongBranchingScores.

Parameter values are silently rounded in Configuring environment

Code to reproduce:

import ecole

env = ecole.environment.Configuring()
env.reset(filename='libecole/tests/data/enlight8.mps')
env.step({'presolving/maxrounds': 0.1})

I suggest we check types more carefully, e.g., with a narrow cast:

template<class Target, class Source>
Target narrow_cast(Source v) {
    auto r = static_cast<Target>(v);
    if (static_cast<Source>(r) != v)
        throw Exception("narrow_cast<>() failed");
    return r;

Implement Khalil state

Describe the problem or improvement suggested

Implement the state from Khalil et al.
https://github.com/ds4dm/PySCIPOpt/blob/9b88eddcfe99208e737236d39ccd6c9dd1a9bec4/src/pyscipopt/scip.pyx#L4203

Describe the solution you would like

Describe alternatives you have considered

Additional context

Python tests are suceptible to run wrong Ecole package

Currently, to run the tests, the user is expected to have Ecole importable.
This leaves room for running tests with outdated versions of Ecole.

#32 makes would make this easier to debug, but a better behavior would be to avoid that altogether.

Tox is a tool that can solve this issue but is hardly compatible with our building tools.

API for stage detection

Describe the problem or improvement suggested

Have a way to programmatically query/declare what SCIP stages are valid for a given observation/reward function, dynamics.
This has the following advantages:

Avoid out of date / incomplete documentation of compatibility;
Provide clear errors to the user rather than obscure exceptions, segfaults, and wrong results;
Automatically add optional on terminal states (if made constexpr), e.g. would help solve #59
Make possible automatic generation of integration tests.

Describe the solution you would like

An observation function for instance could have:

struct NodeBipartite: ObservationFunction<SomeObs> {
    static std::array<scip::Stage, 2> valid_stages = {scip::Stage::Solving}
    ...
};

Describe alternatives you have considered

Whether this is static/constexpr variable or could be deduced from constructor arguments

Setup pipeline to build documentation

Create doc.ecole.ai repository, and set proper DNS;
Add Doxygen, Sphynx, and Breathe to CMake;
Test building documentation to CI;
Deploy documentation to Github in specific branches.

Refactoring Environment

To simplify the Python bindings, we should bind components (ObservationFunction, RewardFunction, SimpleEnvironment...) independently, and combine them in a Python Environment class.

In practice, the DefaultEnvironment would not be bound but reproduced in Python.

This would remove the need to have common base class for all the state functions, as well as simplify the default Environment class.
Python Environment inheritance and composition would be much more easy.

Change Environment to be standalone classes, rather than inheriting from DefaultEnvironment
Simplify Default Environment (remove pointer handling)
Create Python Environment class to mimic DefaultEnvironment
Evaluate relevance of clone methods and dereference
Remove bindings to None types, and manage them in Python

Implement InformationFunction

Describe the problem or improvement suggested

Similarily to RewardFunction, and ObservationFunction, we want to have a composable/customizable type to return in the information dictionary.

Describe the solution you would like

An important question is what it the type of the information dictionary in C++, as it cannot be as dynamic as Python.
Something with std::map<Key, Value> where:

Key is std::string? std::variant<std::string, long int>?
Value is std::variant<std::string, long int, double, xt::xarray>?

Additionally, we also want to construct InformationFunction from RewardFunction.

Describe alternatives you have considered

Alternatively, the Value type could be templated and the variant merged at compile time when merging different information function. Unfortunately, it may mean wrapping information function before binding them to Python.

Additional context

Node Bipartite Feature Memoization

Describe the problem or improvement suggested

Some feature in the NodeBipartite observation only need to be computed on reset and not on all transitions.

Describe the solution you would like

Cache the features:

Cache the whole observation
Update features that can change

Describe alternatives you have considered

Additional context

Enable Model::set_param with std::string

Currently not possible via Cast_SFNIAE, because the std::string goes out of scope (and is deallocated) before the call to SCIP.

Possible solutions:

Add an overload rather than a specialization of the template;
Specialize the template with std::string (no &, no const): possible copy of string

Add Branching with Imitation Learning tutorial

The tutorial should showcase how Ecole can be used with the Branching environment, a node bipartite observation, and imitation learning to learn a branching policy.

The file should go under /examples

Write an integration test for inheritance

Write a test that verify that spaces inherited from their base class or from an existing space have the desired behavior when given back to the C++ end (through an environment).

Create new file test_inheritance.py
Inherit some spaces from their base class
Inherit other spaces from an existing space
Create an environment with these spaces
Verify that the spaces are called correctly

Note:

The test may not work at first as not all the features are implemented yet, but it will act as test driven development for said feature
Look in other test files for convention on how to organize test (e.g. the model fixture to get a valid MILP problem)

Enable Source Distribution

Steps for PyPI:

Additionally

Switch to tox for automated testing

Python type hints

Add type hints for IDE completion, static check, and documentation purposes.

Investigate if Pybind need stubs;
The correct types for state function would be to define a Protocol but this is Python 3.8+

Add version and git revision in Python and C++ libraries

To more easily debug Ecole installs, in particular when the user is not running the version they are expecting, it would be useful to add Ecole version and Git revision (hash) in both libraries and tests.

Implement an `ObservationAggregator` observation function

In order to be able to combine several observation functions. The observations are then returned in a list, in the same order as the passed ObservationFunction's.

Pseudocode of the use case:

of = ObservationAggregator([NodeBipartite(), StrongBranchingScores()])
env = Environment(observation_function=of)

obs, done = env.reset()
bipartitegraph, sb_scores = obs

Allow for nested exceptions

Replace Exception for std::exception in order to allow std::throw_with_nested. That would improve error messages for the users.

Convert xtensor to pyxtensor

Figure if it is possible to convert xtensor to pyxtensor without copy.

This would make it possible to remove the template from Observation and ObservtionFunction classes.

Model.set_params/get_params

Describe the improvement suggested

Everything is in the plural, get and set multiple SCIP parameters in one call for convenience.

Describe the solution you'd like

C++, use the ParamType variant to hold different parameters values.
It may also be an opportunity to bind the Python code using the variant as well to reduce code duplication.

Ecole observation features

Modularize feature extraction from scip::Model

Currently, feature extracted in NodeBipartite cannot be easily reused by other classes, or customized in the current class.
On the other end, VarProxy... create confusion with SCIP variables.

Environment composition

Think about the possibility of composing environments, e.g., Node selection + branching

Rename base.hpp files into astract.hpp

Implement the `vanillafullstrong` expert

Can be implemented as an ObservationFunction in Ecole.
Will need SCIP 7.0

Python object ownership

Problem:
Python cannot give away ownership of its references. Using holder types such as std::unique_ptr is impossible without making copies (of base::XSpace, or scip::Model).
Move semantics from Python are work in progress in Pybind11, and unexpected to the Python user anyways.

Documentation pages

Installation
First step / Environment example
State functions (full environment API)
MDP formulation and generalization
Difference with OpenAi (and motivation)
- done flag on reset
- initial prob. distribution on reset
- None on terminal states
Extending an environment
Creating an environment
Compatibility with PySCIPOpt (seed and to/from Model), understanding Model, and acesssing it through state (e.,g. env.state.model.get_param("random/somehting"))

Lifetime of Model should exceed that of Thread

Currently the lifetime of Controller and the model it uses Model are independent but this assumption is wrong because

The Model should not be accessed without validation form Controller;
The Controller needs a valid Model to be destructed, as it needs to terminate the solving (this cannot be prematurely terminated, as we want SCIP to properly free the solving resources).

The lifetimes are currently manually maintained, but this design is not fullproof.

Proposed solution:

Move State inside of Controller, as std::shared_ptr;
Pass std::shared_ptr<State> to the auxillary thread, hence avoiding address errors;
Validate external access to State using Controller lock;
Change DefaultEnvironment to access State through Branching Controller (tied with #19)

Increase testing environments

Continuous integration should test the following configurations:

All python >= 3.6
gcc vs clang
scip >= 6.0

To simplify builds, environments should be completely defined by Docker images.

TODO:

Create generic docker files for python-version x compiler;
Build images, and discuss making them public with ZIB;
Simplify Circle CI config and increase parallelism.
Add full compiler support to conda environments, including ld64 and libcxxabi (for sanitizers)
Add tests with sanitizers, see sanitizers-cmake, cpp_starter_project
Add checks of clang-tidy and cppcheck

Coding conventions

Make the following conventions consistent across the codebase:

Brace vs parenthesis constructors;
auto x = declaration (may be a copy in C++<17);
naming private attribute m_var
setter/getter convention:
- void name(val&) and auto name()
- auto& name(val)
- explicit get and set prefixes
snake_case (std convention) vs PascalCase (Python convention) for types

Many of this things can be don by clang-tidy

Environment composability

Environments are made composable by their observation, reward, termination spaces.

ds4dm / ecole Goto Github PK

ecole's People

Contributors

Stargazers

Watchers

Forkers

ecole's Issues

Describe the bug

Setting

To Reproduce

Expected behavior

Fix

Describe the problem or improvement suggested

Describe the solution you would like

Describe alternatives you have considered

Additional context

Describe the problem or improvement suggested

Describe the solution you would like

Describe alternatives you have considered

Describe the problem or improvement suggested

Describe the solution you'd like

Describe alternatives you've considered

Describe the problem of improvement suggested

Describe the solution you'd like

Describe alternatives you've considered

Additional context

Option 1

Option 2

Option 3

Describe the problem of improvement suggested

Describe the solution you'd like

Describe alternatives you've considered

Additional context

Describe the problem of improvement suggested

Describe the solution you'd like

Describe alternatives you've considered

Describe the problem or improvement suggested

Describe the solution you would like

Describe the problem or improvement suggested

Describe the solution you would like

Describe alternatives you have considered

Additional context

Describe the problem or improvement suggested

Describe the solution you would like

Describe alternatives you have considered

Describe the problem or improvement suggested

Describe the solution you would like

Describe alternatives you have considered

Additional context

Describe the problem or improvement suggested

Describe the solution you would like

Describe alternatives you have considered

Additional context

Steps for PyPI:

Additionally

Describe the improvement suggested

Describe the solution you'd like

Recommend Projects

Recommend Topics

Recommend Org