Giter Site home page Giter Site logo

automl / deepcave Goto Github PK

View Code? Open in Web Editor NEW
58.0 7.0 7.0 41.55 MB

An interactive framework to visualize and analyze your AutoML process in real-time.

Home Page: https://automl.github.io/DeepCAVE/main/

License: Apache License 2.0

Makefile 0.81% Python 98.33% CSS 0.33% Dockerfile 0.11% Shell 0.41%
automl iml visualization analysis interactive real-time hyperparameters hyperparameter-importance sampling-bias

deepcave's People

Contributors

dwoiwode avatar eddiebergman avatar helegraf avatar keckelt avatar mlindauer avatar phmueller avatar renesass avatar sarah-segel avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

deepcave's Issues

Tests: Expand

Add more tests, especially for:

  • Run
  • Converters
  • Plugins (check API calls) and add correct typing

[Question] Correct way of shutting down DeepCave

I assumed that when I close the deepcave tab and kill the command line application, deepcave would shut down completely but in fact the dash port is still in use and not freed within ~30 mins (not sure when exactly it's freed, actually, I just observed it being taken even quite a while after). Is there a shutdown command of some sort or a way to make this cleaner?
The issue right now is obviously that if I want to restart, I have to change my config to a different port.

Improvements of Sidebar

We could add some more features to the sidebar, especially:

  • History of jobs -> list of jobs from static plugins that have been run already (currently disappear after they are clicked)

  • Favorites -> favorite configurations to be used in all plugins (highlighted)

Documentation: installing redis without root access

The documentation includes instructions how to install redis without root access, but there's a line missing which was necessary for me: after running 'make' in the redis directory, I also needed to run 'make install' to be able to actually run the command.

fANOVA shows nothing (Nan Values from RF)

Hey,

first of all: thanks for that super nice tool.
It is really awesome.

Unfortunately, I have encountered a bug in the fANOVA plugin.

The values returned by the rf are all nan.
It might happen due to some constant hyperparameters in the search space.

I've attached the results of a hpbandster run to reproduce the error.

bohb_run.zip

Thanks in advance

UI error in Importances tab

Hi,

I've just started using the DeepCAVE tool and noticed in a few screens, some fields remain active even if a run isn't selected. If you try editing the fields, it throws an error. It only happens at the very beginning of operation when no run is selected.

Workflow

  • Start DeepCAVE
  • Open the Parallel Coordinates Tab
  • Without selecting a Run, interact with the Show Important Hyperparameters ,Limit Hyperparameters or Show Unsuccessful Configurations fields. Errors are thrown

Similarly for the Importances tab when changing the Method, Trees or Limit Hyperparameters fields. Again, it only happens at the very beginning when no run has ever been selected for that session.

It doesn't affect functionality but I think it could be solved with a callback function (input being the id and value of the run menu dropdown and output being the id and visibility/enable property of the relevant fields).

Screens of the Parallel coordinates and Importance tabs where I saw this behavior.

UI_importance UI_ParallelCoord

Best,
Dipti

Cache

Right now, cache is always resets after some minutes. Own json saving is recommended.

Improve README

We would like to improve the README:

  • Add visualization GIFs
  • Add a very minimal example at the top
  • Add very simple installation guide at the top

As part of this, we can also look into similar repos and what makes their READMEs great such that we can benefit from their ideas.

Pre-commit hooks: Check and update

When doing a commit, error messages from the pre-commit hooks show up (with errors not regarding the changes made, but regarding the existing files). This needs to be checked and updated.

Add support for several other HPO tools

In order to maximize the potential users of DeepCave, we should aim to support more HPO tools. In particular, we should write converters for

  • SyneTune
  • BoTorch
  • HEBO
  • Ray Tune
  • Optuna

"state" Key missing in Run object when visualising BOHB runs with DeepCAVE

Hi, :)

I wanted to use the DeepCAVE framework to visualise some runs from HpBandSter. (I can't use SMAC because it has some issues with multi-node runs. I've already posted it.)
In deepcave.runs.converters.bohb, while creating the runs from the bohb Result object, there is no "state" key in the info dict for the run Object within bohb.get_all_runs().

Line 66 in deepcave/runs/converters/bohb.py

status = bohb_run.info["state"]

The bohb.py code assumes the configspace is also saved but it does not happen by default in the library. I've saved the configspace.json separately , while the hpbandster.core.result.json_result_logger saves results.json and configs.json.

Am I using the wrong version of HpBandSter? I used pip install hpbandster which installs version 0.7.4.

Warm Regards,

[Question] Example data?

Is it possible to include some example data so that first time users can try everything out without having to run anything?

Display configids

When deepcave is the first pillar of my HPO run, I might find some particular configs (or sets of configs) interesting and would like to examine them further. To do so, I need to be able to select configs and be given back the ids

New plugin: Hardness of AutoML problem

Hi,

We often wonder how hard an AutoML problem is. Can we therefore add some metrics regarding that?
For example

  • an eCDF plot for the cost distribution (i.e., a hard AutoML task should have only a few very well-performing configurations)
  • uni-modal metric from the automl loss landscape paper (but evaluated on our surrogate models)
  • convexity metric from the automl loss landscape paper (but evaluated on our surrogate models)

Hyperparameter Importance: Different values for same run

When using the API and calculating the importance of a run more than one time, the importance values vary a lot. This happens for fANOVA and Local importance. An example code is here (execute more than once and compare the values):

from deepcave.runs.converters.smac3v2 import SMAC3v2Run
from deepcave.evaluators.fanova import fANOVA
from deepcave.evaluators.lpi import LPI
import pandas as pd

run = SMAC3v2Run.from_path("../smac3_output/700b9ad7b27ba991278b31467cbe7fe6/700b9ad7b27ba991278b31467cbe7fe6/")
result = fANOVA(run)
result.calculate(objectives=run.get_objective('1-accuracy'), budget=max(run.get_budgets()), n_trees=10, seed=None)
df_importance = pd.DataFrame(result.get_importances(hp_names=None))
df_importance[df_importance>0].dropna(axis=1)

Seaborn is missing as a dependency

Hi!
I just gave DeepCAVE a try (its is great) and noticed that seaborn seems to missing as a dependency as I got the following error after starting DeepCAVE:

Using config 'default'
Checking if redis-server is already running...
Could not connect to Redis at 127.0.0.1:6379: Connection refused
Redis server is not running. Starting...
Redis server successfully started.

-------------STARTING WORKER-------------

-------------STARTING SERVER-------------
Traceback (most recent call last):
  File "/home/klaus/miniconda3/envs/deepcave/lib/python3.9/site-packages/matplotlib/style/core.py", line 137, in use
    style = _rc_params_in_file(style)
  File "/home/klaus/miniconda3/envs/deepcave/lib/python3.9/site-packages/matplotlib/__init__.py", line 866, in _rc_params_in_file
    with _open_file_or_url(fname) as fd:
  File "/home/klaus/miniconda3/envs/deepcave/lib/python3.9/contextlib.py", line 119, in __enter__
    return next(self.gen)
  File "/home/klaus/miniconda3/envs/deepcave/lib/python3.9/site-packages/matplotlib/__init__.py", line 843, in _open_file_or_url
    with open(fname, encoding='utf-8') as f:
FileNotFoundError: [Errno 2] No such file or directory: 'seaborn'

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/home/klaus/ws/DeepCAVE/deepcave/server.py", line 6, in <module>
    app.layout = MainLayout(config.PLUGINS)()
  File "/home/klaus/ws/DeepCAVE/deepcave/config.py", line 50, in PLUGINS
    from deepcave.plugins.hyperparameter.importances import Importances
  File "/home/klaus/ws/DeepCAVE/deepcave/plugins/hyperparameter/importances.py", line 13, in <module>
    from deepcave.utils.styled_plot import plt
  File "/home/klaus/ws/DeepCAVE/deepcave/utils/styled_plot.py", line 168, in <module>
    plt = StyledPlot()
  File "/home/klaus/ws/DeepCAVE/deepcave/utils/styled_plot.py", line 28, in __init__
    plt.style.use("seaborn")
  File "/home/klaus/miniconda3/envs/deepcave/lib/python3.9/site-packages/matplotlib/style/core.py", line 139, in use
    raise OSError(
OSError: 'seaborn' is not a valid package style, path of style file, URL of style file, or library style name (library styles are listed in `style.available`)
[1]    10519 terminated  deepcave --open

This is how I set it up:

git clone [email protected]:automl/DeepCAVE.git
cd DeepCAVE/
conda create -n DeepCAVE python=3.9
conda activate DeepCAVE
conda install -c anaconda swig
pip install -e .
sudo apt-get install redis-server
deepcave --open

After installing seaborn, everything worked:

conda install seaborn
deepcave --open

Seaborn is used here:

import seaborn as sns

and here:

plt.style.use("seaborn")

but is not part of the requirements.txt.

PDF Report Feature

It would be very nice to have a feature that allows to generate a PDF report from an analysis with everything the tool has to offer in principle. For this, a user would need to be able to define which analyses under which configuration they would like to see.

`type` object is not subscriptable

Will try to add from __future__ import annotations at the top.

File "/home/skantify/code/DeepCAVE/deepcave/evaluators/epm/random_forest.py", line 345, in RandomForest
    def _predict(self, X: np.ndarray) -> tuple[np.ndarray, np.ndarray]:
TypeError: 'type' object is not subscriptable

Python 3.8.5

Enable using SMAC runs with multiple seeds

Currently, DeepCave doesn't support when runs are done with deterministic=False. For example, when running examples/1_basics/1_quadratic_function.py with deterministic=False, loading the resulting run in DeepCAVE is not possible and will give the warning message "SMAC3v2: Multiple seeds are not supported..".

ModuleNotFoundError

I get the following error when trying to run deepcave --open: ModuleNotFoundError: No module named 'smac.epm.util_funcs'
I think smac changed the naming of the module from util_funcs to utils.
The error occured in DeepCAVE/deepcave/evaluators/epm/random_forest.py and should be fixed if you change the import.

Parallel Coordinates

Colouring based on Configclusters (using e.g. HDBSCAN, which also allows having "noisy observations" with no affiliation)?
The clusters could create a base color which's color shading is determined by the final performance?

This might help identify communalities between well-performing configurations.

ConfigCube Projections for higher dimensions.

When dealing with more than three-dimensional HP-spaces, it may become spurious to look at 3d-slices. Maybe try some high-performing projection procedures? There is for instance UMAP and its successor. But be wary of transductive and inductive projections if you want to do it sequentially.

New plugin: Add symbolic explanations

Add a new plugin that allows to apply symbolic regression to the meta-data gathered during HPO and obtain symbolic explanations for the dependency between hyperparameters and performance

To be done:

  • Add Parsimony Hyperparameter
  • Possibly add other SR HPs
  • Add check if #OptimizedHPs != #ExplainedHPs (if so, run PDP before SR)
  • Replace X1 / X2 by HP name

Quality of Surrogate Models

Many of our analyses are based on surrogate models.
It would be fairly important to know how faithful these surrogate models actually are.
Could we add some insights regarding that? In the easiest case, we could start with some RMSE on out-of-bag error.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.