The deepcave from automl

Update readme

Tutorial: How to add new plugins

We should make it as easy as possible for a user to add their own plugins and should add an in-depth tutorial for that as the current documentation is rather limited on that: https://automl.github.io/DeepCAVE/main/plugins/index.html

Tests: Expand

Add more tests, especially for:

Run
Converters
Plugins (check API calls) and add correct typing

Slider for different number of configurations

cache for every selection

[Question] Correct way of shutting down DeepCave

I assumed that when I close the deepcave tab and kill the command line application, deepcave would shut down completely but in fact the dash port is still in use and not freed within ~30 mins (not sure when exactly it's freed, actually, I just observed it being taken even quite a while after). Is there a shutdown command of some sort or a way to make this cleaner?
The issue right now is obviously that if I want to restart, I have to change my config to a different port.

DeepCAVE disables all loggers before import

DeepCAVE/deepcave/utils/logs.py

Line 13 in 5811bf9

logging.config.dictConfig(config)

By default disable_existing_loggers is set to True, so if deepcave is imported late in the program as an optional dependency, it will disable all non-root level loggers.

Setting this flag to false at the end of this file should prevent this.

Improvements of Sidebar

We could add some more features to the sidebar, especially:

History of jobs -> list of jobs from static plugins that have been run already (currently disappear after they are clicked)
Favorites -> favorite configurations to be used in all plugins (highlighted)

Add support for AutoML-Toolkit output

It would be great to add support for reading AutoML-Toolkit output. A corresponding converter needs to be written for that.

Enable loading multi-objective runs

Currently, MO-SMAC runs cannot be loaded into DeepCave.

Documentation: installing redis without root access

The documentation includes instructions how to install redis without root access, but there's a line missing which was necessary for me: after running 'make' in the redis directory, I also needed to run 'make install' to be able to actually run the command.

Plots: Make matplotlib outputs available for more plugins

Plots are mainly created with Plotly in DeepCAVE. For some plots, a matplotlib version is available as well. This should be added for more plugins.

fANOVA shows nothing (Nan Values from RF)

Hey,

first of all: thanks for that super nice tool.
It is really awesome.

Unfortunately, I have encountered a bug in the fANOVA plugin.

The values returned by the rf are all nan.
It might happen due to some constant hyperparameters in the search space.

I've attached the results of a hpbandster run to reproduce the error.

bohb_run.zip

Thanks in advance

Add in selection: select all parameters

for the use of the parallel coordinate plot a select all option would be nice - and be set as default

UI error in Importances tab

Hi,

I've just started using the DeepCAVE tool and noticed in a few screens, some fields remain active even if a run isn't selected. If you try editing the fields, it throws an error. It only happens at the very beginning of operation when no run is selected.

Workflow

Start DeepCAVE
Open the Parallel Coordinates Tab
Without selecting a Run, interact with the Show Important Hyperparameters ,Limit Hyperparameters or Show Unsuccessful Configurations fields. Errors are thrown

Similarly for the Importances tab when changing the Method, Trees or Limit Hyperparameters fields. Again, it only happens at the very beginning when no run has ever been selected for that session.

It doesn't affect functionality but I think it could be solved with a callback function (input being the id and value of the run menu dropdown and output being the id and visibility/enable property of the relevant fields).

Screens of the Parallel coordinates and Importance tabs where I saw this behavior.

Best,
Dipti

Cache

Right now, cache is always resets after some minutes. Own json saving is recommended.

New plugin: Implement ideas from Fitness Landscape Footprint paper

We could implement some of the ideas presented in
Fitness Landscape Footprint: A Framework to Compare Neural Architecture Search Problems

Improve README

We would like to improve the README:

Add visualization GIFs
Add a very minimal example at the top
Add very simple installation guide at the top

As part of this, we can also look into similar repos and what makes their READMEs great such that we can benefit from their ideas.

fANOVA: Implement faster version

As the current implementation of fANOVA is rather inefficient, we could think about how to make it faster.

Pre-commit hooks: Check and update

When doing a commit, error messages from the pre-commit hooks show up (with errors not regarding the changes made, but regarding the existing files). This needs to be checked and updated.

Local Importance shows the same graph for all objectives

With Hyperparameter importance, if you select "local Importance", the same graph is displayed for all objectives. If you select Fanova, however, different graphs are displayed.

Add support for several other HPO tools

In order to maximize the potential users of DeepCave, we should aim to support more HPO tools. In particular, we should write converters for

SyneTune
BoTorch
HEBO
Ray Tune
Optuna

"state" Key missing in Run object when visualising BOHB runs with DeepCAVE

Hi, :)

I wanted to use the DeepCAVE framework to visualise some runs from HpBandSter. (I can't use SMAC because it has some issues with multi-node runs. I've already posted it.)
In deepcave.runs.converters.bohb, while creating the runs from the bohb Result object, there is no "state" key in the info dict for the run Object within bohb.get_all_runs().

Line 66 in deepcave/runs/converters/bohb.py

status = bohb_run.info["state"]

The bohb.py code assumes the configspace is also saved but it does not happen by default in the library. I've saved the configspace.json separately , while the hpbandster.core.result.json_result_logger saves results.json and configs.json.

Am I using the wrong version of HpBandSter? I used pip install hpbandster which installs version 0.7.4.

Warm Regards,

Decide on priority of enhancements / new feature

Prioritize enhancements / new features and schedule when to work on them.

[Question] Example data?

Is it possible to include some example data so that first time users can try everything out without having to run anything?

Display configids

When deepcave is the first pillar of my HPO run, I might find some particular configs (or sets of configs) interesting and would like to examine them further. To do so, I need to be able to select configs and be given back the ids

Redis-Server on Cluster

???

Plots from synetune paper

Plots over time for mf with trajectories of each config from syne tune paper/ video

New plugin: Hardness of AutoML problem

Hi,

We often wonder how hard an AutoML problem is. Can we therefore add some metrics regarding that?
For example

an eCDF plot for the cost distribution (i.e., a hard AutoML task should have only a few very well-performing configurations)
uni-modal metric from the automl loss landscape paper (but evaluated on our surrogate models)
convexity metric from the automl loss landscape paper (but evaluated on our surrogate models)

Hyperparameter Importance: Different values for same run

When using the API and calculating the importance of a run more than one time, the importance values vary a lot. This happens for fANOVA and Local importance. An example code is here (execute more than once and compare the values):

from deepcave.runs.converters.smac3v2 import SMAC3v2Run
from deepcave.evaluators.fanova import fANOVA
from deepcave.evaluators.lpi import LPI
import pandas as pd

run = SMAC3v2Run.from_path("../smac3_output/700b9ad7b27ba991278b31467cbe7fe6/700b9ad7b27ba991278b31467cbe7fe6/")
result = fANOVA(run)
result.calculate(objectives=run.get_objective('1-accuracy'), budget=max(run.get_budgets()), n_trees=10, seed=None)
df_importance = pd.DataFrame(result.get_importances(hp_names=None))
df_importance[df_importance>0].dropna(axis=1)

Dependencies: Check and update

Not all dependencies are up to date. This needs to be checked and updated.

Seaborn is missing as a dependency

Hi!
I just gave DeepCAVE a try (its is great) and noticed that seaborn seems to missing as a dependency as I got the following error after starting DeepCAVE:

Using config 'default'
Checking if redis-server is already running...
Could not connect to Redis at 127.0.0.1:6379: Connection refused
Redis server is not running. Starting...
Redis server successfully started.

-------------STARTING WORKER-------------

-------------STARTING SERVER-------------
Traceback (most recent call last):
  File "/home/klaus/miniconda3/envs/deepcave/lib/python3.9/site-packages/matplotlib/style/core.py", line 137, in use
    style = _rc_params_in_file(style)
  File "/home/klaus/miniconda3/envs/deepcave/lib/python3.9/site-packages/matplotlib/__init__.py", line 866, in _rc_params_in_file
    with _open_file_or_url(fname) as fd:
  File "/home/klaus/miniconda3/envs/deepcave/lib/python3.9/contextlib.py", line 119, in __enter__
    return next(self.gen)
  File "/home/klaus/miniconda3/envs/deepcave/lib/python3.9/site-packages/matplotlib/__init__.py", line 843, in _open_file_or_url
    with open(fname, encoding='utf-8') as f:
FileNotFoundError: [Errno 2] No such file or directory: 'seaborn'

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/home/klaus/ws/DeepCAVE/deepcave/server.py", line 6, in <module>
    app.layout = MainLayout(config.PLUGINS)()
  File "/home/klaus/ws/DeepCAVE/deepcave/config.py", line 50, in PLUGINS
    from deepcave.plugins.hyperparameter.importances import Importances
  File "/home/klaus/ws/DeepCAVE/deepcave/plugins/hyperparameter/importances.py", line 13, in <module>
    from deepcave.utils.styled_plot import plt
  File "/home/klaus/ws/DeepCAVE/deepcave/utils/styled_plot.py", line 168, in <module>
    plt = StyledPlot()
  File "/home/klaus/ws/DeepCAVE/deepcave/utils/styled_plot.py", line 28, in __init__
    plt.style.use("seaborn")
  File "/home/klaus/miniconda3/envs/deepcave/lib/python3.9/site-packages/matplotlib/style/core.py", line 139, in use
    raise OSError(
OSError: 'seaborn' is not a valid package style, path of style file, URL of style file, or library style name (library styles are listed in `style.available`)
[1]    10519 terminated  deepcave --open

This is how I set it up:

git clone [email protected]:automl/DeepCAVE.git
cd DeepCAVE/
conda create -n DeepCAVE python=3.9
conda activate DeepCAVE
conda install -c anaconda swig
pip install -e .
sudo apt-get install redis-server
deepcave --open

After installing seaborn, everything worked:

conda install seaborn
deepcave --open

Seaborn is used here:

DeepCAVE/deepcave/utils/styled_plot.py

Line 150 in 1851126

import seaborn as sns

and here:

DeepCAVE/deepcave/utils/styled_plot.py

Line 28 in 1851126

plt.style.use("seaborn")

but is not part of the requirements.txt.

Design a logo

We should think about a nice logo!

Parallel Coordinates example broken

The example examples/api/parallel_coordinates.py throws an error and needs to be fixed.

Problem with Partial Dependence Plot

The plugin shows nothing when selecting two different hyperparameters. This needs to be checked.

PDF Report Feature

It would be very nice to have a feature that allows to generate a PDF report from an analysis with everything the tool has to offer in principle. For this, a user would need to be able to define which analyses under which configuration they would like to see.

`type` object is not subscriptable

Will try to add from __future__ import annotations at the top.

File "/home/skantify/code/DeepCAVE/deepcave/evaluators/epm/random_forest.py", line 345, in RandomForest
    def _predict(self, X: np.ndarray) -> tuple[np.ndarray, np.ndarray]:
TypeError: 'type' object is not subscriptable

Python 3.8.5

Partial Dependence Plot: Fix seed

There seems to be a problem with the reproducibility of the Partial Dependence Plot that needs to be checked.

Enable using SMAC runs with multiple seeds

Currently, DeepCave doesn't support when runs are done with deterministic=False. For example, when running examples/1_basics/1_quadratic_function.py with deterministic=False, loading the resulting run in DeepCAVE is not possible and will give the warning message "SMAC3v2: Multiple seeds are not supported..".

Loading indicators

Parallel Coordinates Permutations

Is there a sensible way to reorder columns to make the depiction more orderly?

Init files: Refactor and clean up

There is quite some code in the init files that should be moved to separate files instead.

Plugin: Real-time update

Qr

ModuleNotFoundError

I get the following error when trying to run deepcave --open: ModuleNotFoundError: No module named 'smac.epm.util_funcs'
I think smac changed the naming of the module from util_funcs to utils.
The error occured in DeepCAVE/deepcave/evaluators/epm/random_forest.py and should be fixed if you change the import.

Parallel Coordinates

Colouring based on Configclusters (using e.g. HDBSCAN, which also allows having "noisy observations" with no affiliation)?
The clusters could create a base color which's color shading is determined by the final performance?

This might help identify communalities between well-performing configurations.

ConfigCube Projections for higher dimensions.

When dealing with more than three-dimensional HP-spaces, it may become spurious to look at 3d-slices. Maybe try some high-performing projection procedures? There is for instance UMAP and its successor. But be wary of transductive and inductive projections if you want to do it sequentially.

New plugin: Add symbolic explanations

Add a new plugin that allows to apply symbolic regression to the meta-data gathered during HPO and obtain symbolic explanations for the dependency between hyperparameters and performance

To be done:

Add Parsimony Hyperparameter
Possibly add other SR HPs
Add check if #OptimizedHPs != #ExplainedHPs (if so, run PDP before SR)
Replace X1 / X2 by HP name

Quality of Surrogate Models

Many of our analyses are based on surrogate models.
It would be fairly important to know how faithful these surrogate models actually are.
Could we add some insights regarding that? In the easiest case, we could start with some RMSE on out-of-bag error.

New plugin: Visualization of Neural Networks

Add one or more plugins to visualize neural networks.
See e.g.: https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=8732351

Tutorial: Write your own converter

In order to make it as easy as possible to add your own converter, it would be great to have a more in-depth tutorial on how to do that as the current documentation on that is rather sparse: https://automl.github.io/DeepCAVE/main/converters.html

automl / deepcave Goto Github PK

deepcave's People

Contributors

Stargazers

Watchers

Forkers

deepcave's Issues

Workflow

Recommend Projects

Recommend Topics

Recommend Org