automl / deepcave Goto Github PK
View Code? Open in Web Editor NEWAn interactive framework to visualize and analyze your AutoML process in real-time.
Home Page: https://automl.github.io/DeepCAVE/main/
License: Apache License 2.0
An interactive framework to visualize and analyze your AutoML process in real-time.
Home Page: https://automl.github.io/DeepCAVE/main/
License: Apache License 2.0
We should make it as easy as possible for a user to add their own plugins and should add an in-depth tutorial for that as the current documentation is rather limited on that: https://automl.github.io/DeepCAVE/main/plugins/index.html
Add more tests, especially for:
I assumed that when I close the deepcave tab and kill the command line application, deepcave would shut down completely but in fact the dash port is still in use and not freed within ~30 mins (not sure when exactly it's freed, actually, I just observed it being taken even quite a while after). Is there a shutdown command of some sort or a way to make this cleaner?
The issue right now is obviously that if I want to restart, I have to change my config to a different port.
DeepCAVE/deepcave/utils/logs.py
Line 13 in 5811bf9
By default disable_existing_loggers is set to True, so if deepcave is imported late in the program as an optional dependency, it will disable all non-root level loggers.
Setting this flag to false at the end of this file should prevent this.
We could add some more features to the sidebar, especially:
History of jobs -> list of jobs from static plugins that have been run already (currently disappear after they are clicked)
Favorites -> favorite configurations to be used in all plugins (highlighted)
It would be great to add support for reading AutoML-Toolkit output. A corresponding converter needs to be written for that.
Currently, MO-SMAC runs cannot be loaded into DeepCave.
The documentation includes instructions how to install redis without root access, but there's a line missing which was necessary for me: after running 'make' in the redis directory, I also needed to run 'make install' to be able to actually run the command.
Plots are mainly created with Plotly in DeepCAVE. For some plots, a matplotlib version is available as well. This should be added for more plugins.
Hey,
first of all: thanks for that super nice tool.
It is really awesome.
Unfortunately, I have encountered a bug in the fANOVA plugin.
The values returned by the rf are all nan.
It might happen due to some constant hyperparameters in the search space.
I've attached the results of a hpbandster run to reproduce the error.
Thanks in advance
for the use of the parallel coordinate plot a select all option would be nice - and be set as default
Hi,
I've just started using the DeepCAVE tool and noticed in a few screens, some fields remain active even if a run isn't selected. If you try editing the fields, it throws an error. It only happens at the very beginning of operation when no run is selected.
Similarly for the Importances tab when changing the Method, Trees or Limit Hyperparameters fields. Again, it only happens at the very beginning when no run has ever been selected for that session.
It doesn't affect functionality but I think it could be solved with a callback function (input being the id and value of the run menu dropdown and output being the id and visibility/enable property of the relevant fields).
Screens of the Parallel coordinates and Importance tabs where I saw this behavior.
Best,
Dipti
Right now, cache is always resets after some minutes. Own json saving is recommended.
We could implement some of the ideas presented in
Fitness Landscape Footprint: A Framework to Compare Neural Architecture Search Problems
We would like to improve the README:
As part of this, we can also look into similar repos and what makes their READMEs great such that we can benefit from their ideas.
As the current implementation of fANOVA is rather inefficient, we could think about how to make it faster.
When doing a commit, error messages from the pre-commit hooks show up (with errors not regarding the changes made, but regarding the existing files). This needs to be checked and updated.
With Hyperparameter importance, if you select "local Importance", the same graph is displayed for all objectives. If you select Fanova, however, different graphs are displayed.
In order to maximize the potential users of DeepCave, we should aim to support more HPO tools. In particular, we should write converters for
Hi, :)
I wanted to use the DeepCAVE framework to visualise some runs from HpBandSter. (I can't use SMAC because it has some issues with multi-node runs. I've already posted it.)
In deepcave.runs.converters.bohb
, while creating the runs from the bohb Result object, there is no "state" key in the info dict for the run Object within bohb.get_all_runs()
.
Line 66 in deepcave/runs/converters/bohb.py
status = bohb_run.info["state"]
The bohb.py code assumes the configspace is also saved but it does not happen by default in the library. I've saved the configspace.json separately , while the hpbandster.core.result.json_result_logger
saves results.json and configs.json.
Am I using the wrong version of HpBandSter? I used pip install hpbandster
which installs version 0.7.4.
Warm Regards,
Prioritize enhancements / new features and schedule when to work on them.
Is it possible to include some example data so that first time users can try everything out without having to run anything?
When deepcave is the first pillar of my HPO run, I might find some particular configs (or sets of configs) interesting and would like to examine them further. To do so, I need to be able to select configs and be given back the ids
???
Plots over time for mf with trajectories of each config from syne tune paper/ video
Hi,
We often wonder how hard an AutoML problem is. Can we therefore add some metrics regarding that?
For example
When using the API and calculating the importance of a run more than one time, the importance values vary a lot. This happens for fANOVA and Local importance. An example code is here (execute more than once and compare the values):
from deepcave.runs.converters.smac3v2 import SMAC3v2Run
from deepcave.evaluators.fanova import fANOVA
from deepcave.evaluators.lpi import LPI
import pandas as pd
run = SMAC3v2Run.from_path("../smac3_output/700b9ad7b27ba991278b31467cbe7fe6/700b9ad7b27ba991278b31467cbe7fe6/")
result = fANOVA(run)
result.calculate(objectives=run.get_objective('1-accuracy'), budget=max(run.get_budgets()), n_trees=10, seed=None)
df_importance = pd.DataFrame(result.get_importances(hp_names=None))
df_importance[df_importance>0].dropna(axis=1)
Not all dependencies are up to date. This needs to be checked and updated.
Hi!
I just gave DeepCAVE a try (its is great) and noticed that seaborn seems to missing as a dependency as I got the following error after starting DeepCAVE:
Using config 'default'
Checking if redis-server is already running...
Could not connect to Redis at 127.0.0.1:6379: Connection refused
Redis server is not running. Starting...
Redis server successfully started.
-------------STARTING WORKER-------------
-------------STARTING SERVER-------------
Traceback (most recent call last):
File "/home/klaus/miniconda3/envs/deepcave/lib/python3.9/site-packages/matplotlib/style/core.py", line 137, in use
style = _rc_params_in_file(style)
File "/home/klaus/miniconda3/envs/deepcave/lib/python3.9/site-packages/matplotlib/__init__.py", line 866, in _rc_params_in_file
with _open_file_or_url(fname) as fd:
File "/home/klaus/miniconda3/envs/deepcave/lib/python3.9/contextlib.py", line 119, in __enter__
return next(self.gen)
File "/home/klaus/miniconda3/envs/deepcave/lib/python3.9/site-packages/matplotlib/__init__.py", line 843, in _open_file_or_url
with open(fname, encoding='utf-8') as f:
FileNotFoundError: [Errno 2] No such file or directory: 'seaborn'
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/home/klaus/ws/DeepCAVE/deepcave/server.py", line 6, in <module>
app.layout = MainLayout(config.PLUGINS)()
File "/home/klaus/ws/DeepCAVE/deepcave/config.py", line 50, in PLUGINS
from deepcave.plugins.hyperparameter.importances import Importances
File "/home/klaus/ws/DeepCAVE/deepcave/plugins/hyperparameter/importances.py", line 13, in <module>
from deepcave.utils.styled_plot import plt
File "/home/klaus/ws/DeepCAVE/deepcave/utils/styled_plot.py", line 168, in <module>
plt = StyledPlot()
File "/home/klaus/ws/DeepCAVE/deepcave/utils/styled_plot.py", line 28, in __init__
plt.style.use("seaborn")
File "/home/klaus/miniconda3/envs/deepcave/lib/python3.9/site-packages/matplotlib/style/core.py", line 139, in use
raise OSError(
OSError: 'seaborn' is not a valid package style, path of style file, URL of style file, or library style name (library styles are listed in `style.available`)
[1] 10519 terminated deepcave --open
This is how I set it up:
git clone [email protected]:automl/DeepCAVE.git
cd DeepCAVE/
conda create -n DeepCAVE python=3.9
conda activate DeepCAVE
conda install -c anaconda swig
pip install -e .
sudo apt-get install redis-server
deepcave --open
After installing seaborn, everything worked:
conda install seaborn
deepcave --open
Seaborn is used here:
DeepCAVE/deepcave/utils/styled_plot.py
Line 150 in 1851126
and here:
DeepCAVE/deepcave/utils/styled_plot.py
Line 28 in 1851126
but is not part of the requirements.txt.
We should think about a nice logo!
The example examples/api/parallel_coordinates.py
throws an error and needs to be fixed.
The plugin shows nothing when selecting two different hyperparameters. This needs to be checked.
It would be very nice to have a feature that allows to generate a PDF report from an analysis with everything the tool has to offer in principle. For this, a user would need to be able to define which analyses under which configuration they would like to see.
Will try to add from __future__ import annotations
at the top.
File "/home/skantify/code/DeepCAVE/deepcave/evaluators/epm/random_forest.py", line 345, in RandomForest
def _predict(self, X: np.ndarray) -> tuple[np.ndarray, np.ndarray]:
TypeError: 'type' object is not subscriptable
Python 3.8.5
There seems to be a problem with the reproducibility of the Partial Dependence Plot that needs to be checked.
Currently, DeepCave doesn't support when runs are done with deterministic=False
. For example, when running examples/1_basics/1_quadratic_function.py
with deterministic=False
, loading the resulting run in DeepCAVE is not possible and will give the warning message "SMAC3v2: Multiple seeds are not supported..".
Is there a sensible way to reorder columns to make the depiction more orderly?
There is quite some code in the init files that should be moved to separate files instead.
I get the following error when trying to run deepcave --open
: ModuleNotFoundError: No module named 'smac.epm.util_funcs'
I think smac changed the naming of the module from util_funcs to utils.
The error occured in DeepCAVE/deepcave/evaluators/epm/random_forest.py and should be fixed if you change the import.
Colouring based on Configclusters (using e.g. HDBSCAN, which also allows having "noisy observations" with no affiliation)?
The clusters could create a base color which's color shading is determined by the final performance?
This might help identify communalities between well-performing configurations.
When dealing with more than three-dimensional HP-spaces, it may become spurious to look at 3d-slices. Maybe try some high-performing projection procedures? There is for instance UMAP and its successor. But be wary of transductive and inductive projections if you want to do it sequentially.
Add a new plugin that allows to apply symbolic regression to the meta-data gathered during HPO and obtain symbolic explanations for the dependency between hyperparameters and performance
To be done:
Many of our analyses are based on surrogate models.
It would be fairly important to know how faithful these surrogate models actually are.
Could we add some insights regarding that? In the easiest case, we could start with some RMSE on out-of-bag error.
Add one or more plugins to visualize neural networks.
See e.g.: https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=8732351
In order to make it as easy as possible to add your own converter, it would be great to have a more in-depth tutorial on how to do that as the current documentation on that is rather sparse: https://automl.github.io/DeepCAVE/main/converters.html
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.