cavalab / srbench Goto Github PK
View Code? Open in Web Editor NEWA living benchmark framework for symbolic regression
Home Page: https://cavalab.org/srbench/
License: GNU General Public License v3.0
A living benchmark framework for symbolic regression
Home Page: https://cavalab.org/srbench/
License: GNU General Public License v3.0
I think this is terrible.
def model(est, X):
mapping = {'x_'+str(i):k for i,k in enumerate(X.columns)}
new_model = est.model_
for k,v in mapping.items():
new_model = new_model.replace(k,v)
replace x_1? if exist x_11?
simple this is solution:
for k, v in reversed(mapping.items()):
@MilesCranmer can you specify the set of 6 hyperparameters you would like to use for benchmarking PySR? Going to start the runs. Hoping to have these by the end of tomorrow. They should match the original constraints:
Currently hyperparameters are set small for testing, but evaluate_model.py
now recognizes and shrinks some PySR parameters during testing. So the desired set should specified directly in the model file. (In the updated version for the competition, you can specify test_params explicitly).
This is the current version:
hyper_params = [
{
"annealing": (True,), # (True, False)
"denoise": (True,), # (True, False)
"binary_operators": (["+", "-", "*", "/"],),
"unary_operators": (
[],
# poly_basis,
# poly_basis + trig_basis,
# poly_basis + exp_basis,
),
"populations": (20,), # (40, 80),
"alpha": (1.0,),
"model_selection": ("best",)
# "alpha": (0.01, 0.1, 1.0, 10.0),
# "model_selection": ("accuracy", "best"),
}
]
Hi, I am encountering the following ImportError:
ImportError: cannot import name 'HalvingGridSearchCV' from 'sklearn.model_selection'
But I checked and my sckikit-learn version is 0.24.1, as required in the environment...
Should I be using a different version ?
Note : I also encountered errors when running conda env create -f environment.yml
, which prevented aifeynman and operon from being installed, but I would believe this is unrelated as the scikit-learn version is correct...
I have had a quick review of the benchmarking pipeline to better understand how the comparison is performed. During that review I noticed the scaling is always performed while reading the data files using a RobustScaler from sklearn.
The actual model is generated in the evaluate model script, which additionally has a parameter scale_x
and scale_y
that determine whether the input data X
and target y
should be scaled.
This means that if scale_x
is set to true
the input data is scaled twice when using the benchmarking pipeline. I don't know if this behavior is intended, but I suspect that the RobustScaler
is an artifact from previous experimentation and should be removed. Otherwise, although I set the scale_x
parameter to false, scaling is performed while reading the data.
I attempted to run ITEA within a Docker image; however, I encountered an error. I suspect that the issue arises from recent refactoring in ITEA. @folivetti It would be greatly appreciated if you could update the relevant code in srbench or adjust the Git version in the ITEA installation script.
Traceback (most recent call last):
File "ITEARegressor.py", line 1, in <module>
from ITEA import itea_srbench as itea
ImportError: cannot import name 'itea_srbench' from 'ITEA' (unknown location)
Hi,
I wanted to point out some issues with the environment/install scripts:
Bash scripts
The install scripts should not require elevated privileges ("sudo"). anaconda/miniconda is normally installed in an unprivileged location (home folder), so it shouldn't be necessary to do anything as root. This seems to trip some things up in some cases.
Some rudimentary way to check if a script succeeded would be useful. I would also redirect the compile output to a log file.
failed=()
# install all methods
for install_file in $(ls *.sh) ; do
echo "vvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvv"
echo "Running $install_file"
echo "////////////////////////////////////////////////////////////////////////////////"
bash $install_file
if [ $? -gt 0 ]
then
failed+=($install_file)
fi
echo "////////////////////////////////////////////////////////////////////////////////"
echo "Finished $install_file"
echo "^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^"
done
echo "failed: ${failed}"
src
folders.Conda environment
pkg-config
needs to be added to environment.yml
, otherwise build scripts relying on it will fail (e.g. GP-GOMEA)
cxxopts
is not necessary (operon only needs it for the command-line program, not for the python module)
on my ubuntu machine (16.04) , ellyn
installs but fails to run
nevermind, it's not supposed to be called directly.
My conda info:
$ conda info
active environment : srbench
active env location : /home/bogdb/miniconda3/envs/srbench
shell level : 2
user config file : /home/bogdb/.condarc
populated config files :
conda version : 4.10.1
conda-build version : not installed
python version : 3.9.1.final.0
virtual packages : __linux=5.11.13=0
__glibc=2.23=0
__unix=0=0
__archspec=1=x86_64
base environment : /home/bogdb/miniconda3 (writable)
conda av data dir : /home/bogdb/miniconda3/etc/conda
conda av metadata url : https://repo.anaconda.com/pkgs/main
channel URLs : https://repo.anaconda.com/pkgs/main/linux-64
https://repo.anaconda.com/pkgs/main/noarch
https://repo.anaconda.com/pkgs/r/linux-64
https://repo.anaconda.com/pkgs/r/noarch
package cache : /home/bogdb/miniconda3/pkgs
/home/bogdb/.conda/pkgs
envs directories : /home/bogdb/miniconda3/envs
/home/bogdb/.conda/envs
platform : linux-64
user-agent : conda/4.10.1 requests/2.25.1 CPython/3.9.1 Linux/5.11.13 ubuntu/16.04.7 glibc/2.23
UID:GID : 1001:1001
netrc file : None
offline mode : False
In 2014, a paper published in JMLR reported the results of more than 100+ classification algorithms on numerous classification benchmark datasets [1].
However, it seems that such a paper does not consider the genetic programming based methods, such as M4GP [2]. Consequently, is it possible to develop a classification benchmark to further boost the advancement of genetic programming and even the machine learning domain?
[1]. Fernández-Delgado M, Cernadas E, Barro S, et al. Do we need hundreds of classifiers to solve real world classification problems?[J]. The journal of machine learning research, 2014, 15(1): 3133-3181.
[2]. La Cava W, Silva S, Danai K, et al. Multidimensional genetic programming for multiclass classification[J]. Swarm and evolutionary computation, 2019, 44: 260-272.
At the moment a lot of preprocessing is done to convert the models returned by different methods into a common, sympy-compatible format in experiment/symbolic_utils.py.
I would like to remove this post-processing step and, in the future, require methods to return sympy compatible strings. Steps:
See updated contribution guide
@foolnotion the operon install script now installs a bunch of packages with no version control. We need to fix them to stable versions.
It's causing the current CI build to fail and is probably going to cause more down the road. Can you add git checkout {version}
to each clone, or use packaged versions of the dependencies in conda instead?
I've noticed that you're standardizing the predictors and target variable before fitting the regression models:
In my experience my algorithm works best without any normalization. I also tested a few datasets with KernelRidge with and without standardization and it seems to also work better without this transformation. Maybe add an option to use the standardized data set or not.
I have executed the experiment script and it outputted several JSON files. However, an issue is that it seems that there is no script that can aggregate these results into a single file. Therefore, my question is, is there any additional analysis tool affiliate with this benchmark package?
In this line of code I'm getting the error ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all():
I suggest changing it to:
sample_idx = np.random.choice(np.arange(len(labels)), size=n_samples)
Hi there,
Some quick start help needed:
After installing, how to run the benchmarks on user-supplied data?
I'm struggling getting this work - to make sure there's nothing wrong with my SRBench install:
Many thanks!
@folivetti i'm noticing some of the links on the website are broken, probably because they are copied straight from the markdown files that have relative paths (e.g., https://epistasislab.github.io/srbench/CONTRIBUTING.md
)
Do you have any free time to update them? thank you!
the command git clone https://github.com/EpistasisLab/pmlb/tree/feynman
does not work, it says
fatal: repository 'https://github.com/EpistasisLab/pmlb/tree/feynman/' not found
but using git clone https://github.com/EpistasisLab/pmlb.git
seems to work ok. The fetch command downloads 299 files in total.
One other suggestion: use micromamba
instead of conda
(mamba is a C++ rewrite of conda - it uses the same package servers but is much faster). The docker image mambaorg/micromamba
which has micromamba
built-in seems to work if all conda
calls are replaced with micromamba
. It's much faster for me!
Originally posted by @MilesCranmer in #59 (comment)
Adding @janoPig other suggestions here (also ref #131 (comment))
Some suggestions for published code here -> https://github.com/janoPig/srbench/tree/srcomp
- fix run on local machine. solve errors
"ValueError: option names {'--ml'} already added"
"ValueError: too many values to unpack (expected 3)"
added scripts for running on local computer subit_stage[N]_local.sh METHOD_NAME
fix number running threads per job(it run always default 4). Add new parameter -n_threads to subit_stage[N].sh script
The submit scripts are there for reproducibility, i.e. they are fixed calls to python analyze.py [args]
. analyze.py
can be configured for both of these purposes directly.
added DataFrame=False parameter to HROCH regressor for correct run.
test for input data featureselection.csv. The data was created using the function '0.11x1^3 + 0.91x2x3 + 0.68x4x5 + 0.26x6^2x7 + 0.13x8x9x10' and feature_absence_score was evaluated for '0.11x1^3 + 0.91x3x5 + 0.68x7x9 + 0.26x11^2x13 + 0.13x15x17x19'
srcomp
branch.refactor validation scripts so that each method is validated with the same lines of code.
This project is awesome and I believe it can greatly promote the advancement of Symbolic Regression domain. However, a critical issue is that we don't have enough computational resources to repeat the full experiment. Therefore, is it possible to provide complete experimental results for us to directly use in our paper? I will be grateful if such results can be provided.
keep a standing leaderboard of methods in the documentation. as new results/methods are developed, add them to the leaderboard.
we should measure complexity of the final models produced by the methods. Since they all vary quite a bit, this isn't trivial. for the SR methods, we should be able to count the number of nodes (operators and literals) in the solutions.
I think at this line you meant to be using X_train_scaled and X_test_scaled instead of X_train, X_test (X_test_scaled was created but never used).
Hi!
Thank you for your great work and framework!
I wanted to try the benchmarked methods for the ground-truth datasets (i.e., Feynman and Strogatz datasets) and followed the instructions in README.
However, the datasets fetched from the pmlb repository look broken. Here is one of the errors I got when running
python analyze.py -results ../results_sym_data -target_noise 0.0 "/data/pmlb/datasets/strogatz*" -sym_data -n_trials 10 -time_limit 9:00 -tuned --local
for Strogatz dataset. (Same errors occurred for Feynman dataset by "/data/pmlb/datasets/feynman_*" as well)
========================================
Evaluating tuned.FEATRegressor on
/data/pmlb/datasets/strogatz_bacres1/strogatz_bacres1.tsv.gz
========================================
compression: gzip
filename: /data/pmlb/datasets/strogatz_bacres1/strogatz_bacres1.tsv.gz
Traceback (most recent call last):
File "evaluate_model.py", line 291, in <module>
**eval_kwargs)
File "evaluate_model.py", line 39, in evaluate_model
features, labels, feature_names = read_file(dataset)
File "/opt/app/srbench/experiment/read_file.py", line 19, in read_file
engine='python')
File "/opt/conda/envs/srbench/lib/python3.7/site-packages/pandas/util/_decorators.py",
line 311, in wrapper
return func(*args, **kwargs)
File "/opt/conda/envs/srbench/lib/python3.7/site-packages/pandas/io/parsers/readers.py",
line 586, in read_csv
return _read(filepath_or_buffer, kwds)
File "/opt/conda/envs/srbench/lib/python3.7/site-packages/pandas/io/parsers/readers.py",
line 482, in _read
parser = TextFileReader(filepath_or_buffer, **kwds)
File "/opt/conda/envs/srbench/lib/python3.7/site-packages/pandas/io/parsers/readers.py",
line 811, in __init__
self._engine = self._make_engine(self.engine)
File "/opt/conda/envs/srbench/lib/python3.7/site-packages/pandas/io/parsers/readers.py",
line 1040, in _make_engine
return mapping[engine](self.f, **self.options) # type: ignore[call-arg]
File "/opt/conda/envs/srbench/lib/python3.7/site-
packages/pandas/io/parsers/python_parser.py", line 100, in __init__
self._make_reader(self.handles.handle)
File "/opt/conda/envs/srbench/lib/python3.7/site-
packages/pandas/io/parsers/python_parser.py", line 203, in _make_reader
line = f.readline()
File "/opt/conda/envs/srbench/lib/python3.7/gzip.py", line 300, in read1
return self._buffer.read1(size)
File "/opt/conda/envs/srbench/lib/python3.7/_compression.py", line 68, in readinto
data = self.read(len(byte_view))
File "/opt/conda/envs/srbench/lib/python3.7/gzip.py", line 474, in read
if not self._read_gzip_header():
File "/opt/conda/envs/srbench/lib/python3.7/gzip.py", line 422, in _read_gzip_header
raise OSError('Not a gzipped file (%r)' % magic)
OSError: Not a gzipped file (b've')
I also tried to manually gunzip the file, but the error message still says it's not in gzip format
$ gunzip /data/pmlb/datasets/strogatz_bacres1/strogatz_bacres1.tsv.gz
gzip: /data/pmlb/datasets/strogatz_bacres1/strogatz_bacres1.tsv.gz: not in gzip format
Could you please resolve this issue for both Feynman and Strogatz datasets?
Thank you!
Hello,
So I followed the conda installation guidelines using Anaconda (Anaconda3-2021.11-Linux-x86_64)
Below is the conda env description.
I use a forked version of srbench, unfortunately I am not able to push it, I don't know if it comes from me or the initial repo as I get the following error though I did not add large files to it: "Account responsible for LFS bandwidth should purchase more data packs to restore access".
Maybe you could at least store the feather results files somewhere as they are not heavy?
Thanks a lot.
active environment : srbench
active env location : /private/home/pakamienny/anaconda3/envs/srbench
shell level : 3
user config file : /private/home/pakamienny/.condarc
populated config files : /private/home/pakamienny/.condarc
conda version : 4.10.3
conda-build version : 3.21.5
python version : 3.9.7.final.0
virtual packages : __cuda=11.4=0
__linux=5.4.0=0
__glibc=2.31=0
__unix=0=0
__archspec=1=x86_64
base environment : /private/home/pakamienny/anaconda3 (writable)
conda av data dir : /private/home/pakamienny/anaconda3/etc/conda
conda av metadata url : None
channel URLs : https://conda.anaconda.org/conda-forge/linux-64
https://conda.anaconda.org/conda-forge/noarch
https://repo.anaconda.com/pkgs/main/linux-64
https://repo.anaconda.com/pkgs/main/noarch
https://repo.anaconda.com/pkgs/r/linux-64
https://repo.anaconda.com/pkgs/r/noarch
package cache : /private/home/pakamienny/anaconda3/pkgs
/private/home/pakamienny/.conda/pkgs
envs directories : /private/home/pakamienny/anaconda3/envs
/private/home/pakamienny/.conda/envs
platform : linux-64
user-agent : conda/4.10.3 requests/2.26.0 CPython/3.9.7 Linux/5.4.0-81-generic ubuntu/20.04.2 glibc/2.31
UID:GID : 1185300944:1185300944
netrc file : None
offline mode : False
Create a docker environment for installing and running srbench. Include an image with releases.
Docker isn't supported by the cluster I'm using, unfortunately. But I think it could be a good way to manage version control going forward.
Also: see #56 for an example starting point
Hi there,
When attempting to fit the ground truth equation "3x^2 - 2x + 1" between x values of [0.0 to 10], with 60 data points,
on different methods within SRBench,
the result consistently evolves to something close to "0.297677x^2 + 0.951x -0.298461".
Steps to reproduce:
Launch locally "evaluate_model.py" on a file with 60 data points with x values ranging from 0.0 to 10,
for example with Operon but other methods give a similar result:
python evaluate_model.py /pmlb/datasets/eq1/eq1.tsv -ml OperonRegressor -seed 42
Observe that the result consistently, across several methods within SRBench, always evolves to something close to
"0.297677x^2 + 0.951x -0.298461",
rather than the expected
"3x^2 - 2x + 1".
Specifically, other methods either yield the above or "not implemented".
It seems we can rule out some sort of linear normalization going on.
And it's quite enough points for, say, TuringBot to find the correct solution on the same data.
Many thanks!
PS: Relevant files at this dropbox folder,
http://bit.ly/3kICKKs
Hi @foolnotion
Here are the docker file (zipped as GitHub doesn't accept Dockerfile here) and steps that fail to install Operon and you requested in #55
Dockerfile.zip
Unzip the dockerfile
unzip Dockerfile.zip
Build docker image
docker build --pull --rm -f "Dockerfile" -t srbench:latest "."
Run docker
docker run --runtime=nvidia --gpus all \
-it srbench /bin/bash
As suggested by @folivetti
remove the following from environment.yml:
vi environment.yml
In srbench/experiments/methods/src/operon_install.sh remove the line:
git checkout 015d420944a64353a37e0493ae9be74c645b4198
vi ./experiments/methods/src/operon_install.sh
Finish conda setup
conda update conda -y
bash configure.sh
conda activate srbench
bash install.sh
You will see that it fails to install Operon then.
Thank you
As discussed in #62 with @lacava (and discussed a bit in #24 by others last year), I think a wall clock time benchmark would be a really nice complement to comparing over a fixed number of evaluations.
I think fixing the number of evaluations is only one way of enforcing a level playing ground. One could also fix:
or any other way of bottlenecking the internal processes of a search code. I think by measuring against only one of these, it artificially biases a comparison against algorithms which might use more of one of these, for whatever reason.
Some algorithms are sample intensive, while other algorithms do a lot of work between steps. I think that only by comparing algorithms based on # of evaluations, this artificially biases any comparison against sample intensive algorithms.
An algorithm and its implementation are not easily separable, so I would argue that you really need to measure by wall clock time to see the whole picture. Not only is this much more helpful from a user's point of view, but it enables algorithms which are intrinsically designed for parallelism to actually demonstrate their performance increase. The same can be said for other algorithmic sacrifices, like required for rigid data structures, data batching, etc.
Of course, there is no single best solution and every different type of benchmark will provide additional info. So I think this wall clock benchmark should be included with the normal fixed-evaluation benchmark, using a separate set of tuned parameters, which would give a more complete picture of performance.
Finally, I note I have a conflict of interest since PySR/SymbolicRegression.jl are designed for parallelism and fast evaluation, but hopefully my above points are not too influenced by this!
Eager to hear others' thoughts
add SR methods for comparison. the following come to mind:
I am glad that most of this repository is replicable. However, it appears that the statistical comparison part is unable to correctly deal with the provided experiment result. There are several missing values for FEAT and MRGP in the provided file. As a result, the Wilcoxon signed-rank test cannot function properly during the statistical comparison process. So, how should we approach the problem?
Hi there,
Would it be possible to add to your instructions
how to exactly run just one of the several Experiments?
Perhaps the simplest, fastest, non-empty, experiment - just for validation?
I can't get docker to finish executing, so I need to debug / validate my installation of 'srbench'.
I'm completely new to this toolchain, so a bit lost.
Many thanks.
Hi,
I am opening this issue to start a discussion about the settings and computational limits of the methods. I added some of the email comments in there, feel free to edit!
There will be a fixed evaluation budget that each method can expend in its own way. Suggested budget: 500,000 evaluations.
It seems reasonable to take into account local search iterations and adjust for minibatch sampling.
It might also be interesting to monitor the load average on the cluster, if you can isolate it on a per-method basis, maybe combine with other measurements like memory usage. This would give a more general measure of each method's computational requirements.
Not much to comment here, maybe just a minor nit pick: I noticed some methods also use the "AQ" (analytical quotient) symbol, which can be decomposed into basic math operations: aq(a,b) = a / sqrt(1 + b^2)
. What is then the complexity of the AQ symbol?
Thank @lacava for helping me resolve the dataset issue last time.
Based on the command in README, I tried to reproduce the results reported in Figure 3 of your recently accepted paper for both Strogatz and Feynman datasets, and found some concerns/questions.
For Strogatz dataset, I ran
python analyze.py -results ../results_sym_data -target_noise 0.0 "/path/to/pmlb/datasets/strogatz*" -sym_data -n_trials 10 -time_limit 9:00 -tuned --local
Following that, there were many json files produced. In strogatz_bacres1_tuned.FE_AFPRegressor_15795.json
(AFP_FE), I found the following values:
>>> with open('strogatz_bacres1_tuned.FE_AFPRegressor_15795.json', 'r') as fp:
... fe_afp_result = json.load(fp)
...
>>> fe_afp_result['r2_test']
0.9984022413915545
>>> fe_afp_result['symbolic_model']
'(log(((((((-0.567/cos(1.486))^2)/(x_1+(x_0/exp((cos(((0.069^2)*(x_0*
(x_0+cos((cos(((-0.065^2)*(x_0*x_0)))*x_1))))))^3)))))^3)^3)+exp(sin(log(((sqrt(|
(cos(((-0.064^2)*(x_0*x_0)))^3)|)/((0.286+
(x_1*0.017))^2))^2))))))*cos((log(x_0)/(log((x_0^3))-x_1))))'
>>> fe_afp_result['true_model']
' 20 - x - \\frac{x \\cdot y}{1+0.5 \\cdot x^2}$'
I think a) r2_test
is called Accuracy in the paper, b) symbolic_model
means the symbolic expression as a result of training on strogatz_bacres1
c) whereas the true symbolic expression is associated with true_model
.
Is my understanding correct for all a), b), c)?
Also, is the above symbolic_model
expected as output of AFP_FE for strogatz_bacres1
? Since the method is the 2nd best for ground-truth datasets shown in Fig. 3, and I expected a clearer expression.
Could you please clarify how the solution rate in Fig. 3 is derived?
Did you manually compare the produced expression symbolic_model
to the true expression true_model
and consider it solved only when the produced expression exactly matches the true one?
Or if it is fully based on Definition 4.1 (Symbolic Solution).
in the paper, what values of a
and b
are used in Fig. 3?
On Ubuntu 18.04 and 20.04, operon build with your provided install.sh
failed due to version discrepancy between libceres-dev (expects Eigen 3.4.0) and libeigen3-dev (the latest available version is 3.3.7). I even tried to build Eigen v3.4.0 from source, but still the build failed.
Do you remember how you setup the dependencies for Operon?
Could you provide the exact commands to reproduce the results in Fig. 3?
For Strogatz datasets with target noise = 0.0, I think the following command was used
python analyze.py -results ../results_sym_data -target_noise 0.0 "/path/to/pmlb/datasets/strogatz*" -sym_data -n_trials 10 -time_limit 9:00 -tuned
but how about Feynman datasets?
Also, how should we determine -time_limit
?
To estimate how long it will take to reproduce the results in Fig. 3, could you share the detail of computing resource used in the paper e.g., how many machines of 24-28 core Intel(R) Xeon(R) CPU E5-2690 v4 @ 2.60GHz chipsets and 250 GB of RAM
are used and (rough) estimated runtime to get the results if you remember?
On a machine with 4-core CPU, 128GB RAM and 2 GPUs, even strogatz_bacres1
(400 samples) is taking more than a day to complete python analyze.py -results ../results_sym_data -target_noise 0.0 /path/to/pmlb/datasets/strogatz_bacres1/strogatz_bacres1.tsv.gz -sym_data -n_trials 10 -time_limit 9:00 -tuned --local
Sorry for many questions, but your responses would be really appreciated and helpful for using this great work in my research.
Thank you!
Hi,
I believe you have an issue at this line:
srbench/experiment/symbolic_utils.py
Line 158 in a7875bf
Here are the steps to reproduce it:
instance: feynman_I_11_19
candidate model: x0*x3 + x1*x4 + x2*x5
true model: x1*y1+x2*y2+x3*y3
yaml features order: x1, x2, x3, y1, y2, y3 == x0, x1, x2, x3, x4, x5
replacing feature 0 with x1
x1*x3+x1*x4+x2*x5
replacing feature 1 with x2
x2*x3+x2*x4+x2*x5
replacing feature 2 with x3
x3*x3+x3*x4+x3*x5
replacing feature 3 with y1
y1*y1+y1*x4+y1*x5
replacing feature 4 with y2
y1*y1+y1*y2+y1*x5
replacing feature 5 with y3
y1*y1+y1*y2+y1*y3
So, the candidate model x0*x3 + x1*x4 + x2*x5 is renamed to y1*y1+y1*y2+y1*y3, which is obviously not the same thing.
The possible fix would be to perform renaming backwards.
for i,f in enumerate(features): --> for i,f in reversed(list(enumerate(features))):
Now, the steps will be:
replacing feature 5 with y3
x0*x3+x1*x4+x2*y3
replacing feature 4 with y2
x0*x3+x1*y2+x2*y3
replacing feature 3 with y1
x0*y1+x1*y2+x2*y3
replacing feature 2 with x3
x0*y1+x1*y2+x3*y3
replacing feature 1 with x2
x0*y1+x2*y2+x3*y3
replacing feature 0 with x1
x1*y1+x2*y2+x3*y3
Best regards,
Aleksandar
As for the default hyper-parameter tuning strategy, I find this benchmark using the halving grid search method. To begin, I admit that the traditional grid search strategy is impractical due to the prohibitively expensive computational cost. However, when we use the halving grid search method and the parameter grid is large, as in the case of XGBoost. The first few rounds of hyper-parameter search will only train on a few data points, and the results might be unreliable. So, is this a truly good method for hyper-parameter tuning, or is this tuning protocol sufficient to persuade reviewers?
Have you considered including "TensorGP" in your collection of benchmarked methods?
Seems highly relevant, however I don't have any relation to the author. Links:
TensorGP:
https://github.com/AwardOfSky/TensorGP
Symbolic Regression example:
https://github.com/AwardOfSky/TensorGP/blob/dev/example_symreg.py
Hi there,
Trying to use Operon with "evaluate_model.py" on the Docker image, returns the error:
» python evaluate_model.py /data/eq1.tsv -ml sembackpropgp -seed 42 -skip_tuning
(...)
ImportError: libpython3.9.so.1.0: cannot open shared object file: No such file or directory
whereas, for example:
python evaluate_model.py /data/eq1.tsv -ml sembackpropgp -seed 42 -skip_tuning
does launch the regressor.
PS: Operon also fails to launch on the "192_vineyard" dataset - where other methods do not.
Many thanks!
I encountered the following error while running the PySR Regressor. It appears that PySR has undergone some changes to its parameters, and the alpha value is no longer supported. @MilesCranmer, could you please take a look at this? Thank you!
File "/opt/conda/envs/srbench/lib/python3.7/site-packages/joblib/parallel.py", line 819, in _dispatch
job = self._backend.apply_async(batch, callback=cb)
File "/opt/conda/envs/srbench/lib/python3.7/site-packages/joblib/_parallel_backends.py", line 208, in apply_async
result = ImmediateResult(func)
File "/opt/conda/envs/srbench/lib/python3.7/site-packages/joblib/_parallel_backends.py", line 597, in __init__
self.results = batch()
File "/opt/conda/envs/srbench/lib/python3.7/site-packages/joblib/parallel.py", line 289, in __call__
for func, args, kwargs in self.items]
File "/opt/conda/envs/srbench/lib/python3.7/site-packages/joblib/parallel.py", line 289, in <listcomp>
for func, args, kwargs in self.items]
File "/opt/conda/envs/srbench/lib/python3.7/site-packages/sklearn/utils/fixes.py", line 216, in __call__
return self.function(*args, **kwargs)
File "/opt/conda/envs/srbench/lib/python3.7/site-packages/sklearn/model_selection/_validation.py", line 668, in _fit_and_score
estimator = estimator.set_params(**cloned_parameters)
File "/opt/conda/envs/srbench/lib/python3.7/site-packages/sklearn/base.py", line 248, in set_params
"with `estimator.get_params().keys()`." % (key, self)
ValueError: Invalid parameter alpha for estimator PySRRegressor(equations=0.0). Check the list of available parameters with `estimator.get_params().keys()`.
Thank you for creating srbench! While running the installation I noticed a typo in feat_install.sh
, line 12: https://github.com/cavalab/srbench/blob/master/experiment/methods/src/feat_install.sh#L12
checkotu
should be checkout
.
Not sure whether this makes a big difference but since I saw it, I thought I would let you know :)
Hi,
Since this issue does not require other changes in this repo except for flake.nix
, I took the liberty of working with the dev
branch.
This allows nix to be used as an alternative to conda, with a bunch of advantages:
nix develop
)docker run -p 8888:8888 -ti --rm docker.nix-community.org/nixpkgs/nix-flakes nix develop github:cavalab/srbench/dev --no-write-lock-file
pyoperon
pull their own dependencies automatically (no need to keep adding things to an environment file)flake.lock
files can fix versions/revisionsThis is obviously a low priority issue right now, but I've been using it to deploy srbench/operon without conda.
My frustration with conda began with not being able to add gcc/gxx-11.2.0
to the environment.
This issue is meant to track integration of other frameworks with nix. So far I have also integrated FEAT and Ellyn (wip). Other frameworks should be easily integrated as long as they use standard packaging.
There are some aspects that will need attention from other authors:
FEAT:
>>> import feat
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/nix/store/h3gfxkz8a31l60qmpzj15ryq12nqsspm-python3.9-feat_ml-0.5.2/lib/python3.9/site-packages/feat/__init__.py", line 1, in <module>
from .feat import Feat, FeatRegressor, FeatClassifier
File "/nix/store/h3gfxkz8a31l60qmpzj15ryq12nqsspm-python3.9-feat_ml-0.5.2/lib/python3.9/site-packages/feat/feat.py", line 12, in <module>
from .pyfeat import PyFeat
File "feat/pyfeat.pyx", line 1, in init feat.pyfeat
ValueError: numpy.ndarray size changed, may indicate binary incompatibility. Expected 96 from C header, got 88 from PyObject
setup.py
seems to be really tailored for conda environments, it took some tricks with setting ENV vars to get it to workEllyn
setup.py
hardcoded for Conda, does not work with nix
(I will try to patch it)Best,
Bogdan
Just a quick note that new PRs should be submitted to the dev branch. The contributing guidelines have been updated accordingly
add a local script mirroring the CI process for users to test their submissions locally.
Came up in #81
Now, it seems that the reports generated by those report scripts are too crude. Some important indicators in the AutoML domain, such as significance test and mean accuracy among all tasks, are not presented. In my research domain, there is a benchmark project named "srbench" (https://github.com/EpistasisLab/srbench). It provides a good example for showing above-mentioned metrics. Consequently, I hope this benchmark project can also implement similar features.
Hi,
Recently I am implementing my SR algorithm for competition, everything is fine but I still have a question for this competition detail eventhough I have read the Competition Guidlines.
Say a gound-truth model is -1.6+x**2
and my model produces a string log(|-0.2|)+x**2
, then according to the "regressor guide" in Competition Guidelines, I should change this string to the sympy compatible string (here is removing '|' since sympy doesn't recognize it, so did in "submission/feat-example/regressor.py"):
sympy_str = est.model_.replace("|", "")
When I do this, the model string would be converted to log(-0.2)+x**2
, this is definely not the same expression as my model since if I run f = sympy.symplify("log(-0.2)+x**2")
, then sympy_str would become "x**2 - 1.6094379124341 + I*pi".
So this result is totally different from the gound-truth model -1.6+x**2
. My question is: will "symplify" be used for model comparison during the competition? If used, how should I deal with this problem?
Hi,
I think these seeds are the ones used in srbench, not those used for the competition.
Line 2 in 2faf1fe
@nightdr your submission didn't pass the tests on merge. can you figure out what is going on:
evaluate_model.py:146: in evaluate_model
results['symbolic_model'] = model(est, X_train_scaled)
methods/Bingo/regressor.py:68: in model
model_str = str(est.get_best_individual())
/usr/share/miniconda3/envs/srcomp-Bingo/lib/python3.9/site-packages/bingo/symbolic_regression/symbolic_regressor.py:225: in get_best_individual
return self.best_estimator_.get_best_individual()
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
self = SymbolicRegressor(clo_threshold=1e-05, crossover_prob=0.3, max_time=350.0,
mutation_prob=0.45,
...', 'exp', 'log',
'sqrt'],
population_size=2500, use_simplification=True)
def get_best_individual(self):
if self.best_ind is None:
print("Best individual is None, setting to X_0")
from bingo.symbolic_regression import AGraph
self.best_ind = AGraph()
self.best_ind.command_array = np.array([[0, 0, 0]], dtype=int) # X_0
> self.best_ind._update()
E AttributeError: 'bingo.bingocpp.AGraph' object has no attribute '_update'
/usr/share/miniconda3/envs/srcomp-Bingo/lib/python3.9/site-packages/bingo/symbolic_regression/symbolic_regressor.py:153: AttributeError
link:
https://github.com/cavalab/srbench/runs/6446882859?check_suite_focus=true#step:9:111
Originally posted by @lacava in #123 (comment)
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.