uwdata / boba Goto Github PK
View Code? Open in Web Editor NEWSpecifying and executing multiverse analysis
License: BSD 3-Clause "New" or "Revised" License
Specifying and executing multiverse analysis
License: BSD 3-Clause "New" or "Revised" License
need to update the example/simple/ template.py code to get rid of these future warnings.
/home/shiven/.local/lib/python3.7/site-packages/statsmodels/tools/_testing.py:19: FutureWarning: pandas.util.testing is deprecated. Use the functions in the public API at pandas.testing instead.
import pandas.util.testing as tm
/home/shiven/.local/lib/python3.7/site-packages/statsmodels/compat/pandas.py:23: FutureWarning: The Panel class is removed from pandas. Accessing it from the top-level namespace will also be removed in the next version
data_klasses = (pandas.Series, pandas.DataFrame, pandas.Panel)
When a universe exits with a non-zero status code, it seems the process would stop and subsequent universes will not run. But ideally we want to run every universe the user specifies even if some of the universes fail in the middle.
To reproduce:
exit(1)
on line 16 here.boba run --all --jobs 0 --batch_size 1
When trying the fertility_r example, the scripts that are generated contain
df <- read.csv2("...\boba-master\example\fertility_r\multiverse\summary.csv", sep = ",", stringsAsFactors = FALSE, check.names=FALSE)
df[1, 8] = summar$coefficients[4, 4]
write.csv(df, file="...\boba-master\example\fertility_r\multiverse\summary.csv", row.names=FALSE)
This is probably the reason for the error I get when trying to compile all scripts with boba run --all.
I cannot see, however, where I can change the paths during generation of the scripts.
Besides the command line interface, we should make it possible for other people to use the Parser class for compiling a multiverse from within python.
Now it's possible to import and use the Parser class if they import the boba package, however, Parser prints and exits upon any error, which is undesirable.
The first step is to modify Parser to throw an exception instead of print and exit. We might think if other aspects might need improvement.
Currently, a universe will log/print stdout
and stderr
after it finishes running. Is it possible to log/print these outputs more frequently? Individual scripts can take a long time to run and it will be helpful to track its progress via these outputs (e.g. such a script may periodically print its progress, but we could not view the progress in "real-time" now).
The scipy version in requirements.txt
is incompatible with the scipy version in boba-visualizer
in tutorial/cli.rst
, under the boba run
section, the --dir
option description is missing.
using the boba compile command fails when using python 3.8.2 but succeeds when using python 3.7.5.
to test this, I used pyenv
here is the output from running boba compile in the /example/simple/ directory when using python 3.8.2:
Creating multiverse from ./template.py
Traceback (most recent call last):
File "/home/shiven/.local/bin/boba", line 8, in <module>
sys.exit(main())
File "/usr/lib/python3/dist-packages/click/core.py", line 764, in __call__
return self.main(*args, **kwargs)
File "/usr/lib/python3/dist-packages/click/core.py", line 717, in main
rv = self.invoke(ctx)
File "/usr/lib/python3/dist-packages/click/core.py", line 1137, in invoke
return _process_result(sub_ctx.command.invoke(sub_ctx))
File "/usr/lib/python3/dist-packages/click/core.py", line 956, in invoke
return ctx.invoke(self.callback, **ctx.params)
File "/usr/lib/python3/dist-packages/click/core.py", line 555, in invoke
return callback(*args, **kwargs)
File "/home/shiven/.local/lib/python3.8/site-packages/boba/cli.py", line 29, in compile
ps.main()
File "/home/shiven/.local/lib/python3.8/site-packages/boba/parser.py", line 394, in main
self._write_server_config()
File "/home/shiven/.local/lib/python3.8/site-packages/boba/parser.py", line 334, in _write_server_config
self.adg.create(self.code_parser.blocks)
File "/home/shiven/.local/lib/python3.8/site-packages/boba/adg.py", line 211, in create
self._merge()
File "/home/shiven/.local/lib/python3.8/site-packages/boba/adg.py", line 87, in _merge
for g in gp:
RuntimeError: dictionary keys changed during iteration
For the syntax in BOBA_CONFIG block, we might add to the bottom of this file, alongside before_execute
and after_execute
:
https://github.com/uwdata/boba/blob/master/tutorial/rules.md
For the syntax of the separate JSON spec for the language, we might create another markdown document in the tutorial folder.
I ran a multiverse with about 4,000 universes on 40 processes. About 30 processes were done within the first few minutes, but the remaining took longer than 2 hours to finish.
This might be because boba assigns universes at the beginning instead of dynamically. Need to investigate further.
requirement.txt
, some needs Rscript and R dependencies)boba-server
Each universe should redirect stdout and stderr to a separate file in the boba_logs
folder. For example, the stdout and stderr of universe_1
will be in boba_logs/log_1.txt
.
We might need a comprehensive description of the available continuous variable syntax in https://github.com/uwdata/boba/blob/master/tutorial/rules.md
and perhaps walk the users through the simple_cont
example in
https://github.com/uwdata/boba/blob/master/tutorial/simple.md
The universe
column in logs.csv
should be renamed to uid
, and its value should be the integer universe ID instead of a string filename. This will make joining with other metadata much easier.
The current design of boba constraints has the following caveat:
"For placeholder variables, the decision is NOT made at the beginning, but until the placeholder variable first appears in the code. Any unmade decision will have option None and index -1."
This is really annoying and we should fix it.
Dear Boba-team,
we are working on a manuscript about various open science tools in neuroscience, where we plan to include an overview figure with the different tools available. Therefore we wish to include a logo of boba. Do you have some, and are there any license restrictions, or can we just include the graphic into our manuscript?
Best regards,
Tina
Increase the version of boba-visualizer to 1.0.1 in boba dependency
Suppose we have a multiverse of size 2, I should be able to run the 2nd universe with boba run 2
, but it returns an error saying "There are only 2 universes."
Since we replaced the shell script code with python code for running all the universes, should we delete the exec_template
and write_sh
, since nothing in the project uses these anymore?
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.