Comments (13)
The fix was merged, and version 0.9.3
published ✅
Closing issue.
from madminer.
Hi @rbarrue, the changes seem to come from the refactor in #483. The comment above the code you linked (# Filter out callable definitions when saving into HDF5 file
) seems to imply that this may be on purpose. In case @Sinclert might have time to take a look or remember more about this, he might know.
What was previously used to refer to the custom observable in the file? If you have a minimal reproducible example of this that you could share, this would be very useful to understand what the behavior should be. I'm happy to help with the Python side of things, but do not know this codebase very well unfortunately.
from madminer.
@rbarrue I think I found the "bug":
📝 Context:
After the migration of column-oriented models to object-oriented models described by issue #482 (to which PRs #483, #484 and #485 belong) was completed, both LHEReader
and DelphesReader
classes are safety-checking the definition of their internal models, to ensure that they do not have SyntaxErrors
(NameErrors
are actually fine at this stage).
You can see that all reader-based data classes do so:
- The
Cut
model in these lines. - The
Efficiency
model in these lines. - The
Observable
model in these lines.
This standardisation of safety-checks within models creation has caused that, classes that did not include them in the first place (like the LHEReader
class), are now doing so (and for good!). You can check how reader classes differ in this approach before Madminer v0.9.0
:
- The
LHEReader
class did not safety check the definitions. - The
DelphesReader
class did safety check the definitions.
🐞 The bug:
Now that both LHEReader
and DelphesReader
classes use safety-checks under-the-hood, an already existing bug within the Madminer codebase has resurfaced:
Setting the empty string value (
""
) as default for callableObservable
definitions (in this line), is incompatible with safety-checkingObservable
models at creation (these lines).
🔧 The solution:
The solution would involve using a dummy parseable Python symbol (one that when doing eval(<symbol>)
does not raise any errors), so that when safety-checking the callables, no exception is raised. I think the best candidate is None
as a string (therefore "None"
).
Knowing that using the empty string is a problem when safety-checking using Python's eval()
, this fix can also be included in other serialization functions, such as:
- The
NuisanceParameter
model (both in loading and saving operations). - The
Systematics
model (both in loading and saving operations).
from madminer.
@rbarrue can you confirm whether or not the described bug-fix is enough to solve your problem?
Hi @Sinclert, I'll test the proposed implementation tomorrow and give feedback.
from madminer.
Hi @rbarrue
Can you provide us with the snippet where you use add_observable_from_function? Maybe there are useful hints.
Also posting the entire error output can be helpful.
Thank you!
from madminer.
Hey @rbarrue 👋🏻
As Alex pointed out, the hdf5.py interface suffered a big refactor in PR #483. Among the changes, multiple HDF5 keys were renamed, making pre madminer
0.9.0 setting files incompatible with madminer
0.9.0+ HDF5 parsing (as specified in the 0.9.0 release notes). Make sure you have generated your settings with the latest madminer
version (0.9.2
).
Considering that has been done, I can see that the referenced lines:
madminer/madminer/utils/interfaces/hdf5.py
Lines 1024 to 1026 in 6556f16
were already present before the refactor:
madminer/madminer/utils/interfaces/madminer_hdf5.py
Lines 524 to 529 in 58d6319
So... if you are having problems with an observable definition that worked fine pre v0.9.0, it may be a bug. As my colleagues have pointed out, could you share your set up?
- What version of
madminer
are you using to load the settings. - What version of
madminer
was used to save the settings. - What custom observables did you include in the madminer settings.
from madminer.
I was running this up until now on MadMiner v0.7.6 (it was working fine there), now I'm rerunning everything (even the setup file creation) on v0.9.2. The calculation of the observables is done properly, since the values are in the 'observations' column in the analysed .h5 files. To mention that this problem appears already in the processing stage if I do lhe.save(shuffle=True).
I'm uploading a minimal working example here with two 50k event .lhe files (sorry for the size, but these were the smallest ones I had), which are analysed and can then be combined at the end. I've already produced a set of analysed .h5 files which you can inspect, and if you want to see the error just run the code as-is, with the process_events functions commented. If you want to reanalyse the files, just uncomment the lines.
@irinaespejo the error is:
File "/cvmfs/sw.el7/python/3.7.2/lib/python3.7/site-packages/madminer/sampling/combine.py", line 127, in combine_and_shuffle
) = load_madminer_settings(filename, include_nuisance_benchmarks=False)
File "/cvmfs/sw.el7/python/3.7.2/lib/python3.7/site-packages/madminer/utils/interfaces/hdf5.py", line 115, in load_madminer_settings
observables[o_name] = Observable(o_name, o_def)
File "", line 6, in init
File "/cvmfs/sw.el7/python/3.7.2/lib/python3.7/site-packages/madminer/models/readers.py", line 50, in post_init
eval(self.val_expression)
File "", line 0
^
from madminer.
(sorry, closed this on accident)
So, the solution would simply be to replace
observable_defs = [d if isinstance(d, str) else "" for d in observable_defs]
-> observable_defs = [d if isinstance(d, str) else "None" for d in observable_defs]
and do the same in the NuisanceParameter
and Systematics
operations ?
from madminer.
Yes. That should be it.
Could you run your analysis either (A) modifying your local copy of madminer
, or (B) installing a customised fork of madminer
, to verify whether or not my proposal fixes your crash?
If the answer is positive, I am open to review a PR including those changes in addition to the necessary tests so that this loading - saving setting inconsistency do not happen again. Finally, I could release version 0.9.3
.
from madminer.
@rbarrue can you confirm whether or not the described bug-fix is enough to solve your problem?
from madminer.
Hi @Sinclert, I tested the proposed changes and everything seems to work as expected. In your comment, you mentioned "in addition to the necessary tests so that this loading - saving setting inconsistency do not happen again", are you talking about adding some tests to the tests folder ?
from madminer.
Hi @rbarrue ,
Yes, exactly. Right now the tests folder is pretty empty, but I (we?) would like that to change.
What a better opportunity to start? 🙂
I would envision creating a tests/utils/interfaces/test_hdf5.py
file (mimicking the folder structure under madminer
), with a small set of test functions covering the saving-loading of madminer settings, in order to check that the model definitions being saved (whether it is Observable
,NuisanceParameter
or Systematics
) are valid upon eval(self.val_definition)
(such as the one done within the models).
from madminer.
Thank you very much @Sinclert !
from madminer.
Related Issues (20)
- Migrate CI and CD from Travis CI to GitHub Actions workflows HOT 5
- Python 2 → 3 conversion error HOT 2
- [Bug?] An unsuitable calling to initializing function(__init__) of class MadMinerParticle HOT 10
- Define Jupyter notebooks automatic tests HOT 2
- [Internal] Define data models HOT 1
- LHE analysis error when parsing observables
- Handling of all PDF uncertainty sets HOT 6
- Bug in calculation of full Fisher information for ensemble with calculate_covariance=False HOT 1
- Increase test coverage
- Remove restrictions on Dependabot GitHub Actions update HOT 2
- Add Overdetermined morphing and addtional testings HOT 1
- Error calculating Fisher information in MadMiner(v0.9.3) HOT 6
- Systematics cannot be read by MadGraph HOT 3
- AttributeError: 'AsymptoticLimits' object has no attribute '_calculate_partition_bound' HOT 3
- Use gh-action-pypi-publish v1.7.0+ APIs HOT 2
- FileNotFoundError for event generation HOT 1
- Stuck in reweighting
- Possible error in filepath lookup?
- Add matthewfeickert as collaborator with owner role on PyPI HOT 5
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from madminer.