When running load_madminer_settings (<a href="https://github.com/madminer-too

The fix was merged, and <a href="https://pypi.org/project/madminer/0.9.3/" rel="nofoll

Hi <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

<a class="user-mention notranslate" data-hovercard-type="user" data-hover

Hi <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="

Hey <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url=

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

Crash in load_madminer_settings in files with custom observables about madminer HOT 13 CLOSED

rbarrue commented on September 17, 2024

Crash in load_madminer_settings in files with custom observables

from madminer.

Comments (13)

Sinclert commented on September 17, 2024 2

The fix was merged, and version 0.9.3 published ✅

Closing issue.

from madminer.

alexander-held commented on September 17, 2024 1

Hi @rbarrue, the changes seem to come from the refactor in #483. The comment above the code you linked (# Filter out callable definitions when saving into HDF5 file) seems to imply that this may be on purpose. In case @Sinclert might have time to take a look or remember more about this, he might know.

What was previously used to refer to the custom observable in the file? If you have a minimal reproducible example of this that you could share, this would be very useful to understand what the behavior should be. I'm happy to help with the Python side of things, but do not know this codebase very well unfortunately.

from madminer.

Sinclert commented on September 17, 2024 1

@rbarrue I think I found the "bug":

📝 Context:

After the migration of column-oriented models to object-oriented models described by issue #482 (to which PRs #483, #484 and #485 belong) was completed, both LHEReader and DelphesReader classes are safety-checking the definition of their internal models, to ensure that they do not have SyntaxErrors (NameErrors are actually fine at this stage).

You can see that all reader-based data classes do so:

The Cut model in these lines.
The Efficiency model in these lines.
The Observable model in these lines.

This standardisation of safety-checks within models creation has caused that, classes that did not include them in the first place (like the LHEReader class), are now doing so (and for good!). You can check how reader classes differ in this approach before Madminer v0.9.0:

The LHEReader class did not safety check the definitions.
The DelphesReader class did safety check the definitions.

🐞 The bug:

Now that both LHEReader and DelphesReader classes use safety-checks under-the-hood, an already existing bug within the Madminer codebase has resurfaced:

Setting the empty string value ("") as default for callable Observable definitions (in this line), is incompatible with safety-checking Observable models at creation (these lines).

🔧 The solution:

The solution would involve using a dummy parseable Python symbol (one that when doing eval(<symbol>) does not raise any errors), so that when safety-checking the callables, no exception is raised. I think the best candidate is None as a string (therefore "None").

Knowing that using the empty string is a problem when safety-checking using Python's eval(), this fix can also be included in other serialization functions, such as:

The NuisanceParameter model (both in loading and saving operations).
The Systematics model (both in loading and saving operations).

from madminer.

rbarrue commented on September 17, 2024 1

@rbarrue can you confirm whether or not the described bug-fix is enough to solve your problem?

Hi @Sinclert, I'll test the proposed implementation tomorrow and give feedback.

from madminer.

irinaespejo commented on September 17, 2024

Hi @rbarrue

Can you provide us with the snippet where you use add_observable_from_function? Maybe there are useful hints.
Also posting the entire error output can be helpful.

Thank you!

from madminer.

Sinclert commented on September 17, 2024

Hey @rbarrue 👋🏻

As Alex pointed out, the hdf5.py interface suffered a big refactor in PR #483. Among the changes, multiple HDF5 keys were renamed, making pre madminer 0.9.0 setting files incompatible with madminer 0.9.0+ HDF5 parsing (as specified in the 0.9.0 release notes). Make sure you have generated your settings with the latest madminer version (0.9.2).

Considering that has been done, I can see that the referenced lines:

madminer/madminer/utils/interfaces/hdf5.py

Lines 1024 to 1026 in 6556f16

    
           # Filter out callable definitions when saving into HDF5 file 
        
           observable_defs = [d if isinstance(d, str) else "" for d in observable_defs] 
        
           observable_defs = _encode_strings(observable_defs)

were already present before the refactor:

madminer/madminer/utils/interfaces/madminer_hdf5.py

Lines 524 to 529 in 58d6319

    
           for key in observable_names: 
        
               definition = observables[key] 
        
               if isinstance(definition, str): 
        
                   observable_definitions.append(definition.encode("ascii", "ignore")) 
        
               else: 
        
                   observable_definitions.append("".encode("ascii", "ignore"))

So... if you are having problems with an observable definition that worked fine pre v0.9.0, it may be a bug. As my colleagues have pointed out, could you share your set up?

What version of madminer are you using to load the settings.
What version of madminer was used to save the settings.
What custom observables did you include in the madminer settings.

from madminer.

rbarrue commented on September 17, 2024

I was running this up until now on MadMiner v0.7.6 (it was working fine there), now I'm rerunning everything (even the setup file creation) on v0.9.2. The calculation of the observables is done properly, since the values are in the 'observations' column in the analysed .h5 files. To mention that this problem appears already in the processing stage if I do lhe.save(shuffle=True).

I'm uploading a minimal working example here with two 50k event .lhe files (sorry for the size, but these were the smallest ones I had), which are analysed and can then be combined at the end. I've already produced a set of analysed .h5 files which you can inspect, and if you want to see the error just run the code as-is, with the process_events functions commented. If you want to reanalyse the files, just uncomment the lines.

@irinaespejo the error is:
File "/cvmfs/sw.el7/python/3.7.2/lib/python3.7/site-packages/madminer/sampling/combine.py", line 127, in combine_and_shuffle
) = load_madminer_settings(filename, include_nuisance_benchmarks=False)
File "/cvmfs/sw.el7/python/3.7.2/lib/python3.7/site-packages/madminer/utils/interfaces/hdf5.py", line 115, in load_madminer_settings
observables[o_name] = Observable(o_name, o_def)
File "", line 6, in init
File "/cvmfs/sw.el7/python/3.7.2/lib/python3.7/site-packages/madminer/models/readers.py", line 50, in post_init
eval(self.val_expression)
File "", line 0

from madminer.

rbarrue commented on September 17, 2024

(sorry, closed this on accident)
So, the solution would simply be to replace

observable_defs = [d if isinstance(d, str) else "" for d in observable_defs] -> observable_defs = [d if isinstance(d, str) else "None" for d in observable_defs] and do the same in the NuisanceParameter and Systematics operations ?

from madminer.

Sinclert commented on September 17, 2024

Yes. That should be it.

Could you run your analysis either (A) modifying your local copy of madminer, or (B) installing a customised fork of madminer, to verify whether or not my proposal fixes your crash?

If the answer is positive, I am open to review a PR including those changes in addition to the necessary tests so that this loading - saving setting inconsistency do not happen again. Finally, I could release version 0.9.3.

from madminer.

Sinclert commented on September 17, 2024

@rbarrue can you confirm whether or not the described bug-fix is enough to solve your problem?

from madminer.

rbarrue commented on September 17, 2024

Hi @Sinclert, I tested the proposed changes and everything seems to work as expected. In your comment, you mentioned "in addition to the necessary tests so that this loading - saving setting inconsistency do not happen again", are you talking about adding some tests to the tests folder ?

from madminer.

Sinclert commented on September 17, 2024

Hi @rbarrue ,

Yes, exactly. Right now the tests folder is pretty empty, but I (we?) would like that to change.

What a better opportunity to start? 🙂

I would envision creating a tests/utils/interfaces/test_hdf5.py file (mimicking the folder structure under madminer), with a small set of test functions covering the saving-loading of madminer settings, in order to check that the model definitions being saved (whether it is Observable,NuisanceParameter or Systematics ) are valid upon eval(self.val_definition) (such as the one done within the models).

from madminer.

rbarrue commented on September 17, 2024

Thank you very much @Sinclert !

from madminer.

Crash in load_madminer_settings in files with custom observables about madminer HOT 13 CLOSED

Comments (13)

📝 Context:

🐞 The bug:

🔧 The solution:

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent

	# Filter out callable definitions when saving into HDF5 file
	observable_defs = [d if isinstance(d, str) else "" for d in observable_defs]
	observable_defs = _encode_strings(observable_defs)

	for key in observable_names:
	definition = observables[key]
	if isinstance(definition, str):
	observable_definitions.append(definition.encode("ascii", "ignore"))
	else:
	observable_definitions.append("".encode("ascii", "ignore"))