Comments (6)
Could I have the DB please?
from hadrons.
Here are two very similar dBs, but with different schedule, both exhibit the issue.
db.tar.gz
from hadrons.
Grid Commit: 0174f5f742782d1b43e49213bd9d729f7094962e [0174f5f7]
Hadrons Commit: 04e06e8 [04e06e8]
i.e. latest Grid, but Hadrons is prior to Fionn's latest changes and HadronsXmlValidate fix
from hadrons.
I can reproduce it. The error you get is about Grid pointers being inconsistent, which the DB knows nothing about (it does not save the geometries)... this is weird let me investigate.
from hadrons.
Ufff that was a painful one... but it was a bug related to the DB indeed, so good catch!
The error message really did not suggest anything like that and was a randomly appearing, indirect side-effect of the problem.
The short version is: the DB was not restoring the storage type (standard, cache or tmp) of the objects. The momentum phase in the sink function got 'standard' by default instead of 'cache', which means it will be part of garbage collection. Because it is not needed by anybody it was destroyed as soon as the sink module ended. Later in the meson contraction the sink function is called, and randomly it was possible to dereference the pointer on the destroyed object without triggering a bad access error.
So one additional issue is that the sink function was just capturing the address of the phase, which is fine but unsafe if the phase get destroyed in the mean time. So once more it is very important to use the envGet
macro as it actually checks the object lives. This is a general comment I actually wrote the unsafe code 😄.
It is now fixed in develop
, the DB restore the storage types and the phases are accessed in a safer way (i.e. the bug would result in a meaningful error message). Let me know if it works on your side.
This would have happened to us in production at one point or another, so thanks for the thorough testing it saved us a lot of headache.
from hadrons.
Awesome! Thanks for finding all that out and fixing it so quickly!
I'll recompile for both CPU and GPU then restart my jobs - should be a fairly thorough test.
I'll report back how it goes.
Thanks again
from hadrons.
Related Issues (20)
- Problems installing Hadrons on AMD GPUs HOT 2
- Bug in naive scheduler for multiple trajectories in one job HOT 1
- Different contractions have different output file structures HOT 4
- Interference caused by using --debug-mem Grid flag HOT 1
- Spring cleaning
- More complicated baryons are computed wrongly in Hadrons HOT 1
- HDF5-related errors when compiling on Skylake at Cambridge HOT 3
- Intel compiler HOT 2
- Test_QED most results are zero when run on GPU's
- GPU runtime error with MAction:DWFF
- Change of compile flags HOT 1
- 3 Methods to Compute Meson 2-Pt Function
- Eigenpack writing precision change check
- Difficult to understand DB error messages in case of graph inconsistency
- Unable to build a C++ Hadrons application for GPU HOT 4
- Cannot link an app using MGauge::FundtoHirep HOT 3
- Module support for higher representations
- Unable to compile for SU(4) for A100 HOT 7
- MIO::LoadBinary does not check for lattice size HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from hadrons.