jeffersonlab / analyzer Goto Github PK
View Code? Open in Web Editor NEWHallA C++ Analyzer
License: BSD 3-Clause "New" or "Revised" License
HallA C++ Analyzer
License: BSD 3-Clause "New" or "Revised" License
Change all modules to use the DBRequest for database lookups. Decide if we want to keep the legacy flat-file support functions (probably not).
Remove the slew of binary-compatibility workarounds that have accumulated in rel 1.5.
It would be nice if only the pieces of code (*.h and *.C) files that are actually needed for building the analyzer were located in these directories, so that wildcards could be used in the build files (either Makefiles or SConstruct/SConscripts). One could possibly move the pieces required for building standalone codes to another parallel directory, or to a and appropriate subdirectory.
After the dust clears a bit on the recent changes, and if the event handler scheme is accepted, I'd like to propose to remove the /hana_scaler directory and "all things related to scalers" in /src, except for the scaler event handler. This will remove about 7K lines of unneeded code. Such a large cleanup may seem surprising, but the reasons are 1) we don't need the scaler GUI in Podd, 2) database handling is simplified -- handled now by THaAnalysisObject base class of event handlers, and 3) all the standalone tscal*_main.C codes can be replaced by variants of the scaler event handler plug-in; besides, I don't think anyone used them except me.
In addition to the above, I would remove THaCodaDecoder and THaFastBusWord.
The archaic THa prefix to class names should be replaced with appropriate namespaces like Podd and HallA. The old names may still be made available to legacy code via typedefs.
Scons files are not included in the srcdist tarball generated by make.
On that note, a few more things to fix:
Add scons build scripts from Ed
When replaying multiple runs, THaAnalyzer currently insists on reinitializing the apparatuses for every run. This is inefficient if it is known that the database does not change between runs, e.g. for simulation data or, in almost all cases, for multiple file segments of the same CODA run.
Perhaps add a mode setting, THaAnalyzer::SetReInitMode( Bool_t ) to turn reinitialization off. Or make it more flexible: mode = { no, auto, always }, where "auto" tries to be intelligent about detecting database changes.
Currently, clients of THaEvData and THaCodaData see EVIO return values from various member functions. For example, THaCodaData is used by THaCodaFile which is used by THaCodaRun which is used by THaRun which is used by THaAnalyzer. All these classes have to include evio.h to understand the return codes. (At least, that's what they should be doing.) This is not optimal as we have an unnecessary dependence on EVIO.
We should hide the underlying EVIO implementation completely by letting these two class hierarchies define constants for the return values of all their functions.
This will allow us to link only the decoder library, not the whole analyzer, with libevio.
Integrate the changes to the core part needed for hcana.
With g2p run 3132, I get gazillions of messages like this from THaScintillator (perhaps 20-30 per 1000 events):
Warning in THaScintillator::"L.s1"::Decode: 2 hits on TDC channel 4/10/89
Warning in THaScintillator::"L.s1"::Decode: 2 hits on TDC channel 4/10/81
Warning in THaScintillator::"L.s1"::Decode: 2 hits on TDC channel 4/10/89
Warning in THaScintillator::"L.s1"::Decode: 2 hits on TDC channel 4/10/81
Warning in THaScintillator::"L.s1"::Decode: 2 hits on TDC channel 4/10/91
These can't be silienced in analyzer-1.5.25, but there should be a way to acknowledge and continue quietly. Perhaps by default just print a warning summary, and per-event warnings only with fDebug > 1? Also, if we already print them, the messages should include the event number.
Support for global variables on std::vector objects was added in issue #22. However, we still do not have the ability to define such variables via RTTI, i.e. via the convenient call to DefineVarsFromList, which everyone uses. Given that the ROOT dictionary has full STL support, this should not be too hard to do.
THaRun should automatically detect files that are split into segments with the usual CODA *.0, *.1 etc. naming convention. Such split runs should be presented as one large run, as far as that is possible.
Rewrite decoder with object oriented design, allowing new hardware to be added via plugins.
Different event types (physics, scalers, slow control, etc.) need different processing chains. These chains should be made pluggable so users can easily extend or replace them. This also helps split the large main analyzer into smaller and hopefully more maintainable subclasses.
As reported by Vlassis Petousis [email protected]:
Compilation on MacOSX 10.9 with official ROOT 5.34.19 fails completely with
Generating Decoder Dictionary...
/usr/local/root-5.34.19/bin/rootcint -f THaDecDict.C -c -pthread -stdlib=libc++ -m64 -I/usr/local/root-5.34.19/include/root -I../src -DHAS_SSTREAM -DWITH_DEBUG THaUsrstrutils.h THaCrateMap.h THaCodaData.h THaEpics.h THaFastBusWord.h THaCodaFile.h THaSlotData.h THaEvData.h evio.h THaCodaDecoder.h THaBenchmark.h haDecode_LinkDef.h
-s : Step into function/loop mode
!!!Calling default constructor (G__CINT_ENDL()) 0x7fcd494c9210 for declaration of endl cint/cint/src/decl.cxx:2785
followed by a string of similar, CINT-related errors about calling (default) constructors.
All analysis objects should use TTimeStamp instead of TDatime to indicate run dates etc. TDatime is not portable between time zones.
THaSpectrometer is currently limited to generating THaTrack objects. THaTracks, in turn, describe tracks in TRANSPORT style, appropriate for small-acceptance focusing spectrometers. Other spectrometer types are conceivable, for example 4pi detectors for which spherical coordinates are best suited, or SoLID with its cylindrical coordinate system. One might even want a different type of tracks for focusing spectrometers than THaTrack.
To support this, turn THaSpectrometer into a base class for various spectrometer classes that differ by the tracks that they generate. Similarly, THaTrack should become a base class for various types of tracks. Common among all tracks is that they represent physics 4-vectors at the vertex; everything else is specific to the spectrometer type.
THaVar::GetObjArrayLenPtr returns a pointer to a local static variable, which breaks thread safety.
Shared libraries should be split into a core part with generally useful classes and hall A/C-specific libraries containing only hall code. The Hall C library will be managed in the hcana repository.
With more recent ROOT versions (tested 5.34/17 and 5.34/24), I get the following crash (with both analyzer 1.5 and 1.6). It seems to be CINT-related, but could be an awkward interaction. An old ROOT installation, 5.28, does not show this behavior, but unlike 5.34 does crash when doing the same with TBrowser.
21] fedora21:~ > analyzer
*
W E L C O M E to the *
H A L L A C++ A N A L Y Z E R *
*
Release 1.6.0-devel Jan 31 2015 *
*
For information visit *
http://hallaweb.jlab.org/root/ *
*
CINT/ROOT C/C++ Interpreter version 5.18.00, July 2, 2010
Type ? for help. Commands must be C++ statements.
Enclose multiple statements between { }.
analyzer [0] THaAnalyzer a
analyzer [1] .q
analyzer [2]
*** Break *** segmentation violation
There was a crash.
This is the entire stack trace of all threads:
#0 0x00000039f96c451a in waitpid () from /lib64/libc.so.6
#1 0x00000039f9642a2b in do_system () from /lib64/libc.so.6
#2 0x00000039fcc7185f in TUnixSystem::StackTrace() () from /usr/lib64/root/libCore.so.5.34
#3 0x00000039fcc73b2c in TUnixSystem::DispatchSignals(ESignals) () from /usr/lib64/root/libCore.so.5.34
#4
#5 0x00000000014c6f90 in ?? ()
#6 0x00007f62271ecd64 in G__haDict_769_0_84 (result7=0x7fffb97c1b60, funcname=0x110bed0 "", libp=0x7fffb97c1ba0, hash=0) at haDict.C:25650
#7 0x00000039fbb2c5f3 in Cint::G__ExceptionWrapper(int ()(G__value, char const_, G__param_, int), G__value_, char_, G__param*, int) () from /usr/lib64/root/libCint.so.5.34
#8 0x00000039fba7cfe4 in G__execute_call () from /usr/lib64/root/libCint.so.5.34
#9 0x00000039fba7d3ec in G__call_cppfunc () from /usr/lib64/root/libCint.so.5.34
#10 0x00000039fbae628d in G__interpret_func () from /usr/lib64/root/libCint.so.5.34
#11 0x00000039fba5d513 in G__getfunction () from /usr/lib64/root/libCint.so.5.34
#12 0x00000039fbb59737 in G__destroy_upto () from /usr/lib64/root/libCint.so.5.34
#13 0x00000039fbb59dac in G__scratch_globals_upto () from /usr/lib64/root/libCint.so.5.34
#14 0x00000039fcc33050 in TCint::ResetGlobals() () from /usr/lib64/root/libCore.so.5.34
#15 0x00000039fcbc4eb1 in TROOT::EndOfProcessCleanups(bool) () from /usr/lib64/root/libCore.so.5.34
#16 0x00000039fcb93336 in TApplication::~TApplication() () from /usr/lib64/root/libCore.so.5.34
#17 0x00007f62270f4fcd in THaInterface::~THaInterface (this=0x112d900, __in_chrg=) at src/THaInterface.C:140
#18 0x00007f62270f5026 in THaInterface::~THaInterface (this=0x112d900, __in_chrg=) at src/THaInterface.C:159
#19 0x0000000000401601 in main (argc=1, argv=0x7fffb97cc968) at src/main.C:22
We developed a special analysis mode for the HRS VDCs in 2010 that correlates clusters in u and v by their timing. This mode is important for experiments operating with very high singles rates in the HRS, causing a significant accidental coincidence rate within the VDC drift time window - i.e. APEX.
This new algorithm should be integrated into the VDC code. However, it should only be enabled on demand via a configuration option since (a) it is slow and (b) it may reduce the tracking efficiency in the analysis of standard, low-rate experiments because of the presence of multiple in-time clusters coming from knock-ons, scraping etc.
To be clear: this new algorithm only filters out accidentally coincident tracks, but does little, if anything, to eliminate "multitracks" resulting from multiple in-time clusters (see item (b) above).
Currently, THaOutput writes out every variable as Double_t. At the minimum, Int_t should also be a supported output type. Having at least a basic integer type will improve accuracy and reduce the output file size.
Fix the bugs discovered in the VDC tracking algorithm audit:
All these bugs only affect events with more than one cluster per plane, i.e. typically a few percent of events, which are normally rejected in the analysis anyway. Thus, the impact of these bugs is considered minor.
Unbundle ancient EVIO. Support latest EVIO version, loaded from external libs.
It should somehow be possible to make a debug build with the SCons system.
Currently, we set DEBUG=1 in the Makefiles during development. DEBUG builds are more or less the default for development, so it would be good if SCons could be configured to remember this option.
Steps to reproduce:
./scons/scons.py debug
scons: *** Do not know how to make File target `debug' (/home/ole/Develop/analyzer/debug). Stop.
Expected behavior:
Builds debug version with CXXFLAGS = -g -O0
In my work on the decoder upgrade, I find that the handling of the cratemap is/was very delicate. Even small typos lead to a SILENT failure of the code. I spent nearly an hour tracking it down. I'll work on fixing THaCrateMap so that it is more robust and also makes sensible complaints so that users know what's wrong. The following illustrates. These are lines in db_cratemap.dat read by THaCrateMap. Of course, this was all my (Bob M.) fault !
==== Crate 3 type fastbus
I get this with analyzer-1.5.25:
Normal end of file /home/ole/tutorial/data/g2p_3132.dat.0 encountered
End of file
Counter summary:
314292 events read
314292 events decoded
313476 physics events
314172 scaler events
115 slow control events
5 other event types
313476 physics events analyzed
314292 events accepted
314172 is not the number of scaler events, but "all"-"slow control"-"other". There should be 696 scaler events. Looks like a bug.
It looks like I temporarily forgot about mutable data members when I added fDim as Int_t*. I think if I make it mutable Int_t fDim, things should work just fine, and we have one less ugly code spot.
I'm not too fond of bundling SCons with the main distribution. After all, we're not bundling GNU make either, SCons is widely available for distros, and we're not a Python project, not even talking about licensing issues. If people really can't get a working SCons installation on their box, we could offer installation instructions, or perhaps an installation script like get_scons.sh.
Also, it would be helpful if we could get the scons system working with the currently supported version 2.0.1 on RHEL6, which is one of our officially supported platforms.
Unlike the Scintillator and Cherenkov classes, the Shower class only calculates the track projection coordinates for the first reconstructed track. This is confusing, and information is missing.
To fix this, move the track projection business into the THaSpectrometerDetector base class, so all detectors can use it without any need for extra code.
If a VDC cluster (in THaVDCPlane::fClusters) is assigned to a reconstructed track, record the track index with the cluster and export it as a global variable.
fgNeedInit and fgCrateMapName in THaEvData were made static to implement fixes that would otherwise have broken binary compatibility. No need for this anymore in Release 1.6. Replace with regular member variables.
When using analyzer 1.5.22 against recent ROOT version 5.34, and replaying a CODA file in a different time zone (French time in my case), the run time is not correctly identified (by correctly I mean EST time, in which the db_run.dat file is usually written). This obviously messes up the reconstruction... This didn't happen with ROOT version 5.18.
I show screen outputs for the same run, with different ROOT version below. Could someone confirm this ?
Thanks,
Carlos
*
W E L C O M E to the *
H A L L A C++ A N A L Y Z E R *
*
Release 1.5.22 Nov 7 2013 *
*
For information visit *
http://hallaweb.jlab.org/root/ *
*
CINT/ROOT C/C++ Interpreter version 5.18.00, July 2, 2010
Type ? for help. Commands must be C++ statements.
Enclose multiple statements between { }.
Processing ProcessRun.C...
The target is Hydrogen 15cm
Will analyze file ./dvcs10_9639.dat.0
Opening file ./dvcs10_9639.dat.0
TCaloEvent constructor
Creating new output file: ./dvcs_9639_0.root
Prestart at 1
Prescales at 2
THaRunParameters::ReadDatabase: Opened database file /afs/in2p3.fr/home/throng/clas/carlos/DVCS2/onlana/DB2/db_run.dat
OBJ: THaRun RUN_9639
Run number: 9639
Run date: Tue Dec 14 21:22:41 2010
However:
*
W E L C O M E to the *
H A L L A C++ A N A L Y Z E R *
*
Release 1.5.22 Nov 20 2013 *
*
For information visit *
http://hallaweb.jlab.org/root/ *
*
CINT/ROOT C/C++ Interpreter version 5.16.29, Jan 08, 2008
Type ? for help. Commands must be C++ statements.
Enclose multiple statements between { }.
Processing ProcessRun.C...
The target is Hydrogen 15cm
Will analyze file ./dvcs10_9639.dat.0
Opening file ./dvcs10_9639.dat.0
TCaloEvent constructor
Creating new output file: ./dvcs_9639_0.root
Prestart at 1
Prescales at 2
THaRunParameters::ReadDatabase: Opened database file /afs/in2p3.fr/home/throng/clas/carlos/DVCS2/onlana/DB2/db_run.dat
OBJ: THaRun RUN_9639
Run number: 9639
Run date: Tue Dec 14 15:22:41 2010
Currently, all trigger bits use the same cut, which is hardcoded in the source. We need the cut limits to be configurable via database parameters. There should be an option to override global cut limits with ones specific to a certain trigger bit. Since this requires new member variables, it will not be binary compatible and so has to wait until release 1.6.
THaVar global variables currently do not know which object they belong to (usually a THaAnalysisObject). It would be useful for reports (and other things like safer variable deletion) if they did. THaVarList would need to be extended to take advantage of this new information.
Example: Define a cut
FullBackTrack MC.btr.n==1&&MC.btr.planes[0]==0x1f
where MC.btr.n is a scalar and MC.btr.planes is a variable-sized array of size MC.btr.n. The above fails to compile:
Error in THaCut::Compile: Bad numerical expression : "MC.btr.planes[0]"
This is obviously a bug. MC.btr.planes[0]==0x1f should evaluate to false if MC.btr.n==0, and to true if MC.btr.n>0 and MC.btr.planes[0]==0x1f. Also, if we assume C-style operator evaluation rules, MC.btr.planes[0] should not be evaluated at all if MC.btr.n!=1.
Now change the definition to
FullBackTrack MC.btr.n==1&&MC.btr.planes==0x1f
This DOES compile and it even works in some way, but for every event with MC.btr.n==0, one gets a nuisance message
Warning in THaVar::GetValue(): Whoa! Index out of range, variable MC.btr.planes, index 0.
This is also clearly a bug since, as mentioned above, MC.btr.planes should not be evaluated if there are no elements.
It is a bit unclear how a test involving variable-sized arrays is supposed to work in general. Should it take the AND or the OR of the test MC.btr.planes[i]==0x1f if MC.btr.n>1? It looks like the specifications for the cut package need to be reviewed and, if necessary, improved.
In updating my forked version of the analyzer recently, I discovered an issue that prevents compilation on Mac OS X (XCode + command line tools version 6.0 on OS X Yosemite). The problem is related to the files VDCeff.C and VDCeff.h.
In VDCeff.h, there is a structure called VDCvar_t which holds data needed for the efficiency calculation for one VDC plane/wire spectrum. In the original version of this header file, VDCvar_t is a protected member of the VDCeff class. In VDCeff.C, there is an iterator that is defined which is a vector of these structures (called variter_t). On Linux, this works fine and the code compiles. However, on Mac OS X, the compiler complains at the typedef statement that there is an error because VDCvar_t is a protected member. Honestly, I can't figure out why this is a problem to do it this way (and this was confirmed in asking a couple of my CS colleagues as well).
Nonetheless, if one makes VDCvar_t a public member of the VDCeff class (along with the typedefs for Vcnt_t and CVar_t which VDCvar_t uses), the code will then compile.
While I can't explain the error on Mac OS X, I also can't see that it is a huge issue to make this structure definition public. One is not, as far as I can tell, exposing the data publicly, just the type.
I have not yet done a pull request on this, as I wanted Ole and others to comment on it first.
std::vector support would be very useful with more modern user classes
If one replays multiple input files in a single analysis session (to add all their events together in the output), duplicate event numbers appear in the event header. This is expected since the event number recorded there is simply the CODA event number of the respective input file, which usually start at 1 for each file. However, what's missing is some sort of continuous counter for the effective event number. Perhaps we should replace fEvtHdr.fEvtNum with this counter value. To keep the original information available, we could add fRawEvtNum (=CODA even number) and fInputFileIdx (input file index). Also, at least the continuous counter should be a 64-bit integer since one might conceivably have more than 2G/4G events.
It was probably a poor design decision to include the s1 and s2 scintillators in the THaHRS spectrometer class. Including the VDC by default is fine, but the scintillators are not really required and not always installed. So, from version 1.6 on, they should be removed.
Also, we should probably create a THaBareHRS or THaHRSBase class without any predefined detectors. This would be useful for checkout of individual detector systems.
Target reconstruction should be moved from the THaVDC class into THaHRSBase. It operates on focal plane tracks regardless of how these tracks are found, so it is not a property of the VDCs, but definitely one of the spectrometer (per its physical construction). THaHRS would then inherit from THaHRSBase.
This issue is related to issue #31 - The THaHRSBase class should implement a MakeTrack method that produces tracks with HRS-style coordinates.
When I merged to the main branch of the analyzer, i.e the version supported by Ole, I found that the feature of "vector histograms" did not work. This is a feature whereby a histogram for a vector of variables, for example for the shower counter, would expand to a vector of histograms like Rsha_p0 ... Rsha_p79 (there are 80 of those blocks). It's more convenient than defining 80 histograms in the output.def file. Currently you get no histograms at all, and the line in output.def is reported as an error. What I've observed so far is that if I copy over the tagged-release-156 version of the following classes, the feature is restored. I don't understand WHY yet, but I'm investigating. Those classes are THaFormula, THaCut, and THaVform.
THaDecData is consistently a performance hog, largely for the reasons implied in the FIXMEs. The entire class should probably be rewritten with (a) configurability and (b) performance in mind.
Another local static discovered in THaAnalysisObject::Here, which is used in error messages all over the place. Note the big all-caps FIXME in the code.
The same is probably true for all applications of ROOT's Form() function in the analyzer.
One possible solution is to return TString, which conveniently casts itself to const char* and so, to first order, would not require code changes.
THaFormulas containing parameter expressions, or special functions that expect parameters, may crash. Examples:
f1 = new THaFormula("f1","gaus");
f1->Eval()
*** Break *** segmentation violation
double x = 1;
gHaVars->Define("x",x);
f2 = new THaFormula("f2","[0]+x")
f2->Eval()
*** Break *** segmentation violation
I can now reproduce a spurious error I've been getting lately:
./scons/scons.py -c (clean up)
./scons/scons.py (build everything)
....
g++ -o analyzer -pthread src/main.o -L/usr/lib64/root -L. -L/home/ole/evio-4.0/Linux-x86_64/lib -Lsrc -Lhana_decode -Lhana_scaler -lCore -lCint -lRIO -lNet -lHist -lGraf -lGraf3d -lGpad -lTree -lRint -lPostscript -lMatrix -lPhysics -lMathCore -lThread -lm -ldl -levio -lHallA -ldc -lscaler
/usr/bin/ld: cannot find -lHallA
collect2: ld returned 1 exit status
scons: *** [analyzer] Error 1
scons: building terminated because of errors.
It looks like the cleanup leaves the versioned library files behind. After scons -c, I still have
libHallA.so.1.6 -> libHallA.so.1.6.0
libHallA.so.1.6.0
and similarly for libdc and libscaler. Apparently, these files make scons think that libHallA is already built, so the link libHallA.so ->libHallA.so.1.6 isn't made.
As a workaround, clean up with make clean instead...
With complex setups, one easily gets over 1000 global variable definitions. At the same time, THaVarList::Find crawls through a linked list. Fortunately, Find() is normally only used at initialization time and so its obviously poor performance usually goes unnoticed. Since future application may not be so forgiving and since it would be quite easy, we should speed up this function with a map or a hash table.
Write all available analysis metadata (replay time/host/directory, user, database keys loaded, etc.) to the ROOT output file to make data provenance as transparent as possible.
It looks like currently the number of versions of all objects in the output file is pruned to 2. This is fine for trees and histograms (I guess), but if more than 2 input files where analyzed into a single output, one wants all the Run_Data info.
Probably closely related to #5.
I needed to repair two codes: THaScaler and THaScalerDB with a few bad lines that broke when we moved to 64-bit. Generally, one cannot assign a long to an int, or compare a long to an int. When doing this before I suppressed warnings by using a cast, but that breaks too. Lesson learned. The code to be checked in shortly.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.