Giter Site home page Giter Site logo

emtf_bdt_performanceplotter's Introduction

EMTF BDT Performance Plotter

=============================== This repository contains tools to evaluate the performance of the EMTF BDT after retraining.

Setup

For Running

source /cvmfs/cms.cern.ch/cmsset_default.sh
cmsrel CMSSW_10_6_1_patch2 
cd CMSSW_10_6_1_patch2/src
cmsenv

git clone [email protected]:jrotter2/EMTF_BDT_PerformancePlotter.git
cd EMTF_BDT_PerformancePlotter

pip3 install -r requirements.txt --user

For Developing

You should first fork this repository.

source /cvmfs/cms.cern.ch/cmsset_default.sh
cmsrel CMSSW_10_6_1_patch2
cd CMSSW_10_6_1_patch2/src
cmsenv

git clone [email protected]:<your_GitHub_username>/EMTF_BDT_PerformancePlotter.git
git checkout -b <your_branch_name>
git push origin <your_branch_name>

pip3 install -r requirements.txt --user

After you have made changes you can push them to your branch using,

git add .
git commit -m "Some Message..."
git push

Once your changes are stable and complete they can merged via a PR to the master branch.

Upon Logging In (Each Session)

It is recommended that you add these to your bash profile. In order to access files from EOS you will need to setup your environment for your session using,

source /cvmfs/cms.cern.ch/cmsset_default.sh
voms-proxy-init --voms cms
cd ~/path/to/your/directory/CMSSW_10_6_1_patch2/src/
cmsenv

Structure

The repository is structure so that one could run each individual plotter seperataly or call the general plotter to make multiple different types of performance plots.

Additionally there are helper classes in the helpers directory which can be used to store multiuse functions or useful calculations.

General Plotter

plotter.py is responsible for making general plots. It can make efficiency plots and resolution plots for different selections. It can be called by,

python3 plotter.py <options> outputDir outputFileName inputFile

The options -e(or --eff) will set a flag to create efficiency plots and -r(or --res) will set a flag to create resolution plots. Additional options can be seen by running python3 plotter.py --help.

Efficiency Plotter

efficiencyPlotter.py is responsible for making efficiency plots. It can be called directly by:

python3 efficiencyPlotter.py <options> outputDir outputFileName inputFile

To see a full list of options you can execute python3 efficiencyPlotter.py --help

This plotter will generate efficiency vs pT, efficiency vs eta, and efficiency vs phi plots for multiple selections passed through <options>. These plots will be saved to a pdf specified by outputDir and outputFileName.

Occupancy Plotter

occupancyPlotter.py is responsible for making occupancy plots.

Resolution Plotter

resolutionPlotter.py is responsible for making resolution plots, which are probability distributions for missing pT for a certain number of events (i.e. how precisely the trigger is estimating muon pT). The resolution plotter gives more information on how to scale the efficiency plots for a turn-on rate efficiency of =>90%. It can be called directly by:

python3 resolutionPlotter.py <options> outputDir inputFile

This plotter will generate resolutions using a Gaussian distribution.

Helpers

Stored in the helpers directory, are used to store multiuse functions or useful calculations.

Details and Information

General EMTF BDT

One side effect of the GBDT Regression Algorithm is that the pT assignment will be 50% efficient at the pT threshold. As a convention of the L1 Trigger, the trigger should be >90% efficient at the pT threshold. Therefore, a scaling factor is implemented to make the BDT pT assignment fit the convention. The scaling factor is,

pT_xml = min(20, pT_unscaled)
pT_scaled = A * pT_unscaled / (1 + B * pT_xml)

For Run 2, A=1.2 and B=.015. For Run 3, A=1.3 and B=.004.

Calculating Efficiency

For each track in the input file there is unbinned information for GEN_pt, BDTG_AWB_sq, GEN_eta, GEN_phi, and TRK_hit_ids which are of interest to our plotters (These will be stored in unbinned_EVT_data).

Efficiency is calculated by first generating a set of tracks that meet certain denominator cuts (i.e. eta, phi, or track hit ids cuts), then generating a subset of those tracks by applying numerator cuts (i.e. pT cut on BDT) - These can be thought of as the denominator and numerator sets respectively.

Then the efficiency is generated by binning both the denominator and numerator sets and finding the ratio in each bin. The confidence interval for efficiency is based on the confidence interval for a binomial distibution for x successes in k trials (where x would be number of tracks in the binned numerator and k would be number of tracks in the binned denominator). This confidence interval is known as the Clopper-Pearson Exact Confidence Interval.

Calculating Resolution

For each track in the input file there is unbinned information for GEN_pt, BDTG_AWB_sq, GEN_eta, GEN_phi, and TRK_hit_ids which are of interest to our plotters (These will be stored in unbinned_EVT_data).

Resolution is calculated by creating an unbinned array of (GEN_pt - BDT_pt)/GEN_pt or (log(GEN_pt) - log(BDT_pt))/log(GEN_pt). The distribution should be roughly normal around zero.

emtf_bdt_performanceplotter's People

Contributors

tbcarnahan avatar jrotter2 avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.