Giter Site home page Giter Site logo

lhcolympics-'s Introduction

Welcome to the home of the LHC Olympics 2020!

img

Despite an impressive and extensive effort by the LHC collaborations, there is currently no convincing evidence for new particles produced in high-energy collisions.  At the same time, there has been a growing interest in machine learning techniques to enhance potential signals using all of the available information.  

In the spirit of the first LHC Olympics (circa 2005-2006) [1st, 2nd, 3rd, 4th], we are organizing the 2020 LHC Olympics.  Our goal is to ensure that the LHC search program is sufficiently well-rounded to capture "all" rare and complex signals.  The final state for this Olympics will be more focused (generic multijet events) but the observable phase space and potential BSM parameter space(s) are large: all hadrons in the event can be used for learning (be it "cuts", supervised machine learning, or unsupervised machine learning). One class of BSM topology captured by this challenge is illustrated in the following picture.

img

We provide two types of files (from this Zenodo link):

  • "Monte Carlo Simulation Background": This is a simulated sample that does not have signal. Be warned that both the physics and the detector modeling for this simulation may not exactly reflect the “Data”.

  • "Data": These are the LHCO 2020 black boxes. These samples may contain some new signal(s). We will release three black boxes during this challenge.  The first one was released on November 19. The second one was released on December 4. 

Both the "Simulation" and "Data" have the following event selection: at least one anti-kT R = 1.0 jet with pseudorapidity |η| < 2.5 and transverse momentum pT > 1.2 TeV.   For each event, we provide a list of all hadrons (pT, η, φ, pT, η, φ, ...) zero-padded up to 700 hadrons.

What you should report:

  1. A p-value associated with the dataset having no new particles (null hypothesis).

  2. As complete a description of the new physics as possible. For example: the masses and decay modes of all new particles (and uncertainties on those parameters).

  3. How many signal events (+uncertainty) are in the dataset (before any selection criteria).

Partial submissions in only a subset of the categories are welcome! You can submit your findings at this Google form.  Outcomes will be judged based on the accuracy of the new physics characterization. For accuracy, we will use the # of sigmas |(your answer - right answer) / your uncertainty| from the right answer wherever applicable.

For setting up, developing, and validating your methods, we provide background events and a benchmark signal model.  You can download these from this page.  To help get you started, we have also prepared simple python scripts to read in the data and do some basic processing.   The page describing the R&D phase of the challenge can be found here.

Please do not hesitate to ask questions: we will use the ML4Jets slack channel to discuss technical questions related to this challenge. You are also encouraged to sign up for the mailing list [email protected] using the e-groups.cern.ch interface for infrequent announcements and communications.

Good luck!

Gregor Kasieczka, Ben Nachman, and David Shih

Workshops

Winter Olympics

The deadline for the Winter Olympics (Black Box 1) challenge was Sunday, January 12, 2020 at 5pm Eastern US Time. Results were presented in a dedicated session at the ML4Jets2020 conference

See the outcome of the Winter Olympics here.

Summer Olympics

Black boxes 2 and 3 will be opened at an event originally scheduled to be hosted in Hamburg in July 2020. However, given the situation with COVID-19, this event was virtual.

Publications

We strongly encourage you to publish your original research methods using these datasets. We are currently compiling a community comparison / summary paper - please contact the organizers for details (anyone who participated in the Olympics has been invited to contribute). Here are papers published with the LHCO dataset. Please send links to your papers if you have used this dataset! Many more preliminary studies can be found in workshops listed above.

  • Anomalous Jet Identification via Sequence Modeling, Alan Kahn, Julia Gonski, Ines Ochoa, Daniel Williams, and Gustaaf Brooijmansa, hep-ph/2105.09274

  • Comparing Weak- and Unsupervised Methods for Resonant Anomaly Detection, Jack H. Collins, Pablo Martin-Ramiro, Benjamin Nachman, David Shih, hep-ph/2104.02092

  • Bump Hunting in Latent Space, B. Bortolato et al., hep-ph/2103.06595

  • The LHC Olympics 2020: A Community Challenge for Anomaly Detection in High Energy Physics, G. Kasieczka et al., hep-ph/2101.08320

  • QUAK: Quasi Anomalous Knowledge: Searching for new physics with embedded knowledge, Sang Eon Park, Dylan Rankin, Silviu-Marian Udrescu, Mikaeel Yunus, Philip Harris, hep-ph/2011.03550

  • UCluster: Unsupervised clustering for collider physics, Vinicius Mikuni and Florencia Canelli, hep-ph/2010.07106

  • Simulation-Assisted Decorrelation for Resonant Anomaly Detection, Kees Benkendorfer, Luc Le Pottier, and Benjamin Nachman, hep-ph/2009.02205

  • Tag N' Train: A Technique to Train Improved Classifiers on Unlabeled Data, Oz Amram and Cristina Mantilla Suarez, hep-ph/2002.123760

  • Simulation Assisted Likelihood-free Anomaly Detection, Anders Andreassen, Benjamin Nachman, David Shih, hep-ph/2001.05001

  • Anomaly Detection with Density Estimation, Benjamin Nachman, David Shih, hep-ph/2001.04990

lhcolympics-'s People

Contributors

davidshih17 avatar bnachman avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.