Giter Site home page Giter Site logo

machine-learning-for-glitch-classification-in-gravitational-wave-data-interferometry's Introduction

Machine-Learning-for-Glitch-Classification-in-Gravitational-Wave-Data-Analysis

To cite this work please refer to link or use the following : R. Abani, "Glitch Classification for Gravitational Wave Interferometry Using Machine Learning," 2022 IEEE 3rd Global Conference for Advancement in Technology (GCAT), Bangalore, India, 2022, pp. 1-3, doi: 10.1109/GCAT55367.2022.9971811.

This project is a part of the coursework pertaining to ECS 308 Data Science and Machine Learning taught by Dr Tanmay Basu, mentored by Mr. Vishisht Sharma at the 6th semester in IISER Bhopal.

Introduction

Gravitational waves are disturbances in the curvature of spacetime, generated by accelerated masses, that propagate as waves outward from their source at the speed of light. The detection of gravitational waves demands a thorough understanding of instrumental responses in the ecosystem of environmental noise. Hence of pertinent interest is the study of anomalous non Gaussian noise transients called ‘Glitches’. The classification of glitches is essential owing to their high occurrence rates in LIGO data that often hazard and mimic true gravitational wave signals. The data used in this project has been extracted from LIGO’s Gravity Sky portal and contains metadata about these ‘Glitches’. The train data contains information about the characteristics of a glitch like bandwidth, signal to noise ratio etc. (there are a total of 7 such features). The test data contains the glitch labels or the 22 types of glitches along with unique identification labels. In this project, various machine learning models from the sklearn or the scikit learn library in python namely K-nearest neighbours, Support Vector Machines, Random Forest and Decision Trees were used to train the data and develop an accurate model to classify glitches. That model was then run on the test data. In another sub-instance of this project, One Hot Encoding was used deal with the categorical variable ifo or detector location and hence to target the research question ’Does the location of the interferometer have any impact on the classification of glitches’ The scatter plot below shows the correlation between various variables in the train data set.

image

Methodology

This project explored the use of sklearn ML models like KNNs, SVMs, Decision Trees and Random Forest via a pipeline based on the ’Divide and Conquer Algorithm’ juxtaposed to the generally used pipeline which is defined to run everything from the pre-processing to hyperparameter tuning to the evaluation and prediction at one go. The pseudo-code and other technical jargon can be found in my report This divide and conquer approach of splitting the pipeline into a training routine and a parameter tuning routine reduced the time complexity which was evident owing to reduced time execution that was sufficed by a local intel i7 NVIDIA processor (without a GPU). For each of the ML models, i.e KNN, SVM, Decision Tree and Random Forest, parameter tuning was done followed by using the training routine with those optimal parameters. We considered 2 scenarios, the first one which didn’t consider location of the detector, the second scenario where one hot encoding was used to convert the location (categorical variable into to numerical).

Analysis and Discussion

The research problem in Scenario 2, where we were trying to find out a correlation between the location of the interferometer and glitch classification hasn’t been made conducively clear through the results obtained. The Listed Color Maps show the variations in visulaizing the data dsitribution after running the model with the highest f-score. Over-fitting of data, and most importantly imbalance might be a contributing factor. As can be seen from the Glitch distribution bar plot, the number of non glitch events is extremely low and the percentage of Blip glitches is very high.

image
Listed Color map pertaining to the Decision Tree based classifer model in scenario 1

image

Listed Color Map for classification performed on data from Hanford after one Hot encoding

image

Listed Color Map for Classification performed on data from Livingston after one hot encoding

image

Distribution of Glitches

References

  1. https://bigdata.oden.utexas.edu/project/divide-conquer-methods-for-big-data-analytics/
  2. https://medium.com/@kohlishivam5522/understanding-a-classification-report-for-your-machine-learning-model-88815e2ce397#:~:text=The%20F1%20score%20is%20a,and%20recall%20into%20their%20computation.
  3. https://www.ams.org/publications/journals/notices/201707/rnoti-p684.pdf

machine-learning-for-glitch-classification-in-gravitational-wave-data-interferometry's People

Contributors

dra-chaos avatar

Stargazers

 avatar

Watchers

 avatar

Forkers

iiserb-ug

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.