Machine-Learning-for-Glitch-Classification-in-Gravitational-Wave-Data-Analysis

To cite this work please refer to or use the following : R. Abani, "Glitch Classification for Gravitational Wave Interferometry Using Machine Learning," 2022 IEEE 3rd Global Conference for Advancement in Technology (GCAT), Bangalore, India, 2022, pp. 1-3, doi: 10.1109/GCAT55367.2022.9971811.

This project is a part of the coursework pertaining to ECS 308 Data Science and Machine Learning taught by Dr Tanmay Basu, mentored by Mr. Vishisht Sharma at the 6th semester in IISER Bhopal.

Introduction

Gravitational waves are disturbances in the curvature of spacetime, generated by accelerated masses, that propagate as waves outward from their source at the speed of light. The detection of gravitational waves demands a thorough understanding of instrumental responses in the ecosystem of environmental noise. Hence of pertinent interest is the study of anomalous non Gaussian noise transients called ‘Glitches’. The classification of glitches is essential owing to their high occurrence rates in LIGO data that often hazard and mimic true gravitational wave signals. The data used in this project has been extracted from LIGO’s Gravity Sky portal and contains metadata about these ‘Glitches’. The train data contains information about the characteristics of a glitch like bandwidth, signal to noise ratio etc. (there are a total of 7 such features). The test data contains the glitch labels or the 22 types of glitches along with unique identification labels. In this project, various machine learning models from the sklearn or the scikit learn library in python namely K-nearest neighbours, Support Vector Machines, Random Forest and Decision Trees were used to train the data and develop an accurate model to classify glitches. That model was then run on the test data. In another sub-instance of this project, One Hot Encoding was used deal with the categorical variable ifo or detector location and hence to target the research question ’Does the location of the interferometer have any impact on the classification of glitches’ The scatter plot below shows the correlation between various variables in the train data set.

Methodology

This project explored the use of sklearn ML models like KNNs, SVMs, Decision Trees and Random Forest via a pipeline based on the ’Divide and Conquer Algorithm’ juxtaposed to the generally used pipeline which is defined to run everything from the pre-processing to hyperparameter tuning to the evaluation and prediction at one go. The pseudo-code and other technical jargon can be found in my This divide and conquer approach of splitting the pipeline into a training routine and a parameter tuning routine reduced the time complexity which was evident owing to reduced time execution that was sufficed by a local intel i7 NVIDIA processor (without a GPU). For each of the ML models, i.e KNN, SVM, Decision Tree and Random Forest, parameter tuning was done followed by using the training routine with those optimal parameters. We considered 2 scenarios, the first one which didn’t consider location of the detector, the second scenario where one hot encoding was used to convert the location (categorical variable into to numerical).

Analysis and Discussion

The research problem in Scenario 2, where we were trying to find out a correlation between the location of the interferometer and glitch classification hasn’t been made conducively clear through the results obtained. The Listed Color Maps show the variations in visulaizing the data dsitribution after running the model with the highest f-score. Over-fitting of data, and most importantly imbalance might be a contributing factor. As can be seen from the Glitch distribution bar plot, the number of non glitch events is extremely low and the percentage of Blip glitches is very high.

Listed Color map pertaining to the Decision Tree based classifer model in scenario 1

Listed Color Map for classification performed on data from Hanford after one Hot encoding

Listed Color Map for Classification performed on data from Livingston after one hot encoding

Distribution of Glitches

dra-chaos / machine-learning-for-glitch-classification-in-gravitational-wave-data-interferometry Goto Github PK

machine-learning-for-glitch-classification-in-gravitational-wave-data-interferometry's Introduction

Machine-Learning-for-Glitch-Classification-in-Gravitational-Wave-Data-Analysis

Introduction

Methodology

Analysis and Discussion

References

machine-learning-for-glitch-classification-in-gravitational-wave-data-interferometry's People

Contributors

Stargazers

Watchers

Forkers

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent