Giter Site home page Giter Site logo

arya.ai-assignment's Introduction

Arya.ai-Assignment

Exploratory Data Analysis

1. General Data Characteristics:

a. Size of Training Set - (3910, 58)
b. Size of Testing Set - (691, 57)
c. The column of responses i.e., ['Y'] is binary.
d. There are 57 predictors, none of which have informative headers.

2. Data Imbalance:

 There is no significant Data Imbalance in the dataset.

3. Missing Values:

a. There are no null values in any column (Missing data handling not required).
b. There is no column filled with all zeroes. 

4. Outlier Detection:

  The dataset has a significant amount of noise which gives rise to a lot of outliers whose elimination is not necessary since “An Outlier is that observation which is significantly different from all other observations.” which is not the case.

5. Feature Co-relation using Heatmaps

6. Dimensionality Reduction using PCA:

 To handle highly co-related features since two or more collinear features (correlated in some way but not necessarily with a strictly linear relationship) can give unexpected results while computing feature importance.
 
 New Training Set Shape - (3910, 52)

6. Feature Importance using Random Forest Regressor

Final Training Set Shape - (3910, 36)

Methodology

1. Machine learning Classifiers

Step 1: Hyperparameter Tuning using GridSearch

Step 2: Training

Classifiers Trained

a. Logistic Regressor
b. K-nearest Neighbors
c. Support Vector Classifier
d. Random Forest Classifier
e. Extra Trees Classifier (BEST f1/auc score)
f. Stochastic Gradient Descent Classifier

2. Binary Neural Network Classifier

Result Visualizations

Confusion Matrix for best classifier

image

Training and Validation Curves (Neural Network)

image

Confusion Matrix for Neural Network

image

arya.ai-assignment's People

Contributors

sakshee5 avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.