Giter Site home page Giter Site logo

sachelsout / performance-metrics-from-scratch Goto Github PK

View Code? Open in Web Editor NEW
0.0 2.0 0.0 574 KB

This repository has the implementation of Performance Metrics (e.g. F1 score, AUC, Accuracy, etc) from scratch, without using Scikit Learn library.

Jupyter Notebook 100.00%
accuracy auc confusion-matrix data-science f1-score machine-learning mape mse performance-metrics python rsquare

performance-metrics-from-scratch's Introduction

performance-metrics-from-scratch

This repository has the implementation of Performance Metrics (e.g. F1 score, AUC, Accuracy, etc) from scratch, without using Scikit Learn library.

Datasets used in this project

image

  1. sample1.csv : In this dataset, the data is highly imbalanced (number of positive data points >> number of negative data points). The dataset has 2 columns namely 'y' and 'proba' which are basically the actual label of the data point and the probability score of the data point respectively.
  2. sample2.csv: In this dataset, the data is highly imbalanced (number of positive data points << number of negative data points). The dataset has 2 columns namely 'y' and 'proba' which are basically the actual label of the data point and the probability score of the data point respectively.
  3. sample3.csv : This is a regression data which means the labels are continuous numbers (97, 101.23, etc.). This dataset is used to calculate metrics like MSE, MAPE, R Sqaured error.

Performance Metrics Explored

1. Confusion Matrix -

image
Confusion Matrix is the combination of TP, FP, TN, FN in which TP is True Positive, FP is False Positive, TN is True Negative, FN is False Negative.
True Positive - The classifier predicted the label as Positive which is actually Positive.
False Positive - The classifier predicted the label as Positive but actually the label is Negative.
True Negative - The classifier predicted the label as Negative which is actually Negative.
False Negative - The classifier predicted the label as Negative but actually the label is Positive.

2. F1 Score -

image
F1 score is the harmonic mean of precision and recall. Harmonic mean for a and b is 2ab/(a+b).
Precision - Out of all the predicted positive data points, how much positive data points was classifier able to predict correctly.
Precision = TP/(TP+FP).
Recall - Out of all the actual positive data points, how much positive data points was classifier able to predict correctly.
Recall = TP/(TP+FN).

3. Accuracy -

image
Accuracy is the metric in which out of total predictions, how much the classifier is able to predict correctly, both for positive and negative predictions.
Accuracy = (TP+TN)/(TP+FP+TN+FN).

4. AUC Score -

image
AUC (Area Under Curve) is a performance measurement metric for classification problems at various threshold settings. FPR and TPR are used to calculate AUC Score.
False Positive Rate (FPR) - Out of all the negative data points, how many data points was classifier able to pick up. It is also called as negative recall.
FPR = FP/(FP+TN).
True Positive Rate (TPR) - Out of all the positive data points, how many data points was classifier able to pick up. It is also called as positive recall.
TPR = TP/(TP+FN).

5. Mean Squared Error (MSE) -

image
MSE is a metric to measure the average of the squares of the errors. This metric is used for regression problems where data is continuous in nature. So the output of this metric does not lie in between some intervals like 0 and 1.

6. Mean Absolute Percentage Error (MAPE) -

image
MAPE is a metric to measure the average of the absolute percentage errors. This metric is used for regression problems where data is continuous in nature.

7. R Squared Error -

image
R Squared Error is the comparison of residual sum of squares (SSres) with the total sum of sqaures (SStotal). It represents the goodness of fit of a regression model.
SSres - It is the sum of squares of the residual error which is nothing but the Mean Squared Error (MSE).
SStotal - It is the sum of the squares of the total error which is the sum of the squares of the differences between the actual labels and the mean of the actual labels, which is nothing but the variance value.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.