Giter Site home page Giter Site logo

notjameshan / ml-binary-classification Goto Github PK

View Code? Open in Web Editor NEW

This project forked from acm-research/coding-challenge-s22

0.0 0.0 0.0 816 KB

Machine Learning Binary Classification using Jupiter Notebook and Matpoltlib

Jupyter Notebook 100.00%
jupiter-notebook matplotlib pandas seaborn python

ml-binary-classification's Introduction

ACM Research Coding Challenge (Spring 2022)

Question

Binary classification is a type of classification task that labels elements of a set (i.e. dataset) into two different groups. An example of this type of classification would be identifying if people had a specific disease or not based on certain health characteristics. The dataset found in mushrooms.csv holds data (22 different characteristics, specifically) about different types of mushrooms, including a mushroom's cap shape, cap surface texture, cap color, bruising, odor, and more. Remember to split the data into test and training sets (you can choose your own percent split). Information about the meaning of the letters under each column can be found within the file attributelegend.txt.

With the file mushrooms.csv, use an algorithm of your choice to classify whether a mushroom is poisonous or edible.

Explaination for my answer

Liberies used

  • pandas
  • numpy
  • Matplotlib
  • seaborn
  • Sk-learn package
    • preprocessing
    • model_selection
    • metrics
    • linear_model

Documentations used

Approach

Starting this challenge, I had no idea what Machine Learning and binary classification were. To understand what it is, I searched "binary classification python tutorial" on Youtube, and found CS Dojo's video, which helped set up the environment and learn how to use Jupiter notebook.

Analyzing Data

After understanding how to use Jupiter notebook, I started to analyze the CSV data that will help me to understand which data can help to separate poisonous and edible mushrooms. So I graphed only the poisonous mushroom for each column and analyzed the population of each type. I discovered the Gil attachment, Gil-spacing, veil-color, ring-number, and veil-type had the largest poisonous mushroom population in each column.

Preprocessing

I divided it into two data, one with no info on class data but with all other attribute data, one with only class data. Then I converted all non-numerical data to numerical data using this. I finalized by splitting the test and ML train.

Machine Learning and Confusion Metrix

I used the Logistic Regression algorithm because it is widely used for binary classification. It uses the logit function for the outcome. The probability is generated in output and it is classified into 0 or 1, by using the sigmoid activation function. Using the Logistic Regression algorithm, I have achieved 97.05 % accuracy.

And looking at the confusion matrix, there were only 48 negatives from the result (21 False Negative, 28 False Positive).

ml-binary-classification's People

Contributors

notjameshan avatar thomasabigail avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.