Giter Site home page Giter Site logo

lsq117 / breast-cancer-prediction Goto Github PK

View Code? Open in Web Editor NEW

This project forked from birajaghoshal/breast-cancer-prediction

0.0 1.0 0.0 1.7 MB

Breast Cancer Prediction Using Machine Learning is a project which is going to deal with machine learning techniques, a state-of-the-art type of artificial intelligence that can be used by computers to detect and classify objects in images, could improve detection of breast cancer lesions in mammograms and help in the classification of breast cancer

License: MIT License

Jupyter Notebook 100.00%

breast-cancer-prediction's Introduction

Breast-Cancer-Prediction

  • Breast cancer is now the most common cancer in Indian women, having recently overtaken cervical cancer in this respect. For every 2 women newly diagnosed with breast cancer, one woman dies of it in India.

  • For women diagnosed during 2010-14, five-year survival for breast cancer is now 89.5% in Australia and 90.2% in the USA, but international differences remain very wide, with levels as low as 66.1% in India.

The dataset given here is about the patients who were detected with 2 kinds of breast cancer :

  • Malignant
  • Benign

Pairplot with comparison between diagnosis and radius_mean as well as texture_mean

Classification Report

Code Requirements

You can install Conda for python which resolves all the dependencies for machine learning.

Finding Out Accuracy:

⚡ Logistic Regression

Logistic Regression measures the relationship between the dependent variable (our label, what we want to predict) and the one or more independent variables (our features), by estimating probabilities using it’s underlying logistic function.

A simple example of a Logistic Regression problem would be an algorithm used for cancer detection that takes screening picture as an input and should tell if a patient has cancer (1) or not (0).

  • The mean accuracy with 10 fold cross validation is 88.96

🏠 kNN (k-Nearest-Neighbor)

K represents the number of training data points lying in proximity to the test data point which we are going to use to find the class. A k-nearest-neighbor is a data classification algorithm that attempts to determine what group a data point is in by looking at the data points around it.

  • The mean accuracy with 10 fold cross validation is 87.21

⚡ Naive Bayes

Naive Bayes classifiers are a family of simple probabilistic classifiers based on applying Bayes’ theorem with strong (naive) independence assumptions between the features.

  • The mean accuracy with 10 fold cross validation is 92.21

🌲 Random Forest

Random Forest is a supervised learning algorithm. Like you can already see from it is name, it creates a forest and makes it somehow random. The forest it builds, is an ensemble of Decision Trees, most of the time trained with the “bagging” method. The general idea of the bagging method is that a combination of learning models increases the overall result.

  • The accuracy on test data is 94.74

Execution :

To run the file BreastCancer.ipynb `

breast-cancer-prediction's People

Contributors

ishubhamsharma avatar

Watchers

James Cloos avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.