Giter Site home page Giter Site logo

machinelearningnotebooks's Introduction

MachineLearningNotebooks

A collection of small machine learning projects.

The projects in this repository are all written in Jupyter Notebooks. This repository is not limited to any particular model or dataset type. Requirements such as the dataset and modules are listed for each notebook. Many of the notebooks have extensive markdown, therefore their section of the README will be minimal on the description.

Pulsar Stars

This project deals with the detection of pulsar stars. The dataset is structured data (csv), several models were utilized they are as follows: Logistic Regression, K Neighbors Classifier, Decision Tree Classifier, Random Forest Classifier, and Support Vector Classifier (SVC). The models overall performed well with the Random Forest Classifier producing the best results, 98% accuracy. I as of now have no future plans for this project.

Dataset and Required Modules

Dataset: https://www.kaggle.com/spacemod/pulsar-dataset

Requirements:

  • Python 3.7.10
  • numpy 1.20.2
  • pandas 1.2.4
  • scikit-learn 0.24.2

Malaria Detection

This project does preprocessing and then evaluates image data of cells to detect if they contain the Malaria parasite. The model used is a Constitutional Neural Network (CNN). The images were run on the model in two forms one with little preprocessing, and once more with significant feature engineering. The feature engineering produced a noticeable improvement on this dataset. The first iteration had a 81% accuracy as it started to over fit the training data. The second iteration with the feature engineering produced a 95% accuracy. If any project would be a candidate for a complete implementation it would be this model.

Dataset and Required Modules

Dataset: https://www.kaggle.com/iarunava/cell-images-for-detecting-malaria

Requirements:

  • Python 3.7.10
  • matplotlib 3.3.4
  • numpy 1.20.2
  • opencv 3.4.2
  • scikit-image 0.18.1
  • scikit-learn 0.24.2
  • tensorflow 2.0.0

Question Answer

This project aims to create a model capable of answering passage based questions with a true of false answer. At the time of this project I did not realize the difficulty of this problem. The dataset is text based composed of a question, title, answer, and passage. A recurrent neural network (RNN) was utilized, but despite many alterations produced less than desirable results. The final accuracy was 61%, which due to the dataset distribution (60/40). The conclusion here is that the model learned the distribution and nothing else. With better feature engineering, and layers that can recognize key words from the question and passage, then relate them may improve the model. At the time of making this model, there were some new natural language processing layers in testing.

Dataset and Required Modules

Dataset: https://github.com/google-research-datasets/boolean-questions

Text vectorization file: https://nlp.stanford.edu/projects/glove/ glove.840B.300d.txt

Requirements:

  • Python 3.7.10
  • matplotlib 3.3.4
  • numpy 1.20.2
  • scikit-learn 0.24.2
  • scipy 1.6.2
  • tensorflow 2.0.0

machinelearningnotebooks's People

Contributors

wisno33 avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.