Giter Site home page Giter Site logo

unsupervised-machine-learning-challenge's Introduction

unsupervised-machine-learning-challenge

GA Tech Data Science and Analytics Boot Camp Module 20

Description

In this module we utilize unsupervised machine learning to fit data to a model and use clustering algorithms to place data into groups.

This activity is broken into four parts:

  • Part 1: Prepare the Data

To prepare the data, we remove the MYOPIC target column that would create bias for unsupervised modeling. This column would be more beneficial for supervised modeling. We then standardize the data using the StandardScaler from sklearn.

  • Part 2: Apply Dimensionality Reduction

After the data is prepared, we reduce the dataset by applying the dimentionality reduction technique PCA. This assignment calls for an n-component of 90% of the explained variance.

We further reduce the dataset dimension with t-SNE and display our results on a scatter plot.

  • Part 3: Perform a Cluster Analysis with K-means

To identify the best number of clusters, we create an elbow plot for the k-means values. We achieve this by creating a for loop to determine the inertia for k between 1 through 10.

Based on the plot above, we can see that the elbow is roughly around 4.

  • Part 4: Make a Recommendation

Based on our findings, we can conclude that the patients could be clustered together. The point in which our elbow plot bends is at about 4. These clusters can also be seen in the scatterplot.

Submission Requirements

Disclaimer

Program may fail with recent numpy version. Downgrading to numpy 1.21.4 will fix this issue. Source

unsupervised-machine-learning-challenge's People

Contributors

aimeevu avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.