Giter Site home page Giter Site logo

abhilash-neog / pet-adoption-speed-predictor Goto Github PK

View Code? Open in Web Editor NEW
2.0 2.0 0.0 501 KB

A capstone project to predict the adoption-speed of listed pets

Jupyter Notebook 100.00%
classification mixed-data deep-learning data-analysis feature-vector machine-learning capstone-project keras architectural-decisions

pet-adoption-speed-predictor's Introduction

Pet Adoption Speed Predictor

A Data Science capstone project completed in fulfilment of the Coursera course - IBM Advanced Data Science Capstone - the final course in the IBM Advanced Data Science Specialization.

Dataset

Kaggle dataset - PetAdoption has been used. Dataset contains tabular, image and text (descriptive) data, which makes it quite challenging and interesting to work with. In this project, only the tabular and image data has been used to train and validate the model. More information about the dataset, its attributes and image data can be found in the above link.

Notebook

The major(primary) tasks present within the notebook are:

  • ETL (Extract Transform Load)
  • Data Quality Assessment
  • Data Exploration
  • Data Visualization
  • Feature Engineering
  • Model Definition
  • Model training
  • Model Evaluation

Proposed Model

The proposed model comprises of 3 Neural networks, one of which is a pretrained network (on the imagenet dataset). This 3-net model has been approached due to the presence of categorical + continuous data (including images).

  • The pretrained network (DenseNet169 has been used to obtain image features (feature_vec1) from the pet images.
  • The 1st network (NN-1) is trained on the tabular/relational dataset (all the attributes after feature engineering are categorical) with the actual labels.
  • Trained NN-1 output is then used to prepare the input for the 2nd network (NN-2). An 1-D feature vector (feature_vec2) is extracted from a Dense layer of the trained NN-1.
  • Both feature vectors (feature_vec1 and feature_vec2 - both 1D) are then concatenated to form a new 1-D feature vector which forms the input for training NN-2
  • The architecture can be imagined to be something similar to the following (except the pretrained network is missing which takes in the continous data):

Image taken from StackExchange alt text

Technology

Keras(latest version) with Tensorflow 2.2 has been used in the project. Python 3.7.x has been used to programme the solution. Should work with any python version >= 3.5. Training the 1st network (NN-1) can be done on CPU, whereas the NN-2 training require GPU computation (TPU recommended), as batches of data (tabular + image) are generated dynamically (preprocessing of images done on the generated batch before feeding into the net) during training.

Additional

  • The ADD (Architectural Decisions Document) can be found in the repo along with the notebook
  • A gist of the notebook - notebook gist
  • Project Presentation (with a demo of the notebook) can be found here

Citation (if referred)

Cite the work as:
Abhilash Neog, Pet Adoption-Speed predictor (July 2020), GitHub repository, https://github.com/abhilash97/Pet-Adoption-Speed-Predictor

pet-adoption-speed-predictor's People

Contributors

abhilash-neog avatar

Stargazers

 avatar  avatar

Watchers

 avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.