Giter Site home page Giter Site logo

data_mining_avocado_price's Introduction

Predict Avocado Price

This project is aiming to guide the wholesale and retail of avocado and help customer purchase the cheapest fruit at specific time and place by predicting the price of avocado. Data comes from Kaggle(https://www.kaggle.com/neuromusic/avocado-prices).

Data Description

There are 12 features in the original dataset, which includes the date of observation, the average price of a single avocado, the numbers of avocados with PLU 4046,4225,4770 sold, the numbers of bags with small, large or xlarge avocado, the type of fruit (coventional or organic) and the observed place.

Code Description

  1. Data Visulization

Before processing, generated picutres to show the relationship between some features and price.

  1. Data preprocessing and Model Building
  • Extracted the information of year,month and day from the observation date.

  • Filled in NaN’s with mean values obtained from the training data.

  • Used one hot encoding, converting categorical features into numerical.

  • Standardization: de-mean and divided by the standard derivation

  • Applied different models (Linear model, KNN, SVR,XGBoost and etc) and compared performance by R squared.

  1. Model Optimization

Implemented Hyperparameter tuning method. Top 3 models with highest accuracy(R squared score) after tuning parameters were Random Forests(0.87), Bagging regressor (0.85) and KNN(0.75).

Disclaimer

This is the final project of Data Mining course at JHU.

data_mining_avocado_price's People

Contributors

ransui11 avatar

Watchers

James Cloos avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.