Giter Site home page Giter Site logo

jacquessham / scotchwhisky Goto Github PK

View Code? Open in Web Editor NEW
2.0 2.0 3.0 2.03 MB

This project is to build a content-based Scotch Whisky recommendation system to help to sell Scotch Whiskies.

Python 100.00%
clustering machine-learning python visualization whisky recommender-system

scotchwhisky's Introduction

Scotch Whisky Recommendation System

This project is aimed to build a content-based recommendation system to help Scotch whisky salesmen to recommend Scotch Whisky to a customer based on the quantified quality of the Scotch whiskies.

Please refer to Report to understand the process of how the recommendation system is built. Also, there is a Medium Post to talk about this project. If you would like to read the Medium post, you may refer to the link

If you are interested with the application which can recommend whisky to you, you may check out the Recommendation Application folder. If you would like to read the Medium post about the frontend upgrade of this application, you may go to this link.

Here are the brief structure of this repository.

Data

The original data is downloaded from Kaggle which obtained the data set from WhiskyClassified.com.

In the original data set contains 12 columns of characters or flavors, including body, sweetness, smoky...etc. Besides those features, there are columns of distillery name, postcode, UTM latitude and UTM longitude of the distilleries.

Additionally to the original 86 rows by 12 columns data set, I added three more columns:

  • Latitude in degree
  • Longitude in degree
  • Region (Region Classification of Whisky Distillery)

You may find the data set [here](Data/whisky.csv) or the detail of the data set [here](Data).

Goal of this Project

The goal of this project is to build a content-based recommendation system for whiskies. It means recommending a whisky based on the similarity between two whiskies. There are more than 86 brands of Scotch whisky and I want a model/system to recommend other brands based on the characters and flavor.

Region Classification

The first approach is to classify which Whisky Region the whisky distilleries are classified. The idea is that each region has its general flavor and characters of the whiskies. The assumption is that a person who likes one highland whisky, I will recommend other highland whisky to that person. The plan of this approach is that once we have trained with a model from 86 distilleries, we can classify the region of the 87th whisky distillery from the model.

The detail of the code may be found in the Region Classification Folder

However, the result of the classification model does not meet expectation. So, the second step is train the model in hierarchical clustering.

Dendrogram

Second approach is to use dendrogram to display the hierarchical relationship among distilleries. Dendrogram is one of the algorithms in hierarchical clustering. The idea is to use the quantified characters and flavor to calculate the similiarity of distilleries.

This is the Python code for the dendrogram or the Dendrogram Folder

Clustering

The last approach is to cluster the similar distilleries by k-means. And here is the Python code here The problem is find the best k for the algorithm.

You may go to the Clusters folder to find out the process of how the recommendation system using k-means.
The result looks like this:


The result of the optimal clustering looks like this on the map:


The model using the k-means algorithm is useful for recommendating Scotch Whisky. So I decided to use k-mean, with k=6, to build the recommendation system.

Application

The application will use K=6 to train a K Mean model. The are three ways to return a list of recommendation:

  • Enter a whisky distillery name
  • Choose from a list of characters and flavors
  • Nothing

1. If we enter a whisky distillery name, the application return a list of whiskies within the same cluster. The list of whiskies are sorted by the flavor similarity
2. If we choose from a list of character and flavors, the application return a list of whiskies that meets the criteria. Note that the whiskies on the list do not belong the same cluster.
3. If nothing is entered, application suggests Macallan. (See the Application folder for explaination)

You may find the codes and the captures in the RecommendationApplication

Report

In the Report folder, there is a report of going over how the recommendation system is built by choosing the best model from Region Classification, Dendrogram, and K-means Clustering.

scotchwhisky's People

Contributors

jacquessham avatar

Stargazers

 avatar  avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.