Giter Site home page Giter Site logo

recommender_system's Introduction

Recommender System Development

Project Overview

This project aims to develop and deploy an advanced recommender system. It involves data gathering, algorithm selection and optimization, performance evaluation, and deployment on AWS with Hadoop and Spark integration. The goal is to improve user experience and increase sales by providing personalized product recommendations.

Contents

Installation

  1. Clone the repository:

    git clone https://github.com/saurabh4269/recommender_system.git
    cd recommender_system
  2. Install the required packages:

    pip install -r requirements.txt

Dataset

  • Movies Dataset: Contains movie metadata such as movie titles and genres.
  • Ratings Dataset: Contains user ratings for various movies.

Download the dataset from MovieLens and extract the files into the project directory.

Content-Based Filtering

Implemented a content-based recommendation system using TfidfVectorizer and cosine_similarity from sklearn.

How It Works

  1. Data Preparation: Load the movies dataset and preprocess the genres column by filling missing values.
  2. TF-IDF Vectorization: Convert the genres into a TF-IDF matrix, which quantifies the importance of each genre in each movie.
  3. Dimensionality Reduction: Apply Truncated SVD to reduce the dimensionality of the TF-IDF matrix for more efficient similarity calculations.
  4. Cosine Similarity: Compute the cosine similarity between movies based on their reduced TF-IDF vectors.
  5. Recommendation Function: Define a function that takes a movie title as input and returns the top 10 most similar movies.
# Example usage
print(get_recommendations('Toy Story (1995)'))

Collaborative Filtering

Implemented a collaborative filtering recommendation system using Surprise library and SVD algorithm.

How It Works

  1. Data Preparation: Load the ratings dataset and prepare it for the Surprise library by specifying the rating scale.
  2. Train-Test Split: Split the data into training and testing sets.
  3. SVD Algorithm: Use the SVD (Singular Value Decomposition) algorithm to factorize the user-item interaction matrix.
  4. Model Training: Train the SVD model on the training set.
  5. Recommendation Function: Define a function that takes a user ID as input and returns the top 10 movie recommendations for that user.
# Example usage
print(get_collaborative_recommendations(1))

# Evaluate the model
predictions = algo.test(testset)
rmse = accuracy.rmse(predictions)
mae = accuracy.mae(predictions)

print(f"Collaborative Filtering RMSE: {rmse}")
print(f"Collaborative Filtering MAE: {mae}")

Hybrid Recommendation System

Combined content-based and collaborative filtering recommendations to create a more robust system.

How It Works

  1. Combine Recommendations: Get recommendations from both the content-based and collaborative filtering systems.
  2. Merge Results: Combine the results from both systems while removing duplicates.
  3. Final Recommendations: Return the top 10 combined recommendations.
# Example usage
print(hybrid_recommendations('Toy Story (1995)', 1))

Evaluation

Evaluated the models using RMSE, MAE for collaborative filtering, and precision and recall for overall performance.

How It Works

  1. Collaborative Filtering Evaluation: Calculate RMSE and MAE using the predictions from the collaborative filtering model.
  2. Precision and Recall: Define a function to calculate precision and recall based on ground truth and predicted recommendations.
# Evaluation metrics
precision, recall = evaluate_recommendations(ground_truth_recommendations, predicted_recommendations)
print(f"Precision: {precision}")
print(f"Recall: {recall}")

Usage

  1. Run Content-Based Filtering:

    Open the notebook and run the cells for Content-Based Filtering

  2. Run Collaborative Filtering:

    Open the notebook and run the cells for Collaborative Filtering

  3. Run Hybrid Recommendation System:

    Open the notebook and run the cells for Hybrid Recommendation System

Results

The hybrid recommendation system successfully combines the strengths of content-based and collaborative filtering approaches, providing accurate and diverse recommendations.

Sample Recommendations for "Toy Story (1995)":

  • Antz (1998)
  • Toy Story 2 (1999)
  • Adventures of Rocky and Bullwinkle, The (2000)
  • Emperor's New Groove, The (2000)
  • Monsters, Inc. (2001)
  • DuckTales: The Movie - Treasure of the Lost Lamp (1990)
  • Wild, The (2006)
  • Shrek the Third (2007)
  • Tale of Despereaux, The (2008)
  • Asterix and the Vikings (Astérix et les Vikings) (2006)

Collaborative Filtering Evaluation:

  • RMSE: 0.7779
  • MAE: 0.5868

Evaluation Metrics:

  • Precision: 0.375
  • Recall: 0.375

How It Works

Content-Based Filtering

  1. TF-IDF Vectorizer: Converts the text data (genres) into numerical features.
  2. Cosine Similarity: Measures the similarity between movies based on their genre features.
  3. Recommendation: Finds movies with the highest similarity scores to a given movie.

Collaborative Filtering

  1. SVD Algorithm: Decomposes the user-item interaction matrix into latent factors.
  2. Prediction: Predicts user ratings for unseen movies based on learned latent factors.
  3. Recommendation: Recommends movies with the highest predicted ratings for a given user.

Hybrid System

  1. Combination: Merges recommendations from both content-based and collaborative filtering systems.
  2. Deduplication: Ensures no duplicates in the final recommendation list.
  3. Final Output: Provides a diverse set of recommendations leveraging both systems.

recommender_system's People

Contributors

saurabh4269 avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.