Giter Site home page Giter Site logo

deeprecommender's Introduction

Training Deep AutoEncoders for Collaborative Filtering

An implementation of the research described in: "Training Deep AutoEncoders for Collaborative Filtering" (https://arxiv.org/abs/1708.01715) in Tensorflow & Keras

The Model

The model is based on Deep AutoEncoders with dense re-feeding.

AutoEncoderPic

Requirements

Please install the following on a linux or windows-based machine, instructions for setup in Ubuntu are given below.

  • Python 3.6
  • Tensorflow 1.11.0
  • CUDA 9.0
  • Numpy(>=1.15.0)
  • Pandas(>=0.23.0)
  • Jupyter(>=1.0.0)

Getting Started

Environment Setup

Python 3.6 Installation

$ sudo apt-get update
$ sudo apt-get install python3-dev python3-pip

Pip & related libraries

$ pip3 install -r requirements.txt
$ pip3 install --user --upgrade tensorflow

Tensorflow may require Virtualenv, if the above process fails please see detaile instructions here.

Launch Jupyter Notebook & Tensorboard

Go to the project directory and run the following commands:

$ jupyter notebook

In browser, open http://localhost:8888/ In another terminal window:

$ tensorboard --logdir=./logs

In browser, open http://localhost:6006

Get the data

Note: The following are not required to run the code demo, might only be downloaded to run original architechure. Skip this step, files for demo are already preprocessed and included for convenience.

Download from here into your DeepRecommender folder

$ tar -xvf nf_prize_dataset.tar.gz
$ tar -xf download/training_set.tar
$ python ./netflix_data_convert.py training_set nf_data

Run the Code

  • Open the file Functional Model.ipynb in Jupyter Notebook
  • Restrart the kernel and clear all output.
  • Follow the steps in the Jupyter Notebook, executing the code snippets in order.
  • Check the results and plots in tensorboard.

Note

The actual dataset is not used in this demo, becuase of lack of computation power at user end. The original dataset is large and forms a 311315 x 17736 user-movie matrix which cannot be processed on regular laptops.

We have reduced the dataset to 13118 x 1000 using a fator k=(17736/1000) for demo purposes, though this produces more error due to lesser size of training data. The layer sizes have also been reduced proportionately to display the autoencoder functionality.

Higher layer sizes perform better but it doesn't depict the purpose of autoencoders to compress data.

The original dataset can be used to produce more accurate results but would require huge computation power and memory and time to process.

Results

It should be possible to achieve the following results. (exact numbers will vary due to randomization)

Dataset RMSE
Train 0.8580
Validation 1.6281
Test 1.5679

deeprecommender's People

Contributors

suvigyavijay avatar

Stargazers

Reza Keshavarz avatar Uguray Durdu avatar Jan Hartman avatar Fari avatar Timothy Nguyen avatar Pythonian Cultist avatar  avatar Arunkumar Venkataramanan avatar eric avatar tinymindx avatar J.Guzmán avatar

Watchers

James Cloos avatar  avatar paper2code - bot avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.