Giter Site home page Giter Site logo

p5's Introduction

Recommendation as Language Processing (RLP): A Unified Pretrain, Personalized Prompt & Predict Paradigm (P5)

This repo presents implementation of the P5 large language model (LLM) for recommendation:

Paper: Recommendation as Language Processing (RLP): A Unified Pretrain, Personalized Prompt & Predict Paradigm (P5)
Paper link: https://arxiv.org/pdf/2203.13366.pdf

A reorganized and simplified repo of P5 called OpenP5 is also available on GitHub, which is an open-source library for benchmarking foundation models for recommendation under the Pre-train, Personalized Prompt and Predict Paradigm (P5):

Paper: OpenP5: Benchmarking Foundation Models for Recommendation
Paper link: https://arxiv.org/pdf/2203.13366.pdf
GitHub link: https://github.com/agiresearch/OpenP5

Another relevant repo regarding how to create item ID for recommendation foundation models is available here:

Paper: How to Index Item IDs for Recommendation Foundation Models
Paper link: https://arxiv.org/pdf/2305.06569.pdf
GitHub link: https://github.com/Wenyueh/LLM-RecSys-ID

Teaser

Introduction

We present a flexible and unified Big Foundation Model for recommendation, which is the "Pretrain, Personalized Prompt, and Predict Paradigm" (P5) for recommendation. It unifies various recommendation tasks in a shared framework. In P5, all data such as user-item interactions, item metadata, and user reviews are converted to a common format -- natural language sequences. Specifically, P5 learns different tasks with the same language modeling objective during pretraining. Thus, it serves as the foundation model for downstream recommendation tasks, allows easy integration with other modalities, and enables instruction-based recommendation based on prompts. P5 advances recommender systems from shallow model to deep model to big model, and will revolutionize the technical form of recommender systems towards universal recommendation engine. With adaptive personalized prompt for different users, P5 is able to make predictions in a zero-shot or few-shot manner and largely reduces the necessity for extensive fine-tuning. On several recommendation benchmarks, we conduct experiments to show the effectiveness of P5. To help advance future research on Recommendation as Language Processing (RLP), Personalized Foundation Models (PFM), and Universal Recommendation Engine (URE), the source code, dataset, prompts, and pretrained P5 models are relased at this repository.

Requirements:

  • Python 3.9.7
  • PyTorch 1.10.1
  • transformers 4.2.1
  • tqdm
  • numpy
  • sentencepiece
  • pyyaml

Usage

  1. Clone this repo

    git clone https://github.com/jeykigung/P5.git
    
  2. Download preprocessed data from this Google Drive link, then put them into the data folder. If you would like to preprocess your own data, please follow the jupyter notebooks in the preprocess folder. Raw data can be downloaded from this Google Drive link, then put them into the raw_data folder.

  3. Download pretrained checkpoints into snap folder. If you would like to train your own P5 models, snap folder will also be used to store P5 checkpoints.

  4. Pretrain with scripts in scripts folder, such as

    bash scripts/pretrain_P5_base_beauty.sh 4
    

    Here 4 means using 4 GPUs to conduct parallel pretraining.

  5. Evaluate with example jupyter notebooks in the notebooks folder. Before testing, create a soft link of data folder to the notebooks folder by

    cd notebooks
    ln -s ../data .
    

Pretrained Checkpoints

See CHECKPOINTS.md.

You can also explore P5 in Hugging Face Hub (https://huggingface.co/makitanikaze/P5).

Citation

Please cite the following paper corresponding to the repository:

@inproceedings{geng2022recommendation,
  title={Recommendation as Language Processing (RLP): A Unified Pretrain, Personalized Prompt \& Predict Paradigm (P5)},
  author={Geng, Shijie and Liu, Shuchang and Fu, Zuohui and Ge, Yingqiang and Zhang, Yongfeng},
  booktitle={Proceedings of the Sixteenth ACM Conference on Recommender Systems},
  year={2022}
}

Acknowledgements

VL-T5, PETER, and S3-Rec

p5's People

Contributors

evison avatar jeykigung avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.