Giter Site home page Giter Site logo

zyang1580 / collm Goto Github PK

View Code? Open in Web Editor NEW
56.0 56.0 7.0 85.71 MB

The implementation for the work "CoLLM: Integrating Collaborative Embeddings into Large Language Models for Recommendation".

License: BSD 3-Clause "New" or "Revised" License

Python 60.20% Jupyter Notebook 39.80%

collm's People

Contributors

zhangyang1588 avatar zyang1580 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

collm's Issues

关于v1和v2的疑惑

您在collm_pretrain_mf_ood.yaml中写run.mode:'v2' please not change it
但是在collm_pretrain_mf_ood_amazon.yaml中写run.mode: 'v2' # stage1: v1,
请问实际情况应该如何选择呢?

此外,在stage1如何保证ID embedding没有嵌入到prompt中,在现有代码中我发现stage1只是将proj冻结,实际上ID embedding还是嵌入到了prompt中,请问这个地方具体应该如何操作呢?

About the code

I read your paper and I feel that your paper is great. When will you release the code?

hot and cold datasets

Hello author, I noticed that your code does not seem to provide the hot and cold datasets mentioned in the paper. Is it missing? Sincerely looking forward to your supplementation

What is the dataset config format

Hello, When I run your code in my server there is a problem, it need a dataset config file and I can't find this file in the code.

The error in the file minigpt4/datasets/builders/rec_pair_builder.py

The error said:
FileNotFoundError: [Errno 2] No such file or directory: './CoLLM/minigpt4/configs/datasets/movielens/default.yaml'

How I can write this file or get it.

Thanks!

ValueError: Input contains NaN.

When I run the program on Amazon Book to 11 epochs, the loss becomes nan.
All the settings are default. The GPU is A100 *1.

Can you give some suggestions?

Input contains NaN.

Following your README step by step, using the dataset directly from your preprocessed ml-1m file, why does it show the error "Input contains NaN"?

Hyperparameters to reproduce the result of collab. model on amazon book dataset?

Hi! I've read the paper and found it very interesting so I'm trying to reproduce it. However, I'm kind of stucked in the very first step to train the base collab. model.
I used baseline_train_sasrec_amazon.py and baseline_train_mf_ood_amazon.py to train SASRec and MF model seperately, with the hyperparameters in the scripts unchanged. (except batch_size: 10240 -> 1024 in baseline_train_sasrec_amazon.py, I thought it might be a typo since the value is always 1024 in baseline_train_sasrec.py, baseline_train_mf_ood.py and baseline_train_mf_ood_amazon.py)
But my result is much lower than those reported in the paper. So I wonder if it's the hyperparameters in the scripts are just for experiments and not the optimal values? Or something I didn't notice may cause the gap?

The training logs produced by my run are as follows:

# SASRec
train_config: {'lr': 0.01, 'wd': 0.0001, 'embedding_size': 64, 'epoch': 5000, 'eval_epoch': 1, 'patience': 50, 'batch_size': 1024, 'maxlen': 20} 
best result: {'valid_auc': 0.6550170900138883, 'valid_uauc': 0, 'test_auc': 0.6478201601802253, 'test_uauc': 0, 'epoch': 64}
# MF
train_config: {'lr': 0.001, 'wd': 0.0001, 'embedding_size': 256, 'epoch': 5000, 'eval_epoch': 1, 'patience': 50, 'batch_size': 1024} 
best result: {'valid_auc': 0.5837625758627063, 'valid_uauc': 0.5173472295556202, 'test_auc': 0.5749810031561386, 'test_uauc': 0.5242115883441811, 'epoch': 10}

Thanks for you advices!

Update:
It seems setting batch_size to 10240 make sense for SASRec on Amazon dataset. My result is close to the value reported in the paper after doing so.
For MF model, weight_decay seems to be the key parameter, the performance boosts after I set it to 1e-5.

Minimum hardware to reproduce the work.

Hi! Congratulations on your work!

I would like to start trying to reproduce your work and was wondering if I could do it with more limited hardware such as an rtx 3090 24GB VRAM.

You say in the Lora Tuning:
To launch the first stage training, run the following command. In our experiments, we use 2 A100.

If you could give me some guidelines (or some advice) I would really appreciate it.

Thank you very much in advance!

关于Lora微调中参数设置的问题。

您好!
请问,在Lora微调时参数pretrained_path时是如何设置的呢,设置为None还是0925-OODv2_lgcn_book_best_model_d256lr-0.0001wd1e-07.pth这样的文件路径呢。我的理解是应该设置为None,因为第一次微调不加入embedding,但是这样不能运行。

Inconsistency of ML-1M statistics in paper and released propcessed dataset

The paper reported that ML-1M dataset have 839 users and 3,256 items.
image

The statistics is inconsistent with released datasets, which can be reproduced via following scripts

import pandas as pd

train_ = pd.read_pickle('train_ood2.pkl')
valid_ = pd.read_pickle('valid_ood2.pkl')
test_ = pd.read_pickle('test_ood2.pkl')

uids = set(train_.uid.unique()) | set(valid_.uid.unique()) | set(test_.uid.unique())
iids = set(train_.iid.unique()) | set(valid_.iid.unique()) | set(test_.iid.unique())

print(len(uids), len(iids)) # 838, 3255

CIE tuning

Hybrid Encoding 和CIE tuning的过程是不是可以看作一个prompt tuning的过程,那这个方法是只能在开源模型上使用对吗

running time

作者你好,请问3090能不能跑你的代码呢,你用的两张A100的训练时间大概是多久呢

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.