zyang1580 / collm Goto Github PK

The implementation for the work "CoLLM: Integrating Collaborative Embeddings into Large Language Models for Recommendation".

License: BSD 3-Clause "New" or "Revised" License

Python 60.20% Jupyter Notebook 39.80%

collm's People

Contributors

Stargazers

Watchers

Forkers

dekopontree kp-forks peijiesun codelaurie njuhugn

collm's Issues

关于v1和v2的疑惑

您在collm_pretrain_mf_ood.yaml中写run.mode:'v2' please not change it
但是在collm_pretrain_mf_ood_amazon.yaml中写run.mode: 'v2' # stage1: v1,
请问实际情况应该如何选择呢？

此外，在stage1如何保证ID embedding没有嵌入到prompt中，在现有代码中我发现stage1只是将proj冻结，实际上ID embedding还是嵌入到了prompt中，请问这个地方具体应该如何操作呢？

About the code

I read your paper and I feel that your paper is great. When will you release the code?

hot and cold datasets

Hello author, I noticed that your code does not seem to provide the hot and cold datasets mentioned in the paper. Is it missing? Sincerely looking forward to your supplementation

What is the dataset config format

Hello, When I run your code in my server there is a problem, it need a dataset config file and I can't find this file in the code.

The error in the file minigpt4/datasets/builders/rec_pair_builder.py

The error said:
FileNotFoundError: [Errno 2] No such file or directory: './CoLLM/minigpt4/configs/datasets/movielens/default.yaml'

How I can write this file or get it.

Thanks!

where is the trained version of stage 1?

https://xxxxx/. There is no pretrained model.

ValueError: Input contains NaN.

When I run the program on Amazon Book to 11 epochs, the loss becomes nan.
All the settings are default. The GPU is A100 *1.

Can you give some suggestions?

Input contains NaN.

Following your README step by step, using the dataset directly from your preprocessed ml-1m file, why does it show the error "Input contains NaN"?

No such file or directory: 'prompts/rec_alignment.txt'

Hi there! I'm trying to train collm_sasrec version, but i can't find the prompt named rec_alignment.txt. Is it 'collm_movie.txt' or 'collm_amazon.txt'?

Hyperparameters to reproduce the result of collab. model on amazon book dataset?

Hi! I've read the paper and found it very interesting so I'm trying to reproduce it. However, I'm kind of stucked in the very first step to train the base collab. model.
I used baseline_train_sasrec_amazon.py and baseline_train_mf_ood_amazon.py to train SASRec and MF model seperately, with the hyperparameters in the scripts unchanged. (except batch_size: 10240 -> 1024 in baseline_train_sasrec_amazon.py, I thought it might be a typo since the value is always 1024 in baseline_train_sasrec.py, baseline_train_mf_ood.py and baseline_train_mf_ood_amazon.py)
But my result is much lower than those reported in the paper. So I wonder if it's the hyperparameters in the scripts are just for experiments and not the optimal values? Or something I didn't notice may cause the gap?

The training logs produced by my run are as follows:

# SASRec
train_config: {'lr': 0.01, 'wd': 0.0001, 'embedding_size': 64, 'epoch': 5000, 'eval_epoch': 1, 'patience': 50, 'batch_size': 1024, 'maxlen': 20} 
best result: {'valid_auc': 0.6550170900138883, 'valid_uauc': 0, 'test_auc': 0.6478201601802253, 'test_uauc': 0, 'epoch': 64}
# MF
train_config: {'lr': 0.001, 'wd': 0.0001, 'embedding_size': 256, 'epoch': 5000, 'eval_epoch': 1, 'patience': 50, 'batch_size': 1024} 
best result: {'valid_auc': 0.5837625758627063, 'valid_uauc': 0.5173472295556202, 'test_auc': 0.5749810031561386, 'test_uauc': 0.5242115883441811, 'epoch': 10}

Thanks for you advices!

Update:
It seems setting batch_size to 10240 make sense for SASRec on Amazon dataset. My result is close to the value reported in the paper after doing so.
For MF model, weight_decay seems to be the key parameter, the performance boosts after I set it to 1e-5.

Minimum hardware to reproduce the work.

Hi! Congratulations on your work!

I would like to start trying to reproduce your work and was wondering if I could do it with more limited hardware such as an rtx 3090 24GB VRAM.

You say in the Lora Tuning:
To launch the first stage training, run the following command. In our experiments, we use 2 A100.

If you could give me some guidelines (or some advice) I would really appreciate it.

Thank you very much in advance!

关于Lora微调中参数设置的问题。

您好！
请问，在Lora微调时参数pretrained_path时是如何设置的呢，设置为None还是0925-OODv2_lgcn_book_best_model_d256lr-0.0001wd1e-07.pth这样的文件路径呢。我的理解是应该设置为None，因为第一次微调不加入embedding，但是这样不能运行。

Inconsistency of ML-1M statistics in paper and released propcessed dataset

The paper reported that ML-1M dataset have 839 users and 3,256 items.

The statistics is inconsistent with released datasets, which can be reproduced via following scripts

import pandas as pd

train_ = pd.read_pickle('train_ood2.pkl')
valid_ = pd.read_pickle('valid_ood2.pkl')
test_ = pd.read_pickle('test_ood2.pkl')

uids = set(train_.uid.unique()) | set(valid_.uid.unique()) | set(test_.uid.unique())
iids = set(train_.iid.unique()) | set(valid_.iid.unique()) | set(test_.iid.unique())

print(len(uids), len(iids)) # 838, 3255