Giter Site home page Giter Site logo

fajieyuan / sigir2020_peterrec Goto Github PK

View Code? Open in Web Editor NEW
194.0 5.0 36.0 8.58 MB

Universal User Representation Pre-training for Cross-domain Recommendation and User Profiling

Python 100.00%
transfer-learning recommender-system cold-start user profiling recommendation transfer representation-learning lifelong-learning continual-learning

sigir2020_peterrec's Introduction

SIGIR2020_PeterRec

Parameter-Efficient Transfer from Sequential Behaviors for User Modeling and Recommendation

Posts: https://zhuanlan.zhihu.com/p/437671278

https://zhuanlan.zhihu.com/p/139048117

https://blog.csdn.net/abcdefg90876/article/details/109505669

https://blog.csdn.net/weixin_44259490/article/details/114850970

https://programmersought.com/article/36196143813/

https://zhuanlan.zhihu.com/p/430145630

🤗 New Resources: four Large-scale datasets for evaluating foundation / transferable / multi-modal / LLM recommendaiton models.

PeterRec Pytorch Code is here: https://github.com/yuangh-x/2022-NIPS-Tenrec



Please cite our paper if you use our code or datasets in your publication.

@article{yuan2020parameter,
  title={Parameter-Efficient Transfer from Sequential Behaviors for User Modeling and Recommendation},
  author={Yuan, Fajie and He, Xiangnan and Karatzoglou, Alexandros and Zhang, Liguang},
  journal={Proceedings of the 42nd international ACM SIGIR conference on Research and development in Information Retrieval},
  year={2020}
}

PeterRec_cau_parallel.py: PeterRec with causal cnn and parallel insertion

PeterRec_cau_serial.py: PeterRec with causal cnn and serial insertion

PeterRec_cau_serial_lambdafm.py: PeterRec_cau_serial.py with lambdafm-based negative sampler and evaluate all items rather than sampling 100 items for evaluation.

PeterRec_noncau_parallel.py: PeterRec with noncausal cnn and parallel insertion

PeterRec_noncau_serial.py: PeterRec with causal cnn and serial insertion

NextitNet_TF_Pretrain.py: Petrained by NextItNet [0] (i.e., causal cnn)

GRec_TF_Pretrain.py: Petrained by the encoder of GRec [1] (i.e., noncausal cnn)

Demo Steps:

You can directly run our code:

First: python NextitNet_TF_Pretrain_topk.py (NextitNet_TF_Pretrain.py is slower than NextitNet_TF_Pretrain_topk.py due to the output of full softmax in the evaluation stage.)

After convergence (you can stop it once the pretrained model is saved!)

Second: python PeterRec_cau_serial.py (or PeterRec_cau_serial_lambdafm.py)

Note that you are ABLE to use two types of evaluation methods, sampled top-N as in our paper (i.e., PeterRec_cau_serial.py) or evaluating all items (i.e., PeterRec_cau_serial_lambdafm.py). Be careful, if you use PeterRec_cau_serial_lambdafm.py, which means you are optimizing top-N metrics, then you have to evaluate prediction accuracy among all items (as shown in this file), rather than sampled metrics --- since sampled metrics are more consistent with AUC rather than true top-N. But if you use BPR or CE loss with a random negative sampler, you should use sampled metrics since the two loss with the random sampler directly optimizes AUC, rather than top-N metrics. I refer you to a recent papaer "On Sampled Metrics for Item Recommendation" for more details. In short, sampled metrics = AUC, rather than true top-N. BPR optimizes AUC, while lambdafm optimizes true top-N metrics (e.g., MRR@N, NDCG@N). If you use the correct evaluation methods, all insights and conclusions in our paper hold well.

or

First: python GRec_TF_Pretrain_topk.py Second: python PeterRec_noncau_parallel.py

Running our paper:

Replacing the demo dataset with our public datasets (including both pretraining and finetuning):

You will reproduce the results reported in our paper using our papar settings, including learning rate, embedding size, dilations, batch size, etc. Note that the results reported in the paper are based on the same hyper-parameter settings for fair comparison and ablation tests. You may further finetune hyper-parameters to obtatin the best performance. For example, we use 0.001 as learning rate during finetuning, you may find 0.0001 performs better although all insights in the paper keep consistent. In addition, there are some other improvement places, such as the negative sampling used for funetuning. For simplicity, we implement a very basic one by uniformly sampling, you can use more advanced sampler such as LambdaFM (LambdaFM: Learning Optimal Ranking with Factorization Machines Using Lambda Surrogates), i.e., PeterRec_cau_serial_lambdafm.py. Similarly, our pretraining network (e.g., NextitNet_TF_Pretrain.py) also employs a basic sampling function in TF, you can also replace it with your own one if you are dealing with hundreds of millions of items in a very large-scale system.

DataSet (desensitized)Links

Recommendation Dataset for pretraining, transfer learning and user representation learning:
ColdRec2: https://drive.google.com/open?id=1OcvbBJN0jlPTEjE0lvcDfXRkzOjepMXH
ColdRec1: https://drive.google.com/open?id=1N7pMXLh8LkSYDX30-zId1pEMeNDmA7t6
    
可用于推荐系统预训练,迁移学习,跨域推荐,冷启动推荐,用户表征学习,自监督学习等任务。

Note that we have provided the original dataset used in the paper and several preprocessed datasets for an easy try. That is, for simplicity, we provide a source dataset along with a target dataset for each task, while in practice it is suggested to use one source dataset pretrained to serve all target tasks (make sure your source dataset covers all ID indices in the target task).

In fact, the ColdRec2 datasets has both clicking and liking actions, we have provided the following dataset, which can be used for future research by separating clicking and liking data.

DataSet (desensitized)Links

Transfer Learning Recommendation Dataset:
ColdRec2 (clicking and liking data is separated): https://drive.google.com/file/d/1imhHUsivh6oMEtEW-RwVc4OsDqn-xOaP/view?usp=sharing

recommendation settings (Be careful!)

it will be much slower if 'eval_iter' is smaller as it represents how often you perform evaluation. It may takes only 1 or 2 iterations to converge.

Also please change the the number of batches you want to evaluate, we only show 20 batches as a demo, you can change it to 2000 maybe

NextitNet_TF_Pretrain_topk.py

    parser.add_argument('--eval_iter', type=int, default=10000,
                        help='Sample generator output evry x steps')
    parser.add_argument('--save_para_every', type=int, default=10000,
                        help='save model parameters every')
    parser.add_argument('--datapath', type=str, default='Data/Session/coldrec2_pre.csv',
                        help='data path')
    model_para = {
        'item_size': len(items),
        'dilated_channels': 64, # note in the paper we use 256
        'dilations': [1,4,1,4,1,4,1,4,], # note 1 4 means  1 2 4 8
        'kernel_size': 3,
        'learning_rate':0.001,
        'batch_size':32,# you can try 32, 64, 128, 256, etc.
        'iterations':5, #you can just stop pretraining if performance does not change in the testing set. It may not need 5 iterations
        'is_negsample':True #False denotes no negative sampling
    }

PeterRec settings (E.g.,PeterRec_cau_serial.py/PeterRec_cau_serial_lambdafm):

    parser.add_argument('--eval_iter', type=int, default=500,
                        help='Sample generator output evry x steps')
    parser.add_argument('--save_para_every', type=int, default=500,
                        help='save model parameters every')
    parser.add_argument('--datapath', type=str, default='Data/Session/coldrec2_fine.csv',
                        help='data path')
    model_para = {
        'item_size': len(items),
        'target_item_size': len(targets),
        'dilated_channels': 64,
        'cardinality': 1, # 1 is ResNet, otherwise is ResNeXt (performs similarly, but slowly)
        'dilations': [1,4,1,4,1,4,1,4,],
        'kernel_size': 3,
        'learning_rate':0.0001,
        'batch_size':512, #you can not use batch_size=1 since in the following you use np.squeeze will reuduce one dimension
        'iterations': 20, # note this is not the default setup, you should set it according to your own dataset by watching the performance in your testing set.
        'has_positionalembedding': args.has_positionalembedding
    }
  

Environments

  • Tensorflow (version: 1.7.0)
  • python 2.7

Related work:

[1]
@inproceedings{yuan2019simple,
  title={A simple convolutional generative network for next item recommendation},
  author={Yuan, Fajie and Karatzoglou, Alexandros and Arapakis, Ioannis and Jose, Joemon M and He, Xiangnan},
  booktitle={Proceedings of the Twelfth ACM International Conference on Web Search and Data Mining},
  pages={582--590},
  year={2019}
}
[2]
@inproceedings{yuan2020future,
  title={Future Data Helps Training: Modeling Future Contexts for Session-based Recommendation},
  author={Yuan, Fajie and He, Xiangnan and Jiang, Haochuan and Guo, Guibing and Xiong, Jian and Xu, Zhezhao and Xiong, Yilin},
  booktitle={Proceedings of The Web Conference 2020},
  pages={303--313},
  year={2020}
}
[3]
@article{sun2020generic,
  title={A Generic Network Compression Framework for Sequential Recommender Systems},
  author={Sun, Yang and Yuan, Fajie and Yang, Ming and Wei, Guoao and Zhao, Zhou and Liu, Duo},
  journal={Proceedings of the Twelfth ACM International Conference on Web Search and Data Mining},
  year={2020}
}
[4]
@inproceedings{yuan2021one,
  title={One person, one model, one world: Learning continual user representation without forgetting},
  author={Yuan, Fajie and Zhang, Guoxiao and Karatzoglou, Alexandros and Jose, Joemon and Kong, Beibei and Li, Yudong},
  booktitle={Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval},
  pages={696--705},
  year={2021}
}

Hiring

If you want to work with Fajie https://fajieyuan.github.io/, Please contact him by email [email protected]. His lab is now recruiting visiting students, interns, research assistants, posdocs (Chinese yuan: 450,000-550,000 per year), and research scientists. You can also contact him if you want to pursue a Phd degree at Westlake University. Please feel free to talk to him (by weichat: wuxiangwangyuan) if you have ideas or papers for collaboration. He is open to various collaborations. 西湖大学原发杰团队长期招聘:推荐系统和生物信息(尤其蛋白质相关)方向 ,科研助理,博士生,博后,访问学者,研究员系列。

sigir2020_peterrec's People

Contributors

fajieyuan avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

sigir2020_peterrec's Issues

Pretrain performance?

在ColdRec2数据集上pretrain,收敛的时候正常的HR@5应该是多少?
'mrr_5:', 0.030770833333333344, 'hit_5:', 0.05375, 'ndcg_5:', 0.03643626106347057
这个数值范围是不是预训练不成功?

用示例代码 + 单卡GPU,训练速度特别慢

你好,我用 Tesla P40 单卡来跑咱们的数据集,用到的参数都是论文里的设置
model_para = { 'item_size': len(items), 'dilated_channels': 256, 'dilations': [1, 2, 4, 8, 1, 2, 4, 8, 1, 2, 4, 8, 1, 2, 4, 8], 'kernel_size': 3, 'learning_rate':0.001, 'batch_size':32, 'iterations':400, 'is_negsample':True }
发现训练速度非常慢,要用 416分钟才能跑完一轮,默认的400轮跑完不知要猴年马月了。我用的是 coldrec2_pre.csv 这个数据集。请问这个训练速度很慢的问题有遇到过吗?
-------------------------------------------------------train1
LOSS: 5.77672100067 ITER: 0 BATCH_NO: 169 STEP:170 total_batches:23006
TIME FOR BATCH 1.08627700806
TIME FOR ITER (mins) 416.514814123

causal serial 在LifeEST复现

Hi,源代码PeterRec_noncau_parallel_classifier.py 有问题吗?我记得当时都应该测试过

我用pytorch参考PeterRec_noncau_parallel_classifier.py这个代码,写了causal_serial_classifier,在LifeEST数据上进行实验,复现不出论文Table2的实验结果。劳烦指正,我的复现如下:

(一)预训练NextitNet模型
代码:参考NextitNet_TF_Pretrain_topk.py
数据集:lifestatus_pretrain_desent.csv
参数: lr 0.001, batchsize 32, epoch 6, splite_rate 0.1, embedding-dim 256, negtive-smple 99, 使用sample softmaxt
'cardinality': 1 'dilations': [1, 4, 1, 4, 1, 4, 1, 4, ],

(二) 微调
数据集: lifestatus_finetune_desent.csv
数据划分: 训练集 0.7 验证集 0.03 测试集 0.27
medel_patch植入: 2 mp serial
训练方式:PeterRec_noncau_parallel_classifier.py的训练方式,不过negtive_samples 设置成99
epoch:80

(3) 编程语言:pytorch
(4) 复现结果:HR@5 0.538(论文中为0.610)

我弄了好久,都没弄出来,劳烦前辈指正一下,若是方便能不能微信沟通(本人微信, 875526037),非常感谢
在RecCold上复现出来了结果,结果略好与论文的。

causal serial 在LifeEST复现

请问,有人复现了 causal serial 在 LifeEST上的效果没?我尝试了各种方法都不得行。

PS:我用pytroch重写了PeterRec_noncau_parallel_classifier.py这个文件,只不过是negative sample 设置为99

Algorithm online practice

Hi Prof Yuan~

Do you have any suggestions on how to deploy the finetuned model for online applications for best efficiency and accuracy?

关于串行微调网络结构的疑问

image
上图1为论文中的2mp-serial的网络结构
image
image
但是在代码实现中(上图2~3),get_mp的return部分,是input_ + output,而不是output,复现后的结构如下
image
我的问题是,复现图(图4)中的右边是 model_patch + output 导致的输出,是不是不该存在?
感谢!

微调模型预测结果一样

hello博士您好,我这边使用coldrec2的数据在复现的模型上测试,预训练阶段mrr@5 = 0.0407;
微调阶段mrr@5 = 0.3044,基本两个epoch就停止收敛了;
然后使用微调的模型对预训练的数据进行top-k预测,发现不管输入啥top-k的结果都是一样的,如下所示几个seq的top-50结果:
image
想请教一下,这个可能是什么原因造成的呢,毕竟微调的指标已经接近论文中的0.33了;
非常感谢。

性别预测的数据问题

您好,请问对于TenRec数据集会出现在不同域下gender不一致的情况,比如源域gender=1,但是目标域是0、1、2都有。这种情况您是如何处理的呢?

Is it a bug?

Hi Prof Yuan~

In the evaluation part of PeterRec_cau_serial.py line 243, the rank of the retrieved items is given by
rank = predictmap.get(negtive_samples)
where negtive_samples is 'the number of negative examples for each positive one'.

Is it a bug? How can negtive_samples be the key of a retrieved item?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.