Giter Site home page Giter Site logo

rucaibox / crslab Goto Github PK

View Code? Open in Web Editor NEW
474.0 14.0 105.0 667 KB

CRSLab is an open-source toolkit for building Conversational Recommender System (CRS).

Home Page: https://github.com/RUCAIBox/CRSLab

License: MIT License

Python 100.00%
conversational-recommendation pytorch graph-neural-network pretrained-models human-machine-interaction deep-learning dialog-system recommender-system conversation-system recommendation text-generation knowledge-graph

crslab's Introduction

CRSLab

Pypi Latest Version Release License arXiv Documentation Status

Paper | Docs | 中文版

CRSLab is an open-source toolkit for building Conversational Recommender System (CRS). It is developed based on Python and PyTorch. CRSLab has the following highlights:

  • Comprehensive benchmark models and datasets: We have integrated commonly-used 6 datasets and 18 models, including graph neural network and pre-training models such as R-GCN, BERT and GPT-2. We have preprocessed these datasets to support these models, and release for downloading.
  • Extensive and standard evaluation protocols: We support a series of widely-adopted evaluation protocols for testing and comparing different CRS.
  • General and extensible structure: We design a general and extensible structure to unify various conversational recommendation datasets and models, in which we integrate various built-in interfaces and functions for quickly development.
  • Easy to get started: We provide simple yet flexible configuration for new researchers to quickly start in our library.
  • Human-machine interaction interfaces: We provide flexible human-machine interaction interfaces for researchers to conduct qualitative analysis.

RecBole v0.1 architecture
Figure 1: The overall framework of CRSLab

Installation

CRSLab works with the following operating systems:

  • Linux
  • Windows 10
  • macOS X

CRSLab requires Python version 3.7 or later.

CRSLab requires torch version 1.8. If you want to use CRSLab with GPU, please ensure that CUDA or CUDAToolkit version is 10.2 or later. Please use the combinations shown in this Link to ensure the normal operation of PyTorch Geometric.

Install PyTorch

Use PyTorch Locally Installation or Previous Versions Installation commands to install PyTorch. For example, on Linux and Windows 10:

# CUDA 10.2
conda install pytorch==1.8.0 torchvision==0.9.0 torchaudio==0.8.0 cudatoolkit=10.2 -c pytorch

# CUDA 11.1
conda install pytorch==1.8.0 torchvision==0.9.0 torchaudio==0.8.0 cudatoolkit=11.1 -c pytorch -c conda-forge

# CPU Only
conda install pytorch==1.8.0 torchvision==0.9.0 torchaudio==0.8.0 cpuonly -c pytorch

If you want to use CRSLab with GPU, make sure the following command prints True after installation:

$ python -c "import torch; print(torch.cuda.is_available())"
>>> True

Install PyTorch Geometric

Ensure that at least PyTorch 1.8.0 is installed:

$ python -c "import torch; print(torch.__version__)"
>>> 1.8.0

Find the CUDA version PyTorch was installed with:

$ python -c "import torch; print(torch.version.cuda)"
>>> 11.1

For Linux:

Install the relevant packages:

conda install pyg -c pyg

For others:

Check PyG installation documents to install the relevant packages.

Install CRSLab

You can install from pip:

pip install crslab

OR install from source:

git clone https://github.com/RUCAIBox/CRSLab && cd CRSLab
pip install -e .

Quick-Start

With the source code, you can use the provided script for initial usage of our library with cpu by default:

python run_crslab.py --config config/crs/kgsf/redial.yaml

The system will complete the data preprocessing, and training, validation, testing of each model in turn. Finally it will get the evaluation results of specified models.

If you want to save pre-processed datasets and training results of models, you can use the following command:

python run_crslab.py --config config/crs/kgsf/redial.yaml --save_data --save_system

In summary, there are following arguments in run_crslab.py:

  • --config or -c: relative path for configuration file(yaml).
  • --gpu or -g: specify GPU id(s) to use, we now support multiple GPUs. Defaults to CPU(-1).
  • --save_data or -sd: save pre-processed dataset.
  • --restore_data or -rd: restore pre-processed dataset from file.
  • --save_system or -ss: save trained system.
  • --restore_system or -rs: restore trained system from file.
  • --debug or -d: use validation dataset to debug your system.
  • --interact or -i: interact with your system instead of training.
  • --tensorboard or -tb: enable tensorboard to monitor train performance.

Models

In CRSLab, we unify the task description of conversational recommendation into three sub-tasks, namely recommendation (recommend user-preferred items), conversation (generate proper responses) and policy (select proper interactive action). The recommendation and conversation sub-tasks are the core of a CRS and have been studied in most of works. The policy sub-task is needed by recent works, by which the CRS can interact with users through purposeful strategy. As the first release version, we have implemented 18 models in the four categories of CRS model, Recommendation model, Conversation model and Policy model.

Category Model Graph Neural Network? Pre-training Model?
CRS Model ReDial
KBRD
KGSF
TG-ReDial
INSPIRED
×


×
×
×
×
×

Recommendation model Popularity
GRU4Rec
SASRec
TextCNN
R-GCN
BERT
×
×
×
×

×
×
×
×
×
×
Conversation model HERD
Transformer
GPT-2
×
×
×
×
×
Policy model PMI
MGCG
Conv-BERT
Topic-BERT
Profile-BERT
×
×
×
×
×
×
×


Among them, the four CRS models integrate the recommendation model and the conversation model to improve each other, while others only specify an individual task.

For Recommendation model and Conversation model, we have respectively implemented the following commonly-used automatic evaluation metrics:

Category Metrics
Recommendation Metrics Hit@{1, 10, 50}, MRR@{1, 10, 50}, NDCG@{1, 10, 50}
Conversation Metrics PPL, BLEU-{1, 2, 3, 4}, Embedding Average/Extreme/Greedy, Distinct-{1, 2, 3, 4}
Policy Metrics Accuracy, Hit@{1,3,5}

Datasets

We have collected and preprocessed 6 commonly-used human-annotated datasets, and each dataset was matched with proper KGs as shown below:

Dataset Dialogs Utterances Domains Task Definition Entity KG Word KG
ReDial 10,006 182,150 Movie -- DBpedia ConceptNet
TG-ReDial 10,000 129,392 Movie Topic Guide CN-DBpedia HowNet
GoRecDial 9,125 170,904 Movie Action Choice DBpedia ConceptNet
DuRecDial 10,200 156,000 Movie, Music Goal Plan CN-DBpedia HowNet
INSPIRED 1,001 35,811 Movie Social Strategy DBpedia ConceptNet
OpenDialKG 13,802 91,209 Movie, Book Path Generate DBpedia ConceptNet

Performance

We have trained and test the integrated models on the TG-Redial dataset, which is split into training, validation and test sets using a ratio of 8:1:1. For each conversation, we start from the first utterance, and generate reply utterances or recommendations in turn by our model. We perform the evaluation on the three sub-tasks.

Recommendation Task

Model Hit@1 Hit@10 Hit@50 MRR@1 MRR@10 MRR@50 NDCG@1 NDCG@10 NDCG@50
SASRec 0.000446 0.00134 0.0160 0.000446 0.000576 0.00114 0.000445 0.00075 0.00380
TextCNN 0.00267 0.0103 0.0236 0.00267 0.00434 0.00493 0.00267 0.00570 0.00860
BERT 0.00722 0.00490 0.0281 0.00722 0.0106 0.0124 0.00490 0.0147 0.0239
KBRD 0.00401 0.0254 0.0588 0.00401 0.00891 0.0103 0.00401 0.0127 0.0198
KGSF 0.00535 0.0285 0.0771 0.00535 0.0114 0.0135 0.00535 0.0154 0.0259
TG-ReDial 0.00793 0.0251 0.0524 0.00793 0.0122 0.0134 0.00793 0.0152 0.0211

Conversation Task

Model BLEU@1 BLEU@2 BLEU@3 BLEU@4 Dist@1 Dist@2 Dist@3 Dist@4 Average Extreme Greedy PPL
HERD 0.120 0.0141 0.00136 0.000350 0.181 0.369 0.847 1.30 0.697 0.382 0.639 472
Transformer 0.266 0.0440 0.0145 0.00651 0.324 0.837 2.02 3.06 0.879 0.438 0.680 30.9
GPT2 0.0858 0.0119 0.00377 0.0110 2.35 4.62 8.84 12.5 0.763 0.297 0.583 9.26
KBRD 0.267 0.0458 0.0134 0.00579 0.469 1.50 3.40 4.90 0.863 0.398 0.710 52.5
KGSF 0.383 0.115 0.0444 0.0200 0.340 0.910 3.50 6.20 0.888 0.477 0.767 50.1
TG-ReDial 0.125 0.0204 0.00354 0.000803 0.881 1.75 7.00 12.0 0.810 0.332 0.598 7.41

Policy Task

Model Hit@1 Hit@10 Hit@50 MRR@1 MRR@10 MRR@50 NDCG@1 NDCG@10 NDCG@50
MGCG 0.591 0.818 0.883 0.591 0.680 0.683 0.591 0.712 0.729
Conv-BERT 0.597 0.814 0.881 0.597 0.684 0.687 0.597 0.716 0.731
Topic-BERT 0.598 0.828 0.885 0.598 0.690 0.693 0.598 0.724 0.737
TG-ReDial 0.600 0.830 0.893 0.600 0.693 0.696 0.600 0.727 0.741

The above results were obtained from our CRSLab in preliminary experiments. However, these algorithms were implemented and tuned based on our understanding and experiences, which may not achieve their optimal performance. If you could yield a better result for some specific algorithm, please kindly let us know. We will update this table after the results are verified.

Releases

Releases Date Features
v0.1.1 1 / 4 / 2021 Basic CRSLab
v0.1.2 3 / 28 / 2021 CRSLab

Contributions

Please let us know if you encounter a bug or have any suggestions by filing an issue.

We welcome all contributions from bug fixes to new features and extensions.

We expect all contributions discussed in the issue tracker and going through PRs.

We thank the nice contributions through PRs from @shubaoyu, @ToheartZhang.

Citing

If you find CRSLab useful for your research or development, please cite our Paper:

@article{crslab,
    title={CRSLab: An Open-Source Toolkit for Building Conversational Recommender System},
    author={Kun Zhou, Xiaolei Wang, Yuanhang Zhou, Chenzhan Shang, Yuan Cheng, Wayne Xin Zhao, Yaliang Li, Ji-Rong Wen},
    year={2021},
    journal={arXiv preprint arXiv:2101.00939}
}

Team

CRSLab was developed and maintained by AI Box group in RUC.

License

CRSLab uses MIT License.

crslab's People

Contributors

icedpanda avatar icyfish332 avatar lancelot39 avatar oran-ac avatar shubaoyu avatar toheartzhang avatar txy77 avatar wxl1999 avatar zilize avatar zyh716 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

crslab's Issues

There are any pretrain model to test? And question about Human-machine interaction interfaces

thanks you for sharing your awesome works! I have some question about this repo.

I am the first year master student and need to do a project about IR/RecSys/Search Engine for my course final project.I have seen your Ads. in somewhere and I think a Conversation Recommendation Demo is really COOL for this project.

but we meet some trouble here.

  1. I have NO GPU to train the whole model, could you share some PreTrain model for us to test. Thanks
  2. I see the describe in README.MD
    "Human-machine interaction interfaces: We provide flexible human-machine interaction interfaces for researchers to conduct qualitative analysis."
    is that mean this interface is something like terminal and recommendation version of Siri?

Bugs in evaluator

When doing the ind2txt, we will get the string:
image
Then if we calculate the n-gram, it will get the character granularity of unique n-gram:
image
Example:
image
Correct:
image
code result:
image

Inspired data Processing

Dear authors,

I have a use case of using a bit modified version of the INSPIRED dataset (having the same format as the original) that I want to use for the different models implemented in CRSLab.
As I see, here are the preprocessed data files from the original INSPIRED dataset, I guess I need to create similar files in order to work with the modified data.
image

Could you please guide me to produce similar files or provide a script that is used to convert the original dataset files?

Thanks in advance!

Have an error

Hi, thanks for your fantastic work!

I tried to follow the instructions but I got the following error message when I ran python run_crslab.py --config config/crs/kgsf/redial.yaml --save_data --save_system --gpu 0

2021-09-11 10:09:15.634 | INFO | crslab.data.dataloader.base:get_data:54 - [Finish dataset process before batchify]
25%|██████████████████████████████████████████████████████▋ | 92/363 [02:20<07:22, 1.63s/it]free(): invalid pointer
[1] 79193 abort (core dumped) python run_crslab.py --config config/crs/kgsf/tgredial.yaml --save_data --gp
(crslab) ✘ NORTHAMERICA.t-yooli@x86_64-conda_cos6-linux-gnu  ~/crslab/CRSLab   main  gcc --version
gcc (GCC) 11.1.0

I am using torch: 1.8.0 + cu111

thanks for your help!

which dataset did NTRD use exactly?

Hi,

First, thank you for open-sourcing such a fantastic tool! It is a great job.

I have a small question about NTRD: which dataset did it use? It seems that I saw a conflict between the paper and the code here.
In the paper, authors said: "To evaluate the performance of our method, we conduct comprehensive experiments on the REDIAL dataset1, ......"
However, in the yaml file for NTRD, it clearly shows "dataset: TGReDial".

I am confused about which is the one that has been used to train NTRD? TGReDial or Redial?

Thanks!

can't use cpu,maybe something wrong in config.py

in my opinion,I think maybe there is some problem in CRSLab/crslab/config/config.py 37
# gpu
os.environ['CUDA_VISIBLE_DEVICES'] = gpu
self.opt['gpu'] = [i for i in range(len(gpu.split(',')))]
if gpu=-1,self.opt['gpu'] will be [0]

Should we specify attention_mask when using gpt2 for the conversation task?

Here's an example when using 'redial dataset'
We do the padding function to make the dialogue in the same length, so that we can deal with batches.

  • Dataloader process: code

In order to make the gpt2 pay no attention to the pad, should we specify attention_mask when using gpt2 for conversation task?

  • gpt2 forward process:code
  • helpful issues: link

Because we calculate the loss just with the response, should the padding labels set to -100 rather than 0(code) so that the model can ignore it?

NTRD

我在NTRD的配置文件中创建了一个redial.config的文件,我看到tgredial.yaml文件中它的replace_token是['ITEM'],所以我也设置的是这个,但是当我运行它的时候一直出现replace_token的错误,想请问一下如果要跑这个redial数据集时,这个replace_token应该改成什么

Unable to reproduce the results from paper using default config

Hi,

I was trying to reproduce the result from this paper:

| Xiaolei Wang*, Kun Zhou*, Ji-Rong Wen, Wayne Xin Zhao. Towards Unified Conversational Recommender Systems via Knowledge-Enhanced Prompt Learning. KDD 2022.

In this paper, the performance of KGSF and KBRD in INSPIRED dataset is like:

image

However, when I use default config setting to run KGSF and KBRD on INSPIRED dataset:

python run_crslab.py --config CRSLab/config/crs/kbrd/inspired.yaml

and

python run_crslab.py --config CRSLab/config/crs/kgsf/inspired.yaml

I got much worse results on test set:
(first for KBRD, latter for KGSF)
image
image


I am not sure why I fail to reproduce. I noticed that the whole training process is quite short which only last 1 epoch. I think it is reasonable because INSPIRED dataset is small. Stop early should prevent overfitting.

Any clue will be helpful, thanks!

bert_param is alway none!

if hasattr(self.rec_model, 'bert'):
if os.environ["CUDA_VISIBLE_DEVICES"] == '-1':
bert_param = list(self.rec_model.bert.named_parameters())
else:
bert_param = list(self.rec_model.module.bert.named_parameters())
bert_param_name = ['bert.' + n for n, p in bert_param]
else:
bert_param = []
bert_param_name = []

KGSF Performance on ReDial dastset

Hi, thanks for sharing such a great project.

I have run a benchmark for the ReDial dataset using KGSF. However, I got worse results compared to the original paper

This is the cmd I used and all configurations are set to default.

python run_crslab.py --config config/crs/kgsf/redial.yaml --gpu 0 --save_data --save_system --tensorboard --restore_data

I notice that the default parameters are different to the original paper:

  • batch_size: 32 -> 128
  • epochs for training recommendation module: 30 -> 9

Are there any suggested parameters to reproduce the results? I found that using a batch of 32 is extremely slow and I did a batch of 256 which leads to worse results.

Results Log with default setting

KG

2022-05-30 17:08:13.563 | INFO     | crslab.system.kgsf:pretrain:120 - [Pretrain epoch 2]
2022-05-30 17:08:13.578 | INFO     | crslab.data.dataloader.base:get_data:54 - [Finish dataset process before batchify]
2022-05-30 17:09:51.380 | INFO     | crslab.evaluator.standard:report:98 - 
    grad norm  info_loss
        1.479      .4573

Recommendation

Results from the paper:

  • R@1: 0.039,
  • R@10: 0.183
  • R@50: 0.378

CRSLab:

2022-05-30 17:18:57.713 | INFO     | crslab.system.kgsf:train_recommender:147 - [Test]
2022-05-30 17:18:57.861 | INFO     | crslab.data.dataloader.base:get_data:54 - [Finish dataset process before batchify]
2022-05-30 17:18:59.518 | INFO     | crslab.evaluator.standard:report:98 - 
    hit@1  hit@10  hit@50  info_loss  mrr@1  mrr@10  mrr@50  ndcg@1  ndcg@10  ndcg@50  rec_loss
   .03522   .1774   .3687      .7471 .03522  .07128  .08048  .03522   .09601    .1385     8.069

Conversation

Results from the paper:

  • Dist-2: 0.289,
  • Dist-3: 0.434
  • Dist-4: 0.519

CRSLab:

2022-05-30 18:44:43.492 | INFO     | crslab.system.kgsf:train_conversation:176 - [Test]
2022-05-30 18:44:43.500 | INFO     | crslab.data.dataloader.base:get_data:54 - [Finish dataset process before batchify]
2022-05-30 18:45:06.845 | INFO     | crslab.evaluator.standard:report:98 - 
    average  bleu@1  bleu@2  bleu@3  bleu@4  dist@1  dist@2  dist@3  dist@4  extreme    f1  greedy
      .7300   .1671  .03262  .01538 .009669  .01072   .1129   .6729   1.855    .4991 .2173   .5993z

ReDial Recommender Results

Hi! I'm trying to reproduce the baseline results observed in the paper "Towards Unified Conversational Recommender Systems via
Knowledge-Enhanced Prompt Learning" available here https://arxiv.org/pdf/2206.09363.pdf

Using the configuration in config/crs/redial/redial.yaml for ReDial I'm getting the following results on the recommendation task:

  • With early stop (8 epochs): {"hit@1": 0.0002498, "hit@10": 0.009493, "hit@50": 0.08818, "mrr@1": 0.0002498, "mrr@10": 0.002024, "mrr@50": 0.005041, "ndcg@1": 0.0002498, "ndcg@10": 0.003739, "ndcg@50": 0.02005, "rec_loss": 10.3}
  • Without early stop (50 epochs): {"hit@1": 0, "hit@10": 0.0002498, "hit@50": 0.0602, "mrr@1": 0, "mrr@10": 3.123e-05, "mrr@50": 0.00158, "ndcg@1": 0, "ndcg@10": 7.881e-05, "ndcg@50": 0.01134, "rec_loss": 10.29}

I was wondering if someone can provide the instructions for getting results similar to those shown in the paper I mentioned.

Missing implementation for data preprocessing

Hi,

I am aiming to extend the current datasets. However, it is missing the implementation of data preprocessing.
Therefore, I would like to ask if this can be included in the new version?

Thanks

关于使用TG-Redial模型在Redial数据集上训练出现的报错

您好,我在使用gpu,用TG-Redial模型在Redial数据集上训练时,在Recommend训练好后,Conversation训练一开始就报错:Assertion srcIndex < srcSelectDimSize failed.
...
RuntimeError: CUDA error: device-side assert triggered
CUDA kernel errors might be asynchronously reported at some other API call,so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1.

不知道是我安装包的问题,还是您代码哪里细节有问题,请指教。

Missing model freezing in KBRD

Hi, thanks for sharing such a good toolbox.

In the original implementation of KBRD, the author freezes the parameters in the KG and recommender layers while training the conversation module. You can find the code here.

However, I couldn't find any freeze_parameters in KBRD other than KGSF and NTRD in this repo.

cpu环境报错:libc10_cuda.so: cannot open shared object file: No such file or directory

(crslab) ubuntu@ip-10-0-1-244:~/cls/CRSLab$ python run_crslab.py --config config/crs/kgsf/redial.yaml
Traceback (most recent call last):
File "run_crslab.py", line 13, in
from crslab import run_crslab
File "/home/ubuntu/cls/CRSLab/crslab/init.py", line 5, in
from crslab.system import get_system
File "/home/ubuntu/cls/CRSLab/crslab/system/init.py", line 12, in
from .kbrd import KBRDSystem
File "/home/ubuntu/cls/CRSLab/crslab/system/kbrd.py", line 16, in
from crslab.system.base import BaseSystem
File "/home/ubuntu/cls/CRSLab/crslab/system/base.py", line 19, in
from transformers import AdamW, Adafactor
File "/home/ubuntu/anaconda3/lib/python3.7/site-packages/transformers/init.py", line 34, in
from . import dependency_versions_check
File "/home/ubuntu/anaconda3/lib/python3.7/site-packages/transformers/dependency_versions_check.py", line 34, in
from .file_utils import is_tokenizers_available
File "/home/ubuntu/anaconda3/lib/python3.7/site-packages/transformers/file_utils.py", line 231, in
import torch_scatter
File "/home/ubuntu/anaconda3/lib/python3.7/site-packages/torch_scatter/init.py", line 12, in
library, [osp.dirname(file)]).origin)
File "/home/ubuntu/anaconda3/lib/python3.7/site-packages/torch/_ops.py", line 105, in load_library
ctypes.CDLL(path)
File "/home/ubuntu/anaconda3/lib/python3.7/ctypes/init.py", line 364, in init
self._handle = _dlopen(self._name, mode)
OSError: libc10_cuda.so: cannot open shared object file: No such file or directory

OSError: libcudart.so.11.0: cannot open shared object file: No such file or directory

(d2l) tislab20@tislab20-Precision-7920-Tower:~/Desktop/lq/CRSLab-main$ CUDA_VISIBLE_DEVICES=-1 python run_crslab.py --config config/crs/kgsf/redial.yaml
2023-02-24 03:34:20.942 | INFO | crslab.config.config:init:79 - [Dataset: ReDial tokenized in nltk]
2023-02-24 03:34:20.942 | INFO | crslab.config.config:init:81 - [Model: KGSF]
2023-02-24 03:34:20.942 | INFO | crslab.config.config:init:88 - [Config]
{
"dataset": "ReDial",
"tokenize": "nltk",
"embedding": "word2vec.npy",
"context_truncate": 256,
"response_truncate": 30,
"scale": 1,
"model": "KGSF",
"token_emb_dim": 300,
"kg_emb_dim": 128,
"num_bases": 8,
"n_heads": 2,
"n_layers": 2,
"ffn_size": 300,
"dropout": 0.1,
"attention_dropout": 0.0,
"relu_dropout": 0.1,
"learn_positional_embeddings": false,
"embeddings_scale": true,
"reduction": false,
"n_positions": 1024,
"pretrain": {
"epoch": 3,
"batch_size": 128,
"optimizer": {
"name": "Adam",
"lr": 0.001
}
},
"rec": {
"epoch": 9,
"batch_size": 128,
"optimizer": {
"name": "Adam",
"lr": 0.001
}
},
"conv": {
"epoch": 90,
"batch_size": 128,
"optimizer": {
"name": "Adam",
"lr": 0.001
},
"lr_scheduler": {
"name": "ReduceLROnPlateau",
"patience": 3,
"factor": 0.5
},
"gradient_clip": 0.1
},
"gpu": [
-1
],
"model_name": "KGSF"
}
Traceback (most recent call last):
File "run_crslab.py", line 41, in
from crslab.quick_start import run_crslab
File "/home/tislab20/Desktop/lq/CRSLab-main/crslab/quick_start/init.py", line 1, in
from .quick_start import run_crslab
File "/home/tislab20/Desktop/lq/CRSLab-main/crslab/quick_start/quick_start.py", line 13, in
from crslab.system import get_system
File "/home/tislab20/Desktop/lq/CRSLab-main/crslab/system/init.py", line 18, in
from .inspired import InspiredSystem
File "/home/tislab20/Desktop/lq/CRSLab-main/crslab/system/inspired.py", line 12, in
from crslab.system.base import BaseSystem
File "/home/tislab20/Desktop/lq/CRSLab-main/crslab/system/base.py", line 30, in
from crslab.model import get_model
File "/home/tislab20/Desktop/lq/CRSLab-main/crslab/model/init.py", line 18, in
from .crs import *
File "/home/tislab20/Desktop/lq/CRSLab-main/crslab/model/crs/init.py", line 2, in
from .kbrd import *
File "/home/tislab20/Desktop/lq/CRSLab-main/crslab/model/crs/kbrd/init.py", line 1, in
from .kbrd import KBRDModel
File "/home/tislab20/Desktop/lq/CRSLab-main/crslab/model/crs/kbrd/kbrd.py", line 26, in
from torch_geometric.nn import RGCNConv
File "/home/tislab20/anaconda3/envs/d2l/lib/python3.8/site-packages/torch_geometric/init.py", line 4, in
import torch_geometric.data
File "/home/tislab20/anaconda3/envs/d2l/lib/python3.8/site-packages/torch_geometric/data/init.py", line 1, in
from .data import Data
File "/home/tislab20/anaconda3/envs/d2l/lib/python3.8/site-packages/torch_geometric/data/data.py", line 20, in
from torch_sparse import SparseTensor
File "/home/tislab20/anaconda3/envs/d2l/lib/python3.8/site-packages/torch_sparse/init.py", line 34, in
from .storage import SparseStorage # noqa
File "/home/tislab20/anaconda3/envs/d2l/lib/python3.8/site-packages/torch_sparse/storage.py", line 5, in
from torch_scatter import segment_csr, scatter_add
File "/home/tislab20/anaconda3/envs/d2l/lib/python3.8/site-packages/torch_scatter/init.py", line 11, in
torch.ops.load_library(importlib.machinery.PathFinder().find_spec(
File "/home/tislab20/anaconda3/envs/d2l/lib/python3.8/site-packages/torch/_ops.py", line 105, in load_library
ctypes.CDLL(path)
File "/home/tislab20/anaconda3/envs/d2l/lib/python3.8/ctypes/init.py", line 369, in init
self._handle = _dlopen(self._name, mode)
OSError: libcudart.so.11.0: cannot open shared object file: No such file or directory

An exception occurred while running the ‘python run_crslab. py --config config/crs/kgsf/redial.yaml’ code

Hi!
Executing python run_crslab. py --config config/crs/kgsf/redial.yaml, return MemoryError. How should I fix it

File "F:\experiment\CRSLab-main\crslab\download.py", line 207, in untar
  shutil.unpack_archive(fullpath, path)
File "E:\Anaconda3\envs\CRSLab\lib\shutil.py", line 1241, in unpack_archive
  func(filename, extract_dir, **kwargs)
File "E:\Anaconda3\envs\CRSLab\lib\shutil.py", line 1154, in _unpack_zipfile
  data = zip.read(info.filename)
File "E:\Anaconda3\envs\CRSLab\lib\zipfile.py", line 1476, in read
  return fp.read()
File "E:\Anaconda3\envs\CRSLab\lib\zipfile.py", line 926, in read
  buf += self._read1(self.MAX_N)
MemoryError

An exception occurred while running the ‘python run_crslab. py --config config/crs/kgsf/redial.yaml’ code

Hi!
Executing python run_crslab. py --config config/crs/kgsf/redial.yaml, return MemoryError. How should I fix it

File "F:\experiment\CRSLab-main\crslab\download.py", line 207, in untar
  shutil.unpack_archive(fullpath, path)
File "E:\Anaconda3\envs\CRSLab\lib\shutil.py", line 1241, in unpack_archive
  func(filename, extract_dir, **kwargs)
File "E:\Anaconda3\envs\CRSLab\lib\shutil.py", line 1154, in _unpack_zipfile
  data = zip.read(info.filename)
File "E:\Anaconda3\envs\CRSLab\lib\zipfile.py", line 1476, in read
  return fp.read()
File "E:\Anaconda3\envs\CRSLab\lib\zipfile.py", line 926, in read
  buf += self._read1(self.MAX_N)
MemoryError

Get in trouble when run run_crslab.py

AssertionError: [ Checksum for redial_nltk.zip from
http://d0.ananas.chaoxing.com/download/417f6ac16282e4910fc93973e954ab42?fn=nltk
does not match the expected checksum. Please try again. ]

How to fix it!

How to manually check the testing result on each testing sample?

Dear authors,

Thank you for sharing this awesome project!

I am trying to look deeper on the evaluation result. like, how exactly the model perform on different testing samples.

I successfully run over the example code you provided on README.md and got pretty good results. However, those results are limited to some high-level metrics.

So, I am trying to look deeper to the performance on each testing samples, to uncover some clues about:

  1. how the testing samples actually looks like to human?
  2. what is the performance of the model on each testing sample? and what are their recommended movies based on the historical dialog.

Do you know how can I manually check the what the model is actually taking as input and output for each testing sample?

Thanks in advance!

Sincerely,

python run_crslab.py --config config/crs/kgsf/tgredial.yaml

Traceback (most recent call last):
File "run_crslab.py", line 47, in
run_crslab(config, args.save_data, args.restore_data, args.save_system, args.restore_system, args.interact,
File "/home//project/crs/CRSLab/crslab/quick_start/quick_start.py", line 73, in run_crslab
CRS.fit()
File "/home/
/project/crs/CRSLab/crslab/system/kgsf.py", line 186, in fit
self.train_recommender()
File "/home//project/crs/CRSLab/crslab/system/kgsf.py", line 144, in train_recommender
metric = self.evaluator.rec_metrics['hit@1'] + self.evaluator.rec_metrics['hit@50']
File "/home/
/project/crs/CRSLab/crslab/evaluator/metrics/base.py", line 214, in getitem
return self.get(item)
File "/home/**/project/crs/CRSLab/crslab/evaluator/metrics/base.py", line 211, in get
raise
RuntimeError: No active exception to reraise

AssertionError: [ Checksum for redial_nltk.zip

when i run this " !sh script/redial/train/redial_rec_train.sh" on anaconda i got this error. same error is coming when executing on goole colab. please help anyone

frozen importlib._bootstrap>:219: RuntimeWarning: scipy.lib.messagestream.MessageStream size changed, may indicate binary incompatibility. Expected 56 from C header, got 64 from PyObject
2022-06-10 11:08:18.617 | INFO | crslab.config.config:init:80 - [CONFIG] C2CRS_Model, C2CRS_Model
2022-06-10 11:08:18.633 | INFO | crslab.config.config:build_path:137 - save_path = C:\Users\91790\c2crs\save\ReDial_C2CRS_Model2022-06-10-11-08-18
2022-06-10 11:08:18.633 | INFO | crslab.config.config:build_path:144 - restore_path = C:\Users\91790\c2crs\save\ReDial_C2CRS_ModelNone
Downloading redial_nltk.zip: 0.00B [00:00, ?B/s]
Traceback (most recent call last):
File "run_crslab.py", line 79, in
run_crslab(config, args.save_data, args.restore_data, args.save_system, args.restore_system, args.interact,
File "C:\Users\91790\c2crs\crslab\quick_start\quick_start.py", line 57, in run_crslab
dataset = get_dataset(config, tokenize, restore_data, save_data)
File "C:\Users\91790\c2crs\crslab\data_init
.py", line 52, in get_dataset
return dataset_register_table[dataset](opt, tokenize, restore, save)
File "C:\Users\91790\c2crs\crslab\data\dataset\redial\redial.py", line 83, in init
super().init(opt, dpath, resource, restore, save)
File "C:\Users\91790\c2crs\crslab\data\dataset\base.py", line 44, in init
build(dpath, dfile, version=resource['version'])
File "C:\Users\91790\c2crs\crslab\download.py", line 274, in build
downloadable_file.download_file(dpath)
File "C:\Users\91790\c2crs\crslab\download.py", line 77, in download_file
self.checksum(dpath)
File "C:\Users\91790\c2crs\crslab\download.py", line 63, in checksum
raise AssertionError(
AssertionError: [ Checksum for redial_nltk.zip from
http://d0.ananas.chaoxing.com/download/417f6ac16282e4910fc93973e954ab42?fn=nltk
does not match the expected checksum. Please try again. ]

error: 'KGSFModel' object has no attribute 'module'

Hi, thanks for sharing such a awsome project.

I have run a benchmark for the TG-ReDial dataset using KGSF. However, something went wrong with the code.

This is the cmd I used and all configurations are set to default.

python run_crslab.py --config config/crs/kgsf/tgredial.yaml --gpu 0

After pretrain 0-40 epochs and train recommendation epoch 0-19, it gives the following relsult:

2022-11-11 15:14:44.522 | INFO     | crslab.system.kgsf:train_recommender:147 - [Test]
100%|█████████████████████████████████████████████████████████████████████████████████████████████████| 9000/9000 [00:00<00:00, 11171.52it/s]
2022-11-11 15:14:45.329 | INFO     | crslab.data.dataloader.base:get_data:54 - [Finish dataset process before batchify]
100%|████████████████████████████████████████████████████████████████████████████████████████████████████████| 18/18 [00:02<00:00,  8.21it/s]
2022-11-11 15:14:47.564 | INFO     | crslab.evaluator.standard:report:98 - 
     hit@1  hit@10  hit@50  info_loss   mrr@1  mrr@10  mrr@50  ndcg@1  ndcg@10  ndcg@50  rec_loss
   .005348  .03342  .08467      .6105 .005348  .01204  .01424 .005348   .01701   .02798     12.01

But when the next do train conversation epoch, something went wrong with the code, i.e.,

Traceback (most recent call last):
  File "run_crslab.py", line 43, in <module>
    run_crslab(config, args.save_data, args.restore_data, args.save_system, args.restore_system, args.interact,
  File "/root/autodl-tmp/csrexp/CRSLab-main/crslab/quick_start/quick_start.py", line 73, in run_crslab
    CRS.fit()
  File "/root/autodl-tmp/csrexp/CRSLab-main/crslab/system/kgsf.py", line 186, in fit
    self.train_conversation()
  File "/root/autodl-tmp/csrexp/CRSLab-main/crslab/system/kgsf.py", line 158, in train_conversation
    self.model.module.freeze_parameters()
  File "/root/miniconda3/lib/python3.8/site-packages/torch/nn/modules/module.py", line 771, in __getattr__
    raise ModuleAttributeError("'{}' object has no attribute '{}'".format(
torch.nn.modules.module.ModuleAttributeError: 'KGSFModel' object has no attribute 'module'

I hope you can take any time to reply to me, thank you very much!

run_crslab.py: error: argument -i/--interact: ignored explicit argument 't'

when i run this script " !sh script/redial/train/redial_rec_train.sh" of c2crs, following error comes. please help

usr/local/lib/python3.7/dist-packages/fuzzywuzzy/fuzz.py:11: UserWarning: Using slow pure-python SequenceMatcher. Install python-Levenshtein to remove this warning
warnings.warn('Using slow pure-python SequenceMatcher. Install python-Levenshtein to remove this warning')
usage: run_crslab.py [-h] [-c CONFIG] [-sd] [-rd] [-ss] [-rs] [-d] [-i]
run_crslab.py: error: argument -i/--interact: ignored explicit argument 't'

Is it normal to be so slow in train phase?

I just run
python run_crslab.py --config config/crs/tgredial/tgredial.yaml --save_data --save_system -g 3
but it lag at this step, it almost cost 9 hours to process data

INFO | crslab.data.dataloader.base:get_data:54 - [Finish dataset process before batchify]

it is realy slow in the step for training. that means I need a more powerful cpu or disk? does anyone slow as I am?

如果我想用自己的数据集来跑crslab上的模型(如KBRD和KGSF),该如何做数据的兼容性更改,是否有什么规范或教程可以提供的?因为我们看到crslab上已有的几个数据集好像都有做一些兼容修改。

如果我想用自己的数据集来跑crslab上的模型(如KBRD和KGSF),该如何做数据的兼容性更改,是否有什么规范或教程可以提供的?因为我们看到crslab上已有的几个数据集好像都有做一些兼容修改,如KGSF下就有durecdial.yaml,gorecdial.yaml等文件。

About the evaluation metric dist@K

Thanks for your amazing project!

I noticed that the dist@k obtained by using crslab is much larger than the dist@k reported in the original paper, such as kgsf, what is the reason?

Looking forward to your reply.

Preprocessing for Formatted Dataset

The code for preprocessing that you have used for unifying the dataset format (specifically, entity linking and BPE) has not been given. Can you please let us know the algorithms/code used for the above mentioned processes? Since in the original dataset, (say DuRecDial), we don't get word and entity keys in the conversation which I assume are from HowNet and CN-DBPedia respectively.

image

Data Resources for GoRecDial is unavailable!

Hi,

I try to retrieve data using the link shown in crslab->dataset->gorecdial->resources.py and it shows that the link is no longer available. I did not encounter this problem with the other datasets so far.

Understanding of results

I have trained the inspired system successfully and got the results for both recommendation and conversational tasks. I would appreciate it if you could please clarify the following points.

The results

  1. For the recommendation task, inspired results are mentioned under the BERT row. is that correct?
  2. The conversational results are mentioned under the transformer row. is that correct?

Finally, the original inspired system implemented two models, one with strategy and one without a strategy. So the results I received in the conversational task belong to which model?

Looking forward to your urgent clarification.
Thanks in advance!

安装CRSLab报错

您好,我(python3.10)按照所列步骤,完成了PyTorch(version1.12.0+CUDA11.6)和PyTorch Geometric(2.4.0)的安装,可是在下一步安装CRSLab出现了如下报错
微信图片_20231027145440
请问如何解决,谢谢!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.