njunlp / knn-box Goto Github PK

an easy-to-use knn-mt toolkit

License: MIT License

Python 96.55% C++ 0.55% Cuda 1.27% Cython 0.37% Shell 1.12% Lua 0.15%

fairseq knn-mt pytorch

knn-box's Introduction

🗃️ kNN-box

kNN-box is an open-source toolkit to build kNN-MT models. We take inspiration from the code of kNN-LM and adaptive kNN-MT, and develope this more extensible toolkit based on fairseq. Via kNN-box, users can easily implement different kNN-MT baseline models and further develope new models.

🗃️ kNN-box

Features

🎯 easy-to-use: a few lines of code to deploy a kNN-MT model
🔭 research-oriented: provide implementations of various papers
🏗️ extensible: easy to develope new kNN-MT models with our toolkit.
📊 visualized: the whole translation process of the kNN-MT can be visualized

Requirements and Installation

python >= 3.7
pytorch >= 1.10.0
faiss-gpu >= 1.7.3
sacremoses == 0.0.41
sacrebleu == 1.5.1
fastBPE == 0.1.0
streamlit >= 1.13.0
scikit-learn >= 1.0.2
seaborn >= 0.12.1

You can install this toolkit by

git clone [email protected]:NJUNLP/knn-box.git
cd knn-box
pip install --editable ./

Note: Installing faiss with pip is not suggested. For stability, we recommand you to install faiss with conda

CPU version only:
conda install faiss-cpu -c pytorch

GPU version:
conda install faiss-gpu -c pytorch # For CUDA

Overview

Basically, there are two steps for runing a kNN-MT model: building datastore and translating with datastore. In this toolkit, we unify different kNN-MT variants into a single framework, albeit they manipulate datastore in different ways. Specifically, the framework consists of three modules (basic class):

datastore: save translation knowledge as key-values pairs
retriever: retrieve useful translation knowledge from the datastore
combiner: produce final prediction based on retrieval results and NMT model

Users can easily develope different kNN-MT models by customizing three modules. This toolkit also provide example implementations of various popular kNN-MT models (listed below) and push-button scripts to run them, enabling researchers conveniently reproducing their experiment results:

Preparation: download pretrained models and dataset

You can prepare pretrained models and dataset by executing the following command:

cd knnbox-scripts
bash prepare_dataset_and_model.sh

use bash instead of sh. If you still have problem running the script, you can manually download the wmt19 de-en single model and multi-domain de-en dataset, and put them into correct directory (you can refer to the path in the script).

base neural machine translation model (our baseline)

To translate using base neural model, execute the following command:

cd knnbox-scripts/base-nmt
bash inference.sh

Nearest Neighbor Machine Translation

Implementation of Nearest Neighbor Machine Translation (Khandelwal et al., ICLR'2021)

To translate using vanilla knn-mt, execute the following command:

cd knnbox-scripts/vanilla-knn-mt
# step 1. build datastore
bash build_datastore.sh
# step 2. inference
bash inference.sh

Adaptive Nearest Neighbor Machine Translation

Implementation of Adaptive Nearest Neighbor Machine Translation (Zheng et al., ACL'2021)

To translate using adaptive knn-mt, execute the following command:

cd knnbox-scripts/adaptive-knn-mt
# step 1. build datastore
bash build_datastore.sh
# step 2. train meta-k network
bash train_metak.sh
# step 3. inference
bash inference.sh

Learning Kernel-Smoothed Machine Translation with Retrieved Examples

Implementation of Learning Kernel-Smoothed Machine Translation with Retrieved Examples (Jiang et al., EMNLP'2021)

To translate using kernel smoothed knn-mt, execute the following command:

cd knnbox-scripts/kernel-smoothed-knn-mt
# step 1. build datastore
bash build_datastore.sh
# step 2. train kster network
bash train_kster.sh
# step 3. inferece
bash inference.sh

Efficient Machine Translation Domain Adaptation

Implementation of Efficient Machine Translation Domain Adaptation (PH Martins et al., 2022)

To translate using Greedy Merge knn-mt, execute the following command:

cd knnbox-scripts/greedy-merge-knn-mt
# step 1. build datastore and prune using greedy merge method
bash build_datastore_and_prune.sh
# step 2. inferece (You can decide whether to use cache by --enable-cache)
bash inference.sh

Efficient Cluster-Based k-Nearest-Neighbor Machine Translation

Implementation of Efficient Cluster-Based k-Nearest-Neighbor Machine Translation (Wang et al., 2022)

To translate using pck knn-mt, execute the following command:

cd knnbox-scripts/pck-knn-mt
# step 1. build datastore 
bash build_datastore.sh
# step 2. train reduction network
bash train_reduct_network.sh
# step 3. reduct datastore's key dimension using trained network
bash reduct_datastore_dim.sh
# step 4. train meta-k network
bash train_metak.sh
# step 5. inference
bash inference.sh

[optional] In addition to reducing dimensions, you can use the method in the paper to reduce the number of entries in the datastore.

(after step 1.)
bash prune_datastore_size.sh

Towards Robust k-Nearest-Neighbor Machine Translation

Implementation of Towards Robust k-Nearest-Neighbor Machine Translation (Jiang et al., EMNLP'2022)

To translate using robust knn-mt, execute the following command:

cd knnbox-scripts/robust-knn-mt
# step 1. build datastore
bash build_datastore.sh
# step 2. train meta-k network
bash train_metak.sh
# step 3. inference
bash inference.sh

What Knowledge Is Needed? Towards Explainable Memory for kNN-MT Domain Adaptation

Implementation of What Knowledge Is Needed? Towards Explainable Memory for kNN-MT Domain Adaptation (Zhu et al., 2022)

PLAC is a datastore pruning method based on MT-model's knowledge. To prune a full datastore (vanilla or dimension-reduced), execute the following command:

cd knnbox-scripts/plac-knn-mt
# step 1. save MT-model predictions
bash save_mt_pred.sh
# step 2. save prunable indexes
bash save_drop_index.sh
# step 3. prune a full datastore and save the pruned datastore
bash prune_datastore.sh

Simple and Scalable Nearest Neighbor Machine Translation

Implementation of Simple and Scalable Nearest Neighbor Machine Translation

To translate using sk-mt, excute the following command:

cd knnbox-scripts/simple-scalable-knn-mt
# step 1. download elastic search
bash download_elasticsearch.sh
# step 2. start elastic search service on port 9200
./elasticsearch-8.6.1/bin/elasticsearch
# step 3. create elasticsearch index for corpus
bash create_elasticsearch_index.sh
# step 4. inference
bash inference.sh

If there is an elasticsearch-related error when executing the script, you may need to open ./elaticsearch-8.6.1/config/elasticsearch.yml and disable the security features:

xpack.security.enabled: false

Visualization

With kNN-box, you can even visualize the whole translation process of your kNN-MT model. You can launch the visualization service by running the following commands. Have fun with it!

cd knnbox-scripts/vanilla-knn-mt-visual
# step 1. build datastore for visualization (save more information for visualization)
bash build_datastore_visual.sh
# step 2. configure the model that you are going to visualize
vim model_configs.yml 
# step 3. launch the web page
bash start_app.sh

# Optional: regist your own tokenize handler function in src/tokenizer.py 
# and then use it as `--tokenizer` in model_configs.yml if necessary

Citation

We now have a paper you can cite for the :card_file_box: knn-box toolkit:

@misc{zhu2023knnbox,
      title={kNN-BOX: A Unified Framework for Nearest Neighbor Generation}, 
      author={Wenhao Zhu and Qianfeng Zhao and Yunzhe Lv and Shujian Huang and Siheng Zhao and Sizhe Liu and Jiajun Chen},
      year={2023},
      eprint={2302.13574},
      archivePrefix={arXiv},
      primaryClass={cs.CL}
}

knn-box's People

Contributors

Stargazers

Watchers

Forkers

nil-zhuang old-young233 hayatejp asomey maxwell-lyu owennju sammichenvv zth9730 hilbert-johnson marcozeller galaxygreen sxysxy

knn-box's Issues

关于plac-knn-mt运行报错

你好，我们在进行运行plac-knn-mt的时候
首先bash save_mt_pred.sh，
接着bash save_drop_index.sh，
然后bash prune_datastore.sh对datastore进行prune后，
使用vanilla-knn-mt中的inference，能够正常运行，但是运行出来的BLUE是0。

关于vanilla-knn-mt inference报错

在inference时，频频出现此类AttributeError: 'Namespace' object has no attribute 的错误。

还有就是，无法调用父类文件knnbox，ModuleNotFoundError: No module named 'knnbox'

Question about multilingual experiments in kNN-BOX paper

Hi, thanks for creating and sharing this codebase, it has been really helpful to me.

I'm interested in replicating multilingual experiments from your paper (Table 2) but I'm having some issues.

Are my following assumptions correct:

I should use the 418M M2M-100 model (not the larger versions).
I should create individual datastores per language direction, based on the TED training data. So for example, Cs->En M2M-100 BLEU score is 20.7, and when adding a Cs->En datastore based on the TED Cs->En training data the result will be improved to 22.3. (You did not use the M2M-100 training data, since this would be huge, correct? It would be helpful to add datastore sizes to the table so that readers can infer this.)

Are there any other details I should be aware of to reproduce your results?

基础NMT模型fairseq版本冲突问题

hello，我在用您的代码运行vanilla-knn-mt/build_datastore.sh遇到了这个错误。我的基础NMT模型fairseq版本是最新的V0.12训练的。在加载模型参数的时候报了这个错误:
AttributeError: 'NoneType' object has no attribute 'user_dir'
![Uploading image.png…]
这个错误是因为fairseq新版本已经没有用 state["args"]，已经改成了state['cfgs']。

当我把代码改成state[‘cfgs’]时，报错：

0.10的state["args"]是argparse.Namespace类型

如果要改代码好像要改许多地方，这也是我发现的其中一个版本不一致带来的问题，应该之后还会有类似的问题。貌似对新版本的fairseq训练的模型不太友好，有比较好的解决方式么？谢谢

the blue-score always 0

when i test the base model it gives me 32.5 in blue score, when i do use vanilla-knn it always 0

to build datastore i used this configuration
CUDA_VISIBLE_DEVICES=1 python $PROJECT_PATH/knnbox-scripts/common/validate.py $DATA_PATH \ --task translation \ --path $BASE_MODEL \ --source-lang en --target-lang ar \ --model-overrides "{'eval_bleu': False, 'required_seq_len_multiple':1, 'load_alignments': False}" \ --dataset-impl mmap \ --valid-subset train \ --skip-invalid-size-inputs-valid-test \ --max-tokens 2048 \ --bpe fastbpe \ --user-dir $PROJECT_PATH/knnbox/models \ --arch vanilla_knn_mt@transformer_wmt19_de_en \ --knn-mode build_datastore \ --knn-datastore-path $DATASTORE_SAVE_PATH \

and to test the the model i used this configurations

CUDA_VISIBLE_DEVICES=1 python $PROJECT_PATH/knnbox-scripts/common/generate.py $DATA_PATH \ --task translation \ --path $BASE_MODEL \ --dataset-impl mmap \ --beam 4 --lenpen 0.6 --max-len-a 1.2 --max-len-b 10 --source-lang en --target-lang ar \ --gen-subset test \ --max-tokens 2048 \ --encoder-embed-dim 768 \ --decoder-embed-dim 768 \ --dropout 0.2 \ --attention-dropout 0.0 \ --encoder-layerdrop 0 \ --decoder-layerdrop 0 \ --encoder-ffn-embed-dim 2048 \ --decoder-ffn-embed-dim 2048 \ --scoring sacrebleu \ --tokenizer moses \ --remove-bpe \ --user-dir $PROJECT_PATH/knnbox/models \ --arch vanilla_knn_mt@transformer_wmt19_de_en \ --knn-mode inference \ --knn-datastore-path $DATASTORE_LOAD_PATH \ --knn-k 8 \ --knn-lambda 0.7 \ --knn-temperature 10.0 \

关于运行速度的问题

在KNNMT论文(也就是Vanilla KNNMT)里说，加入KNN检索之后，速度慢了有2两个数量级。

我用咱们KNNBOX，在一块RTX8000上，测试对IT领域翻译的速度（推理速度），按照Readme中的指令运行的。

Baseline （只用NMT）：用时14.1s

Vanilla KNNMT : 用时17.7s

我注意到有基于原始KNNMT的代码的工作例如Revised-Key-KNNMT，在运行的时候，整个程序的整体GPU利用率只有25%，作者说因为“检索数据库时，需要把向量先转移到CPU，检索得到表示后再转移到GPU"，但我观察https://github.com/NJUNLP/knn-box/blob/master/knnbox/retriever/utils.py 似乎也是这样做的，运行Vanilla KNNMT GPU占用率却能够稳定接近100%。原始的KNNMT的代码https://github.com/urvashik/knnmt 是否存在GPU利用率过低的问题？（我实在没能跑起来这个原始的KNNMT代码，不知道HOME环境变量应该设置为什么，只能厚着脸皮来这里问问了）。

如果原始KNNMT比Baseline慢了两个数量级是因为原始KNNMT代码做的优化不好，而像KNNBOX这样能100%利用GPU则不存在这个问题的话，那么有关KNNMT的加速的研究还有用吗....

missing 'transformer_wmt19_de_en' arch when I trying to reproduce the Adaptive-kNN

Thanks for this repo to gather the knn-series' code. I have successfully reproduced the vanilla-knn under the guidance of it.
But the error when I trying to reproduce the Adaptive-knn in the [stage 2. train meta-k network] disturbed me for so long , the error info shows:

train.py: error: argument --arch/-a: invalid choice: 'transformer_wmt19_de_en' (choose from 'transformer_tiny', 'transformer', 'transformer_iwslt_de_en', 'transformer_wmt_en_de', 'transformer_vaswani_wmt_en_de_big', 'transformer_vaswani_wmt_en_fr_big', 'transformer_wmt_en_de_big' ... )

It seems that there's no such arch file as 'transformer_wmt19_de_en' which is used as the args. I wonder if it is a modified architecture by yourself or the elder version of fairseq? (I mentioned that you recommend the 0.12.2 version of fairseq and I'm pretty sure I successfully built it.) From the list I can only find the arch 'wmt_en_de' , it is also not compatible with the wmt19.de-en.ffn8192.pt file, but some 'de_en' arches only related to iwslt datasets... (T_T)

FileNotFoundError: /home/demo/knn-box/knnbox/models

请问这里缺少knn模型文件是哪一步的问题呢？

AssertionError: You should set pad mask first! 当我尝试运行bash build_datastore.sh

我在Run vanilla knn-mt出现了问题，

在这一步前，我已成功运行了Run base neural machine translation model (our baseline)。报的错是You should set pad mask first。源代码是

我所使用的环境是win10，torch1.13。
运行的命令是：
python knnbox-scripts/common/validate.py data-bin/medical --task translation --path pretrain-models/wmt19.de-en.ffn8192.pt --model-overrides "{'eval_bleu': False, 'required_seq_len_multiple':1, 'load_alignments': False}" --dataset-impl mmap --valid-subset train --skip-invalid-size-inputs-valid-test --max-tokens 4096 --bpe fastbpe --user-dir knnbox/models --arch "adaptive_knn_mt@transformer_wmt19_de_en" --knn-mode "build_datastore" --knn-datastore-path datastore/vanilla/medical
向您请教下如何设置pad mask.

Multi-processing for huge datastore

Thanks for sharing the great tool.

I am wondering if the tool supports multi-processing when saving keys and value to datastore (i.e. multi-gpu inference and saving key values). It may helps for huge datastore application.

关于RuntimeError: Error(s) in loading state_dict for VanillaKNNMT:

您好，我在运行vanilla-knn-mt/inference.sh脚本后，在加载模型时出现了错误：

这是我的运行脚本：

请问这个问题应该如何解决？

KeyError: 'keys'

请问出现这个问题是什么原因呢？

2023-02-09 08:23:10 | INFO | fairseq.tasks.translation | /data/home/likai/NMT-offline/knn-box/knnbox-scripts/vanilla-knn-mt/../../data-bin/zh2en-ziyan-03 train zh-en 721 examples
2023-02-09 08:23:11 | INFO | train |  | valid on 'train' subset | loss 2.994 | nll_loss 1.346 | ppl 2.54 | wps 0 | wpb 16975 | bsz 721                                                                                                     
[vals.npy: (16975,) saved successfully ^_^ ]
|||  {'vals': <knnbox.common_utils.memmap.Memmap object at 0x7f9b55f62990>} <class 'knnbox.common_utils.memmap.Memmap'>
Traceback (most recent call last):
  File "/data/home/likai/NMT-offline/knn-box/knnbox-scripts/vanilla-knn-mt/../../knnbox-scripts/common/validate.py", line 252, in <module>
    cli_main()
  File "/data/home/likai/NMT-offline/knn-box/knnbox-scripts/vanilla-knn-mt/../../knnbox-scripts/common/validate.py", line 246, in cli_main
    distributed_utils.call_main(args, main, override_args=override_args)
  File "/data/home/likai/NMT-offline/knn-box/fairseq/distributed_utils.py", line 301, in call_main
    main(args, **kwargs)
  File "/data/home/likai/NMT-offline/knn-box/knnbox-scripts/vanilla-knn-mt/../../knnbox-scripts/common/validate.py", line 192, in main
    datastore.build_faiss_index("keys", use_gpu=(not args.build_faiss_index_with_cpu))   # build faiss index
  File "/data/home/likai/NMT-offline/knn-box/knnbox/datastore/datastore.py", line 177, in build_faiss_index
    if not isinstance(self.datas[name], Memmap):
KeyError: 'keys'

推理训练数据与标签不一致

请问一下，我理解的knn-mt是让训练好的模型对训练数据进行推理，从而记录下key-value值。但是有没有可能推理时的输出并不一定和真实标签结果相同，从而导致知识库中key-value个数不同或者不匹配的问题呢？或者这个问题也有解决只是我对代码理解没有到位。

运行vanilla-knn-mt-visual出错

当我运行可视化的knn-mt出现上图的参数不匹配的错误：
第363行的参数列表：

第1153行的参数列表：

KeyError when load_faiss_index from a dumpped datastore

How to reproduce

[OK] Build a datastore and its faiss_index using scripts under knnbox-scripts/vanilla-knn-mt
[OK] Load this datastore in my code, dump it
[Error] Load the dumpped datastore, and load its faiss_index (called by any retriever)

Cause

The "dump" and "load" in knn-box is not symmetric when it comes to faiss_index

The build_faiss_index method saves faiss_index shape to config.json
The dump method does not
The load method tries to load faiss_index shape from config.json

Fix

faiss's Index class saves vector dimention and vector counts in faiss_index file, knn-box need not to save them.

Error Trace

Traceback (most recent call last):
  File "/data0/lvyz/knn-box/knnbox-scripts/plac-knn-mt/../../knnbox-scripts/plac-knn-mt/save_drop_index.py", line 39, in <module>
    mt_known    = retriever.retrieve(query=query, return_list=["mt_known"])["mt_known"]
  File "/data0/lvyz/knn-box/knnbox/retriever/retriever.py", line 21, in retrieve
    self.datastore.load_faiss_index("keys", move_to_gpu=True)
  File "/data0/lvyz/knn-box/knnbox/datastore/datastore.py", line 155, in load_faiss_index
    shape = config["data_infos"][filename]["faiss_index_shape"]
KeyError: 'faiss_index_shape'

运行adaptive knn-mt出错

你好，我想请教一个问题。
我在运行adaptive knn-mt中的 bash train_metak.sh的时候。
运行结果报错如下：
RuntimeError: Error in faiss::gpu::GpuIndex::GpuIndex(std::shared_ptrfaiss::gpu::GpuResources, int, faiss::MetricType, float, faiss::gpu::GpuIndexConfig) at /root/miniconda3/conda-bld/faiss-pkg_1669821591485/work/faiss/gpu/GpuIndex.cu:58: Error: 'config_.device < getNumDevices()' failed: Invalid GPU device 0
感觉像是GPU未能识别，但是查了下nvidia-smi都是能看到GPU序号是0和1的。
而且在跑vanilla knn-mt时程序都是好的。

翻译错误：AssertionError: interactive mode, should have only one sentence

Traceback (most recent call last):
File "/home/nlp/anaconda3/envs/knn/lib/python3.7/site-packages/streamlit/runtime/scriptrunner/script_runner.py", line 564, in run_script
exec(code, module.dict)
File "/home/nlp/y2020/yzh/knn-box-master/knnbox-scripts/vanilla-knn-mt-visual/src/app.py", line 215, in
knn_main()
File "/home/nlp/y2020/yzh/knn-box-master/knnbox-scripts/vanilla-knn-mt-visual/src/app.py", line 126, in knn_main
k = 1, lambda=0.0, temperature=1.0
File "/home/nlp/y2020/yzh/knn-box-master/knnbox-scripts/vanilla-knn-mt-visual/../../knnbox-scripts/vanilla-knn-mt-visual/src/function.py", line 510, in translate_using_knn_model
assert len(results) == 1, "interactive mode, should have only one sentence"
AssertionError: interactive mode, should have only one sentence

隔了一段时间重新用后出现了上述问题

您好，当我运行adaptive-knn-mt中的build_datasotre.sh时，报No moudle named'knnbox'

我什么也没改啊

感谢您指出这个问题。我们经过对checkpoint文件的对比，确认了您提出的现象，您可以按照下面的示例代码，对checkpoint进行一个小的修改，以正常加载，而kNN-BOX的代码无需修改：

          感谢您指出这个问题。我们经过对checkpoint文件的对比，确认了您提出的现象，您可以按照下面的示例代码，对checkpoint进行一个小的修改，以正常加载，而kNN-BOX的代码无需修改：

import torch
new_version_ckpt = torch.load("<新版本保存的checkpoint路径>")
new_version_ckpt["args"] = new_version_ckpt["cfg"]["model"]
torch.save(new_version_ckpt, "<为旧版本转换过的checkpiont>")

尖括号内的文件名请按需填写，以下是一些补充说明：
新版本中，fairseq会向checkpoint里面保存更多的配置信息，而原本的ckpt["args"]仅包含模型的信息，改保存到ckpt["cfg"]["model"]中，类型仍为argparseNamespace

Originally posted by @Maxwell-Lyu in #18 (comment)
谢谢！我转换了模型后，运行build_datastore.sh又产生了下面的错误

关于计时的一个问题

在之前的工作中，我使用fairseq.logging.meters.StopwatchMeter 计NMT预测时间、KNN检索时间，大概是这样的流程：

timerNMT.start()
x = NMTModel(input)
timerNMT.stop()
timerKNN.start()
knn_retrieve(x)
timerKNN.stop()

但是得到的结果却非常诡异，即便是非常非常小的datastore，有GPU加速的faiss进行KNN检索的总用时占了inference的大部分时间，后来我发现，https://github.com/NJUNLP/knn-box/blob/master/knnbox/retriever/utils.py 中的retrieve_k_nearest中，会临时把tensor迁移到cpu上，这个操作会先进行一个cuda同步，等待tensor的前置算子全部计算完成后，才能把它的值拷贝到cpu上。
这时我意识到，根据pytorch文档的说法，pytorch上的cuda kernel默认是异步执行的。也就是说python这边的函数返回的时候，cuda操作可能并没有完成，我把计时的流程改为

torch.cuda.synchronize()
timerNMT.start()
x = NMTModel(input)
torch.cuda.synchronize()
timerNMT.stop()
torch.cuda.synchronize()
timerKNN.start()
knn_retrieve(x)
torch.cuda.synchronize()
timerKNN.stop()

它结果就看起来靠谱多了。

另一个更简单的例子则是(运行在我的RTX8000 GPU上)：

import torch
import time
a = torch.randn(10000,100000,device='cuda')
b = torch.randn(100000,10000,device='cuda')
#torch.cuda.synchronize()
start = time.perf_counter()
c = a@b
#torch.cuda.synchronize()
print(time.perf_counter() - start)

输出 0.27494998497422785

而解除torch.cuda.synchronize()的注释后：

import torch
import time
a = torch.randn(10000,100000,device='cuda')
b = torch.randn(100000,10000,device='cuda')
torch.cuda.synchronize()
start = time.perf_counter()
c = a@b
torch.cuda.synchronize()
print(time.perf_counter() - start)

输出2.4091989540029317

也就是说如果我们想测试算法的用时，应该在计时器的开始和结束前都进行一次cuda流同步操作，以确保测得时间就是中间被测代码的实际运行到运算出结果的时间。

于是我改了一下fairseq的StopwachMeter，在start和stop中加入cuda同步操作

class StopWatchTimer:
    '''
    Timer for time measurement.
    '''
    def __init__(self, cudaSyncOnEvents=False, cudaStream : torch.cuda.Stream=None) -> None:
        '''
        Args:
            cudaSyncOnEvents: Call cuda synchronize if start、stop、reset、elapsedTime is called
        '''
        self.startTime = None 
        self.totalTime = 0
        self.itemCount = 0
        
        if cudaSyncOnEvents and (not torch.cuda.is_available()):
            raise RuntimeError("cuda is not available")
        
        self.cudaSyncOnEvents = cudaSyncOnEvents
        self.cudaStream = cudaStream
        if cudaStream:
            self.cudaSyncFunction = cudaStream.synchronize
        else:
            self.cudaSyncFunction = torch.cuda.synchronize
        
    def __cudaSync(self): 
        if self.cudaSyncOnEvents:
            self.cudaSyncFunction()
        
    def start(self):
        self.__cudaSync()
        self.startTime = time.perf_counter()
        
    def stop(self, itemCount=0):
        if self.startTime is not None:
            self.__cudaSync()
            dtime = time.perf_counter() - self.startTime
            self.totalTime += dtime
            self.itemCount += itemCount
            
    def reset(self):
        self.itemCount = 0
        self.totalTime = 0
        self.start()
        
    def elapsedTime(self) -> float:
        if self.startTime is None: 
            return 0.0
        self.__cudaSync()
        return time.perf_counter() - self.startTime

Emmm，这或许其实是fairseq的问题.....

希望或许能对一些KNN-MT的加速的工作有帮助