Giter Site home page Giter Site logo

khuangaf / concrete Goto Github PK

View Code? Open in Web Editor NEW
14.0 2.0 0.0 140 KB

Official implementation of "CONCRETE: Improving Cross-lingual Fact Checking with Cross-lingual Retrieval" (COLING'22)

License: Apache License 2.0

Python 99.06% Shell 0.94%
cross-lingual-transfer fact-checking retrieval low-resource-languages multilinguality

concrete's Issues

ZeroDivisionError: float division by zero

When i run run_xict.sh I get the following error:-

Traceback (most recent call last):
File "run_xict.py", line 600, in
main()
File "run_xict.py", line 590, in main
trainer.run_train()
File "run_xict.py", line 132, in run_train
self._train_epoch(scheduler, epoch, eval_step, train_iterator)
File "run_xict.py", line 363, in _train_epoch
self.validate_and_save(epoch, train_data_iterator.get_iteration(), scheduler)
File "run_xict.py", line 148, in validate_and_save
validation_loss = self.validate_nll()
File "run_xict.py", line 185, in validate_nll
total_loss = total_loss / batches
ZeroDivisionError: float division by zero

What should I do?

KeyError: 'hard_negative_ctxs'

on running run_xict.sh i get the following error.

Traceback (most recent call last):
File "run_xict.py", line 600, in
main()
File "run_xict.py", line 590, in main
trainer.run_train()
File "run_xict.py", line 132, in run_train
self._train_epoch(scheduler, epoch, eval_step, train_iterator)
File "run_xict.py", line 325, in _train_epoch
shuffle_positives=args.shuffle_positive_ctx
File "/scratch/22cs60r72/InformationRetrival/copy/CORA/mDPR/mool/lib/python3.7/site-packages/dpr/models/biencoder.py", line 171, in create_biencoder_input
hard_neg_ctxs = sample["hard_negative_ctxs"]
KeyError: 'hard_negative_ctxs'

In the dataset "../../data/bbc_passages/all_ict_samples.jsonl_[0,1,2]" there is no argument 'hard_negative_ctxs'
I feel like a part of the code is missing of dpr/models/biencoder.py Can you share me the code?

KeyError: 'question'

on running run_xict.sh i get the following error.
Traceback (most recent call last):
File "run_xict.py", line 600, in
main()
File "run_xict.py", line 590, in main
trainer.run_train()
File "run_xict.py", line 132, in run_train
self._train_epoch(scheduler, epoch, eval_step, train_iterator)
File "run_xict.py", line 325, in _train_epoch
shuffle_positives=args.shuffle_positive_ctx
File "/scratch/22cs60r72/InformationRetrival/copy/CORA/mDPR/mool/lib/python3.7/site-packages/dpr/models/biencoder.py", line 172, in create_biencoder_input
question = normalize_question(sample["question"])
KeyError: 'question'

should i initialize question =[] or should i do something else ?

can not find all_ict_samples.jsonl_dev

In run_xict.sh there is a command
python -m torch.distributed.launch
--nproc_per_node 1 run_xict.py
--max_grad_norm 2.0
--encoder_model_type hf_bert
--pretrained_model_cfg bert-base-multilingual-uncased
--seed 12345 --sequence_length 256
--warmup_steps 300 --batch_size 4 --do_lower_case
--train_file "../../data/bbc_passages/all_ict_samples.jsonl_[0,1,2]"
--dev_file ../../data/bbc_passages/all_ict_samples.jsonl_dev
--output_dir xict_outputs
--checkpoint_file_name xICT_biencoder.pt
--learning_rate 2e-05 --num_train_epochs 40
--dev_batch_size 6 --val_av_rank_start_epoch 30
but I don't know where can i find all_ict_samples.jsonl_dev
Instead of this file I am using all_ict_samples-trans100.jsonl
but it gives me error
#4 (comment)

ImportError: cannot import name 'read_xict_samples_from_json_files' from 'dpr.utils.data_utils'

when i run run_xict.sh it shows following error. It seems to me like you have not included dpr in this repository.

Traceback (most recent call last):
File "run_xict.py", line 32, in
from dpr.utils.data_utils import ShardedDataIterator, read_xict_samples_from_json_files, Tensorizer
ImportError: cannot import name 'read_xict_samples_from_json_files' from 'dpr.utils.data_utils' (/scratch/22cs60r72/InformationRetrival/copy/CORA/mDPR/mool/lib/python3.7/site-packages/dpr/utils/data_uti$
/scratch/22cs60r72/InformationRetrival/copy/CORA/mDPR/mool/lib/python3.7/site-packages/torch/distributed/launch.py:186: FutureWarning: The module torch.distributed.launch is deprecated
and will be removed in future. Use torchrun.
Note that --use_env is set by default in torchrun.
If your script expects --local_rank argument to be set, please
change it to read from os.environ['LOCAL_RANK'] instead. See
https://pytorch.org/docs/stable/distributed.html#launch-utility for
further instructions

FutureWarning,
ERROR:torch.distributed.elastic.multiprocessing.api:failed (exitcode: 1) local_rank: 0 (pid: 40046) of binary: /scratch/22cs60r72/InformationRetrival/copy/CORA/mDPR/mool/bin/python
Traceback (most recent call last):
File "/home/apps/DL-CondaPy3.7/lib/python3.7/runpy.py", line 193, in _run_module_as_main
"main", mod_spec)
File "/home/apps/DL-CondaPy3.7/lib/python3.7/runpy.py", line 85, in _run_code
exec(code, run_globals)
File "/scratch/22cs60r72/InformationRetrival/copy/CORA/mDPR/mool/lib/python3.7/site-packages/torch/distributed/launch.py", line 193, in
main()
File "/scratch/22cs60r72/InformationRetrival/copy/CORA/mDPR/mool/lib/python3.7/site-packages/torch/distributed/launch.py", line 189, in main
launch(args)
File "/scratch/22cs60r72/InformationRetrival/copy/CORA/mDPR/mool/lib/python3.7/site-packages/torch/distributed/launch.py", line 174, in launch
run(args)
File "/scratch/22cs60r72/InformationRetrival/copy/CORA/mDPR/mool/lib/python3.7/site-packages/torch/distributed/run.py", line 713, in run
)(*cmd_args)
File "/scratch/22cs60r72/InformationRetrival/copy/CORA/mDPR/mool/lib/python3.7/site-packages/torch/distributed/launcher/api.py", line 131, in call
return launch_agent(self._config, self._entrypoint, list(args))
File "/scratch/22cs60r72/InformationRetrival/copy/CORA/mDPR/mool/lib/python3.7/site-packages/torch/distributed/launcher/api.py", line 261, in launch_agent
failures=result.failures,
torch.distributed.elastic.multiprocessing.errors.ChildFailedError:

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.