Giter Site home page Giter Site logo

khuangaf / concrete Goto Github PK

View Code? Open in Web Editor NEW
14.0 2.0 0.0 140 KB

Official implementation of "CONCRETE: Improving Cross-lingual Fact Checking with Cross-lingual Retrieval" (COLING'22)

License: Apache License 2.0

Python 99.06% Shell 0.94%
cross-lingual-transfer fact-checking retrieval low-resource-languages multilinguality

concrete's Introduction

What's up!

I am a Computer Science Ph.D. candidate at the University of Illinois Urbana-Champaign, advised by Prof. Heng Ji. My research interest lies in fact-checking, corrective explanations for misinformation, and factually consistent text generation. Previously, I got my Bachelor's degree from the Hong Kong University of Science and Technology and my Master's degree from the University of Southern California.

Steeve's GitHub stats Top Langs

concrete's People

Contributors

khuangaf avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

concrete's Issues

KeyError: 'hard_negative_ctxs'

on running run_xict.sh i get the following error.

Traceback (most recent call last):
File "run_xict.py", line 600, in
main()
File "run_xict.py", line 590, in main
trainer.run_train()
File "run_xict.py", line 132, in run_train
self._train_epoch(scheduler, epoch, eval_step, train_iterator)
File "run_xict.py", line 325, in _train_epoch
shuffle_positives=args.shuffle_positive_ctx
File "/scratch/22cs60r72/InformationRetrival/copy/CORA/mDPR/mool/lib/python3.7/site-packages/dpr/models/biencoder.py", line 171, in create_biencoder_input
hard_neg_ctxs = sample["hard_negative_ctxs"]
KeyError: 'hard_negative_ctxs'

In the dataset "../../data/bbc_passages/all_ict_samples.jsonl_[0,1,2]" there is no argument 'hard_negative_ctxs'
I feel like a part of the code is missing of dpr/models/biencoder.py Can you share me the code?

ImportError: cannot import name 'read_xict_samples_from_json_files' from 'dpr.utils.data_utils'

when i run run_xict.sh it shows following error. It seems to me like you have not included dpr in this repository.

Traceback (most recent call last):
File "run_xict.py", line 32, in
from dpr.utils.data_utils import ShardedDataIterator, read_xict_samples_from_json_files, Tensorizer
ImportError: cannot import name 'read_xict_samples_from_json_files' from 'dpr.utils.data_utils' (/scratch/22cs60r72/InformationRetrival/copy/CORA/mDPR/mool/lib/python3.7/site-packages/dpr/utils/data_uti$
/scratch/22cs60r72/InformationRetrival/copy/CORA/mDPR/mool/lib/python3.7/site-packages/torch/distributed/launch.py:186: FutureWarning: The module torch.distributed.launch is deprecated
and will be removed in future. Use torchrun.
Note that --use_env is set by default in torchrun.
If your script expects --local_rank argument to be set, please
change it to read from os.environ['LOCAL_RANK'] instead. See
https://pytorch.org/docs/stable/distributed.html#launch-utility for
further instructions

FutureWarning,
ERROR:torch.distributed.elastic.multiprocessing.api:failed (exitcode: 1) local_rank: 0 (pid: 40046) of binary: /scratch/22cs60r72/InformationRetrival/copy/CORA/mDPR/mool/bin/python
Traceback (most recent call last):
File "/home/apps/DL-CondaPy3.7/lib/python3.7/runpy.py", line 193, in _run_module_as_main
"main", mod_spec)
File "/home/apps/DL-CondaPy3.7/lib/python3.7/runpy.py", line 85, in _run_code
exec(code, run_globals)
File "/scratch/22cs60r72/InformationRetrival/copy/CORA/mDPR/mool/lib/python3.7/site-packages/torch/distributed/launch.py", line 193, in
main()
File "/scratch/22cs60r72/InformationRetrival/copy/CORA/mDPR/mool/lib/python3.7/site-packages/torch/distributed/launch.py", line 189, in main
launch(args)
File "/scratch/22cs60r72/InformationRetrival/copy/CORA/mDPR/mool/lib/python3.7/site-packages/torch/distributed/launch.py", line 174, in launch
run(args)
File "/scratch/22cs60r72/InformationRetrival/copy/CORA/mDPR/mool/lib/python3.7/site-packages/torch/distributed/run.py", line 713, in run
)(*cmd_args)
File "/scratch/22cs60r72/InformationRetrival/copy/CORA/mDPR/mool/lib/python3.7/site-packages/torch/distributed/launcher/api.py", line 131, in call
return launch_agent(self._config, self._entrypoint, list(args))
File "/scratch/22cs60r72/InformationRetrival/copy/CORA/mDPR/mool/lib/python3.7/site-packages/torch/distributed/launcher/api.py", line 261, in launch_agent
failures=result.failures,
torch.distributed.elastic.multiprocessing.errors.ChildFailedError:

KeyError: 'question'

on running run_xict.sh i get the following error.
Traceback (most recent call last):
File "run_xict.py", line 600, in
main()
File "run_xict.py", line 590, in main
trainer.run_train()
File "run_xict.py", line 132, in run_train
self._train_epoch(scheduler, epoch, eval_step, train_iterator)
File "run_xict.py", line 325, in _train_epoch
shuffle_positives=args.shuffle_positive_ctx
File "/scratch/22cs60r72/InformationRetrival/copy/CORA/mDPR/mool/lib/python3.7/site-packages/dpr/models/biencoder.py", line 172, in create_biencoder_input
question = normalize_question(sample["question"])
KeyError: 'question'

should i initialize question =[] or should i do something else ?

can not find all_ict_samples.jsonl_dev

In run_xict.sh there is a command
python -m torch.distributed.launch
--nproc_per_node 1 run_xict.py
--max_grad_norm 2.0
--encoder_model_type hf_bert
--pretrained_model_cfg bert-base-multilingual-uncased
--seed 12345 --sequence_length 256
--warmup_steps 300 --batch_size 4 --do_lower_case
--train_file "../../data/bbc_passages/all_ict_samples.jsonl_[0,1,2]"
--dev_file ../../data/bbc_passages/all_ict_samples.jsonl_dev
--output_dir xict_outputs
--checkpoint_file_name xICT_biencoder.pt
--learning_rate 2e-05 --num_train_epochs 40
--dev_batch_size 6 --val_av_rank_start_epoch 30
but I don't know where can i find all_ict_samples.jsonl_dev
Instead of this file I am using all_ict_samples-trans100.jsonl
but it gives me error
#4 (comment)

ZeroDivisionError: float division by zero

When i run run_xict.sh I get the following error:-

Traceback (most recent call last):
File "run_xict.py", line 600, in
main()
File "run_xict.py", line 590, in main
trainer.run_train()
File "run_xict.py", line 132, in run_train
self._train_epoch(scheduler, epoch, eval_step, train_iterator)
File "run_xict.py", line 363, in _train_epoch
self.validate_and_save(epoch, train_data_iterator.get_iteration(), scheduler)
File "run_xict.py", line 148, in validate_and_save
validation_loss = self.validate_nll()
File "run_xict.py", line 185, in validate_nll
total_loss = total_loss / batches
ZeroDivisionError: float division by zero

What should I do?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.