Giter Site home page Giter Site logo

stevezheng23 / xlnet_extension_tf Goto Github PK

View Code? Open in Web Editor NEW
130.0 5.0 26.0 585 KB

XLNet Extension in TensorFlow

License: Apache License 2.0

Python 94.56% Shell 5.44%
artificial-intelligence natural-language-understanding natural-language-processing deep-learning machine-learning xlnet xlnet-ner xlnet-nlu xlnet-mrc xlnet-extension

xlnet_extension_tf's People

Contributors

stevezheng23 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

xlnet_extension_tf's Issues

TypeError: expected str, bytes or os.PathLike object, not FlagValues

I am trying to execute run_coqa.sh file with this command:

bash run_coqa.sh \
--gpudevice=0 \
--numgpus=1 \
--taskname=coqa \
--randomseed=100 \
--predicttag=xxxxx \
--modeldir=./model/xlnet_cased_L-24_H-1024_A-16 \
--datadir=./data \
--outputdir=./output_folder \
--numturn=100 \
--seqlen=512 \
--querylen=128 \
--answerlen=128 \
--batchsize=8 \
--learningrate=3e-5 \
--trainsteps=2000 \
--warmupsteps=120 \
--savesteps=500 \
--answerthreshold=128

Then I got this error:

I0819 21:56:00.241106 140032917555008 estimator.py:1147] Done calling model_fn.
INFO:tensorflow:Create CheckpointSaverHook.
I0819 21:56:00.242035 140032917555008 basic_session_run_hooks.py:541] Create CheckpointSaverHook.
Traceback (most recent call last):
  File "run_coqa.py", line 1777, in <module>
    tf.app.run()
  File "/home/huytran/miniconda3/envs/TF/lib/python3.7/site-packages/tensorflow/python/platform/app.py", line 40, in run
    _run(main=main, argv=argv, flags_parser=_parse_flags_tolerate_undef)
  File "/home/huytran/miniconda3/envs/TF/lib/python3.7/site-packages/absl/app.py", line 299, in run
    _run_main(main, args)
  File "/home/huytran/miniconda3/envs/TF/lib/python3.7/site-packages/absl/app.py", line 250, in _run_main
    sys.exit(main(argv))
  File "run_coqa.py", line 1716, in main
    estimator.train(input_fn=train_input_fn, max_steps=FLAGS.train_steps)
  File "/home/huytran/miniconda3/envs/TF/lib/python3.7/site-packages/tensorflow_estimator/python/estimator/estimator.py",
    loss = self._train_model(input_fn, hooks, saving_listeners)
  File "/home/huytran/miniconda3/envs/TF/lib/python3.7/site-packages/tensorflow_estimator/python/estimator/estimator.py",
    return self._train_model_default(input_fn, hooks, saving_listeners)
  File "/home/huytran/miniconda3/envs/TF/lib/python3.7/site-packages/tensorflow_estimator/python/estimator/estimator.py",
    saving_listeners)
  File "/home/huytran/miniconda3/envs/TF/lib/python3.7/site-packages/tensorflow_estimator/python/estimator/estimator.py",ec
    scaffold=estimator_spec.scaffold)
  File "/home/huytran/miniconda3/envs/TF/lib/python3.7/site-packages/tensorflow/python/training/basic_session_run_hooks.p
    self._save_path = os.path.join(checkpoint_dir, checkpoint_basename)
  File "/home/huytran/miniconda3/envs/TF/lib/python3.7/posixpath.py", line 80, in join
    a = os.fspath(a)
TypeError: expected str, bytes or os.PathLike object, not FlagValues
execution time was 57 s.

Do you guys know how to solve this?

What does label.vocab contains

Just to test the implementation, I took the conll2003 data from this link https://github.com/synalp/NER/tree/master/corpus/CoNLL-2003 and then added a resource/label.vocab file in ${DATADIR}. My label.vocab file contains following entities:
I-PER
B-PER
I-LOC
B-LOC
I-MISC
B-MISC
I-ORG
B-ORG

But when I run the training script, it gives me following error:
Traceback (most recent call last):
File "run_ner.py", line 855, in
tf.app.run()
File "/home/falak/.local/lib/python3.5/site-packages/tensorflow/python/platform/app.py", line 40, in run
_run(main=main, argv=argv, flags_parser=_parse_flags_tolerate_undef)
File "/home/falak/.local/lib/python3.5/site-packages/absl/app.py", line 300, in run
_run_main(main, args)
File "/home/falak/.local/lib/python3.5/site-packages/absl/app.py", line 251, in _run_main
sys.exit(main(argv))
File "run_ner.py", line 786, in main
train_features = example_converter.convert_examples_to_features(train_examples)
File "run_ner.py", line 381, in convert_examples_to_features
feature = self.convert_single_example(example, logging=(idx < 5))
File "run_ner.py", line 327, in convert_single_example
label_ids.append(self.label_map[labels[i]])
KeyError: 'O'

This error disappears if I add following additional entities in label.vocab:

O
X
<sep>
<unk>
<s>
</s>
<cls>
<sep>
<pad>
<mask>
<eod>
<eop>

So, is it expected to add these entities in label.vocab file?

Question about InputFeature generation in coqa

Thanks for uploading good and readable code and experiment setting, result.

btw, i have question about CoQA InputFeature generation.
In inputFeature generation code,
I think that your code seems to assume that doc span has always rationale to answer Free-form answers.
image

That means, sometimes, when doc span has no clue to answer free-form type question, it can be labeled incorrectly.
Is it intended? or Is there anything else I haven't understood?
Thanks for your work and have a good day!

About QuAC results

Hi, thank you so much for the code you share, I run the code for QuAC, but only get F1 score around 60, still have a large gap form the result you mentioned. I tried the same parameters you mentioned, except for batch_size, my machine can only support bs=4. Is there something more I need to pay attention to? Can you give me some advice about this? Thanks! :P

How to train a model with randomly initialized weights

I wanna train a ner model without pretrained weights. I remove --init_checkpoint from the command and execute it, but I get error message "--init_checkpoint must have a value other than None".
How should I do ? thanks

hard to get f1 value

I try to run on base size XLnet,128seq len,32 bsz and 2000times. but I can only get 91.3 f1 with conlleval perl version. is it right?

Can’t run CoQa train script in Google Colab

I tried using Google Colab CPU and GPU notebooks to train XLNET on COQA, but they keep crashing because of the ’out-of-memory’ issues. I tried reducing batch size to 1, but the problem still persists. Did anyone else face similar issues and was able to solve it?

How do I run ner on other language like chinese?

I have pretrained xlnet on a large chinese corpus, but how do I run the ner.py and what is label.vocab.
Here is my parameters to train the Sentence Piece model

spm_train \
	--input=data/wiki_all.txt \
	--model_prefix=sp10m.cased.v3 \
	--vocab_size=32000 \
	--character_coverage=0.9995 \
	--model_type=char \
	--control_symbols='<cls>,<sep>,<pad>,<mask>,<eod>' \
	--user_defined_symbols='<eop>,。' \
	--shuffle_input_sentence \
	--input_sentence_size=10000000

This my pretrained result.

I0708 01:51:08.929747 140337454118720 train_gpu.py:300] [99500] | gnorm 5.37 lr 0.000000 | loss 2.08 | pplx    8.01, bpc  3.0017
I0708 01:52:52.577970 140337454118720 train_gpu.py:300] [99600] | gnorm 4.98 lr 0.000000 | loss 2.03 | pplx    7.60, bpc  2.9265
I0708 01:54:36.169189 140337454118720 train_gpu.py:300] [99700] | gnorm 5.21 lr 0.000000 | loss 2.04 | pplx    7.73, bpc  2.9500
I0708 01:56:19.727979 140337454118720 train_gpu.py:300] [99800] | gnorm 5.06 lr 0.000000 | loss 2.05 | pplx    7.79, bpc  2.9625
I0708 01:58:03.187680 140337454118720 train_gpu.py:300] [99900] | gnorm 5.06 lr 0.000000 | loss 2.01 | pplx    7.47, bpc  2.9009
I0708 01:59:46.560450 140337454118720 train_gpu.py:300] [100000] | gnorm 5.51 lr 0.000000 | loss 2.00 | pplx    7.38, bpc  2.8840

So the label.vocabshould be like this ?

<cls>
<sep>
<pad>
<mask>
<eod>
B-AnatomyPart
I-AnatomyPart
B-Diagnosis
I-Diagnosis
B-Drug
I-Drug
B-Lab
I-Lab
B-Procedure
I-Procedure
B-Radiology
I-Radiology
O

AttributeError: 'module' object has no attribute 'XLNetConfig'

Thanks for your work on xlnet extension! It is quite impressive with how quickly this has been done.
I have a question with importing xlnet package. When I tried running NER experiment, I got an error in line 753 of run_ner.py,
saying that "AttributeError: 'module' object has no attribute 'XLNetConfig'".
Do you have any idea why xlnet.XLNetConfig is not successfully imported here?

Screen Shot 2019-06-27 at 10 28 35 AM

Thanks in advance for your answer!

Evaluation and Prediction

Hi,

I was trying to run the NER task on a customized dataset. The training process was successful. However, when it went to evaluation and prediction step, the program stuck at INFO:tensorflow:Done running local_init_op. and not moving forward. Is there any potential fix on this problem?

Here is the log

INFO:tensorflow:Done calling model_fn.
INFO:tensorflow:Graph was finalized.
2020-04-14 20:00:54.378068: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1512] Adding visible gpu devices: 0
2020-04-14 20:00:54.378117: I tensorflow/core/common_runtime/gpu/gpu_device.cc:984] Device interconnect StreamExecutor with strength 1 edge matrix:
2020-04-14 20:00:54.378127: I tensorflow/core/common_runtime/gpu/gpu_device.cc:990]      0 
2020-04-14 20:00:54.378135: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1003] 0:   N 
2020-04-14 20:00:54.378210: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 23077 MB memory) -> physical GPU (device: 0, name: Quadro P6000, pci bus id: 0000:02:00.0, compute capability: 6.1)
INFO:tensorflow:Restoring parameters from output/ner/i2b2/checkpoint/model.ckpt-100
INFO:tensorflow:Running local_init_op.
INFO:tensorflow:Done running local_init_op.

ResourceExhaustedError: OOM when allocating tensor of shape [32000,768]

I got OOM error when running with conll2003 dataset using my 12G memory GPU. How could I solve this problem?
ResourceExhaustedError (see above for traceback): OOM when allocating tensor of shape [32000,768] and type float [[node model/transformer/word_embedding/lookup_table/Adam/Initializer/zeros (defined at xlnet/model_utils.py:164) = Const[dtype=DT_FLOAT, value=Tensor<type: float shape: [32000,768] values: [0 0 0...]...>, _device="/job:localhost/replica:0/task:0/device:GPU:0"]()]

Using GPU

Hi,

Im trying to run NER task, but I think the model training is running on CPU instead of GPU. Is there any way to train on GPU, I saw the only choice in the flag is TPU on or off.

NER problem

Hi steve, thank for this great repo first.
i am wondering do you add bilstm+ crf layer after xlnet for NER task?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.