xuezhemax / neuronlp2 Goto Github PK
View Code? Open in Web Editor NEWDeep neural models for core NLP tasks (Pytorch version)
License: GNU General Public License v3.0
Deep neural models for core NLP tasks (Pytorch version)
License: GNU General Public License v3.0
mldl@ub1604:/ub16_prj/NeuroNLP2$ bash examples/run_ner_ger.sh/ub16_prj/NeuroNLP2$
loading embedding: sskip from data/sskip/sskip.ger.64.gz
Traceback (most recent call last):
File "examples/NERCRF.py", line 248, in
main()
File "examples/NERCRF.py", line 94, in main
embedd_dict, embedd_dim = utils.load_embedding_dict(embedding, embedding_path)
File "./neuronlp2/utils.py", line 69, in load_embedding_dict
with gzip.open(embedding_path, 'r') as file:
File "/usr/lib/python3.5/gzip.py", line 53, in open
binary_file = GzipFile(filename, gz_mode, compresslevel)
File "/usr/lib/python3.5/gzip.py", line 163, in init
fileobj = self.myfileobj = builtins.open(filename, mode or 'rb')
FileNotFoundError: [Errno 2] No such file or directory: 'data/sskip/sskip.ger.64.gz'
mldl@ub1604:
Hi,
I'm having trouble reproducing the results of the paper, and getting much lower scores. I was wondering if you happen to have the gold tag scores for 3.3.0 converted PTB for the BIAF: re-impl. It seems likely I'm getting worse scores on the gold StackPtr as well but it takes longer to run and verify.
Thanks!
mldl@ub1604:/ub16_prj/NeuroNLP2$ bash examples/run_analyze.sh/ub16_prj/NeuroNLP2$
2018-09-03 09:59:58,293 - Analyzer - INFO - punctuations(5): . `` : '' ,
2018-09-03 09:59:58,293 - Create Alphabets - INFO - Creating Alphabets: models/parsing/stack_ptr/alphabets/
train_path is None
Traceback (most recent call last):
File "examples/analyze.py", line 486, in
main()
File "examples/analyze.py", line 61, in main
stackptr(model_path, model_name, test_path, punct_set, use_gpu, logger, args)
File "examples/analyze.py", line 198, in stackptr
type_alphabet = conllx_stacked_data.create_alphabets(alphabet_path, None, data_paths=[None, None], max_vocabulary_size=50000, embedd_dict=None)
File "./neuronlp2/io/conllx_data.py", line 89, in create_alphabets
with open(train_path, 'r') as file:
TypeError: invalid file: None
mldl@ub1604:
Hi, @XuezheMax
Thanks for sharing your code here.
The pos tagger is trained by WSJ data from PTB, but I have not found 23-24 in package/treebank_3/tagged/pos/wsj, but I found 0-24 in treebank_3/parsed/mrg/wsj ?
Do you use the "parsed/mrg" for pos tagger ?
Another question is : For merged data in WSJ, I do data statics, but I found the token num is the same as your paper, but the sentence num is diffenrent. Do you have any idea for this ?
train: 36386
test: 5104
Thanks
Hi Dear Xuezhe,
I am trying to reproduce the results in your paper "End-to-end Sequence Labeling via Bi-directional LSTM-CNNs-CRF". Are all the hyper-parameters pre-set in the code? Any special attention I need to pay when running your code in order to reproduce it?
Thank you!
Hi Max
Is NERCRF.py the same to the bi_lstm_cnn_crf.py in the LasagneNLP?
I assume the Biaf: re-impl scores are described with predicted tags. For the predicted tags in the paper, where did the predicted tags come from? Did they come from the tagger built by run_posCRFTagger.sh? I did not notice a tool to predict a new CoNLL file as a result of a trained POS tagger model.
mldl@ub1604:/ub16_prj/NeuroNLP2$ bash examples/run_analyze.sh/ub16_prj/NeuroNLP2$
2018-09-04 12:37:51,549 - Analyzer - INFO - punctuations(5): , : '' . ``
Traceback (most recent call last):
File "examples/analyze.py", line 480, in
main()
File "examples/analyze.py", line 61, in main
stackptr(model_path, model_name, test_path, punct_set, device, logger, args)
File "examples/analyze.py", line 193, in stackptr
type_alphabet = conllx_stacked_data.create_alphabets(alphabet_path, None, data_paths=[None, None], max_vocabulary_size=50000, embedd_dict=None)
File "./neuronlp2/io/conllx_data.py", line 140, in create_alphabets
word_alphabet.load(alphabet_directory)
File "./neuronlp2/io/alphabet.py", line 138, in load
self.__from_json(json.load(open(os.path.join(input_directory, loading_name + ".json"))))
FileNotFoundError: [Errno 2] No such file or directory: 'models/parsing/stack_ptr/alphabets/word.json'
mldl@ub1604:
Dear all,
I'm trying to run run_ner_crf.sh on conll2003(English) for NER problem. The error I got is:
loading embedding: glove from data/glove/glove.6B/glove.6B.100d.gz
2018-05-13 11:40:13,149 - NERCRF - INFO - Creating Alphabets
2018-05-13 11:40:13,174 - Create Alphabets - INFO - Word Alphabet Size (Singleton): 23598 (8122)
2018-05-13 11:40:13,174 - Create Alphabets - INFO - Character Alphabet Size: 86
2018-05-13 11:40:13,174 - Create Alphabets - INFO - POS Alphabet Size: 47
2018-05-13 11:40:13,174 - Create Alphabets - INFO - Chunk Alphabet Size: 19
2018-05-13 11:40:13,174 - Create Alphabets - INFO - NER Alphabet Size: 9
2018-05-13 11:40:13,174 - NERCRF - INFO - Word Alphabet Size: 23598
2018-05-13 11:40:13,174 - NERCRF - INFO - Character Alphabet Size: 86
2018-05-13 11:40:13,174 - NERCRF - INFO - POS Alphabet Size: 47
2018-05-13 11:40:13,174 - NERCRF - INFO - Chunk Alphabet Size: 19
2018-05-13 11:40:13,174 - NERCRF - INFO - NER Alphabet Size: 9
2018-05-13 11:40:13,174 - NERCRF - INFO - Reading Data
Reading data from data/conll2003/NeuroNLP2_sep=s_eng_train
Total number of data: 1
Reading data from data/conll2003/NeuroNLP2_sep=s_eng_testa
Total number of data: 1
Reading data from data/conll2003/NeuroNLP2_sep=s_eng_testb
Total number of data: 1
oov: 339
2018-05-13 11:40:18,594 - NERCRF - INFO - constructing network...
/home/jayhsu/miniconda2/envs/py27/lib/python2.7/site-packages/torch/nn/modules/rnn.py:38: UserWarning: dropout option adds dropout after all but last recurrent layer, so non-zero dropout expects num_layers greater than 1, but got dropout=0.5 and num_layers=1
"num_layers={}".format(dropout, num_layers))
2018-05-13 11:40:25,038 - NERCRF - INFO - Network: LSTM, num_layer=1, hidden=200, filter=30, tag_space=128, crf=bigram
2018-05-13 11:40:25,038 - NERCRF - INFO - training: l2: 0.000000, (#training data: 0, batch: 10, unk replace: 0.00)
2018-05-13 11:40:25,038 - NERCRF - INFO - dropout(in, out, rnn): (0.33, 0.50, (0.33, 0.5))
Epoch 1 (LSTM(std), learning rate=0.0150, decay rate=0.0500 (schedule=1)):
Traceback (most recent call last):
File "examples/NERCRF.py", line 250, in <module>
main()
File "examples/NERCRF.py", line 179, in main
word, char, _, _, labels, masks, lengths = conll03_data.get_batch_tensor(data_train, batch_size, unk_replace=unk_replace)
File "/home/jayhsu/NeuroNLP2/neuronlp2/io/conll03_data.py", line 382, in get_batch_tensor
buckets_scale = [sum(bucket_sizes[:i + 1]) / total_size for i in range(len(bucket_sizes))]
ZeroDivisionError: float division by zero
And my setting is:
python 2.7.15 | Anaconda, Inc.|
pytorch 0.4.0
gensim 3.4.0
I also switch to branch pytorch4.0 of this repo.
and the format of conll2003 was modified to:
1 EU NNP I-NP I-ORG
2 rejects VBZ I-VP O
3 German JJ I-NP I-MISC
4 call NN I-NP O
5 to TO I-VP O
6 boycott VB I-VP O
7 British JJ I-NP I-MISC
8 lamb NN I-NP O
9 . . O O
10 Peter NNP I-NP I-PER
Is there anything I did wrong?
How can I run this successfully? Thanks in advance.
how to use Elmo or Bert
now i has pretrained Blmo and Bert
and i runnning Biaffine and stackpointer model
i need your help
XuezheMax,
Could you upload the input (train,dev,test) files for run_ner_crf.sh?
I read the post on issue#9, and added the word ids accordingly for each sentence from the conll2013 eng.train eng.testa/b dataset.
i.e:
18 people NNS I-NP O
19 that IN I-SBAR O
20 I PRP I-NP O
21 grew VBD I-VP O
but I got the following two issues:
@XuezheMax it occurs when program constructs network. can you help me?Thanks!
loading embedding: glove from data/glove/glove.6B/glove.6B.100d.gz
2018-11-25 19:55:40,169 - NERCRF - INFO - Creating Alphabets
2018-11-25 19:55:40,197 - Create Alphabets - INFO - Word Alphabet Size (Singleton): 23598 (8122)
2018-11-25 19:55:40,197 - Create Alphabets - INFO - Character Alphabet Size: 86
2018-11-25 19:55:40,197 - Create Alphabets - INFO - POS Alphabet Size: 47
2018-11-25 19:55:40,197 - Create Alphabets - INFO - Chunk Alphabet Size: 19
2018-11-25 19:55:40,197 - Create Alphabets - INFO - NER Alphabet Size: 10
2018-11-25 19:55:40,197 - NERCRF - INFO - Word Alphabet Size: 23598
2018-11-25 19:55:40,197 - NERCRF - INFO - Character Alphabet Size: 86
2018-11-25 19:55:40,197 - NERCRF - INFO - POS Alphabet Size: 47
2018-11-25 19:55:40,197 - NERCRF - INFO - Chunk Alphabet Size: 19
2018-11-25 19:55:40,197 - NERCRF - INFO - NER Alphabet Size: 10
2018-11-25 19:55:40,197 - NERCRF - INFO - Reading Data
./neuronlp2/io/conll03_data.py:363: UserWarning: volatile was removed and now has no effect. Use with torch.no_grad():
instead.
words = Variable(torch.from_numpy(wid_inputs), volatile=volatile)
./neuronlp2/io/conll03_data.py:364: UserWarning: volatile was removed and now has no effect. Use with torch.no_grad():
instead.
chars = Variable(torch.from_numpy(cid_inputs), volatile=volatile)
./neuronlp2/io/conll03_data.py:365: UserWarning: volatile was removed and now has no effect. Use with torch.no_grad():
instead.
pos = Variable(torch.from_numpy(pid_inputs), volatile=volatile)
./neuronlp2/io/conll03_data.py:366: UserWarning: volatile was removed and now has no effect. Use with torch.no_grad():
instead.
chunks = Variable(torch.from_numpy(chid_inputs), volatile=volatile)
./neuronlp2/io/conll03_data.py:367: UserWarning: volatile was removed and now has no effect. Use with torch.no_grad():
instead.
ners = Variable(torch.from_numpy(nid_inputs), volatile=volatile)
./neuronlp2/io/conll03_data.py:368: UserWarning: volatile was removed and now has no effect. Use with torch.no_grad():
instead.
masks = Variable(torch.from_numpy(masks), volatile=volatile)
./neuronlp2/io/conll03_data.py:369: UserWarning: volatile was removed and now has no effect. Use with torch.no_grad():
instead.
single = Variable(torch.from_numpy(single), volatile=volatile)
Reading data from data/conll2003/english/eng.train.bio.conll
reading data: 10000
Total number of data: 14987
Reading data from data/conll2003/english/eng.dev.bio.conll
Total number of data: 3466
Reading data from data/conll2003/english/eng.test.bio.conll
Total number of data: 3684
oov: 339
2018-11-25 19:55:48,096 - NERCRF - INFO - constructing network...
Traceback (most recent call last):
File "examples/NERCRF.py", line 248, in
main()
File "examples/NERCRF.py", line 146, in main
tag_space=tag_space, embedd_word=word_table, p_in=p_in, p_out=p_out, p_rnn=p_rnn, bigram=bigram, initializer=initializer)
File "./neuronlp2/models/sequence_labeling.py", line 201, in init
p_in=p_in, p_out=p_out, p_rnn=p_rnn, initializer=initializer)
File "./neuronlp2/models/sequence_labeling.py", line 16, in init
self.word_embedd = Embedding(num_words, word_dim, init_embedding=embedd_word)
File "./neuronlp2/nn/modules/sparse.py", line 53, in init
self.reset_parameters(init_embedding)
File "./neuronlp2/nn/modules/sparse.py", line 60, in reset_parameters
assign_tensor(self.weight, init_embedding)
File "./neuronlp2/nn/init.py", line 17, in assign_tensor
assign_tensor(tensor.data, val)
File "./neuronlp2/nn/init.py", line 17, in assign_tensor
assign_tensor(tensor.data, val)
File "./neuronlp2/nn/init.py", line 17, in assign_tensor
assign_tensor(tensor.data, val)
File "./neuronlp2/nn/init.py", line 17, in assign_tensor
assign_tensor(tensor.data, val)
File "./neuronlp2/nn/init.py", line 17, in assign_tensor
assign_tensor(tensor.data, val)
File "./neuronlp2/nn/init.py", line 17, in assign_tensor
assign_tensor(tensor.data, val)
File "./neuronlp2/nn/init.py", line 17, in assign_tensor
assign_tensor(tensor.data, val)
File "./neuronlp2/nn/init.py", line 17, in assign_tensor
assign_tensor(tensor.data, val)
File "./neuronlp2/nn/init.py", line 17, in assign_tensor
assign_tensor(tensor.data, val)
File "./neuronlp2/nn/init.py", line 17, in assign_tensor
assign_tensor(tensor.data, val)
File "./neuronlp2/nn/init.py", line 17, in assign_tensor
assign_tensor(tensor.data, val)
File "./neuronlp2/nn/init.py", line 17, in assign_tensor
assign_tensor(tensor.data, val)
File "./neuronlp2/nn/init.py", line 17, in assign_tensor
assign_tensor(tensor.data, val)
File "./neuronlp2/nn/init.py", line 17, in assign_tensor
assign_tensor(tensor.data, val)
File "./neuronlp2/nn/init.py", line 17, in assign_tensor
assign_tensor(tensor.data, val)
File "./neuronlp2/nn/init.py", line 17, in assign_tensor
assign_tensor(tensor.data, val)
File "./neuronlp2/nn/init.py", line 17, in assign_tensor
assign_tensor(tensor.data, val)
File "./neuronlp2/nn/init.py", line 17, in assign_tensor
assign_tensor(tensor.data, val)
File "./neuronlp2/nn/init.py", line 17, in assign_tensor
assign_tensor(tensor.data, val)
File "./neuronlp2/nn/init.py", line 17, in assign_tensor
assign_tensor(tensor.data, val)
File "./neuronlp2/nn/init.py", line 17, in assign_tensor
assign_tensor(tensor.data, val)
File "./neuronlp2/nn/init.py", line 17, in assign_tensor
assign_tensor(tensor.data, val)
File "./neuronlp2/nn/init.py", line 17, in assign_tensor
assign_tensor(tensor.data, val)
File "./neuronlp2/nn/init.py", line 17, in assign_tensor
assign_tensor(tensor.data, val)
File "./neuronlp2/nn/init.py", line 17, in assign_tensor
assign_tensor(tensor.data, val)
File "./neuronlp2/nn/init.py", line 17, in assign_tensor
assign_tensor(tensor.data, val)
File "./neuronlp2/nn/init.py", line 17, in assign_tensor
assign_tensor(tensor.data, val)
File "./neuronlp2/nn/init.py", line 17, in assign_tensor
assign_tensor(tensor.data, val)
File "./neuronlp2/nn/init.py", line 17, in assign_tensor
assign_tensor(tensor.data, val)
File "./neuronlp2/nn/init.py", line 17, in assign_tensor
assign_tensor(tensor.data, val)
File "./neuronlp2/nn/init.py", line 17, in assign_tensor
assign_tensor(tensor.data, val)
File "./neuronlp2/nn/init.py", line 17, in assign_tensor
assign_tensor(tensor.data, val)
File "./neuronlp2/nn/init.py", line 17, in assign_tensor
assign_tensor(tensor.data, val)
File "./neuronlp2/nn/init.py", line 17, in assign_tensor
assign_tensor(tensor.data, val)
File "./neuronlp2/nn/init.py", line 17, in assign_tensor
assign_tensor(tensor.data, val)
File "./neuronlp2/nn/init.py", line 17, in assign_tensor
assign_tensor(tensor.data, val)
File "./neuronlp2/nn/init.py", line 17, in assign_tensor
assign_tensor(tensor.data, val)
File "./neuronlp2/nn/init.py", line 17, in assign_tensor
assign_tensor(tensor.data, val)
File "./neuronlp2/nn/init.py", line 17, in assign_tensor
assign_tensor(tensor.data, val)
File "./neuronlp2/nn/init.py", line 17, in assign_tensor
assign_tensor(tensor.data, val)
File "./neuronlp2/nn/init.py", line 17, in assign_tensor
assign_tensor(tensor.data, val)
File "./neuronlp2/nn/init.py", line 17, in assign_tensor
assign_tensor(tensor.data, val)
File "./neuronlp2/nn/init.py", line 17, in assign_tensor
assign_tensor(tensor.data, val)
File "./neuronlp2/nn/init.py", line 17, in assign_tensor
assign_tensor(tensor.data, val)
File "./neuronlp2/nn/init.py", line 17, in assign_tensor
assign_tensor(tensor.data, val)
File "./neuronlp2/nn/init.py", line 17, in assign_tensor
assign_tensor(tensor.data, val)
File "./neuronlp2/nn/init.py", line 17, in assign_tensor
assign_tensor(tensor.data, val)
File "./neuronlp2/nn/init.py", line 17, in assign_tensor
assign_tensor(tensor.data, val)
File "./neuronlp2/nn/init.py", line 17, in assign_tensor
assign_tensor(tensor.data, val)
File "./neuronlp2/nn/init.py", line 17, in assign_tensor
assign_tensor(tensor.data, val)
File "./neuronlp2/nn/init.py", line 17, in assign_tensor
assign_tensor(tensor.data, val)
File "./neuronlp2/nn/init.py", line 17, in assign_tensor
assign_tensor(tensor.data, val)
File "./neuronlp2/nn/init.py", line 17, in assign_tensor
assign_tensor(tensor.data, val)
File "./neuronlp2/nn/init.py", line 17, in assign_tensor
assign_tensor(tensor.data, val)
Hello. I'm trying to run NERCRF.py
, and when it hits the instruction self.__source_file = open(file_path, 'r')
in the method __init__
(line 104), it runs without problems, but the variable self.__source_file
is shown as "not defined", throwing an error in the instruction self.__source_file.close()
in method close(self)
(line 112). It should work, but somehow this source file variable is not being defined to be used after that. Also, the path is correct, I've already checked that.
Any idea how to fix it?
mldl@ub1604:~/ub16_prj/NeuroNLP2$ bash examples/run_posCRFTagger.sh
embedding_path is data/glove/glove.6B/glove.6B.100d.gz
loading embedding: glove from data/glove/glove.6B/glove.6B.100d.gz
2018-08-29 17:03:57,003 - POSCRFTagger - INFO - Creating Alphabets
2018-08-29 17:03:57,003 - Create Alphabets - INFO - Creating Alphabets: data/alphabets/pos_crf/
Traceback (most recent call last):
File "examples/posCRFTagger.py", line 220, in
main()
File "examples/posCRFTagger.py", line 82, in main
max_vocabulary_size=50000, embedd_dict=embedd_dict)
File "./neuronlp2/io/conllx_data.py", line 88, in create_alphabets
with open(train_path, 'r') as file:
IOError: [Errno 2] No such file or directory: 'data/POS-penn/wsj/split1/wsj1.train.original'
Can you tell me how to evaluate Dependency parsing Accuracy relative to Sentence length, Dependency length and Distance to root ?
Hello,
I noticed that the outputs of NERCRF.py are not in the same order as the input. The sentences get shuffled. So, the testing results in the tmp folder get jumbled up. Is there any way of getting the outputs in the same order as that of the inputs?
Hi,
Is there any script/way to (load and then) use a trained parsing model to parse a new file ?
Thank you.
Best,
Dat.
Hi, I try to feed in some self-defined input data for NER model. In order to fit the original format, I adjust it to the following:
1 We PRP I-NP O
2 consider VBP I-NP O
3 a DT I-NP O
4 variational JJ I-NP B-ALG1
5 method NN I-NP I-ALG1
6 to TO I-NP O
7 solve VB I-NP O
8 the DT I-NP O
9 optical JJ I-NP O
10 flow NN I-NP O
11 problem NN I-NP O
12 with IN I-NP O
13 varying VBG I-NP O
14 illumination NN I-NP O
15 . . I-NP O
1 Using VBG I-NP O
While I got the following error:
(py27) jayhsu@hpcuda:~/NeuroNLP2$ ./examples/run_ner_crf.sh
loading embedding: glove from data/glove/glove.6B/glove.6B.100d.gz
2018-05-17 10:10:38,480 - NERCRF - INFO - Creating Alphabets
2018-05-17 10:10:38,502 - Create Alphabets - INFO - Word Alphabet Size (Singleton): 23598 (8122)
2018-05-17 10:10:38,502 - Create Alphabets - INFO - Character Alphabet Size: 86
2018-05-17 10:10:38,502 - Create Alphabets - INFO - POS Alphabet Size: 47
2018-05-17 10:10:38,502 - Create Alphabets - INFO - Chunk Alphabet Size: 19
2018-05-17 10:10:38,502 - Create Alphabets - INFO - NER Alphabet Size: 9
2018-05-17 10:10:38,502 - NERCRF - INFO - Word Alphabet Size: 23598
2018-05-17 10:10:38,502 - NERCRF - INFO - Character Alphabet Size: 86
2018-05-17 10:10:38,502 - NERCRF - INFO - POS Alphabet Size: 47
2018-05-17 10:10:38,502 - NERCRF - INFO - Chunk Alphabet Size: 19
2018-05-17 10:10:38,502 - NERCRF - INFO - NER Alphabet Size: 9
2018-05-17 10:10:38,502 - NERCRF - INFO - Reading Data
Reading data from data/conll2003/NeuroNLP2_train_pos.txt
Traceback (most recent call last):
File "examples/NERCRF.py", line 249, in <module>
main()
File "examples/NERCRF.py", line 110, in main
data_train = conll03_data.read_data_to_tensor(train_path, word_alphabet, char_alphabet, pos_alphabet, chunk_alphabet, ner_alphabet, device=device)
File "/home/jayhsu/NeuroNLP2/neuronlp2/io/conll03_data.py", line 313, in read_data_to_tensor
max_size=max_size, normalize_digits=normalize_digits)
File "/home/jayhsu/NeuroNLP2/neuronlp2/io/conll03_data.py", line 156, in read_data
inst = reader.getNext(normalize_digits)
File "/home/jayhsu/NeuroNLP2/neuronlp2/io/reader.py", line 171, in getNext
ner_ids.append(self.__ner_alphabet.get_index(ner))
File "/home/jayhsu/NeuroNLP2/neuronlp2/io/alphabet.py", line 64, in get_index
raise KeyError("instance not found: %s" % instance)
KeyError: u'instance not found: B-ALG1'
So, for the fifth column , a.k.a tags, why can't it be something like 'B-ALG1', which is ALG1(self-defined) plus prefix B?
※ I read issue #11, so the chunk column is dummy. (But still have to be in correct instance format like I-NP, otherwise may have similar error)
※ For anyone new here, previously I post #14, where use conll2003 and fix format problem [success]
I know there is way 2 indicated in issue #11, but how can I fix this problem with easy and fast solution as pointed as way 1?
Thank you for your nice and neat code for the re-implementation of the Stanford Deep BiAffine Parser.
I notice that you used and Conv layer on in class BiRecurrentConvBiAffine at the beginning.
Since in the original paper, they do not use the Conv layer, I am wondering what is the function of the Conv layer here? Is it just for features extraction? Or it actually plays an important role in your code?
Hi Max,
could you share the datas which is used in the example ?
such as the datas in the path : data/conll2003/english/ or data/POS-penn/wsj/split1/
thanks.
Hello
I am trying to achieve the same results as "End-to-end Sequence Labeling via Bi-directional LSTM-CNNs-CRF" paper, but it doesn't seem to match the results that the paper claims to have after 50 epochs. I've also read #8 issue.
Because I'm using Windows, I got the hyper-parameters off the .sh script and wrote them direct into the NERCRF.py code.
After 50 epochs, using the GloVe embeddings with 100 dimensions and CoNLL-2003 corpus (which I downloaded from this repository), I've only managed a 84.76% F1 score in my dev data and a 80.32% F1 score in my test data. Are the hyper-parameters rights? Did you use eng.testa for dev data and eng.testb for test data, or did you used different files? Should I pay attention to anything else?
Thanks.
Hi! Xuezhe
Your code and paper are so great, I have two questions I would like to ask you.
English data LOC MISC ORG PER
Training set 7140 3438 6321 6600
Development set 1837 922 1341 1842
Test set 1668 702 1661 1617
I can't always reach the f1 score you reported, I don't know if it's related to the data difference.
I have been confused for a long time and I hope to get your help.
Thanks!
Hi @XuezheMax,
In his LSTM-CRF architecture, Lample has a parameter to perform the lowercasing of words before they are lookup in the embeddings table. This is particularly good for my Portuguese training because I'm using pre-trained embeddings that have only lowercase words. So, if a word is not found in the embeddings table because it starts with an upper case letter, it would end up hitting the UNK vector.
Do you have support for this in your model? If not, can you indicate where should I makes changes to consider this?
Thanks!
Hi
Thank you for sharing your code!
I'm trying to run the stack pointer parser on my own dataset and I need your help to fix this error:
===========
loading embedding: polyglot from data/polyglot-ar.pkl
2021-05-27 10:15:42,679 - Parsing - INFO - Creating Alphabets
2021-05-27 10:15:42,679 - Create Alphabets - INFO - Creating Alphabets: models/parsing/stackptr/alphabets
2021-05-27 10:15:43,070 - Create Alphabets - INFO - Total Vocabulary Size: 13914
2021-05-27 10:15:43,070 - Create Alphabets - INFO - Total Singleton Size: 6034
2021-05-27 10:15:43,073 - Create Alphabets - INFO - Total Vocabulary Size (w.o rare words): 12077
2021-05-27 10:15:43,206 - Create Alphabets - INFO - Word Alphabet Size (Singleton): 13275 (4197)
2021-05-27 10:15:43,206 - Create Alphabets - INFO - Character Alphabet Size: 75
2021-05-27 10:15:43,206 - Create Alphabets - INFO - POS Alphabet Size: 4
2021-05-27 10:15:43,206 - Create Alphabets - INFO - Type Alphabet Size: 11
2021-05-27 10:15:43,206 - Parsing - INFO - Word Alphabet Size: 13275
2021-05-27 10:15:43,207 - Parsing - INFO - Character Alphabet Size: 75
2021-05-27 10:15:43,207 - Parsing - INFO - POS Alphabet Size: 4
2021-05-27 10:15:43,207 - Parsing - INFO - Type Alphabet Size: 11
2021-05-27 10:15:43,207 - Parsing - INFO - punctuations(5): `` , : '' .
word OOV: 970
2021-05-27 10:15:43,232 - Parsing - INFO - constructing network...
Traceback (most recent call last):
File "parsing.py", line 651, in
train(args)
File "parsing.py", line 225, in train
assert word_dim == hyps['word_dim']
AssertionError
I am training a NER system, but the results change from a run to an other using the same configuration.
Do you have an idea how to fix this variability please ?
Thank you for your open source work.
Could you please explain the parsing method : left2right, deep first, shallow first, inside out in code of ,
looking forward to your reply
Hi,
I'm trying to test this tool but when I want to test it training a Stack-Pointer parser, even if I follow the instructions and set the paths for data and embeddings.
#!/usr/bin/env bash
CUDA_VISIBLE_DEVICES=0 OMP_NUM_THREADS=4 python -u parsing.py --mode train --config configs/parsing/stackptr.json \
--num_epochs 600 --batch_size 32 \
--opt adam --learning_rate 0.001 --lr_decay 0.999997 --beta1 0.9 --beta2 0.9 --eps 1e-4 --grad_clip 5.0 \
--loss_type token --warmup_steps 40 --reset 20 --weight_decay 0.0 --unk_replace 0.5 --beam 10 \
--word_embedding sskip --word_path "data/SSKIP/sskip.eng.100.gz" --char_embedding random \
--punctuation '.' '``' "''" ':' ',' \
--train "data/UD/en_ewt-ud-train.conll" \
--dev "data/UD/en_ewt-ud-dev.conll" \
--test "data/UD/en_ewt-ud-test.conll" \
--model_path "models/parsing/stack-pointer/"
But when I try to execute the script I get:
$ ./scripts/run_stackptr.sh
Traceback (most recent call last):
File "parsing.py", line 22, in <module>
from neuronlp2.nn.utils import total_grad_norm
File "/home/iago/Escritorio/NeuroNLP2/neuronlp2/nn/__init__.py", line 4, in <module>
from neuronlp2.nn.crf import ChainCRF, TreeCRF
File "/home/iago/Escritorio/NeuroNLP2/neuronlp2/nn/crf.py", line 7, in <module>
from neuronlp2.nn.modules import BiAffine
File "/home/iago/Escritorio/NeuroNLP2/neuronlp2/nn/modules.py", line 82, in <module>
class BiAffine(nn.Module):
File "/home/iago/Escritorio/NeuroNLP2/neuronlp2/nn/modules.py", line 156, in BiAffine
@overrides
File "/home/iago/Escritorio/NeuroNLP2/env/lib/python3.6/site-packages/overrides/overrides.py", line 88, in overrides
return _overrides(method, check_signature, check_at_runtime)
File "/home/iago/Escritorio/NeuroNLP2/env/lib/python3.6/site-packages/overrides/overrides.py", line 114, in _overrides
_validate_method(method, super_class, check_signature)
File "/home/iago/Escritorio/NeuroNLP2/env/lib/python3.6/site-packages/overrides/overrides.py", line 135, in _validate_method
ensure_signature_is_compatible(super_method, method, is_static)
File "/home/iago/Escritorio/NeuroNLP2/env/lib/python3.6/site-packages/overrides/signature.py", line 82, in ensure_signature_is_compatible
ensure_return_type_compatibility(super_type_hints, sub_type_hints, method_name)
File "/home/iago/Escritorio/NeuroNLP2/env/lib/python3.6/site-packages/overrides/signature.py", line 266, in ensure_return_type_compatibility
f"{method_name}: return type `{sub_return}` is not a `{super_return}`."
TypeError: BiAffine.extra_repr: return type `None` is not a `<class 'str'>`.
I get the same error with:
./scripts/run_stackptr.sh
and ./scripts/run_deepbiaf.sh
What I'm doing wrong?
Greetings.
Hi,
I am getting following error, Any suggestions?
Thanks
TypeError Traceback (most recent call last)
in ()
11 word, char, _, _, labels, masks, lengths = conll03_data.get_batch_variable(data_train, batch_size)
12 optim.zero_grad()
---> 13 loss = network.loss(word, char, labels, mask=masks)
14 loss.backward()
15 optim.step()
~/NeuroNLP2/neuronlp2/models/sequence_labeling.py in loss(self, input_word, input_char, target, mask, length, hx)
291 def loss(self, input_word, input_char, target, mask=None, length=None, hx=None):
292 # output from rnn [batch, length, tag_space]
--> 293 output, _, mask, length = self._get_rnn_output(input_word, input_char, mask=mask, length=length, hx=hx)
294
295 if length is not None:
~/NeuroNLP2/neuronlp2/models/sequence_labeling.py in _get_rnn_output(self, input_word, input_char, mask, length, hx)
252
253 # [batch, length, char_length, char_dim]
--> 254 char = self.char_embedd(input_char)
255 char_size = char.size()
256 # first transform to [batch *length, char_length, char_dim]
/anaconda/lib/python3.6/site-packages/torch/nn/modules/module.py in call(self, *input, **kwargs)
222 for hook in self._forward_pre_hooks.values():
223 hook(self, input)
--> 224 result = self.forward(*input, **kwargs)
225 for hook in self._forward_hooks.values():
226 hook_result = hook(self, input, result)
~/NeuroNLP2/neuronlp2/nn/modules/sparse.py in forward(self, input)
68 if input.dim() > 2:
69 num_inputs = np.prod(input_size[:-1])
---> 70 input = input.view(num_inputs, input_size[-1])
71
72 output_size = input_size + (self.embedding_dim, )
/anaconda/lib/python3.6/site-packages/torch/autograd/variable.py in view(self, *sizes)
508
509 def view(self, *sizes):
--> 510 return View.apply(self, sizes)
511
512 def view_as(self, tensor):
/anaconda/lib/python3.6/site-packages/torch/autograd/_functions/tensor.py in forward(ctx, i, sizes)
94 ctx.new_sizes = sizes
95 ctx.old_size = i.size()
---> 96 result = i.view(*sizes)
97 ctx.mark_shared_storage((i, result))
98 return result
TypeError: view received an invalid combination of arguments - got (numpy.int64, int), but expected one of:
mldl@ub1604:~/ub16_prj/NeuroNLP2$ bash examples/run_posCRFTagger.sh
embedding_path is data/glove/glove.6B/glove.6B.100d.gz
loading embedding: glove from data/glove/glove.6B/glove.6B.100d.gz
2018-09-03 09:43:35,189 - POSCRFTagger - INFO - Creating Alphabets
2018-09-03 09:43:35,190 - Create Alphabets - INFO - Creating Alphabets: data/alphabets/pos_crf/
2018-09-03 09:43:38,950 - Create Alphabets - INFO - Total Vocabulary Size: 38406
2018-09-03 09:43:38,950 - Create Alphabets - INFO - Total Singleton Size: 16938
2018-09-03 09:43:38,973 - Create Alphabets - INFO - Total Vocabulary Size (w.o rare words): 35154
2018-09-03 09:43:40,167 - Create Alphabets - INFO - Word Alphabet Size (Singleton): 39266 (13686)
2018-09-03 09:43:40,167 - Create Alphabets - INFO - Character Alphabet Size: 88
2018-09-03 09:43:40,167 - Create Alphabets - INFO - POS Alphabet Size: 48
2018-09-03 09:43:40,167 - Create Alphabets - INFO - Type Alphabet Size: 15
2018-09-03 09:43:40,169 - POSCRFTagger - INFO - Word Alphabet Size: 39266
2018-09-03 09:43:40,169 - POSCRFTagger - INFO - Character Alphabet Size: 88
2018-09-03 09:43:40,169 - POSCRFTagger - INFO - POS Alphabet Size: 48
2018-09-03 09:43:40,169 - POSCRFTagger - INFO - Reading Data
reading data: 10000
reading data: 20000
reading data: 30000
Total number of data: 38218
Reading data from data/POS-penn/wsj/split1/wsj1.dev.original
Total number of data: 5527
./neuronlp2/io/conllx_data.py:371: UserWarning: volatile was removed and now has no effect. Use with torch.no_grad():
instead.
words = Variable(torch.from_numpy(wid_inputs), volatile=volatile)
./neuronlp2/io/conllx_data.py:372: UserWarning: volatile was removed and now has no effect. Use with torch.no_grad():
instead.
chars = Variable(torch.from_numpy(cid_inputs), volatile=volatile)
./neuronlp2/io/conllx_data.py:373: UserWarning: volatile was removed and now has no effect. Use with torch.no_grad():
instead.
pos = Variable(torch.from_numpy(pid_inputs), volatile=volatile)
./neuronlp2/io/conllx_data.py:374: UserWarning: volatile was removed and now has no effect. Use with torch.no_grad():
instead.
heads = Variable(torch.from_numpy(hid_inputs), volatile=volatile)
./neuronlp2/io/conllx_data.py:375: UserWarning: volatile was removed and now has no effect. Use with torch.no_grad():
instead.
types = Variable(torch.from_numpy(tid_inputs), volatile=volatile)
./neuronlp2/io/conllx_data.py:376: UserWarning: volatile was removed and now has no effect. Use with torch.no_grad():
instead.
masks = Variable(torch.from_numpy(masks), volatile=volatile)
./neuronlp2/io/conllx_data.py:377: UserWarning: volatile was removed and now has no effect. Use with torch.no_grad():
instead.
single = Variable(torch.from_numpy(single), volatile=volatile)
Reading data from data/POS-penn/wsj/split1/wsj1.test.original
Total number of data: 5462
oov: 954
2018-09-03 09:43:58,409 - POSCRFTagger - INFO - constructing network...
Traceback (most recent call last):
File "examples/posCRFTagger.py", line 220, in
main()
File "examples/posCRFTagger.py", line 127, in main
tag_space=tag_space, embedd_word=word_table, bigram=bigram, p_in=p_in, p_out=p_out, p_rnn=p_rnn, initializer=initializer)
File "./neuronlp2/models/sequence_labeling.py", line 201, in init
p_in=p_in, p_out=p_out, p_rnn=p_rnn, initializer=initializer)
File "./neuronlp2/models/sequence_labeling.py", line 16, in init
self.word_embedd = Embedding(num_words, word_dim, init_embedding=embedd_word)
File "./neuronlp2/nn/modules/sparse.py", line 53, in init
self.reset_parameters(init_embedding)
File "./neuronlp2/nn/modules/sparse.py", line 60, in reset_parameters
assign_tensor(self.weight, init_embedding)
File "./neuronlp2/nn/init.py", line 17, in assign_tensor
assign_tensor(tensor.data, val)
File "./neuronlp2/nn/init.py", line 17, in assign_tensor
assign_tensor(tensor.data, val)
File "./neuronlp2/nn/init.py", line 17, in assign_tensor
assign_tensor(tensor.data, val)
i
File "./neuronlp2/nn/init.py", line 16, in assign_tensor
if isinstance(tensor, Variable):
RuntimeError: maximum recursion depth exceeded
mldl@ub1604:~/ub16_prj/NeuroNLP2$
Anyone have a fix/tweak to make this work on pytorch 0.4.0 and up?
I'm getting a stack overflow error which I think others have reported.
Hi, thanks for your nice project.
I encountered this error when I tried to train the NeuroMST
model.
Below are some error logs:
./scripts/run_neuromst.sh
Namespace(amsgrad=False, batch_size=32, beam=1, beta1=0.9, beta2=0.9, char_embedding='random', char_path=None, config='configs/parsing/neuromst.json', cuda=True, dev='data/ptb/dev.conllx', eps=0.0001, freeze=False, grad_clip=5.0, learning_rate=0.001, loss_type='token', lr_decay=0.999995, mode='train', model_path='models/parsing/neuromst/', num_epochs=400, optim='adam', punctuation=['.', '``', "''", ':', ','], reset=20, test='data/ptb/test.conllx', train='data/ptb/train.conllx', unk_replace=0.5, warmup_steps=40, weight_decay=0.0, word_embedding='sskip', word_path='data/glove.6B.100d.gz')
loading embedding: sskip from data/glove.6B.100d.gz
2020-05-23 17:16:35,234 - Parsing - INFO - Creating Alphabets
2020-05-23 17:16:35,265 - Create Alphabets - INFO - Word Alphabet Size (Singleton): 37377 (14112)
2020-05-23 17:16:35,265 - Create Alphabets - INFO - Character Alphabet Size: 83
2020-05-23 17:16:35,265 - Create Alphabets - INFO - POS Alphabet Size: 48
2020-05-23 17:16:35,265 - Create Alphabets - INFO - Type Alphabet Size: 48
2020-05-23 17:16:35,266 - Parsing - INFO - Word Alphabet Size: 37377
2020-05-23 17:16:35,266 - Parsing - INFO - Character Alphabet Size: 83
2020-05-23 17:16:35,266 - Parsing - INFO - POS Alphabet Size: 48
2020-05-23 17:16:35,266 - Parsing - INFO - Type Alphabet Size: 48
2020-05-23 17:16:35,266 - Parsing - INFO - punctuations(5): . `` '' , :
word OOV: 1017
2020-05-23 17:16:35,356 - Parsing - INFO - constructing network...
2020-05-23 17:16:42,994 - Parsing - INFO - Network: NeuroMST-FastLSTM, num_layer=3, hidden=512, act=elu
2020-05-23 17:16:42,994 - Parsing - INFO - dropout(in, out, rnn): variational(0.33, 0.33, [0.33, 0.33])
2020-05-23 17:16:42,994 - Parsing - INFO - # of Parameters: 22298677
2020-05-23 17:16:42,994 - Parsing - INFO - Reading Data
Reading data from data/ptb/train.conllx
reading data: 10000
reading data: 20000
reading data: 30000
Total number of data: 39832
Reading data from data/ptb/dev.conllx
Total number of data: 1700
Reading data from data/ptb/test.conllx
Total number of data: 2416
2020-05-23 17:16:55,055 - Parsing - INFO - training: #training data: 39831, batch: 32, unk replace: 0.50
Epoch 1 (adam, betas=(0.9, 0.900), eps=1.0e-04, amsgrad=False, lr=0.000000, lr decay=0.999995, grad clip=5.0, l2=0.0e+00):
CUDA runtime error: an illegal instruction was encountered (73) in magma_dgetrf_batched at /opt/conda/conda-bld/magma-cuda92_1564975048006/work/src/dgetrf_batched.cpp:213
CUDA runtime error: an illegal instruction was encountered (73) in magma_dgetrf_batched at /opt/conda/conda-bld/magma-cuda92_1564975048006/work/src/dgetrf_batched.cpp:214
CUDA runtime error: an illegal instruction was encountered (73) in magma_dgetrf_batched at /opt/conda/conda-bld/magma-cuda92_1564975048006/work/src/dgetrf_batched.cpp:215
CUDA runtime error: an illegal instruction was encountered (73) in magma_queue_destroy_internal at /opt/conda/conda-bld/magma-cuda92_1564975048006/work/interface_cuda/interface.cpp:944
CUDA runtime error: an illegal instruction was encountered (73) in magma_queue_destroy_internal at /opt/conda/conda-bld/magma-cuda92_1564975048006/work/interface_cuda/interface.cpp:945
CUDA runtime error: an illegal instruction was encountered (73) in magma_queue_destroy_internal at /opt/conda/conda-bld/magma-cuda92_1564975048006/work/interface_cuda/interface.cpp:946
Traceback (most recent call last):
File "parsing.py", line 651, in <module>
train(args)
File "parsing.py", line 360, in train
loss_arc, loss_type = network.loss(words, chars, postags, heads, types, mask=masks)
File "neuronlp2/models/parsing.py", line 281, in loss
loss_arc = self.treecrf.loss(arc[0], arc[1], heads, mask=mask)
File "neuronlp2/nn/crf.py", line 269, in loss
z = torch.logdet(L)
RuntimeError: CUDA error: an illegal instruction was encountered
I wonder what went wrong with me.
By observing the outputs, I found that at the first epoch, the Laplacian matrix is in the following form:
[ x -1 -1
-1 x -1
-1 -1 x]
where x corresponds a summation of exponents. Is it prone to cause the numerical overflow when calculating the determinants?
I would be appreciated if given any suggestions.
Hello, I saw in #9 that you used a data formed of 4 columns for NER. I am trying to run it in a corpus formed of 2 columns, like in this pic:
So, my text base is formed of a column with an word and another column with a tag only. Is there any way to parameterize the script to support such kind of data, or I will have to adapt the code specific for my use? For instance, I will have to change in conll03_data
to read tokens[0] instead of tokens[1] as an word, and deal with pos, chunk and ner alphabet. Anything else I should know?
Thanks.
Dear Max,
Thank you so much for making your code available.
I am running your stacked pointer network but I cannot find the train/dev/test datasets
CUDA_VISIBLE_DEVICES=0 OMP_NUM_THREADS=4 python -u parsing.py --mode train --config configs/parsing/stackptr.json --num_epochs 600 --batch_size 32
--opt adam --learning_rate 0.001 --lr_decay 0.999997 --beta1 0.9 --beta2 0.9 --eps 1e-4 --grad_clip 5.0
--loss_type token --warmup_steps 40 --reset 20 --weight_decay 0.0 --unk_replace 0.5 --beam 10
--word_embedding sskip --word_path "data/sskip/sskip.eng.100.gz" --char_embedding random
--punctuation '.' '``' "''" ':' ','
--train "data/PTB3.0/PTB3.0-Stanford_dep/ptb3.0-stanford.auto.cpos.train.conll"
--dev "data/PTB3.0/PTB3.0-Stanford_dep/ptb3.0-stanford.auto.cpos.dev.conll"
--test "data/PTB3.0/PTB3.0-Stanford_dep/ptb3.0-stanford.auto.cpos.test.conll"
--model_path "models/parsing/stackptr/"
Could you tell me where to find the dataset? Thank you!
Hi,
For the results in the paper, "Stack-Pointer Networks for Dependency Parsing", which version of the Stanford dependency converter was used to convert PTB 3.0? Was the same version used in all experiments including BIAF: re-impl ?
Thanks, Ian
For the data used for POS tagging and Dependency Parsing, our data format follows the CoNLL-X format. Following is an example:
1 No _ RB RB _ 7 discourse _ _
2 , _ , , _ 7 punct _ _
3 it _ PR PRP _ 7 nsubj _ _
4 was _ VB VBD _ 7 cop _ _
5 n't _ RB RB _ 7 neg _ _
6 Black _ NN NNP _ 7 nn _ _
7 Monday _ NN NNP _ 0 root _ _
8 . _ . . _ 7 punct _ _
For the data used for NER, our data format is similar to that used in CoNLL 2003 shared task, with a little bit difference. An example is in following:
1 EU NNP I-NP I-ORG
2 rejects VBZ I-VP O
3 German JJ I-NP I-MISC
4 call NN I-NP O
5 to TO I-VP O
6 boycott VB I-VP O
7 British JJ I-NP I-MISC
8 lamb NN I-NP O
9 . . O O
1 Peter NNP I-NP I-PER
2 Blackburn NNP I-NP I-PER
3 BRUSSELS NNP I-NP I-LOC
4 1996-08-22 CD I-NP O
...
where we add an column at the beginning to store the index of each word.
The original CoNLL-03 data can be downloaded here:
https://github.com/glample/tagger/tree/master/dataset
Make sure to convert the original tagging schema to the standard BIO (or more advanced BIOES)
Here is the code I used to convert it to BIO
def transform(ifile, ofile):
with open(ifile, 'r') as reader, open(ofile, 'w') as writer:
prev = 'O'
for line in reader:
line = line.strip()
if len(line) == 0:
prev = 'O'
writer.write('\n')
continue
tokens = line.split()
# print tokens
label = tokens[-1]
if label != 'O' and label != prev:
if prev == 'O':
label = 'B-' + label[2:]
elif label[2:] != prev[2:]:
label = 'B-' + label[2:]
else:
label = label
writer.write(" ".join(tokens[:-1]) + " " + label)
writer.write('\n')
prev = tokens[-1]
In the neuronlp2/models/parsing.py/BiRecurrentConvBiAffine, you use Dropout2d as your drop out layer. Could you please explain why you choose that?
So let's say the output = [32, 35, 512], you first transpose(1,2) and then do the dropout, and then transpose it back. Could you please tell me why you do the transpose before dropout?
Hey, just wondering, if there is documentation in regards to running this?
Hi, @XuezheMax
I have noticed that in the loss function of NER.py , there is a parameter named 'leading_symbolic'(value=1),
In the source code of loss funtion , you use this parameter as following:
_, preds = torch.max(output[:, :, leading_symbolic:], dim=2)
preds += leading_symbolic
Could you explain Why you change the shape of output of the network in this way?
looking forward to your reply~
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.