luheng / deep_srl Goto Github PK

View Code? Open in Web Editor NEW

331.0 16.0 77.0 54 KB

Code and pre-trained model for: Deep Semantic Role Labeling: What Works and What's Next

License: Apache License 2.0

Python 90.18% Shell 9.82%

nlp theano srl tagging deep-learning lstm

deep_srl's People

Contributors

Stargazers

Watchers

Forkers

lijielife benjamesbabala frankchu0229 vikingmew ericxsun dkaterenchuk mindis roys174 speedcell4 gtoos alexander-jin gabrielstanovsky shihuaxing janewangle yusifu kaeflint youlei5898 flyrae iinemo schangpi ameybarapatre ml-lab anki1909 chaonan99 joelau94 tony-hong huangjingxian adithyachander hsilnam juicechuan mbollmann charizardacademy liu4lin sedflix wangtong106 mahartmann serenayj jazzzchan garry1ng wurentidai acdante riyapal yakzan rizwan09 shyamupa bowbowbow qiuhuan sanyu12 nasidiqi aralluna michael-wzhu smtan8 ldvhuy09 daishu7 zhangzhiyi0108 rah-man mnpm92 ywang021 shinyemimalef mfekadu kaustav1616 gihan-dias smith6036 avcjeewantha zjr35897 wangpeiyi9979 yuvalkirstain psyxusheng zxc45886 notnotrishi moqingyan xkianteb studentworker shanestorks weenleen keqian-li

deep_srl's Issues

error

Hi, Thanks for your work, in practice, I meet some problem, please help me

thanks a lot!

constraints

In the paper, you mentioned the constraints were used. In the code, where can I find them?
Thanks!

Model taking more than 4 hours to run

I am receiving this error on ubuntu machine

Model building and loading duration was 0:00:28.

Running model duration was 3:24:52.
Task: srl
Allow new words in test data: True
Embedding size=100
Read 295330 sentences.
Data loading duration was 0:00:30.
Using 2 feature types, projected output dim=200.
('lstm_0_rdrop', 0.1, True)
<neural_srl.theano.layer.HighwayLSTMLayer object at 0x7faa7c08d110>
('lstm_1_rdrop', 0.1, True)
<neural_srl.theano.layer.HighwayLSTMLayer object at 0x7faa7c08d7d0>
('lstm_2_rdrop', 0.1, True)
<neural_srl.theano.layer.HighwayLSTMLayer object at 0x7faac68df910>
('lstm_3_rdrop', 0.1, True)
<neural_srl.theano.layer.HighwayLSTMLayer object at 0x7faac68df350>
('lstm_4_rdrop', 0.1, True)
<neural_srl.theano.layer.HighwayLSTMLayer object at 0x7faac68dfcd0>
('lstm_5_rdrop', 0.1, True)
<neural_srl.theano.layer.HighwayLSTMLayer object at 0x7faac68f7c50>
('lstm_6_rdrop', 0.1, True)
<neural_srl.theano.layer.HighwayLSTMLayer object at 0x7faac68f76d0>
('lstm_7_rdrop', 0.1, True)
<neural_srl.theano.layer.HighwayLSTMLayer object at 0x7faac68f7b90>
46989 32054
46989 2
Loaded model from: /home/ubuntu/deep_srl-master/resources/conll05_ensemble/model1.npz
Model building and loading duration was 0:00:03.

and one model is taking about 4 hours to run ... any idea what to doo?

Originally posted by @Nasidiqi in #21 (comment)

CoNLL 2012 Data

@luheng
Did you use the complete ontonotes 5.0 dataset or CoNLL 2012 train/test/dev split ?

Which treebank version

Hi Dear Lvheng,
Which treebank version did you use for running the fetch_and_make_conll05_data.sh script?
I was trying to use treebank2, but it seems not correct..

ImportError: cannot import name inplace_increment

(.venv) ub16hp@UB16HP:/ub16_prj/deep_srl$ bash scripts/run_train.sh 0
Traceback (most recent call last):
File "python/train.py", line 4, in
from neural_srl.theano.tagger import BiLSTMTaggerModel
File "/home/ub16hp/ub16_prj/deep_srl/python/neural_srl/theano/tagger.py", line 1, in
from optimizer import *
File "/home/ub16hp/ub16_prj/deep_srl/python/neural_srl/theano/optimizer.py", line 2, in
import theano
File "/home/ub16hp/ub16_prj/deep_srl/.venv/local/lib/python2.7/site-packages/theano/init.py", line 80, in
from theano.scan_module import (scan, map, reduce, foldl, foldr, clone,
File "/home/ub16hp/ub16_prj/deep_srl/.venv/local/lib/python2.7/site-packages/theano/scan_module/init.py", line 41, in
from theano.scan_module import scan_opt
File "/home/ub16hp/ub16_prj/deep_srl/.venv/local/lib/python2.7/site-packages/theano/scan_module/scan_opt.py", line 60, in
from theano import tensor, scalar
File "/home/ub16hp/ub16_prj/deep_srl/.venv/local/lib/python2.7/site-packages/theano/tensor/init.py", line 9, in
from theano.tensor.subtensor import *
File "/home/ub16hp/ub16_prj/deep_srl/.venv/local/lib/python2.7/site-packages/theano/tensor/subtensor.py", line 27, in
from cutils_ext.cutils_ext import inplace_increment
ImportError: cannot import name inplace_increment
(.venv) ub16hp@UB16HP:/ub16_prj/deep_srl$

Cannot run deep_srl

I was following the instructions in the README file. However, when I try to run the interactive console with python python/interactive.py --model conll05_model/ --pidmodel conll05_propid_model, I get the following error:

Embedding size=100
Using 1 feature types, projected output dim=100.
('lstm_0_rdrop', 0.1, True)
Traceback (most recent call last):
  File "python/interactive.py", line 61, in <module>
    pid_model, pid_data = load_model(args.pidmodel, 'propid')
  File "python/interactive.py", line 42, in load_model
    model = BiLSTMTaggerModel(data, config=config, fast_predict=True)
  File "/home/gkiril/Documents/workspace/deep_srl/deep_srl/python/neural_srl/theano/tagger.py", line 41, in __init__
    prefix='lstm_{}'.format(l))
  File "/home/gkiril/Documents/workspace/deep_srl/deep_srl/python/neural_srl/theano/layer.py", line 197, in __init__
    self._init_dropout_layers(input_dropout_prob, recurrent_dropout_prob)
  File "/home/gkiril/Documents/workspace/deep_srl/deep_srl/python/neural_srl/theano/layer.py", line 141, in _init_dropout_layers
    prefix='{}_rdrop'.format(self.prefix))
  File "/home/gkiril/Documents/workspace/deep_srl/deep_srl/python/neural_srl/theano/layer.py", line 402, in __init__
    self.rng = MRG_RandomStreams(seed=RANDOM_SEED, use_cuda=True)
TypeError: __init__() got an unexpected keyword argument 'use_cuda'

This is probably some Theano issue (although I already installed it as suggested in your tutorial).

Any idea of how this can be fixed?

error run_en2end.sh

Traceback (most recent call last):
  File "python/predict.py", line 161, in <module>
    args.input)
  File "python/predict.py", line 69, in get_scores
    model = BiLSTMTaggerModel(data, config=config, fast_predict=True)
  File "/home/mario/git/deep_srl/python/neural_srl/theano/tagger.py", line 66, in __init__
    self.is_train)
  File "/home/mario/git/deep_srl/python/neural_srl/theano/layer.py", line 221, in connect
    return LSTMLayer.connect(self, inputs, mask, is_train)
  File "/home/mario/git/deep_srl/python/neural_srl/theano/layer.py", line 178, in connect
    n_steps=max_length) # scan steps
  File "/home/mario/.local/lib/python2.7/site-packages/theano/scan_module/scan.py", line 1048, in scan
    local_op = scan_op.Scan(inner_inputs, new_outs, info)
  File "/home/mario/.local/lib/python2.7/site-packages/theano/scan_module/scan_op.py", line 216, in __init__
    [])
  File "/home/mario/.local/lib/python2.7/site-packages/theano/gof/cc.py", line 1300, in cmodule_key_variables
    c_compiler)
  File "/home/mario/.local/lib/python2.7/site-packages/theano/gof/cc.py", line 1350, in cmodule_key_
    np.core.multiarray._get_ndarray_c_version())
AttributeError: 'module' object has no attribute '_get_ndarray_c_version'

I get the error above when trying to run run_en2end.sh

About Training Data

Hi,
can we create our own training data.?
In the code where we are giving training data?
can i look the dataset?
Here we are giving pre-trained models?
I want to add some more data to the existing data and want to train a model
please suggest me.

Cant run run_end2end.sh. Error in desc

Using single model.
Task: propid
Allow new words in test data: True
Embedding size=100
Read 0 sentences.
Data loading duration was 0:00:15.
Traceback (most recent call last):
  File "python/predict.py", line 161, in <module>
    args.input)
  File "python/predict.py", line 66, in get_scores
    test_data = data.get_test_data(test_sentences, batch_size=config.dev_batch_size)
  File "/Users/maazfarooqi/Downloads/deep_srl-master/python/neural_srl/shared/tagger_data.py", line 75, in get_test_data
    max_len = max([len(s[0]) for s in test_sentences])
ValueError: max() arg is an empty sequence
Using an ensemble of 5 models
Task: srl
Allow new words in test data: True
Data loading duration was 0:00:00.
Traceback (most recent call last):
  File "python/predict.py", line 161, in <module>
    args.input)
  File "python/predict.py", line 49, in get_scores
    allow_new_words)
  File "/Users/maazfarooqi/Downloads/deep_srl-master/python/neural_srl/shared/reader.py", line 236, in get_srl_test_data
    samples = get_srl_sentences(filepath, config.use_se_marker)
  File "/Users/maazfarooqi/Downloads/deep_srl-master/python/neural_srl/shared/reader.py", line 48, in get_srl_sentences
    predicate = int(lefthand_input[0])
ValueError: invalid literal for int() with base 10: '{\\rtf1\\ansi\\ansicpg1252\\cocoartf1671'

I can not extract *.tar.gz file

this is a problem which I encounter:
tar -zxvf conll2012_model.tar.gz

gzip: stdin: not in gzip format
tar: Child returned status 1
tar: Error is not recoverable: exiting now

This problem is really confuse me, I would appreciate if someone help me.

sudo apt-get install tcsh

I think “sudo apt-get install tsch” should be “sudo apt-get install tcsh”

Error while running endtoend.sh and its taking too long on big dataset

i am recieving these error and model is taking approx. 4 hours to complete one iteration

RuntimeError: module compiled against API version 0xc but this version of numpy is 0xb
RuntimeError: module compiled against API version 0xc but this version of numpy is 0xb
RuntimeError: module compiled against API version 0xc but this version of numpy is 0xb
Traceback (most recent call last):
File "python/predict.py", line 20, in
from neural_srl.theano.tagger import BiLSTMTaggerModel
File "/home/ubuntu/deep_srl-master/python/neural_srl/theano/tagger.py", line 1, in
from optimizer import *
File "/home/ubuntu/deep_srl-master/python/neural_srl/theano/optimizer.py", line 2, in
import theano
File "/home/ubuntu/.local/lib/python2.7/site-packages/theano/init.py", line 80, in
from theano.scan_module import (scan, map, reduce, foldl, foldr, clone,
File "/home/ubuntu/.local/lib/python2.7/site-packages/theano/scan_module/init.py", line 41, in
from theano.scan_module import scan_opt
File "/home/ubuntu/.local/lib/python2.7/site-packages/theano/scan_module/scan_opt.py", line 60, in
from theano import tensor, scalar
File "/home/ubuntu/.local/lib/python2.7/site-packages/theano/tensor/init.py", line 9, in
from theano.tensor.subtensor import *
File "/home/ubuntu/.local/lib/python2.7/site-packages/theano/tensor/subtensor.py", line 26, in
import theano.gof.cutils # needed to import cutils_ext
File "/home/ubuntu/.local/lib/python2.7/site-packages/theano/gof/cutils.py", line 320, in
compile_cutils()
File "/home/ubuntu/.local/lib/python2.7/site-packages/theano/gof/cutils.py", line 285, in compile_cutils
preargs=args)
File "/home/ubuntu/.local/lib/python2.7/site-packages/theano/gof/cmodule.py", line 2325, in compile_str
return dlimport(lib_filename)
File "/home/ubuntu/.local/lib/python2.7/site-packages/theano/gof/cmodule.py", line 302, in dlimport
rval = import(module_name, {}, {}, [module_name])
ImportError: numpy.core.multiarray failed to import
Using an ensemble of 5 models
Task: srl
Allow new words in test data: True
Embedding size=100
Read 295330 sentences.
Data loading duration was 0:00:28.
Using 2 feature types, projected output dim=200.
('lstm_0_rdrop', 0.1, True)
<neural_srl.theano.layer.HighwayLSTMLayer object at 0x7faa8ec34110>
('lstm_1_rdrop', 0.1, True)
<neural_srl.theano.layer.HighwayLSTMLayer object at 0x7faadb29fb10>
('lstm_2_rdrop', 0.1, True)
<neural_srl.theano.layer.HighwayLSTMLayer object at 0x7faadb29ffd0>
('lstm_3_rdrop', 0.1, True)
<neural_srl.theano.layer.HighwayLSTMLayer object at 0x7faab2ea9a50>
('lstm_4_rdrop', 0.1, True)
<neural_srl.theano.layer.HighwayLSTMLayer object at 0x7faab2ea9f10>
('lstm_5_rdrop', 0.1, True)
<neural_srl.theano.layer.HighwayLSTMLayer object at 0x7faab2ea9290>
('lstm_6_rdrop', 0.1, True)
<neural_srl.theano.layer.HighwayLSTMLayer object at 0x7faa8ebd4650>
('lstm_7_rdrop', 0.1, True)
<neural_srl.theano.layer.HighwayLSTMLayer object at 0x7faa8ebd4b10>
47031 32056
47031 2

AssertionError in processing CONLL2012 data

Hi,

When running the pre-processing for the Ontonotes CONLL2012 data, I run into this AssertionError at line 64 of the process_conll2012.py file :
assert tags[t][props[t]] in {"B-V", "B-I"}
due to the if-statements at line 149. I believe it is caused by when the tag passed into the file includes a nested verb label, such as the one below, where the word is labeled as a verb and is also the first word in a continued ARG2 span.

Any suggestions on how to handle this?

gets assertion error when run predict.py

Error message:
in File "/root/test-deepsrl/python/neural_srl/shared/conll_utils.py", line 29, in print_sentence_to_conll
assert len(label_column) == len(tokens)

model: conll_2012_model
input file: see attachment.
temp.txt

BUG in interactive.py

Thank you for your great work.
I found some problems in interactive.py:

    s0 = string_sequence_to_ids(tokenized_sent, pid_data.word_dict, True)
    l0 = [0 for _ in s0]
    x, _, _, weights = pid_data.get_test_data([(s0, l0)], batch_size=None)
    pid_pred, scores0 = pid_pred_function(x, weights)

    s1 = []
    predicates = []
    for i,p in enumerate(pid_pred[0]):
      if pid_data.label_dict.idx2str[p] == 'V':
        #print 'Predicate:', tokenized_sent[i]
        predicates.append(i)
        feats = [1 if j == i else 0 for j in range(num_tokens)]
        s1.append((s0, feats, l0))

    if len(s1) == 0:
      continue

    # Semantic role labeling.
    x, _, _, weights = srl_data.get_test_data(s1, batch_size=None)
    srl_pred, scores = srl_pred_function(x, weights)

I think it is wrong to input s1 into the srl_data.get_test_data(), as the dictionary pid_data.word_dict and srl_data.word_dict are different, compared to predicate.py.
The input should be something like:

s1 = string_sequence_to_ids(tokenized_sent, srl_data.word_dict, True)
r1 = []
...
r1.append((s1, feats, l0))
x, _, _, weights = srl_data.get_test_data(r1, batch_size=None)
srl_pred, scores = srl_pred_function(x, weights)

hardcoded path :(

I don't think you should hardcode the path (for ex: in run_end2end.sh). It becomes quite confusing to use this code in different places. In many instances, people probably either already have this set in bashrc or otherwise have a path configuration, and you are hard setting it, which can create issues.

Cannot install deep_srl

Hello! I am involved in a doctoral research about the use of SRL in order to perform critical discourse analysis. Currently i am using an old python application, practNLPtools, which works fine, but outdated. I read your paper, and I was pleased with your idea of a Theano based SRL, which is fine, however I was unable to run your code. I have cloned the repository downloaded the necessary dependencies (as I guess).
I have checked cloned repo, all files are there. Probably I making a mistake when configuring files, since I am working in try-error mode. Could you give me some hints?
Thank you in advance for your attention.

issueSRL.txt

About the conll-2012 data

Dear author:
I have some questions for the data, the conll2012 data downloaded from the website: V12 release you supplied could not being processed by its own process scripts due to the file missing. like this:

I just download the data several times. still same. Could you please help me the verify the data ? thanks very much!

Training does not start

Hi, Luheng:
Thanks for your great work! I encountered some strange errors during training. I used the following to start training your model :
python python/train.py --config=./config/srl_config.json --model=./output --train=./sample_data/sentences_with_gold.txt --dev=./sample_data/sentences_with_gold.txt --task=srl

And I got these outputs in the terminal:

/scratch/users/duxi/miniconda3/envs/deep_srl/lib/python2.7/site-packages/theano/gpuarray/dnn.py:135: UserWarning: Your cuDNN version is more recent than Theano. If you encounter problems, try updating Theano or downgrading cuDNN to version 5.1.
warnings.warn("Your cuDNN version is more recent than "
Using cuDNN version 6021 on context None
Mapped name None to device cuda: GeForce GTX TITAN X (0000:04:00.0)
Task: srl
Embedding size=100
Extracting features
Extraced 19 words and 9 tags
Max training sentence length: 9
Max development sentence length: 9
Warning: not using official gold predicates. Not for formal evaluation.
Dev data has 1 batches.
Data loading duration was 0:00:14.
[WARNING] Log directory ./output is not empty, previous checkpoints might be overwritten
Preparation duration was 0:00:00.
Using 2 feature types, projected output dim=200.
('lstm_0_rdrop', 0.1, True)
<neural_srl.theano.layer.HighwayLSTMLayer object at 0x7fe5782bab50>
('lstm_1_rdrop', 0.1, True)
<neural_srl.theano.layer.HighwayLSTMLayer object at 0x7fe5782620d0>
('lstm_2_rdrop', 0.1, True)
<neural_srl.theano.layer.HighwayLSTMLayer object at 0x7fe56bf82c10>
('lstm_3_rdrop', 0.1, True)
<neural_srl.theano.layer.HighwayLSTMLayer object at 0x7fe570087f90>
('lstm_4_rdrop', 0.1, True)
<neural_srl.theano.layer.HighwayLSTMLayer object at 0x7fe5781912d0>
('lstm_5_rdrop', 0.1, True)
<neural_srl.theano.layer.HighwayLSTMLayer object at 0x7fe570090f90>
('lstm_6_rdrop', 0.1, True)
<neural_srl.theano.layer.HighwayLSTMLayer object at 0x7fe57809b590>
('lstm_7_rdrop', 0.1, True)
<neural_srl.theano.layer.HighwayLSTMLayer object at 0x7fe578203f90>
embedding_0 embedding_0 [ 19 100]
embedding_1 embedding_1 [ 2 100]
lstm_0_W lstm_0_W [ 200 1800]
lstm_0_U lstm_0_U [ 300 1500]
lstm_0_b lstm_0_b [1800]
lstm_1_W lstm_1_W [ 300 1800]
lstm_1_U lstm_1_U [ 300 1500]
lstm_1_b lstm_1_b [1800]
lstm_2_W lstm_2_W [ 300 1800]
lstm_2_U lstm_2_U [ 300 1500]
lstm_2_b lstm_2_b [1800]
lstm_3_W lstm_3_W [ 300 1800]
lstm_3_U lstm_3_U [ 300 1500]
lstm_3_b lstm_3_b [1800]
lstm_4_W lstm_4_W [ 300 1800]
lstm_4_U lstm_4_U [ 300 1500]
lstm_4_b lstm_4_b [1800]
lstm_5_W lstm_5_W [ 300 1800]
lstm_5_U lstm_5_U [ 300 1500]
lstm_5_b lstm_5_b [1800]
lstm_6_W lstm_6_W [ 300 1800]
lstm_6_U lstm_6_U [ 300 1500]
lstm_6_b lstm_6_b [1800]
lstm_7_W lstm_7_W [ 300 1800]
lstm_7_U lstm_7_U [ 300 1500]
lstm_7_b lstm_7_b [1800]
softmax_W softmax_W [300 9]
softmax_b softmax_b [9]

After these output, I never got other terminal output and the file "./output/checkpoints.tsv" remains empty even after the training is started for a long time. It seems the training does not make any progress at all. I am not sure if this is a GPU-specific issue: I am using cuda8.0 + cudnn8.0 and here is my theano configuration file:

[global]
device = cuda
floatX = float64
mode = FAST_RUN

[cuda]
root=/usr/local/cuda-8.0/

[dnn]
enable=True
include_path=/usr/local/cuda-8.0/include
library_path=/usr/local/cuda-8.0/lib64

[lib]
cnmem = 0.8

[nvcc]
fastmath = True

[gcc]
cxxflags=-Wno-narrowing
~

Could you give me any ideas about the potential reason ?

make_conll2012_data.sh issue

script make_conll2012_data.sh cannot run successfully, I think something is missing. it should be like this:
line 10: python ../preprocess/process_conll2012.py \
line 17: python ../preprocess/process_conll2012.py \
line 24: python ../preprocess/process_conll2012.py \

生成BIO数据

您好，请问如何生成BIO形式的数据呢

about the sample data

1.why the file sentences_with_predicates.txt and sentences_without_predicates.txt is the same?
2.what the 'gold' word mean(like,data/srl/conll05.devel.props.gold.txt) ? Is't means predicates?

ValueError

x, y, _, weights = batched_tensor
Who can solve the above issues? Thanks！

Hi the dataset format make me confused

I will appreciate it if you could give full example about data set format .
You said the training data format is like this .

2 The cat love hats . ||| B-A0 I-A0 B-V B-A1 O

Then is the DEV_PATH data and GOLD_PATH is in the same format ?

And what is the GOLD_PATH data used for ?

Random Seed

Hello,

First thank you very much for this contribution!

However, when I try to train the model from scratch
I'm getting, lower performance. (about 1pt of f-score)
I'm able to download your model and reproduce de results in your paper but I'm not able to retrain it.

I believe this may be due to a difference in the random seed, a different hyper-parameter configuration or a difference in software/hardware.

The configurations and random seed provided in this repo are the same used in the paper?

Kind regards,
Gabriel M

running Single Model or endtoend ensamble

Hi All,

Does this model affect the results if we run single model or endtoend(ensemble) on dataset?

Thanks
Nadia

AssertionError

i am recieving this error, any idea how to fix it?

Traceback (most recent call last):
File "python/predict.py", line 227, in
evaluator.evaluate(predictions)
File "/home/ubuntu/deep_srl-master/python/neural_srl/shared/evaluation.py", line 26, in evaluate
self.compute_accuracy(predictions)
File "/home/ubuntu/deep_srl-master/python/neural_srl/shared/evaluation.py", line 97, in compute_accuracy
print_to_conll(self.pred_labels, self.pred_props_file, temp_output)
File "/home/ubuntu/deep_srl-master/python/neural_srl/shared/conll_utils.py", line 45, in print_to_conll
print_sentence_to_conll(fout, tokens_buf, pred_labels[seq_ptr:seq_ptr+num_props_for_sentence])
File "/home/ubuntu/deep_srl-master/python/neural_srl/shared/conll_utils.py", line 26, in print_sentence_to_conll
assert len(label_column) == len(tokens)
AssertionError