Giter Site home page Giter Site logo

luheng / deep_srl Goto Github PK

View Code? Open in Web Editor NEW
331.0 16.0 77.0 54 KB

Code and pre-trained model for: Deep Semantic Role Labeling: What Works and What's Next

License: Apache License 2.0

Python 90.18% Shell 9.82%
nlp theano srl tagging deep-learning lstm

deep_srl's People

Contributors

luheng avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

deep_srl's Issues

error

Hi, Thanks for your work, in practice, I meet some problem, please help me

image

thanks a lot!

constraints

In the paper, you mentioned the constraints were used. In the code, where can I find them?
Thanks!

Model taking more than 4 hours to run

I am receiving this error on ubuntu machine

Model building and loading duration was 0:00:28.

Running model duration was 3:24:52.
Task: srl
Allow new words in test data: True
Embedding size=100
Read 295330 sentences.
Data loading duration was 0:00:30.
Using 2 feature types, projected output dim=200.
('lstm_0_rdrop', 0.1, True)
<neural_srl.theano.layer.HighwayLSTMLayer object at 0x7faa7c08d110>
('lstm_1_rdrop', 0.1, True)
<neural_srl.theano.layer.HighwayLSTMLayer object at 0x7faa7c08d7d0>
('lstm_2_rdrop', 0.1, True)
<neural_srl.theano.layer.HighwayLSTMLayer object at 0x7faac68df910>
('lstm_3_rdrop', 0.1, True)
<neural_srl.theano.layer.HighwayLSTMLayer object at 0x7faac68df350>
('lstm_4_rdrop', 0.1, True)
<neural_srl.theano.layer.HighwayLSTMLayer object at 0x7faac68dfcd0>
('lstm_5_rdrop', 0.1, True)
<neural_srl.theano.layer.HighwayLSTMLayer object at 0x7faac68f7c50>
('lstm_6_rdrop', 0.1, True)
<neural_srl.theano.layer.HighwayLSTMLayer object at 0x7faac68f76d0>
('lstm_7_rdrop', 0.1, True)
<neural_srl.theano.layer.HighwayLSTMLayer object at 0x7faac68f7b90>
46989 32054
46989 2
Loaded model from: /home/ubuntu/deep_srl-master/resources/conll05_ensemble/model1.npz
Model building and loading duration was 0:00:03.

and one model is taking about 4 hours to run ... any idea what to doo?

Originally posted by @Nasidiqi in #21 (comment)

Which treebank version

Hi Dear Lvheng,
Which treebank version did you use for running the fetch_and_make_conll05_data.sh script?
I was trying to use treebank2, but it seems not correct..

ImportError: cannot import name inplace_increment

(.venv) ub16hp@UB16HP:/ub16_prj/deep_srl$ bash scripts/run_train.sh 0
Traceback (most recent call last):
File "python/train.py", line 4, in
from neural_srl.theano.tagger import BiLSTMTaggerModel
File "/home/ub16hp/ub16_prj/deep_srl/python/neural_srl/theano/tagger.py", line 1, in
from optimizer import *
File "/home/ub16hp/ub16_prj/deep_srl/python/neural_srl/theano/optimizer.py", line 2, in
import theano
File "/home/ub16hp/ub16_prj/deep_srl/.venv/local/lib/python2.7/site-packages/theano/init.py", line 80, in
from theano.scan_module import (scan, map, reduce, foldl, foldr, clone,
File "/home/ub16hp/ub16_prj/deep_srl/.venv/local/lib/python2.7/site-packages/theano/scan_module/init.py", line 41, in
from theano.scan_module import scan_opt
File "/home/ub16hp/ub16_prj/deep_srl/.venv/local/lib/python2.7/site-packages/theano/scan_module/scan_opt.py", line 60, in
from theano import tensor, scalar
File "/home/ub16hp/ub16_prj/deep_srl/.venv/local/lib/python2.7/site-packages/theano/tensor/init.py", line 9, in
from theano.tensor.subtensor import *
File "/home/ub16hp/ub16_prj/deep_srl/.venv/local/lib/python2.7/site-packages/theano/tensor/subtensor.py", line 27, in
from cutils_ext.cutils_ext import inplace_increment
ImportError: cannot import name inplace_increment
(.venv) ub16hp@UB16HP:
/ub16_prj/deep_srl$

Cannot run deep_srl

I was following the instructions in the README file. However, when I try to run the interactive console with python python/interactive.py --model conll05_model/ --pidmodel conll05_propid_model, I get the following error:

Embedding size=100
Using 1 feature types, projected output dim=100.
('lstm_0_rdrop', 0.1, True)
Traceback (most recent call last):
  File "python/interactive.py", line 61, in <module>
    pid_model, pid_data = load_model(args.pidmodel, 'propid')
  File "python/interactive.py", line 42, in load_model
    model = BiLSTMTaggerModel(data, config=config, fast_predict=True)
  File "/home/gkiril/Documents/workspace/deep_srl/deep_srl/python/neural_srl/theano/tagger.py", line 41, in __init__
    prefix='lstm_{}'.format(l))
  File "/home/gkiril/Documents/workspace/deep_srl/deep_srl/python/neural_srl/theano/layer.py", line 197, in __init__
    self._init_dropout_layers(input_dropout_prob, recurrent_dropout_prob)
  File "/home/gkiril/Documents/workspace/deep_srl/deep_srl/python/neural_srl/theano/layer.py", line 141, in _init_dropout_layers
    prefix='{}_rdrop'.format(self.prefix))
  File "/home/gkiril/Documents/workspace/deep_srl/deep_srl/python/neural_srl/theano/layer.py", line 402, in __init__
    self.rng = MRG_RandomStreams(seed=RANDOM_SEED, use_cuda=True)
TypeError: __init__() got an unexpected keyword argument 'use_cuda'

This is probably some Theano issue (although I already installed it as suggested in your tutorial).

Any idea of how this can be fixed?

error run_en2end.sh

Traceback (most recent call last):
  File "python/predict.py", line 161, in <module>
    args.input)
  File "python/predict.py", line 69, in get_scores
    model = BiLSTMTaggerModel(data, config=config, fast_predict=True)
  File "/home/mario/git/deep_srl/python/neural_srl/theano/tagger.py", line 66, in __init__
    self.is_train)
  File "/home/mario/git/deep_srl/python/neural_srl/theano/layer.py", line 221, in connect
    return LSTMLayer.connect(self, inputs, mask, is_train)
  File "/home/mario/git/deep_srl/python/neural_srl/theano/layer.py", line 178, in connect
    n_steps=max_length) # scan steps
  File "/home/mario/.local/lib/python2.7/site-packages/theano/scan_module/scan.py", line 1048, in scan
    local_op = scan_op.Scan(inner_inputs, new_outs, info)
  File "/home/mario/.local/lib/python2.7/site-packages/theano/scan_module/scan_op.py", line 216, in __init__
    [])
  File "/home/mario/.local/lib/python2.7/site-packages/theano/gof/cc.py", line 1300, in cmodule_key_variables
    c_compiler)
  File "/home/mario/.local/lib/python2.7/site-packages/theano/gof/cc.py", line 1350, in cmodule_key_
    np.core.multiarray._get_ndarray_c_version())
AttributeError: 'module' object has no attribute '_get_ndarray_c_version'

I get the error above when trying to run run_en2end.sh

About Training Data

Hi,
can we create our own training data.?
In the code where we are giving training data?
can i look the dataset?
Here we are giving pre-trained models?
I want to add some more data to the existing data and want to train a model
please suggest me.

Cant run run_end2end.sh. Error in desc

Using single model.
Task: propid
Allow new words in test data: True
Embedding size=100
Read 0 sentences.
Data loading duration was 0:00:15.
Traceback (most recent call last):
  File "python/predict.py", line 161, in <module>
    args.input)
  File "python/predict.py", line 66, in get_scores
    test_data = data.get_test_data(test_sentences, batch_size=config.dev_batch_size)
  File "/Users/maazfarooqi/Downloads/deep_srl-master/python/neural_srl/shared/tagger_data.py", line 75, in get_test_data
    max_len = max([len(s[0]) for s in test_sentences])
ValueError: max() arg is an empty sequence
Using an ensemble of 5 models
Task: srl
Allow new words in test data: True
Data loading duration was 0:00:00.
Traceback (most recent call last):
  File "python/predict.py", line 161, in <module>
    args.input)
  File "python/predict.py", line 49, in get_scores
    allow_new_words)
  File "/Users/maazfarooqi/Downloads/deep_srl-master/python/neural_srl/shared/reader.py", line 236, in get_srl_test_data
    samples = get_srl_sentences(filepath, config.use_se_marker)
  File "/Users/maazfarooqi/Downloads/deep_srl-master/python/neural_srl/shared/reader.py", line 48, in get_srl_sentences
    predicate = int(lefthand_input[0])
ValueError: invalid literal for int() with base 10: '{\\rtf1\\ansi\\ansicpg1252\\cocoartf1671'

I can not extract *.tar.gz file

this is a problem which I encounter:
tar -zxvf conll2012_model.tar.gz

gzip: stdin: not in gzip format
tar: Child returned status 1
tar: Error is not recoverable: exiting now

This problem is really confuse me, I would appreciate if someone help me.

Error while running endtoend.sh and its taking too long on big dataset

i am recieving these error and model is taking approx. 4 hours to complete one iteration

RuntimeError: module compiled against API version 0xc but this version of numpy is 0xb
RuntimeError: module compiled against API version 0xc but this version of numpy is 0xb
RuntimeError: module compiled against API version 0xc but this version of numpy is 0xb
Traceback (most recent call last):
File "python/predict.py", line 20, in
from neural_srl.theano.tagger import BiLSTMTaggerModel
File "/home/ubuntu/deep_srl-master/python/neural_srl/theano/tagger.py", line 1, in
from optimizer import *
File "/home/ubuntu/deep_srl-master/python/neural_srl/theano/optimizer.py", line 2, in
import theano
File "/home/ubuntu/.local/lib/python2.7/site-packages/theano/init.py", line 80, in
from theano.scan_module import (scan, map, reduce, foldl, foldr, clone,
File "/home/ubuntu/.local/lib/python2.7/site-packages/theano/scan_module/init.py", line 41, in
from theano.scan_module import scan_opt
File "/home/ubuntu/.local/lib/python2.7/site-packages/theano/scan_module/scan_opt.py", line 60, in
from theano import tensor, scalar
File "/home/ubuntu/.local/lib/python2.7/site-packages/theano/tensor/init.py", line 9, in
from theano.tensor.subtensor import *
File "/home/ubuntu/.local/lib/python2.7/site-packages/theano/tensor/subtensor.py", line 26, in
import theano.gof.cutils # needed to import cutils_ext
File "/home/ubuntu/.local/lib/python2.7/site-packages/theano/gof/cutils.py", line 320, in
compile_cutils()
File "/home/ubuntu/.local/lib/python2.7/site-packages/theano/gof/cutils.py", line 285, in compile_cutils
preargs=args)
File "/home/ubuntu/.local/lib/python2.7/site-packages/theano/gof/cmodule.py", line 2325, in compile_str
return dlimport(lib_filename)
File "/home/ubuntu/.local/lib/python2.7/site-packages/theano/gof/cmodule.py", line 302, in dlimport
rval = import(module_name, {}, {}, [module_name])
ImportError: numpy.core.multiarray failed to import
Using an ensemble of 5 models
Task: srl
Allow new words in test data: True
Embedding size=100
Read 295330 sentences.
Data loading duration was 0:00:28.
Using 2 feature types, projected output dim=200.
('lstm_0_rdrop', 0.1, True)
<neural_srl.theano.layer.HighwayLSTMLayer object at 0x7faa8ec34110>
('lstm_1_rdrop', 0.1, True)
<neural_srl.theano.layer.HighwayLSTMLayer object at 0x7faadb29fb10>
('lstm_2_rdrop', 0.1, True)
<neural_srl.theano.layer.HighwayLSTMLayer object at 0x7faadb29ffd0>
('lstm_3_rdrop', 0.1, True)
<neural_srl.theano.layer.HighwayLSTMLayer object at 0x7faab2ea9a50>
('lstm_4_rdrop', 0.1, True)
<neural_srl.theano.layer.HighwayLSTMLayer object at 0x7faab2ea9f10>
('lstm_5_rdrop', 0.1, True)
<neural_srl.theano.layer.HighwayLSTMLayer object at 0x7faab2ea9290>
('lstm_6_rdrop', 0.1, True)
<neural_srl.theano.layer.HighwayLSTMLayer object at 0x7faa8ebd4650>
('lstm_7_rdrop', 0.1, True)
<neural_srl.theano.layer.HighwayLSTMLayer object at 0x7faa8ebd4b10>
47031 32056
47031 2

AssertionError in processing CONLL2012 data

Hi,

When running the pre-processing for the Ontonotes CONLL2012 data, I run into this AssertionError at line 64 of the process_conll2012.py file :
assert tags[t][props[t]] in {"B-V", "B-I"}
due to the if-statements at line 149. I believe it is caused by when the tag passed into the file includes a nested verb label, such as the one below, where the word is labeled as a verb and is also the first word in a continued ARG2 span.

deep_srl_nestedcarg2v

Any suggestions on how to handle this?

gets assertion error when run predict.py

Error message:
in File "/root/test-deepsrl/python/neural_srl/shared/conll_utils.py", line 29, in print_sentence_to_conll
assert len(label_column) == len(tokens)

model: conll_2012_model
input file: see attachment.
temp.txt

BUG in interactive.py

Thank you for your great work.
I found some problems in interactive.py:

    s0 = string_sequence_to_ids(tokenized_sent, pid_data.word_dict, True)
    l0 = [0 for _ in s0]
    x, _, _, weights = pid_data.get_test_data([(s0, l0)], batch_size=None)
    pid_pred, scores0 = pid_pred_function(x, weights)

    s1 = []
    predicates = []
    for i,p in enumerate(pid_pred[0]):
      if pid_data.label_dict.idx2str[p] == 'V':
        #print 'Predicate:', tokenized_sent[i]
        predicates.append(i)
        feats = [1 if j == i else 0 for j in range(num_tokens)]
        s1.append((s0, feats, l0))

    if len(s1) == 0:
      continue

    # Semantic role labeling.
    x, _, _, weights = srl_data.get_test_data(s1, batch_size=None)
    srl_pred, scores = srl_pred_function(x, weights)

I think it is wrong to input s1 into the srl_data.get_test_data(), as the dictionary pid_data.word_dict and srl_data.word_dict are different, compared to predicate.py.
The input should be something like:

s1 = string_sequence_to_ids(tokenized_sent, srl_data.word_dict, True)
r1 = []
...
r1.append((s1, feats, l0))
x, _, _, weights = srl_data.get_test_data(r1, batch_size=None)
srl_pred, scores = srl_pred_function(x, weights)

hardcoded path :(

I don't think you should hardcode the path (for ex: in run_end2end.sh). It becomes quite confusing to use this code in different places. In many instances, people probably either already have this set in bashrc or otherwise have a path configuration, and you are hard setting it, which can create issues.

Cannot install deep_srl

Hello! I am involved in a doctoral research about the use of SRL in order to perform critical discourse analysis. Currently i am using an old python application, practNLPtools, which works fine, but outdated. I read your paper, and I was pleased with your idea of a Theano based SRL, which is fine, however I was unable to run your code. I have cloned the repository downloaded the necessary dependencies (as I guess).
I have checked cloned repo, all files are there. Probably I making a mistake when configuring files, since I am working in try-error mode. Could you give me some hints?
Thank you in advance for your attention.

issueSRL.txt

About the conll-2012 data

Dear author:
I have some questions for the data, the conll2012 data downloaded from the website: V12 release you supplied could not being processed by its own process scripts due to the file missing. like this:
image
I just download the data several times. still same. Could you please help me the verify the data ? thanks very much!

Training does not start

Hi, Luheng:
Thanks for your great work! I encountered some strange errors during training. I used the following to start training your model :
python python/train.py --config=./config/srl_config.json --model=./output --train=./sample_data/sentences_with_gold.txt --dev=./sample_data/sentences_with_gold.txt --task=srl

And I got these outputs in the terminal:

/scratch/users/duxi/miniconda3/envs/deep_srl/lib/python2.7/site-packages/theano/gpuarray/dnn.py:135: UserWarning: Your cuDNN version is more recent than Theano. If you encounter problems, try updating Theano or downgrading cuDNN to version 5.1.
warnings.warn("Your cuDNN version is more recent than "
Using cuDNN version 6021 on context None
Mapped name None to device cuda: GeForce GTX TITAN X (0000:04:00.0)
Task: srl
Embedding size=100
Extracting features
Extraced 19 words and 9 tags
Max training sentence length: 9
Max development sentence length: 9
Warning: not using official gold predicates. Not for formal evaluation.
Dev data has 1 batches.
Data loading duration was 0:00:14.
[WARNING] Log directory ./output is not empty, previous checkpoints might be overwritten
Preparation duration was 0:00:00.
Using 2 feature types, projected output dim=200.
('lstm_0_rdrop', 0.1, True)
<neural_srl.theano.layer.HighwayLSTMLayer object at 0x7fe5782bab50>
('lstm_1_rdrop', 0.1, True)
<neural_srl.theano.layer.HighwayLSTMLayer object at 0x7fe5782620d0>
('lstm_2_rdrop', 0.1, True)
<neural_srl.theano.layer.HighwayLSTMLayer object at 0x7fe56bf82c10>
('lstm_3_rdrop', 0.1, True)
<neural_srl.theano.layer.HighwayLSTMLayer object at 0x7fe570087f90>
('lstm_4_rdrop', 0.1, True)
<neural_srl.theano.layer.HighwayLSTMLayer object at 0x7fe5781912d0>
('lstm_5_rdrop', 0.1, True)
<neural_srl.theano.layer.HighwayLSTMLayer object at 0x7fe570090f90>
('lstm_6_rdrop', 0.1, True)
<neural_srl.theano.layer.HighwayLSTMLayer object at 0x7fe57809b590>
('lstm_7_rdrop', 0.1, True)
<neural_srl.theano.layer.HighwayLSTMLayer object at 0x7fe578203f90>
embedding_0 embedding_0 [ 19 100]
embedding_1 embedding_1 [ 2 100]
lstm_0_W lstm_0_W [ 200 1800]
lstm_0_U lstm_0_U [ 300 1500]
lstm_0_b lstm_0_b [1800]
lstm_1_W lstm_1_W [ 300 1800]
lstm_1_U lstm_1_U [ 300 1500]
lstm_1_b lstm_1_b [1800]
lstm_2_W lstm_2_W [ 300 1800]
lstm_2_U lstm_2_U [ 300 1500]
lstm_2_b lstm_2_b [1800]
lstm_3_W lstm_3_W [ 300 1800]
lstm_3_U lstm_3_U [ 300 1500]
lstm_3_b lstm_3_b [1800]
lstm_4_W lstm_4_W [ 300 1800]
lstm_4_U lstm_4_U [ 300 1500]
lstm_4_b lstm_4_b [1800]
lstm_5_W lstm_5_W [ 300 1800]
lstm_5_U lstm_5_U [ 300 1500]
lstm_5_b lstm_5_b [1800]
lstm_6_W lstm_6_W [ 300 1800]
lstm_6_U lstm_6_U [ 300 1500]
lstm_6_b lstm_6_b [1800]
lstm_7_W lstm_7_W [ 300 1800]
lstm_7_U lstm_7_U [ 300 1500]
lstm_7_b lstm_7_b [1800]
softmax_W softmax_W [300 9]
softmax_b softmax_b [9]

After these output, I never got other terminal output and the file "./output/checkpoints.tsv" remains empty even after the training is started for a long time. It seems the training does not make any progress at all. I am not sure if this is a GPU-specific issue: I am using cuda8.0 + cudnn8.0 and here is my theano configuration file:

[global]
device = cuda
floatX = float64
mode = FAST_RUN

[cuda]
root=/usr/local/cuda-8.0/

[dnn]
enable=True
include_path=/usr/local/cuda-8.0/include
library_path=/usr/local/cuda-8.0/lib64

[lib]
cnmem = 0.8

[nvcc]
fastmath = True

[gcc]
cxxflags=-Wno-narrowing
~

Could you give me any ideas about the potential reason ?

make_conll2012_data.sh issue

script make_conll2012_data.sh cannot run successfully, I think something is missing. it should be like this:
line 10: python ../preprocess/process_conll2012.py \
line 17: python ../preprocess/process_conll2012.py \
line 24: python ../preprocess/process_conll2012.py \

about the sample data

1.why the file sentences_with_predicates.txt and sentences_without_predicates.txt is the same?
2.what the 'gold' word mean(like,data/srl/conll05.devel.props.gold.txt) ? Is't means predicates?

ValueError

x, y, _, weights = batched_tensor
Who can solve the above issues? Thanks!

Hi the dataset format make me confused

I will appreciate it if you could give full example about data set format .
You said the training data format is like this .

2 The cat love hats . ||| B-A0 I-A0 B-V B-A1 O

Then is the DEV_PATH data and GOLD_PATH is in the same format ?

And what is the GOLD_PATH data used for ?

Random Seed

Hello,

First thank you very much for this contribution!

However, when I try to train the model from scratch
I'm getting, lower performance. (about 1pt of f-score)
I'm able to download your model and reproduce de results in your paper but I'm not able to retrain it.

I believe this may be due to a difference in the random seed, a different hyper-parameter configuration or a difference in software/hardware.

The configurations and random seed provided in this repo are the same used in the paper?

Kind regards,
Gabriel M

AssertionError

i am recieving this error, any idea how to fix it?

Traceback (most recent call last):
File "python/predict.py", line 227, in
evaluator.evaluate(predictions)
File "/home/ubuntu/deep_srl-master/python/neural_srl/shared/evaluation.py", line 26, in evaluate
self.compute_accuracy(predictions)
File "/home/ubuntu/deep_srl-master/python/neural_srl/shared/evaluation.py", line 97, in compute_accuracy
print_to_conll(self.pred_labels, self.pred_props_file, temp_output)
File "/home/ubuntu/deep_srl-master/python/neural_srl/shared/conll_utils.py", line 45, in print_to_conll
print_sentence_to_conll(fout, tokens_buf, pred_labels[seq_ptr:seq_ptr+num_props_for_sentence])
File "/home/ubuntu/deep_srl-master/python/neural_srl/shared/conll_utils.py", line 26, in print_sentence_to_conll
assert len(label_column) == len(tokens)
AssertionError

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.