sheng-z / stog Goto Github PK
View Code? Open in Web Editor NEWAMR Parsing as Sequence-to-Graph Transduction
License: MIT License
AMR Parsing as Sequence-to-Graph Transduction
License: MIT License
hello, I have some problem in code, would you tell me what is the source_dynamic_vocab_size
and target_dynamic_vocab_size
in the code. Thanks a lot
At the prediction step, if the input file contains only one sentence, it results in an error. For example, after all the preprocess was applied to the AMR 2.0 (LDC2017T10) data (following the steps in README.md), there will be a preprocessed test data file test.preproc
in data/AMR/amr_2.0/
directory. However, if I removed all the "paragraphs" (that is, a pair of "comments" (including id, snt, and tags etc.) and AMR graph) except for the first paragraph from the file, and then apply the "6. Prediction" phase, the program shows the following error:
Traceback (most recent call last):
File ".../anaconda/envs/stog/lib/python3.6/runpy.py", line 193, in _run_module_as_main
"__main__", mod_spec)
File ".../anaconda/envs/stog/lib/python3.6/runpy.py", line 85, in _run_code
exec(code, run_globals)
File ".../stog/stog/commands/predict.py", line 252, in <module>
_predict(args)
File ".../stog/stog/commands/predict.py", line 208, in _predict
manager.run()
File ".../stog/stog/commands/predict.py", line 182, in run
for model_input_instance, result in zip(batch, self._predict_instances(batch)):
File ".../stog/stog/commands/predict.py", line 143, in _predict_instances
results = [self._predictor.predict_instance(batch_data[0])]
File ".../stog/stog/predictors/predictor.py", line 46, in predict_instance
outputs = self._model.forward_on_instance(instance)
File ".../stog/stog/models/model.py", line 117, in forward_on_instance
raise NotImplementedError
NotImplementedError
This error occurs whenever input file with only one AMR paragraph was given to --input-file
option of stog.commands.predict
module. Also, if the input file contains more than two paragraphs, this error doesn't seem to be happening.
I tried to run the evaluation script (tools/amr-evaluation-tool-enhanced/evaluation.sh
), but when it was run in a environment without Python 2 (or, there doesn't exist an executable python2
in PATH), it results in a error like "python2: command not found". This message was also shown during the training. Could you write that the evaluation needs Python 2 in the README.md?
Hi Sheng,
Thanks for your nice work. The conversion to dependency parsing is really insightful.
One question, I find your named entity metric to be low, and yet have a wikification accuracy higher than that of named entity score. Do you know what is happening? Are you having low named entity recall or there are issues with surface string processing?
Chunchuan
Hi!
With your statement in the README, "Make sure that you have at least two GeForce GTX TITAN X GPUs to train the full model.", do you mean that you require this setup specifically, or do you just require GPU-memory equivalent to, or greater than, two GeForce GTX TITAN X GPUs? Or is there some other interpretation/requirement completely?
Thanks in advance,
Simon
Hi Sheng,
Please see the attached log for more details.
(stog) lfsong@c47:tool.amr_parsing_stog$ ./scripts/annotate_features.sh data/AMR/amr_2.0
[2019-09-05 06:18:53,869 INFO] Better speed can be achieved with apex installed from https://www.github.com/nvidia/apex .
[2019-09-05 06:18:53,885 INFO] Processing data/AMR/amr_2.0/test.txt
Traceback (most recent call last):
File "/data/home/lfsong/anaconda3/envs/stog/lib/python3.6/runpy.py", line 193, in _run_module_as_main
"__main__", mod_spec)
File "/data/home/lfsong/anaconda3/envs/stog/lib/python3.6/runpy.py", line 85, in _run_code
exec(code, run_globals)
File "/data2/lfsong/tool.amr_parsing_stog/stog/data/dataset_readers/amr_parsing/preprocess/feature_annotator.py", line 205, in <module>
annotation = annotator(amr.sentence)
File "/data2/lfsong/tool.amr_parsing_stog/stog/data/dataset_readers/amr_parsing/preprocess/feature_annotator.py", line 75, in __call__
annotation = self.annotate(text)
File "/data2/lfsong/tool.amr_parsing_stog/stog/data/dataset_readers/amr_parsing/preprocess/feature_annotator.py", line 63, in annotate
tokens = self.nlp.annotate(text.strip(), self.nlp_properties)['sentences'][0]['tokens']
TypeError: string indices must be integers
Currently is: ./script/postprocess_2.0.sh test.pred.txt
Should be: ./scripts/postprocess_2.0.sh test.pred.txt
In README it's ./scripts/annotate_features.sh
, but actually the file name is annnotate_features.sh
.
Excuse me,
I want to parse another file, which including some sentences, the example as follows:
Can you do push-ups ?
Of course I can . It's a piece of cake ! Believe it or not , I can do 30 push-ups a minute .
Really ? I think that's impossible !
You mean 30 push-ups ?
It's easy . If you do exercise everyday , you can make it , too .
I'm so appreciate that you can give me a advice! Thanks!
is there any chance you will release a trained version of the
model?
Hi, Thanks for your nice work. and I want to realize a simple task
If I just have a file as follows:
sentence1
sentence2
....
And how to use this pre-train model to generate AMR of these sentences directly..
Hi,
during prediction I met this error, have you met it before?
File "/home/yaosw/anaconda3/envs/stog/lib/python3.6/site-packages/penman.py", line 250, in encode
return self._encode_penman(g, top=top)
File "/home/yaosw/anaconda3/envs/stog/lib/python3.6/site-packages/penman.py", line 493, in _encode_penman
raise EncodeError('Invalid graph; possibly disconnected.')
penman.EncodeError: Invalid graph; possibly disconnected.
Can anyone provide me with a script about using the per-trained models? What do you mean by "To use them for prediction, simply download & unzip them, and then run Step 6-8"?
Has anyone tried to train on the new, larger, AMR 3.0 corpus? Any plans to do so? I'm curious if this improves the overall score. I'm tempted to dig into this but I only have a single TitanX card to train on and the corpus isn't readily available to me. If someone else is already working on this I may prefer to wait to hear your results rather than dig into it myself. Let us know.
Hi Sheng,
Thanks for your nice work. Can you offer the scripts or methods to generate the "amr_2.0_utils" on the other dataset? Thank you very much!
Let me ask a question. Is there any way to predict an AMR graph for the sentences not in AMR corpus? The README describes how to run predictions on the test sentences in AMR corpus, but I couldn't find ways to apply it to another sentence, using the trained parameters. As far as I can see, the prediction consists of several steps: given any AMR files, do feature annotations (step 3.), apply preprocess to the graph (step 4.), and do the prediction (step 6.). However, according to these steps, it seems that a sentence needs to be accompanied with AMR graph. Am I misunderstanding something? Is there any way to do prediction on any other sentences, either on command line or calling the Python codes?
I'm trying to train the AMR parser on subsets of AMR 2.0 in order to see how performance scales with the number of training examples. Normally, it works just fine. But, for some subsets, I end up encountering errors. Below is the output I get when training using just 75% of the AMR 2.0 training data. I've attached a list of the training example ids in case that it is helpful.
#Step 1
python -u -m stog.commands.train data/subsets/train_75_s0/config.yaml &> RUN75_STEP1.log
#Step 2
python -u -m stog.commands.predict --archive-file data/subsets/train_75_s0/ckpt-dir
--weights-file data/subsets/train_75_s0/ckpt-dir/best.th
--input-file data/subsets/train_75_s0/test.txt.features.preproc
--batch-size 32
--use-dataset-reader
--cuda-device 0
--output-file data/subsets/train_75_s0/test.pred.txt
--silent
--beam-size 5
--predictor STOG
#Step 3
./scripts/postprocess_2.0.sh data/subsets/train_75_s0/test.pred.txt
#Step 4
./scripts/compute_smatch.sh data/subsets/train_75_s0/test.pred.txt data/AMR/amr_2.0/test.txt
#Step 4 output
Error in parsing AMR (vv1 / beyond:domain (vv2 / suppose-02:ARG1 (vv3 / i):ARG2 (vv4 / keep-01:ARG0 vv3:ARG1 (vv6 / horse:mod 1/
Traceback (most recent call last):
File "smatch/smatch.py", line 837, in
main(args)
File "smatch/smatch.py", line 737, in main
amr1.rename_node(prefix1)
AttributeError: 'NoneType' object has no attribute 'rename_node'
Smatch -> P: , R: , F:
Error in parsing AMR (vv1 / beyond:label (vv2 / suppose-02:label (vv3 / i):label (vv4 / keep-01:label vv3:label (vv6 / horse:label 1/
Traceback (most recent call last):
File "smatch/smatch.py", line 837, in
main(args)
File "smatch/smatch.py", line 737, in main
amr1.rename_node(prefix1)
AttributeError: 'NoneType' object has no attribute 'rename_node'
Unlabeled -> P: , R: , F:
Error in parsing AMR (vv1 / beyond:domain (vv2 / suppose-01:ARG1 (vv3 / i):ARG2 (vv4 / keep-01:ARG0 vv3:ARG1 (vv6 / horse:mod 1/
Traceback (most recent call last):
File "smatch/smatch.py", line 837, in
main(args)
File "smatch/smatch.py", line 737, in main
amr1.rename_node(prefix1)
AttributeError: 'NoneType' object has no attribute 'rename_node'
No WSD -> P: , R: , F:
Error in parsing AMR (vv1 / beyond :domain (vv2 / suppose-02 :ARG1 (vv3 / i) :ARG2 (vv4 / keep-01 :ARG0 vv3 :ARG1 (vv6 / horse :mod 1/
Traceback (most recent call last):
File "scores.py", line 121, in
dict_pred = var2concept(amr_pred)
File "scores.py", line 103, in var2concept
for n, v in zip(amr.nodes, amr.node_values):
AttributeError: 'NoneType' object has no attribute 'nodes'
Hi, I noticed you're using the Penman package in your code. I expect that the next version will break some backward compatibility, so I recommend you pin the current latest version in your requirements.txt
:
penman==0.6.2
If not, how to add the src-side and tgt-side attention prob distributions together?
Wrong post
Firstly,Thanks for your sharing , it's really helpful. But could I know how to generate the semantic graphs based on other dataset like WMT14 or iwslt2014 ,etc.?thanks in advance!
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.