pariajm / joint-disfluency-detector-and-parser Goto Github PK

Improving Disfluency Detection by Self-Training a Self-Attentive Model

Home Page: https://www.aclweb.org/anthology/2020.acl-main.346/

License: MIT License

Makefile 0.03% Scilab 2.12% C 20.69% Perl 0.05% Python 77.10%

constituency-parsing switchboard-trees pretrained-models speech-transcripts disfluency-detection bert-based-parser elmo-based-parser self-attentive-disfluency-detector transformer-disfluency-detection fisher-trees

joint-disfluency-detector-and-parser's People

Stargazers

Watchers

Forkers

huisun01 orlgln tapanhp1995 ishine jacob-lewis gitsamshi kovnew programming-with-chris xiaowangzi123 liwangd luislabtl

joint-disfluency-detector-and-parser's Issues

is there any Chinese disfluency detector and parser？

or how to train a Chinese disfluency detector ？

Trying to parse new sentences gives runtime error

When I run the parser with the default sentences in the best_models/raw_sentences.txt, it parses correctly. But when I make any change to those sentences or add any new sentence, it gives runtime error:
return self.layer_norm(outputs + residual), attns_padded
RuntimeError: The size of tensor a (51) must match the size of tensor b (38) at non-singleton dimension 0

using pytorch version: 1.8.1+cu101
python version: Python 3.7.10

Any idea about the issue? Can't we use the pretrained model (swbd_fisher_bert_Edev.0.9078) to parse our own sentence?

Run time error

When I run:
python src/main.py parse --input-path best_models/raw_sentences.txt --output-path best_models/parsed_sentences.txt --model-path-base best_models/swbd_fisher_bert_Edev.0.9078.pt >best_models/out.log

I get the error:
File "src/main.py", line 657, in
main()
File "src/main.py", line 653, in main
args.callback(args)
File "src/main.py", line 530, in run_parse
predicted, _ = parser.parse_batch(subbatch_sentences)
File "/mnt/home/v_shizhan03/joint-disfluency-detector-and-parser/src/parse_nk.py", line 1027, in parse_batch
annotations, _ = self.encoder(emb_idxs, batch_idxs, extra_content_annotations=extra_content_annotations)
File "/mnt/home/v_shizhan03/anaconda3/envs/myenv/lib/python3.8/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
result = self.forward(*input, **kwargs)
File "/mnt/home/v_shizhan03/joint-disfluency-detector-and-parser/src/parse_nk.py", line 615, in forward
res, current_attns = attn(res, batch_idxs)
File "/mnt/home/v_shizhan03/anaconda3/envs/myenv/lib/python3.8/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
result = self.forward(*input, **kwargs)
File "/mnt/home/v_shizhan03/joint-disfluency-detector-and-parser/src/parse_nk.py", line 358, in forward
return self.layer_norm(outputs + residual), attns_padded
RuntimeError: The size of tensor a (102) must match the size of tensor b (82) at non-singleton dimension 0

Any clues?

use GPU for

How to parse a sentence faster？

When I run the GPU to parse a sentence, it takes 6 seconds, and it only 40 seconds to parse 5,000 sentences. Is there any way to parse a sentence faster?

Calling cuda() with async results in SyntaxError

The error occurs in this line.

if use_cuda:
    torch_t = torch.cuda
    def from_numpy(ndarray):
        return torch.from_numpy(ndarray).pin_memory().cuda(async=True)

The reason for this error is that async has become a reserved keyword from python 3.7. The cuda() constructor arguments have changed as well. This is how the new constructor looks -

cuda(device=None, non_blocking=False)

async=True can be replaced by non_blocking=True.

non_blocking (bool):
If True and the source is in pinned memory, the copy will be asynchronous with respect to the host. Otherwise, the argument has no effect. Default: False.

parse_nk.py does not support BERT

I was not able to run the given scripts with BERT model. It seems like the parse_nk.py does not support BERT models yet. Please let me know if that is the case and would the repo be updated?

pariajm / joint-disfluency-detector-and-parser Goto Github PK

joint-disfluency-detector-and-parser's People

Stargazers

Watchers

Forkers

joint-disfluency-detector-and-parser's Issues

is there any Chinese disfluency detector and parser？

Trying to parse new sentences gives runtime error

Run time error

use GPU for

How to parse a sentence faster？

Calling cuda() with async results in SyntaxError

parse_nk.py does not support BERT

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent