Disfluency Detection using Auto-Correlational Neural Networks (ACNN)
This is the implementation of Auto-Correlational Neural Networks (ACNN) proposed for disfluency detection from speech transcripts, based on this paper from EMNLP 2018.
Contents
Basic Overview
Task
ACNN Model
Requirements
- Python 3
- Tensorflow > 0.12
- Numpy
$ git clone https://github.com/pariajm/deep-disfluency-detector
$ cd deep-disfluency-detector
Data
Training
To train a new ACNN model from scratch:
$ python3 train.py --data_path=/path/to/train_and_test_files --checkpoint_dir=/dir/to/save/checkpoints_and_summaries
Prediction
To use the trained ACNN model to predict disfluency labels for your own data:
$ cd model/checkpoints
$ wget https://github.com/pariajm/deep-disfluency-detection/releases/download/v1/model-84893.data-00000-of-00001
$ wget https://github.com/pariajm/deep-disfluency-detection/releases/download/v1/model-84893.index
$ wget https://github.com/pariajm/deep-disfluency-detection/releases/download/v1/model-84893.meta
$ cd ../..
$ python3 prediction.py --input_path=/path/to/input/file --checkpoint_dir=./model --output_path=/path/to/output/file
Citation
@InProceedings{jamshidlou2018,
author = {Jamshid Lou, Paria and Anderson, Peter and Johnson, Mark},
title = {Disfluency Detection using Auto-Correlational Neural Networks},
booktitle = {Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP2018)},
year = {2018},
pages = {4610--4619},
address = {Brussels, Belgium},
publisher = {Association for Computational Linguistics},
url = {https://www.aclweb.org/anthology/D18-1490.pdf}
}
Credits
The baseline CNN code is a modified version of Denny's code.
Contact
Paria Jamshid Lou [email protected]