lorenlugosch / end-to-end-slu Goto Github PK

View Code? Open in Web Editor NEW

220.0 220.0 50.0 137.74 MB

PyTorch code for end-to-end spoken language understanding (SLU) with ASR-based transfer learning

License: Apache License 2.0

Python 100.00%

end-to-end-slu's People

Contributors

Stargazers

Watchers

Forkers

mravanelli asa-mosi entn-at gdcollect singhranjodh gazay ishine templeblock rameshkunasi zhanghaobaba zhyoung24 alokproc mabounassif pwdonh yangpuhai tparcollet toukihei ttslr whitefu nangongmu xrick seopbo hrishabh95 praesc xinkez daydrill emmyphung verojulianaschmalz piapip hwendy12 siddhu001 18445864529 wutianqidx maseeval fengf9876 mbencherif kumarkarun sylvainverdy normonisping anupkumargupta super-alex aa452948257 iq-scm shaungt1

end-to-end-slu's Issues

Save the optimizer status for future retraining

I think it would be useful if the optimizer state was also saved, for a future retraining from a checkpoint

torch.save(optimizer.state_dict(), path_to_optimizer)

inference error

Hi, I performed the inference codes, but I got different result. I got different result when I ran the codes every time. Thank you in advance.

Inference Example

Possible to provide an inference example? I'm having trouble figuring out how to evaluate once I've done the training.

import data
import models
import soundfile as sf
import torch

config = data.read_config('/home/n.folkman/git/pretrain_speech_model/experiments/unfreeze_word_layers/experiment.cfg')
data.get_SLU_datasets(config)
model = models.Model(config)

[signal, fs] = sf.read('/home/n.folkman/models/fluent_speech_commands_dataset/wavs/speakers/zZezMeg5XvcbRdg3/ff4ae930-45d9-11e9-81ce-69b74fd7e64e.wav')

device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
sample = torch.tensor(signal, device=device).float()

model.predict_intents(sample)

RuntimeError: Expected 3-dimensional input for 3-dimensional weight [80, 1, 401], but got 2-dimensional input of size [62805, 1] instead

"synthetic_data.csv"

How do I get the "synthetic_data.csv" file? The dataset I downloaded does not have this file, and I can't find a way to generate this file.

what's the format of train_data_seq2seq.csv?

When training under the mode "seq2seq", the file "xx_data_seq2seq.csv" is needed. But this file is not included in the downloaded dataset. So what's the format of "xx_data_seq2seq.csv"？

With the format, I can new this file by myself.

I can't run inference successfully because of lacking of your synthetic data

May I have your synthetic_data.csv?

/home/a/anaconda3/bin/python3.7 /home/a/git/slu_1/end-to-end-SLU/test.py
no seq2seq hyperparameters
Traceback (most recent call last):
File "/home/a/anaconda3/lib/python3.7/configparser.py", line 788, in get
value = d[option]
File "/home/a/anaconda3/lib/python3.7/collections/init.py", line 916, in getitem
return self.missing(key) # support subclasses that define missing
File "/home/a/anaconda3/lib/python3.7/collections/init.py", line 908, in missing
raise KeyError(key)
KeyError: 'real_dataset_subset_percentage'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "/home/a/git/slu_1/end-to-end-SLU/test.py", line 7, in
config = data.read_config("experiments/no_unfreezing.cfg"); ,,_=data.get_SLU_datasets(config)
File "/home/a/git/slu_1/end-to-end-SLU/data.py", line 95, in read_config
config.real_dataset_subset_percentage=float(parser.get("training", "real_dataset_subset_percentage"))
File "/home/a/anaconda3/lib/python3.7/configparser.py", line 791, in get
raise NoOptionError(option, section)
configparser.NoOptionError: No option 'real_dataset_subset_percentage' in section: 'training'

Where is the librispeech and the synthesized data?

I think to run many of these experiments we need to know where the pre-trained librispeech model is, no? I don't obviously see it, can you help?

Also where is the synthesized data you used?

Thanks
Michael

Questions about seq2seq model

Hi, thank you for sharing the code and knowledge via the 2 papers! My team and I have been trying to implement your E2E SLU models (both seq2seq and non-seq2seq) and have some questions we would like to ask.

Randomness in results
Even when we run the same models with the same parameters, the results don’t always come out the same, with 1~5% difference. We found the random seed in your code and also in cfg files, but just wondering if different accuracy results are expected.
Cross-validation
We want to know how you did cross-validation for SNIPS dataset. We found 5 fold datasets from the SNIPS so we combined fold_0-3 to be the training data and split fold_4 in half, and set the first half to be validation dataset and the second half to be test dataset. We repeated the same procedure 5 times (set fold_1-4 as training, half of fold0 as valid and another half of fold_0 to be test, ...). We just want to make sure this is the same as how you did.
Accuracy calculation
In the 2nd paper , you mentioned the “best test accuracy”, and we would like to ask you for clarification on this. Is it possible that it refers to the best validation accuracy achieved during the 40 epochs for each fold, and you average them out over 5 folds? Also, as far as we understand, if a model has possibilities of overfitting, then it’s better to choose more conservative measures instead of the “best” one.
Excerpt from the paper -- “The model is able to overfit the dataset without synthetic speakers; we therefore record the best test accuracy achieved over the course of training for each fold, instead of the final test accuracy.”
Saving model state
I notice that the code in main.py does not save model states based on validation performance but based on epochs. Does this mean that the model run on the test set is the model saved on the last training epoch (even if it’s not the best one)?

the link for the dataset is not working

according the link ( http://fluent.ai:2052/jf8398hf30f0381738rucj3828chfdnchs.tar.gz ) given in the paper and the fluent.ai, finally I found the link is not working, showing 521. Maybe the server is not working anymore.

FileNotFoundError on 'fluent_speech_commands_dataset/data/synthetic_data.csv'

Hi there,

I tried to reproduce the results on the paper: Speech Model Pre-training for End-to-End Spoken Language Understanding. However, when I trained the model on an SLU dataset by following command:

python main.py --train --config_path=<path to .cfg>

It shows
FileNotFoundError: [Errno 2] No such file or directory: '/export/corpora6/fluent_speech_commands_dataset/data/synthetic_data.csv'

I checked the SLU dataset on our server, there are four files in the data folder: train_data.csv, valid_data.csv, test_data.csv and speaker_demographics.csv. I am not sure where I can find the synthetic_data.csv file or the correct datasets trained for the paper. Could you please help me to find it so that I can reproduce the results?

Many thanks in advance.

Kind regards,
Tianyu Cao

lorenlugosch / end-to-end-slu Goto Github PK

end-to-end-slu's People

Contributors

Stargazers

Watchers

Forkers

end-to-end-slu's Issues

Save the optimizer status for future retraining

inference error

Inference Example

"synthetic_data.csv"

what's the format of train_data_seq2seq.csv?

I can't run inference successfully because of lacking of your synthetic data

Running model error

Where is the librispeech and the synthesized data?

Questions about seq2seq model

the link for the dataset is not working

FileNotFoundError on 'fluent_speech_commands_dataset/data/synthetic_data.csv'

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent