Giter Site home page Giter Site logo

end-to-end-slu's People

Contributors

lorenlugosch avatar mravanelli avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

end-to-end-slu's Issues

inference error

Hi, I performed the inference codes, but I got different result. I got different result when I ran the codes every time. Thank you in advance.

Inference Example

Possible to provide an inference example? I'm having trouble figuring out how to evaluate once I've done the training.

import data
import models
import soundfile as sf
import torch

config = data.read_config('/home/n.folkman/git/pretrain_speech_model/experiments/unfreeze_word_layers/experiment.cfg')
data.get_SLU_datasets(config)
model = models.Model(config)

[signal, fs] = sf.read('/home/n.folkman/models/fluent_speech_commands_dataset/wavs/speakers/zZezMeg5XvcbRdg3/ff4ae930-45d9-11e9-81ce-69b74fd7e64e.wav')

device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
sample = torch.tensor(signal, device=device).float()

model.predict_intents(sample)

RuntimeError: Expected 3-dimensional input for 3-dimensional weight [80, 1, 401], but got 2-dimensional input of size [62805, 1] instead

"synthetic_data.csv"

How do I get the "synthetic_data.csv" file? The dataset I downloaded does not have this file, and I can't find a way to generate this file.

what's the format of train_data_seq2seq.csv?

When training under the mode "seq2seq", the file "xx_data_seq2seq.csv" is needed. But this file is not included in the downloaded dataset. So what's the format of "xx_data_seq2seq.csv"?

With the format, I can new this file by myself.

Running model error

/home/a/anaconda3/bin/python3.7 /home/a/git/slu_1/end-to-end-SLU/test.py
no seq2seq hyperparameters
Traceback (most recent call last):
File "/home/a/anaconda3/lib/python3.7/configparser.py", line 788, in get
value = d[option]
File "/home/a/anaconda3/lib/python3.7/collections/init.py", line 916, in getitem
return self.missing(key) # support subclasses that define missing
File "/home/a/anaconda3/lib/python3.7/collections/init.py", line 908, in missing
raise KeyError(key)
KeyError: 'real_dataset_subset_percentage'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "/home/a/git/slu_1/end-to-end-SLU/test.py", line 7, in
config = data.read_config("experiments/no_unfreezing.cfg"); ,,_=data.get_SLU_datasets(config)
File "/home/a/git/slu_1/end-to-end-SLU/data.py", line 95, in read_config
config.real_dataset_subset_percentage=float(parser.get("training", "real_dataset_subset_percentage"))
File "/home/a/anaconda3/lib/python3.7/configparser.py", line 791, in get
raise NoOptionError(option, section)
configparser.NoOptionError: No option 'real_dataset_subset_percentage' in section: 'training'

Where is the librispeech and the synthesized data?

I think to run many of these experiments we need to know where the pre-trained librispeech model is, no? I don't obviously see it, can you help?

Also where is the synthesized data you used?

Thanks
Michael

Questions about seq2seq model

Hi, thank you for sharing the code and knowledge via the 2 papers! My team and I have been trying to implement your E2E SLU models (both seq2seq and non-seq2seq) and have some questions we would like to ask.

  1. Randomness in results
    Even when we run the same models with the same parameters, the results don’t always come out the same, with 1~5% difference. We found the random seed in your code and also in cfg files, but just wondering if different accuracy results are expected.

  2. Cross-validation
    We want to know how you did cross-validation for SNIPS dataset. We found 5 fold datasets from the SNIPS so we combined fold_0-3 to be the training data and split fold_4 in half, and set the first half to be validation dataset and the second half to be test dataset. We repeated the same procedure 5 times (set fold_1-4 as training, half of fold0 as valid and another half of fold_0 to be test, ...). We just want to make sure this is the same as how you did.

  3. Accuracy calculation
    In the 2nd paper , you mentioned the “best test accuracy”, and we would like to ask you for clarification on this. Is it possible that it refers to the best validation accuracy achieved during the 40 epochs for each fold, and you average them out over 5 folds? Also, as far as we understand, if a model has possibilities of overfitting, then it’s better to choose more conservative measures instead of the “best” one.
    Excerpt from the paper -- “The model is able to overfit the dataset without synthetic speakers; we therefore record the best test accuracy achieved over the course of training for each fold, instead of the final test accuracy.”

  4. Saving model state
    I notice that the code in main.py does not save model states based on validation performance but based on epochs. Does this mean that the model run on the test set is the model saved on the last training epoch (even if it’s not the best one)?

FileNotFoundError on 'fluent_speech_commands_dataset/data/synthetic_data.csv'

Hi there,

I tried to reproduce the results on the paper: Speech Model Pre-training for End-to-End Spoken Language Understanding. However, when I trained the model on an SLU dataset by following command:

python main.py --train --config_path=<path to .cfg>

It shows
FileNotFoundError: [Errno 2] No such file or directory: '/export/corpora6/fluent_speech_commands_dataset/data/synthetic_data.csv'

I checked the SLU dataset on our server, there are four files in the data folder: train_data.csv, valid_data.csv, test_data.csv and speaker_demographics.csv. I am not sure where I can find the synthetic_data.csv file or the correct datasets trained for the paper. Could you please help me to find it so that I can reproduce the results?

Many thanks in advance.

Kind regards,
Tianyu Cao

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.