twerkmeister / ilid Goto Github PK

View Code? Open in Web Editor NEW

89.0 89.0 24.0 3.2 MB

automatic spoken language identification

License: MIT License

Python 60.43% Shell 16.91% Gnuplot 3.76% HTML 0.83% JavaScript 17.37% CSS 0.70%

ilid's People

Contributors

Stargazers

Watchers

ilid's Issues

IndexError in evaluation/predict.py (classifier.py)

Hello twerkmeister, sorry if is trivial, is my first experience with caffe, I succede to preprocessing the audios, trained it with Berlin_net model, but at the evaluation step I get:

Traceback (most recent call last):
File "predict.py", line 47, in
predict(args.input, args.proto, args.model, args.output)
File "predict.py", line 20, in predict
raw_scale=255 # convert 0..255 values into range 0..1
File "/home/sylvain/caffe/python/caffe/classifier.py", line 29, in init
in_ = self.inputs[0]
IndexError: list index out of range

I put the complete log in attachment... If someone can help me. Thanks
output.log

Is there a reported accuracy and latency to predict a wav file

would someone tell me the accuracy of iLID and latency to decode a wav file of 10s?
I want to have the reference.
Regards,
Luke

what does the training data look like?

Hi, the link to the training data repo is dead, could you fix that? I cant run the train.py while I dont know what is in the trainingData.csv

Are there other advices to preprocess audio files?

Hi, twerkmeister. I'm following your excellent work 'iLID' recently. The approach shows good performance when tested on the dataset consisting of lots of clean audios. However, when tested on the audios recorded in natural scenes, it doesn't perform as well as before. In your project, I've seen the loudness normalization operation. Are there other advices to preprocess the audio to make it more clean?

many thanks.

twerkmeister / ilid Goto Github PK

ilid's People

Contributors

Stargazers

Watchers

Forkers

ilid's Issues

IndexError in evaluation/predict.py (classifier.py)

Is there a reported accuracy and latency to predict a wav file

what does the training data look like?

Are there other advices to preprocess audio files?

ValueError: need more than 0 values to unpack

umlaut

How are silences treated?

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent