Comments (4)
Hello @pranoot , I'm gald to that my code is helpful to you.
In fact, this error is caused by one-hot encoding. It has been solved by adding np.argmax(labels, axis=1)
. contact me if you still have question.
Thanks for your report.
from speaker-recognition-papers.
Hello @vzxxbacq ,
Thanks for the revert and the fix both were helpful and now that script is running without any bug.
Now I am trying to train the same model on my own dataset.
I have few doubts can you please help me with the same.
- For pre processing feature extraction in deep speaker I used ext_fbank_feature as specified in the paper. There is SLIDE_WINDOW present in it. Is it required and also if yes what should be the appropriate shape for the same?
- The shape of the input place holder is ( None, 100, 64, 1) I want to know is 100 num of frames of individual clip, 64 the number of f-bank coefficients?
- What should be the duration of individual clip which is used for training ?
- Should the clip contain silence along with utterances or should clip not have any silence?
Thank you very much! :)
from speaker-recognition-papers.
Hi @pranoot ,
Actually, if you understand slide_window function, all problems will be solved. In the paper, author use fixed length audio ( 10ms ), but we have many dataset with changable length. So, I write the slide_window to apply our model to these dataset. slide_window
parameters is a list [l, r]
and the function will do feature[i] = feature[i-l : i+r]
, so obviously 100
means feature[i] = feature[ i-49 : i+50 ]
that slide_window=[49, 50]
. And 64 is feature dims in the paper.
And I didn't write VAD method, I used other toolkit to do this part job.
If you still have question feel free to contact me.
from speaker-recognition-papers.
Oh cool got it !
Thanks a lot !! :D
from speaker-recognition-papers.
Related Issues (9)
- Fighting Wolf
- ModuleNotFoundError: No module named 'pyasv.data_manage'
- Ctdnn approach dount HOT 5
- Low validation accuracy while training for 50 speakers HOT 19
- How to generate enrollment data for testing CTDNN model? HOT 3
- Using DataManage4BigData class HOT 5
- Validation Accuracy low for Deep Speaker Model HOT 1
- 疑问 HOT 5
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from speaker-recognition-papers.