Giter Site home page Giter Site logo

open-speaker-verification's Introduction

Open Speaker Verification

News

[11/24/2020] Important future updates will be focused on VoxSRC2020 Tech Reports. Also a large scale ID(1M+ Identities) training scripts will be updated later. More model benchmarks will be updated :)

[10/25/2020] I have released my baseline system ResNet34-AM-VoxCeleb2 based on mmclassification. Thanks to mmclassification, I can fully utilize all the functions and modules provided without paying a lot attention on the training process.

Introduction

This is a repo intended to provide an open speaker verification tool. Currently this project only provides a training and extraction process. More fundamental functions like feature extraction, post processing, scoring backends and augmentation research will be updated later. The project is based on mmclassification codebase. Please refer to mmclassification readme for installation and running scripts. The code is tested with PyTorch 1.6.0 and CUDA 10.2. NOTE: The pretrained model is saved in PyTorch 1.6.0. So if you are using older versions, you may need to upgrade your PyTorch Version to 1.6.0+ to load our released model.

Data preparation

All features adopted in our training frameworks are extracted from Kaldi, cepstral mean normalized(CMN) and no VAD implemented.

Preparation scripts will be released soon.

Attribute

dataset

Instructions to build kaldi format file to vaex format (highly recomended to replace normal speaker dataset with vaex based speaker dataset):

python tools/python/feats2csv.py --feat_scp YOUR_FEAT --utt2spk YOUR_UTT2SPK --out_dir YOUR_DEST --data_type train

Also data_type can be valid or test or whatever you like.

pipeline

backbone

Pooling Method

Metric

Released Model Benchmark

NOTE: The test set is VOX1-O(cleaned) dataset and training set is VoxCeleb2-dev. Backend is cosine similarity scoring. The minDCF criterion is the same as VoxSRC2020.

Model Backbone Metric feature batch size config raw EER on Vox1-O(cleaned) raw DCF raw EER on Vox1-H checkpoint
ResNet34-AM-VoxCeleb2 ResNet34 AMSoftmax, scale=30, margin=0.2 81 FBANK(including energy) 128 conf 1.207 0.0738 2.44 ckpt
ResNet34-AM-VoxCeleb2-syncBN ResNet34 AMSoftmax, scale=30, margin=0.2 81 FBANK(including energy) 128 conf 1.196 0.0791 - ckpt
SEResNet34-AM-VoxCeleb2 SEResNet34 AMSoftmax, scale=30, margin=0.2 81 FBANK(including energy) 100 conf 1.121 0.0771 2.43 ckpt
SEResNet34-AM-VoxCeleb2-syncBN SEResNet34 AMSoftmax, scale=30, margin=0.2 81 FBANK(including energy) 100 conf 1.175 0.0745 - realeased soon
SEResNet34-AM-VoxCeleb2(with checkpointed) SEResNet34 AMSoftmax, scale=30, margin=0.2 81 FBANK(including energy) 128 conf 1.07 0.0747 2.43 -

open-speaker-verification's People

Contributors

anthracene avatar daavoo avatar hellock avatar leejzh avatar mansimane avatar sunnyxiaohu avatar xiaojieli0903 avatar xvjiarui avatar ycxioooong avatar yl-1993 avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.