wq2012 / awesome-diarization Goto Github PK

View Code? Open in Web Editor NEW

1.5K 1.5K 223.0 222 KB

A curated list of awesome Speaker Diarization papers, libraries, datasets, and other resources.

Home Page: https://wq2012.github.io/awesome-diarization/

License: Apache License 2.0

awesome awesome-list deep-learning machine-learning speaker-diarization speech-processing speech-recognition

awesome-diarization's People

Contributors

Stargazers

Watchers

Forkers

instinct2k18 hedonistrh hbredin josepatino honghe awesome-archive aascode serenidpity muruganr96 pvk444 xuanjihe 00001101-xt twistedmove srinivasgutta7 rehan-ahmad aayushkubb sadam1195 vsantamaria 520liuxing akshaydee-p gzfffff flamato andygaogao wsstriving human2b chenxinglili runngezhang dilo00o dieg0as frannetty mingewang mwang-lifesize luan78zaoha yt-hwang ronggan wildgeece96 yubouf whoconli xiaofei-wang danigunawan wenwanchen ngohider nada-projects swhan9873 ashudwi liukelinlin smith24122412 yanchaomars df14jsj117 songyf naminwang xingui opencollective lamnguyenx moses1994 jim-schwoebel asrivast13 wuqiangch aparnak799 avinashshah099 jaycicle shammur rajratnpranesh herbert-wu mtingzhi nonday yuelupenbgpeng123 saber5433 chihuataneo alongwithyou ismallfish guomin luyun2hit shiwanglei guang-yao zhangkai2017 nanahou diegocastan lizezheng gdy1201 shangeth zmandyhe zeroqiaoba dweekly anotherother ronva-h ellenrw aslam021 wantt ashu170292 manukulamkombil manojpamk xiangpku jardnzm ipadawan arabae yangyutu khanhtuong299 burakakrishna bbrookie

awesome-diarization's Issues

Question about training chinese speech diarization model

It was memtioned in your paper that google used 36M utterance and 18K speaker to train the model. If I want to training a chinese version of the speech diazarition model. How much data do I need for that?
Thank you.

Leadboard

Where has the leaderboard gone?

Multi-scale Speaker Diarization with Dynamic Scale Weighting

Do you think the method from Nvidia in 2022 is impact:

Multi-scale Speaker Diarization with Dynamic Scale Weighting : https://arxiv.org/pdf/2203.15974.pdf

And

TitaNet: Neural Model for speaker representation with 1D Depth-wise separable convolutions and global context

I found it in the leaderboard of Paper with code

nanodrz

Not a publication but I'll be working on getting this more e2e approach working with transformers:

https://github.com/mogwai/nanodrz

How to use AMI dataset to evaluate the DER performance?

Hi, I have seen some authors use AMI corpus to make evaluation on diarization task. But there is no more details about how to evaluate specifically. Like how to choose the dev and test part of the AMI,and how to make the corresponding data preparation.
Is there any guidance about using AMI corpus to evaluate the task？
Thanks.
@wq2012

Leaderboard / benchmark ?

It would be nice to also have some kind of leaderboard associated to each of the listed datasets.

One (bad) way of doing it would be to use numbers reported in papers. It is bad for several reasons:

one cannot be sure the exact same evaluation protocol was used
one cannot be sure the exact same metric was used

A better way of doing it would be to ask authors to provide their output files and run the evaluation for them (possibly automatically on each pull request) but this does not solve the first problem. We could use pyannote.metrics for that.

An even better way would be to ask them to provide runnable pre-trained systems and run them for them but this would need a lot of work to ask from the authors and to setup.

An utopian way would to ask them to provide trainable systems.

Anyway, maybe it is too much to ask and existing challenges like DIHARD and Albayzin are probably enough...

wq2012 / awesome-diarization Goto Github PK

awesome-diarization's People

Contributors

Stargazers

Watchers

Forkers

awesome-diarization's Issues

Question about training chinese speech diarization model

Leadboard

Multi-scale Speaker Diarization with Dynamic Scale Weighting

nanodrz

How to use AMI dataset to evaluate the DER performance?

Leaderboard / benchmark ?

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent