Giter Site home page Giter Site logo

ishine / leaderboard Goto Github PK

View Code? Open in Web Editor NEW

This project forked from speechcolab/leaderboard

0.0 1.0 0.0 2.14 MB

largest-ever Automatic Speech Recognition leaderboard, periodically benchmarks SOTA commercial ASR APIs from Alibaba, Baidu, Google, IFlytek, Microsoft and so on.

Shell 9.68% Python 84.91% Dockerfile 5.41%

leaderboard's Introduction

SpeechColab ASR leaderboard

1. Overview

"If you can’t measure it, you can’t improve it." -- Peter Drucker

Regarding to the current state of Automatic Speech Recognition(ASR), the term "State-Of-The-Art"(SOTA) is kind of vague in the sense that:

  • For industry, there is no objective and quantative benchmark on how these commercial APIs perform in real-life scenarios, at least in public domain.
  • For academia, it is becoming harder today to compare ASR models due to the fragmentation of research toolkits and ecosystems.
  • How are academic SOTA and industrial SOTA related ?

Overview

As above figure shows, SpeechIO leaderboard serves as an ASR benchmarking platform, by providing 3 components:

  1. TestSet Zoo: A collection of test sets covering wide range of speech recognition scenarios
  2. Model Zoo: A collection of models including commercial APIs and open-sourced pretrained models
  3. An automated benchmarking pipeline:
    • defines a simplest-possible specification on recognition interface, the format of input test sets, the format of output recognition results.
    • As long as model submitters conform to this specification, a fully automated pipeline will take care of the rest (e.g. data preparation -> recognition invocation -> text post processing -> WER/CER/SER evaluation)

With SpeechIO leaderboard, anyone can benchmark, reproduce, compare others' systems on local machine, as long as they are published in model zoo and test-set zoo.


2. TestSet Zoo

Test Sets From Public Academic Datasets

已公开
Unlocked
编号
TEST_SET_ID
说明
DESCRIPTION
语言
LANGUAGE
AISHELL1_TEST test set of AISHELL-1 zh
AISHELL2_IOS_TEST test set of AISHELL-2 (iOS channel) zh
AISHELL2_ANDROID_TEST test set of AISHELL-2 (Android channel) zh
AISHELL2_MIC_TEST test set of AISHELL-2 (Microphone channel) zh

SpeechIO Test Sets (ZH)

SpeechIO test sets are carefully curated by SpeechIO authors, crawled from publicly available sources (Youtube, TV programs, Podcast etc), covering various well-known acoustic scenarios(AM) and content domains(LM & vocabulary), labeled by professional annotators.

已公开
Unlocked
编号
TEST_SET_ID
名称
Name
场景
Scenario
内容领域
Topic Domain
时长
hours
难度(1-5)
Difficulty
SPEECHIO_ASR_ZH00000 接入调试集
For leaderboard submitter debugging
视频会议、论坛演讲
video conference & forum speech
经济、货币、金融
economy, currency, finance
1.0 ★★☆
SPEECHIO_ASR_ZH00001 新闻联播 新闻播报
TV News
时政
news & politics
9
SPEECHIO_ASR_ZH00002 鲁豫有约 访谈电视节目
TV interview
名人工作/生活
celebrity & film & music & daily
3 ★★☆
SPEECHIO_ASR_ZH00003 天下足球 专题电视节目
TV program
足球
Sports & Football & Worldcup
2.7 ★★☆
SPEECHIO_ASR_ZH00004 罗振宇跨年演讲 会场演讲
Stadium Public Speech
社会、人文、商业
Society & Culture & Business Trend
2.7 ★★
SPEECHIO_ASR_ZH00005 李永乐老师在线讲堂 在线教育
Online Education
科普
Popular Science
4.4 ★★★
SPEECHIO_ASR_ZH00006 张大仙 & *白 王者荣耀直播 直播
Live Broadcasting
游戏
Game
1.6 ★★★☆
SPEECHIO_ASR_ZH00007 李佳琪 & 薇娅 直播带货 直播
Live Broadcasting
电商、美妆
Makeup & Online shopping/advertising
0.9 ★★★★☆
SPEECHIO_ASR_ZH00008 老罗语录 线下培训
Offline lecture
段子、做人
Life & Purpose & Ethics
1.3 ★★★★☆
SPEECHIO_ASR_ZH00009 故事FM 播客
Podcast
人生故事、见闻
Ordinary Life Story Telling
4.5 ★★☆
SPEECHIO_ASR_ZH00010 创业内幕 播客
Podcast
创业、产品、投资
Startup & Enterprenuer & Product & Investment
4.2 ★★☆
SPEECHIO_ASR_ZH00011 罗翔 刑法法考培训讲座 在线教育
Online Education
法律 法考
Law & Lawyer Qualification Exams
3.4 ★★☆
SPEECHIO_ASR_ZH00012 张雪峰 考研线上小讲堂 在线教育
Online Education
考研 高校报考
University & Graduate School Entrance Exams
3.4 ★★★☆
SPEECHIO_ASR_ZH00013 谷阿莫&牛叔说电影 短视频
VLog
电影剪辑
Movie Cuts
1.8 ★★★
SPEECHIO_ASR_ZH00014 贫穷料理 & 琼斯爱生活 短视频
VLog
美食、烹饪
Food & Cooking & Gourmet
1 ★★★☆
SPEECHIO_ASR_ZH00015 单田芳 白眉大侠 评书
Traditional Podcast
江湖、武侠
Kongfu Fiction
2.2 ★★☆
SPEECHIO_ASR_ZH00016 德云社相声演出 剧场相声
Theater Crosstalk Show
包袱段子
Funny Stories
1 ★★★
SPEECHIO_ASR_ZH00017 吐槽大会 脱口秀电视节目
Standup Comedy
明星糗事
Celebrity Jokes
1.8 ★★☆
SPEECHIO_ASR_ZH00018 小猪佩奇 & 熊出没 少儿动画
Children Cartoon
童话故事、日常
Fairy Tale
0.9 ★☆
SPEECHIO_ASR_ZH00019 CCTV5 NBA 比赛转播 体育赛事解说
Sports Game Live
篮球、NBA
NBA Game
0.7 ★★★
SPEECHIO_ASR_ZH00020 篮球人物 纪录片
Documentary
篮球明星、成长
NBA Super Stars' Life & History
2.2 ★★
SPEECHIO_ASR_ZH00021 汽车之家 车辆评测 短视频
VLog
汽车测评
Car benchmarks, Road driving test
1.7 ★★★☆
SPEECHIO_ASR_ZH00022 小艾大叔 豪宅带看 短视频
VLog
房地产、豪宅
Realestate, Mansion tour
1.7 ★★★
SPEECHIO_ASR_ZH00023 无聊开箱 & Zealer评测 短视频
VLog
产品开箱评测
Unboxing
2 ★★★
SPEECHIO_ASR_ZH00024 付老师种植技术 短视频
VLog
农业、种植
Agriculture, Planting
2.7 ★★★☆
SPEECHIO_ASR_ZH00025 石国鹏讲古希腊哲学 线下培训
Offline lecture
历史,古希腊哲学
History, Greek philosophy
1.3 ★★☆
SPEECHIO_ASR_ZH00026 张震鬼故事 广播节目
Broadcasting Program
鬼故事
Horror Stories
2.4 ★★★
SPEECHIO_ASR_ZH00027 华语辩论世界杯 辩论赛
Debates Contest
兴趣、技能、成长
Hobby, Skill, Growth
1.4 ★★★
SPEECHIO_ASR_ZH00028 时政现场同传 同声传译
Simultaneous Translation
时政、社会公共治理
News & Events on Public Governance
2.1 ★★★☆
SPEECHIO_ASR_ZH00029 港台明星访谈
周杰伦、曾志伟、张家辉、陈小春、周星驰
口音(港台)
Accent(HongKong & Taiwan)
娱乐、生活、演艺
Entertainment, Acting, Musics
1.5 ★★★☆
SPEECHIO_ASR_ZH00030 世界青年说 口音(老外)
Accent(Foreigner)
异国文化比较
Cultural Difference
2 ★★★☆

To pull a unlocked test set from cloud to your local dataset-zoo leaderboard/datasets/*:

ops/pull dataset <TEST_SET_ID>

3. Model Zoo

Cloud API Models

API models are usually small (basically client programs), so we normally put them in this github repo.

已公开
Unlocked
编号
MODEL_ID
类型
type
模型作者/所有人
model author/owner
简介
description
链接
Service URL
aispeech_api_zh Cloud API 思必驰
AISpeech
思必驰开放平台 https://cloud.aispeech.com
aliyun_api_zh Cloud API 阿里巴巴
Alibaba
阿里云 https://ai.aliyun.com/nls/asr
baidu_pro_api_zh Cloud API 百度
Baidu
百度智能云(极速版) https://cloud.baidu.com/product/speech/asr
deprecated Cloud API 讯飞
IFlyTek
讯飞开放平台(听写) https://www.xfyun.cn/services/voicedictation
iflytek_lfasr_api_zh Cloud API 讯飞
IFlyTek
讯飞开放平台(转写) https://www.xfyun.cn/services/lfasr
deprecated microsoft_rest_api_zh Cloud API 微软
Microsoft
Azure https://azure.microsoft.com/zh-cn/services/cognitive-services/speech-services/
microsoft_sdk_zh Cloud API 微软
Microsoft
Azure https://azure.microsoft.com/zh-cn/services/cognitive-services/speech-services/
deprecated sogou_api_zh Cloud API 搜狗
Sogou
AI开放平台 https://ai.sogou.com/product/one_recognition/
tencent_api_zh Cloud API 腾讯
Tencent
腾讯云 https://cloud.tencent.com/product/asr
yitu_api_zh Cloud API 依图
YituTech
依图语音开放平台 https://speech.yitutech.com

Local Engine (Open-sourced Pretrained ASR Models)

Local models/engines are normally too large for github, so we store these models in cloud.

已公开
Unlocked
编号
MODEL_ID
类型
type
模型作者/所有人
model author/owner
简介
description
speechio_kaldi_multicn pretrained model Xingyu NA(那兴宇) Kaldi multi_cn recipe
wenet_multi_cn pretrained model Binbin Zhang(张彬彬)@wenet-e2e WeNet multi_cn recipe
vosk_model_cn batteries-included local engine alphacephei Chinese engine of Vosk
wenet_wenetspeech pretrained model Binbin Zhang(张彬彬)@wenet-e2e WeNet wenetspeech recipe

To pull an unlocked model from cloud to your local model-zoo leaderboard/models/*:

ops/pull model <MODEL_ID>

4. Benchmarking Pipeline

To submit your model to leaderboard and get it benchmarked over all test sets, follow this specification How to submit

Also you can pull unlocked models & test sets, and trigger benchmarking pipeline on your local machine via:

ops/leaderboard_runner requests/request.yaml

the content of request.yaml is described in above specification.


5. Latest Leaderboard Report

result


Contacts

Email: [email protected]

leaderboard's People

Contributors

dophist avatar fanlu avatar iou2much avatar liankuai avatar missru avatar robin1001 avatar tonko22 avatar xiaoxi91 avatar ximalayaeverestintelligentlab avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.