homink / deepspeech.pytorch.ko Goto Github PK

View Code? Open in Web Editor NEW

22.0 22.0 11.0 54 KB

License: MIT License

Python 83.86% Shell 7.81% Perl 8.33%

deepspeech.pytorch.ko's People

Contributors

Stargazers

Watchers

Forkers

cyaai akakakakakaa lazy-juns all-iswell somjang jongsix poveteen redlasha nborggren wonwooo steadymingha

deepspeech.pytorch.ko's Issues

nikl.py

좋은 repo 올려 주신 거 감사드립니다.
다른 dataset(an4..)들의 training은 문제가 없이 잘 진행되는 데 한글 데이타셋들은 전혀 진행이 되지 않네요.
제가 초보자라 그런 것 같습니다.

먼저 국립국어원에서 압축파일을 받았는데 이 걸 어느 디렉토리에 풀어야 하는 지 정확히 모르겠습니다.
다른 데이타셋들은 deepspeech/data에 압축파일이 저장되고 process되서 nikl dataset도 일단 deepspeech/data에 넣어두고 python nikl.py를 치면

이와같이 에러가 나옵니다.
deepspeech/data/local/에 clean_corpus.sh는 존재하는데 OSError가 나오는 데 이유를 모르겠습니다.

또 이와 다르게 직접 압축파일들을 풀고 an4와 유사하게 manifest.csv를 만들어서 train시켜도 wav파일을 못찾는 OSError가 나옵니다. 디렉토리 지정을 정확히 되어있는데도...
물론 original deepspeech repo에서도 똑같은 에러가 나옵니다.

다른 데이타셋들은 정상적으로 training이 됩니다.

좋은 조언 부탁드립니다.
감사합니다.

안녕하세요 run.pl 관련 질문드립니다.

안녕하세요! 딥스피치관련 음성인식을 해보려고하는 연구자입니다.

nikl.py파일을 돌리면

local 폴더에 clean_corpus.sh 의 93번째 라인에 관련된 오류가 계속뜨는데요

./run.pl 이 없다는 오류가 계속 떠서

local/run.pl로 바꿧더니 프로그램 진행이 되지않습니다.

어떤식으로 코드를 수정해야할지 감이 잡히지 않아 질문을 드리게 되었습니다.

label 관련 질문 드립니다

안녕하세요
default label이 영어 알파뱃으로 되어있던데, 제가 label_ko.json으로 바꾸어 돌리는 것이 맞을까요?

I was following on speech.ko repo to preprocess, deep speech.pytorch repo with preprocess, preparemetafile .
I was wondering if this repo is independent to run with only Korean open data *.zip file (same data as speech.ko)

After run this repo just considering that raw zipped datasets, there are some $home/corpus or .txt path problems (can't find file) just running data/nikl.py
I want to know directory structure that is running in this repo.

With prepared datasets, I wonder where to continue on this repo and which to skip. Also, can I have to implement the original deep speech.pytorch repo with Korean frontend process that has Korean-cleaners?
Thanks!!

Transcribe.py output problem

안녕하세요. 경희대학교 김만수입니다.

STT에 관심이 있어 검색하던 중, 여기를 찾아 트레이닝 시켜보았는데요.

필요한 패키지들을 다운받고, 러닝을 완료하였고, deepspeech_final_pth.tar가 성공적으로 생성되었습니다.

그 후, transribe.py를 실행하였을 때,

{"output": [{"transcription": " "}], "_meta": {"language_model": {"name": null}, "acoustic_model": {"version": "0.0.1", "hidden_layers": 5, "name": "deepspeech_final.pth.tar", "hidden_size": 800, "rnn_type": "gru"}, "decoder": {"alpha": null, "beta": null, "lm": false, "type": "greedy"}}}

다음과 같은 output을 얻게 되었습니다. 왜 이런 문제가 생기는지 짐작이 안가 질문올립니다.
답변해주시면 감사하겠습니다.

download urls in readme outdated

Hi,

thank you for all the work. it's hard to find data in Korean language.
I'm glad I found this on github for self-educating reason I'm very interested to use it.
Unfortunately the links seem to be dead/outdated

Korean read speech corpus (about 120 hours, 17GB) from National Institude of Korean Language

http://www.korean.go.kr/front/board/boardStandardView.do?board_id=4&mn_id=17&b_seq=464

https://ithub.korean.go.kr/user/corpus/referenceManager.do

homink / deepspeech.pytorch.ko Goto Github PK

deepspeech.pytorch.ko's People

Contributors

Stargazers

Watchers

Forkers

deepspeech.pytorch.ko's Issues

nikl.py

안녕하세요 run.pl 관련 질문드립니다.

label 관련 질문 드립니다

usage

Transcribe.py output problem

download urls in readme outdated

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent