Giter Site home page Giter Site logo

crispengari / speech-to-text-python-ibm_watson Goto Github PK

View Code? Open in Web Editor NEW
0.0 2.0 0.0 11.83 MB

This is a simple Artificial Intelligence Application that converts audios to speech using `ibm_watson`.

Python 27.15% PowerShell 67.96% Batchfile 4.89%
ai ibm-watson ibm-cloud python python3 python2 machine-learning jupiter-notebook

speech-to-text-python-ibm_watson's Introduction

Speech To Text (stt)

This is a simple Artificial Intelligence Application that converts audios to speech using ibm_watson.

Application capabilities

This app is capable of:

  • reading an audio from audios folder
  • converts the audio to speech using ibm_watson SpeechToTextV1()
  • write the converted speech to an external file speech.txt in the files folder

Getting started

Installation

####### First you need to install ibm_watson

$pip install ibm_watson
Second you need to install pip install PyJWT==1.7.1
$pip install PyJWT==1.7.1

Then you are ready to go

Getting an API key and service URL

To get the service URL go to IBM WATSON

  • Create an account or login if you are a member
  • Go to service
  • Go to AI
  • Look for Speech To Text and click
  • Create a new project
  • Hunt for API keys in the docs
Importing packages
from ibm_watson import SpeechToTextV1, ApiException
from ibm_cloud_sdk_core.authenticators import IAMAuthenticator
import json
Keys Variables
url = "API_KEY"
api_key = "URL"

Setting the authentication
try:
    auth = IAMAuthenticator(api_key)
    stt = SpeechToTextV1(authenticator=auth)
    stt.set_service_url(url)
except ApiException as e:
    print(e)
Converting audio to speech
with open("audios/long.mp3", "rb") as audio:
    res = stt.recognize(audio=audio, content_type="audio/mp3", model="en-AU_NarrowbandModel", continuous=True).get_result()
Write all the speech in a text file
sentences = res["results"]
sentence_list = []
for sentence in sentences:
    # adding a sentence with confidence that is greater than 50%
    sentence_list.append(str(sentence["alternatives"][0]["transcript"]).strip() if sentence["alternatives"][0]["confidence"] > 0.5 else "")

# print(json.dumps(sentence_list, indent=2))

with open("files/speech.txt", "w") as writter:
    for line in sentence_list:
        if line == "%HESITATION":
            writter.write(",")
        else:
            writter.write(line+" ")

print("DONE")
All the code in one file main.py

# importing packages
from ibm_watson import SpeechToTextV1, ApiException
from ibm_cloud_sdk_core.authenticators import IAMAuthenticator
import json

# service credentials
url = "API_KEY"
api_key = "URL"

# Setting the authentication
try:
    auth = IAMAuthenticator(api_key)
    stt = SpeechToTextV1(authenticator=auth)
    stt.set_service_url(url)
except ApiException as e:
    print(e)

# converting audio to speech
with open("audios/long.mp3", "rb") as audio:
    res = stt.recognize(audio=audio, content_type="audio/mp3", model="en-AU_NarrowbandModel", continuous=True).get_result()


""""
* We are getting a python list of number of results
* We want to loop through them and create sentences
"""

sentences = res["results"]
sentence_list = []
for sentence in sentences:
    # adding a sentence with confidence that is greater than 50%
    sentence_list.append(str(sentence["alternatives"][0]["transcript"]).strip() if sentence["alternatives"][0]["confidence"] > 0.5 else "")

# print(json.dumps(sentence_list, indent=2))

with open("files/speech.txt", "w") as writter:
    for line in sentence_list:
        if line == "%HESITATION":
            writter.write(",")
        else:
            writter.write(line+" ")

print("DONE")
Changes

There are a list of models and you can change the code based on what you want to achive

  • Modes URL HERE
  • Speech To Text Docs HERE

Why this simple Application.

This program was built for practical purposes

Credits:

speech-to-text-python-ibm_watson's People

Contributors

crispengari avatar

Watchers

 avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.