Giter Site home page Giter Site logo

olaviinha / neuraldialogueaudiolizer Goto Github PK

View Code? Open in Web Editor NEW
5.0 1.0 0.0 152 KB

Jupyter notebook for turning textual dialogue into voice audio.

Jupyter Notebook 100.00%
neural-network colab amazon-polly azure-tts google-cloud-speech google-tts azure-cognitive-services text-to-speech tts tts-api

neuraldialogueaudiolizer's Introduction

Neural Dialogue Audiolizer

Open In Colab

Neural Dialogue Audiolizer is a ".txt to .wav converter" that turns textual dialogue (e.g. an interview, a chat) between two individuals to audio dialogue with two freely selectable voices, currently by using any of the following APIs:

It was made to run in Google Colaboratory (i.e. your browser), using your Google Drive as data source and storage.

Audio demos

Source text Google Cloud TTS Amazon Polly TTS Microsoft Azure TTS
gpt-3_chat-1.txt WAV (loser) WAV WAV (winner)

API access

Access with necessary access keys is required to use any of the provided TTS APIs. More information on obtaining access:

Note that neural voices are available only in specific regions in all of these services. Select location accordingly when enabling the service/API where necessary.

Note that costs may apply. At the time of writing this, to the best of my knowledge, account creation to all of these services as well as limited monthly usage of these TTS APIs is free of charge, even if billing/credit card information is already required upon registration. You should also be aware that each line in each text file you audiolize, consumes one TTS API call. TODO: consume only 2 API calls and slice+merge returned audio files in Colab.

Input text

Input should be path to a .txt file located in your Google Drive, containing the dialogue in one of the following formats, with no other text. If your input material is a copy-paste from the interwebs, make sure to clean it up first to strictly follow one of these formats.

  1. question_and_answer expects an empty line between every time speaker changes. See example
  2. dialogue_with_names expects Name: (e.g. John: Hello Bob! How are you?) every time speaker changes. Speaker is changed despite the name in the beginning, i.e. if there are two consecutive lines beginning with John:, the notebook will still interpret the second as Bob, and your result is messed up. This will be improved in the distant future, perhaps. See example

Languages

This notebook has only English and Finnish voices by default. To add other languages, add the correct language names to p1_voice and p2_voice menus from Google Cloud TTS voice list, Amazon Polly TTS voice list or Microsoft Azure TTS voice list


โ‡จ Run NeuralDialogueAudiolizer.ipynb

neuraldialogueaudiolizer's People

Contributors

olaviinha avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.