Giter Site home page Giter Site logo

textreader's Introduction

TextReader

Project aimed at utilizing Azure AI Text to Speech service to build a text reader for news sites, Wikipedia etc.

Intro

What is Azure Speech Service?

The Speech service provides speech to text and text to speech capabilities with a Speech resource. You can transcribe speech to text with high accuracy, produce natural-sounding text to speech voices, translate spoken audio, and use speaker recognition during conversations.

Frontend

Front is a chrome extension containing a popup.html to displayer the audio player and 3 layers of scripts:

i. content.js - to collect article text from page. ii. background.js - service worker to handle messages from content.js and popup.js. iii. popup.js - to fetch audio from backend and serve audio file to popup.html

Challenges

  1. Currently popup.js retriggers fetch everytime it is closed and opened. We need to identify a way to persist the audio as long as the tab is open to reduce the calls to the backend.
  2. Create a more robust scraper for the frontend. Currently it is simply looking for the first article tag.

Backend

Backend is built on Express.js to serve requests from the front end. Currently, there is only one service - it accepts text and responds with a Readable Stream of WAV audio.

  1. OPTIONS request for "/audio" handles pre-flight requests from chrome extension to prevent any CORS exceptions.

  2. POST request for "/audio" calls the generateAudioFile function.

i. generateAudioFile creates a new instance of Azure speech configuration:

const speechConfig = sdk.SpeechConfig.fromSubscription(
  process.env.SPEECH_KEY,
  process.env.SPEECH_REGION
);

ii. A new instance of SpeechSynthesizer is created to perform speech synthesis audio.

var synthesizer = new sdk.SpeechSynthesizer(speechConfig);

iv. speakTextAsync produces an ArrayBuffer object called audioData. The function takes an ArrayBuffer as input and converts it to Buffer using Buffer.from(arrayBuffer). Then, it creates a Readable stream, pushes the Buffer data to the stream using stream.push(buffer), and finally signals the end of the stream using stream.push(null).

textreader's People

Contributors

nandini92 avatar

Stargazers

Ala Taupeka avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.