migushthe2nd / msedgetts Goto Github PK

View Code? Open in Web Editor NEW

245.0 245.0 38.0 280 KB

A simple Azure Speech Service module that uses the Microsoft Edge Read Aloud API

Home Page: https://migushthe2nd.github.io/MsEdgeTTS/

License: MIT License

TypeScript 100.00%

free-tts speech speech-synthesis speech-to-text text-to-speech tts

msedgetts's Introduction

My Projects

JeatApp

Everything your group needs, in one app!
visit organization

ThemezerNX

All kinds of theming tools for the Nintendo Switch
visit organization

And some more:

MsEdgeTTS
An Azure Speech Service module that uses the Microsoft Edge Read Aloud API.
NxSysupdate
Detects, downloads, and runs hooks when a new Nintendo Switch system update has been detected. Posts output to a Discord webhook.

Of which archived:

Larry
A simple Discord bot that interacts with the OpenAI GTP3 API.
Lacu
A bot for Discord similar to music bots, but with video support on an external website.
filmkever
A movie streaming project that indexes videos from certain external websites.

Main Languages and Tools

msedgetts's People

Contributors

Stargazers

Watchers

msedgetts's Issues

support proxy

I would like to connect to the edge TTS service through a proxy. Would you consider adding the option of proxy?
Does this library support importing in the CommonJS method?

no browser support ?

First of all thank you for this great tts lib it's one the best (free) I've seen so far 👍🏼

I'm trying to use it from the browser but it does not work.

setMetadata want to create a websocket instance (initClient) and try to access a "client" method which does not exist :/

I have no issue from nodejs. I didn't saw this was a nodejs only package in the doc, just want to make sure it is before moving on

Whether the role and styledegree fields are supported？

const readable = tts.toStream(voiceOver, {
rate: ${formatN(opt.rate)}%,
pitch: ${formatN(opt.pitch)}st,
volume: ${formatN(opt.volume)}%,
});

like this

const readable = tts.toStream(voiceOver, {
rate: ${formatN(opt.rate)}%,
pitch: ${formatN(opt.pitch)}st,
volume: ${formatN(opt.volume)}%,
role,
styledegree
});

Reserved Characters in SSML

If the text contains reserved characters (<>&"') it won't work. Using a simple escape function solves the issue:

reuse a wss connection

How to reuse a wss connection?
It takes a lot of time to connect to wss every time.
Is it possible to use the same wss for multiple tts requests?

On this way(Write to Stream), how to set correct order of waves?

TypeError: websocket_1.client is not a constructor

ii use it in TypeScript , but it tip:
TypeError: websocket_1.client is not a constructor

Version 1.2.1 Does Not Generate Audio

Hi there,

I recently tried updating this package from version 1.1.4 to the latest version (1.2.1). However, upon making this update, I noticed that when I tried to generate audio, no audio was actually being generated. Are there any changes I am supposed to make to my code that occurred between version 1.1.4 and version 1.2.1? Or does the latest version of the package just not work correctly? I had trouble finding a changelog so I'm not sure what changes were made that may cause the problem I am experiencing. Any help is greatly appreciated.

RIFF_16KHZ_16BIT_MONO_PCM cannot generate WAV format files

import {MsEdgeTTS, OUTPUT_FORMAT} from "msedge-tts";

(async () => {
    const tts = new MsEdgeTTS();
    await tts.setMetadata("en-US-AriaNeural", OUTPUT_FORMAT.RIFF_16KHZ_16BIT_MONO_PCM);
    const filePath = await tts.toFile("./wav.wav", "Hi, how are you?");  
})();

The format RIFF_16KHZ_16BIT_MONO_PCM does not produce WAV files

Not able fine-tune anymore

No audio return from the API when sending with those tags to fine-tune pitch, pronunciation, speaking rate, etc.

Text without any adjustment will still work fine.

Readme need to be fix

Hi.
The info to use MsEdgeTTS.OUTPUT_FORMATS. throw typeof error. After update OUTPUT_FORMATS renamed OUTPUT_FORMAT.

Offtopic:
Could add the limits to usage MsEdgeTTS? I mean: How many words support into a text?
Could make a Google neural TTS module? (without apiKey)