Comments (5)
One important item to note here relevant to speech recognition is that Chrome/Chromium currently records the user and sends the recording to a server - without any notification provided that clearly indicates the user of the browser is being recorded and their voice is being sent to an external server. Also, it is not clear if the users' biometrics (their voice) is stored (forever) by the service; see https://bugs.chromium.org/p/chromium/issues/detail?id=816095
from speech-api.
@guest271314 , thank you. By including these client-side, server-side and third-party scenarios in the design of standard APIs and by more tightly integrating such standards and APIs with WebRTC we can: (1) provide users with notifications and permissions with respect to which client-side, server-side and third-party components and services are accessing their microphones and their text, SSML, hypertext and audio streams, (2) produce efficient call graphs (see also: https://youtu.be/EPBWR_GNY9U?t=2m from 2:00 to 4:12), (3) reduce latency for real-time translation scenarios, (4) improve quality for real-time translation scenarios.
from speech-api.
Iām hoping to inspire interest in post-text speech technology (speech-to-X1 and X2-to-speech) as well as interest in round-tripping where we can utilize acoustic measures and metrics to compare the audio input to and output from speech-to-X1-to-X2-to-speech.
X1, X2 could be SSML (1.0, 1.1 or 2.0), hypertext or new formats.
X1-to-X2 machine translation is also topical.
from speech-api.
In the video Real Time Translation in WebRTC, the speaker indicates (at 7:48) that a major issue which he would like to see solved is that users have to pause their speech before speech recognition and translation occur.
Towards reducing latency, we can consider real-time online speech recognition algorithms which, instead of processing natural language sentence-by-sentence and outputting X1, process natural language lexeme-by-lexeme and produce event streams. In these low-latency approaches, speech recognition components and services process speech audio in real-time and produce event steams which are consumed by machine translation components which produce event streams which are consumed by speech synthesis components which produce resultant speech audio.
from speech-api.
This issue pertains to Issue 1 in the Web Speech API specification.
Issue 1: The group has discussed whether WebRTC might be used to specify selection of audio sources and remote recognizers. See Interacting with WebRTC, the Web Audio API and other external sources thread on [email protected].
from speech-api.
Related Issues (20)
- Define how to load custom voices HOT 3
- Can I capture UserMedia stream and do processing on that along with using SpeechRecognition HOT 1
- SpeechRecognition ends after unspecified time
- Bubbling model for SpeechSynthesisUtterance HOT 1
- need to improve speech recognition in conversation between multiple speakers
- Speach Destination HOT 1
- Requirements for SpeechSynthesisErrorEvent HOT 1
- Offline/on-device speech recognition HOT 2
- Could we have web speech API support IPA for speech synthesis voice language HOT 2
- Arabic TTS Web Speech is Missing some letters
- Interaction with screen readers and other assistive technology? HOT 1
- getVoices() is supposed to be user agent dependent, but appears not to be. HOT 1
- Android issue HOT 1
- Could not get isFinal == true HOT 9
- Feature request: SpeechRecognition pause/resume HOT 2
- Multiple Issues HOT 1
- speechSynthesis: utterance with lang but not voice
- Why is SpeechRecognition not working correctly in Safari?
- Clear Privacy Contracts HOT 1
- Continuously listening
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
š Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. ššš
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ā¤ļø Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from speech-api.