Giter Site home page Giter Site logo

drorm / leah Goto Github PK

View Code? Open in Web Editor NEW
7.0 7.0 0.0 1.07 MB

Leah combines voice recognition, voice synthesis and ChatGPT to provide an environment where you can improve your foreign language skills.

Home Page: https://drorm.github.io/leah/

License: MIT License

TypeScript 78.12% CSS 1.07% HTML 18.91% SCSS 0.53% Shell 1.37%
chatgpt esl foreign-language text-to-speech voice-recognition

leah's People

Contributors

dependabot[bot] avatar drorm avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

leah's Issues

Auto detect ChatGPT language

We can auto-detect languages using these options:

With both options, The solution is relatively simple when the conversation is in one language. In some cases, however, we want ChatGPT to switch languages on the fly. The challenge is how to do this in a way that keeps the text clean, since there's and obvious advantage to having the user be able to both see and hear the text.
It's easy when it does something like this:

English: How is it going?
French: Comment ça va?

but how do we handle the following:

In French the 'C' in "Ça va" is pronounced like an 'S'.

Provide a way to create and update prompts

Right now only "system" prompts are available. The only way for the user to update or create the prompts is to go into the developer tools and change the JSON of the settings.

Provide a way to have full CRUD of prompts.

Add shortcuts to support vision impaired

As @freetimekate mentioned in C-Nedelcu/talk-to-chatgpt#25, voice apps can help folks who are vision impaired.

Provide shortcuts for:

Description shortcut mouse equivalent
Microphone listen/mute CTRL - Space Click on the microphone icon
Bot pause/resume speaking answer CTRL-SHIFT-S Click on "Ne Chat" top left
Bot stop speaking. abort answer CTRL-SHIFT-A Click on "Ne Chat" top left
New ChatGPT chat TBD Click on "Ne2 Chat" top left

Should we use the CTRL and SHIFT combinations above, or just use the bare keys: 'S', 'A', SPACE, etc. and require that the cursor not be in the typing textarea. That would make things simpler and avoid the possibility of conflicts with the browser (less of an issue since we only support Chrome) or the OS.

In C-Nedelcu/talk-to-chatgpt#25 (comment) there are some implementation suggestions, but we should also look at how screen readers currently handle things, and in general the state of online tools for the visually impaired.

Have the bot speak as text is generated: read each sentence.

It can get annoying to wait till the text is done generating to start the speech. It's better to one sentence at the time.
Since it's practically impossible to do good sentence detection programmatically using regex or similar techniques, the best solution is to have ChatGPT itself tell us where a sentence starts and ends.
Options:

  1. The easiest approach is to ask it to start each sentence on a new line.
  2. A more elegant solution is to ask it to introduce characters that are invisible to the user, but are detectable programmatically, such as 3 consecutive spaces. Since we're only dealing with text generated one entity, we don't need to worry too much about edge cases.

Provide a way to add a prefix to every question

ChatGPT easily "forgets" the initial instructions. When using the "translate" prompt, for instance, if you say:

tell me a joke

rather than translating to Spanish, it tell you a joke in Spanish.
On the other hand

translate: tell me a joke

works correctly.
So adding a "prefix", in this case "translate:" looks like it'll solve this issue.

figure out a way to generalize the prompts by using something like ${language}

Looking at our prompts, many of the specify a behavior for a specific language. Now that users can change prompt, they can go ahead and change the language on their own, but it'd be cleaner if once the user has chose a language, all the prompts would just work out of the box, grabbing the language from the user's choice.
The challenge is that we have the language in several places:

  • In the prompt itself, something like "I want you to be a French" teacher ...
  • In the choice that the user made: fr-fr
    and possibly other places.
    Need to see how we can make one choice, affect these.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.