Giter Site home page Giter Site logo

omnilingo / omnilingo Goto Github PK

View Code? Open in Web Editor NEW
48.0 48.0 8.0 6.03 MB

Listening-based language learning

Home Page: http://omnilingo.github.io

License: GNU Affero General Public License v3.0

Python 2.55% HTML 0.78% JavaScript 92.26% CSS 4.42%
language-learning language-learning-game languages languages-spoken web-application

omnilingo's Introduction

OmniLingo

Matrix #omnilingo:matrix.org GitHub licence

Project in action

What is this?

The goal of the project is to help you practice listening comprehension.

It works by giving you random sentences in the language you're learning and asking you to fill in the gaps. The sentences were submitted by contributors to Mozilla Common Voice platform.

The project aims to not require any knowledge of a meta language in order to start learning. If you are interested in a more traditional course creation project, check out LibreLingo.

The game works by ordering the questions by difficulty, then you are given batches of five with a random task for each of the questions. When you sucessfully answer a batch of five in less time than the audio takes to play, then you advance a level and get given a new batch of five.

Tasks

  • Fill in the blanks: A cloze-style task
  • Drag and drop: Get a set of tiles and click on them to build a word or sentence
  • Pick the right one: Get two options and choose the right one
  • Spot the word: Get set of six tiles and click on the ones that appear in the audio

Keys

  • Space: Play the recording
  • Enter:
    1. Submit and check if you got it right
    2. If already submitted, move to the next recording

Data

The data comes from the Common Voice dataset releases.

Target audience

This system is designed with two main user groups in mind:

  • People who want to learn a new language
  • People who want to learn how to write their native language

The system endeavours to be audio first, with knowledge of writing built up by hearing.

Contact

Talk to us

  • IRC: irc.freenode.net #OmniLingo
  • Matrix: #OmniLingo:matrix.org (access via Element)
  • Telegram: OmniLingo

Follow us

Available languages

All of the languages available in Common Voice 6.1 dataset.

Abkhaz · Arabic · Assamese · Breton · Catalan · Hakha Chin · Czech · Chuvash · Welsh · German · Dhivehi · Greek · English · Esperanto · Spanish · Estonian · Basque · Persian · Finnish · French · Frisian · Irish · Hindi · Upper Sorbian · Hungarian · Interlingua · Indonesian · Italian · Japanese · Georgian · Kabyle · Kyrgyz · Luganda · Lithuanian · Latvian · Mongolian · Maltese · Dutch · Odia · Punjabi · Polish · Portuguese · Romansh Sursilvan · Romansh Vallader · Romanian · Russian · Kinyarwanda · Sakha · Slovenian · Swedish · Tamil · Thai · Turkish · Tatar · Ukrainian · Vietnamese · Votic · Chinese (China) · Chinese (Hong Kong) · Chinese (Taiwan)

If you want to work with a language not yet in Common Voice, we highly recommend that you get set up in Common Voice, but in the meantime, you can check out the format guidelines.

Releases

  • 0.1.0 Functional proof of concept
  • 0.2.0 Partial prototype with level progression

Deployment

For deployment information check out our blogpost at the IPFS blog.

To add more languages, download a dataset from Common Voice and put it in cv-corpus-6.1-2020-12-11/.

Happy hacking! :)

Dependencies

For those who prefer to install their dependencies through their package manager in Debian/Ubuntu, the following dependencies are available there:

python3-mutagen - audio metadata editing library (Python 3)
python3-jieba - Jieba Chinese text segmenter (Python 3)
python3-flask - micro web framework based on Werkzeug and Jinja2 - Python 3.x

Acknowledgements

omnilingo's People

Contributors

alunduil avatar d33tah avatar ftyers avatar harikalarkutusu avatar jonorthwash avatar nlhowell avatar oddaaron00 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

omnilingo's Issues

Get user permission to play

At the moment the user has to explicitly use the play button in order to give permission, which is annoying when trying to add a keyboard only mode.

Add keyboard-only mode

There should be a mode that only uses keyboard input.

  • Play the audio (or replay)
  • Type
  • Submit
  • Next clip

We will probably need to use some option key for play/next clip.

Add basic feedback

This is related to some other issues, e.g. #2 #7 . But it would be good to have some basic feedback while we work out what will actually work. I'll implement tick and cross for each batch of 10.

Add good multiple language support

Currently it allows choosing a different language at startup, but it would be cool if the same server could serve multiple languages.

improve text input

  • make it more obvious where to type, maybe have an actual box
  • allow pressing return to submit

Set up a project-specific domain

This is blocked by #1, but I think it could prove useful in many ways. There are lots of domains that cost less than a typical coffee per year and the renewal is just as cheap.

BiDi layout

Make sure we have a layout that works for bidirectional text.

A task for identifying N words in the audio

The task:

  • Play an audio
  • Get presented with e.g. 6 words (or Chinese characters)
    • 3 are in the audio, 3 are not (distractors)
  • You have to click on the ones that are there, without the ones that aren't.

Thanks to @JacobSchmitt for the idea!

Improve difficulty ranking

Ideas:

  • Use LM perplexity
  • Use compression ratio

E.g. we need to think about: (1) short sentences with rare words and (2) someone speaking slowly but with a lot of background noise.

We could try with the linear interpolation of all three, or it could be configurable.

Scramble mode

You get the sentence as tiles which are syllables (or morphs), and you have the audio, you have to build up the sentence by dragging tiles.

Choose or type based on image

It would be cool to also allow people to learn by typing in images or by choosing between N words for a given image. To avoid the huge data collection bottleneck, we could use WikiData: https://www.wikidata.org/wiki/Q506.

An additionally cool thing (but long-term plan) would be to use these as primers for the audio. E.g. the system sets up a task to do the image selection and then in 3-4 tasks does the same word with audio.

Writing system learning mode

It would be great to give people the opportunity to learn a writing system too. Here is my idea:

  • Have the audio
  • Take the transcript, split into characters
  • Scramble the characters and present them as tiles
  • Give the user slots to drag the characters into
  • Slots turn green when the user gets it right, otherwise red,
    • If red then the user can drag it back or drag it to another slot

This could be useful info for the implementation: https://daily-dev-tips.com/posts/vanilla-javascript-drag-n-drop/

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.