Giter Site home page Giter Site logo

ai4bharat / indicnlp-transliteration Goto Github PK

View Code? Open in Web Editor NEW
57.0 4.0 14.0 495 KB

Codebase for Indic-Transliteration using Seq2Seq RNN. For latest repo with Transformer-based models, check: https://github.com/AI4Bharat/IndicXlit

Home Page: https://transliteration.ai4bharat.org

License: Apache License 2.0

Python 73.21% Jupyter Notebook 26.79%
indian-languages nlp transliteration konkani maithili sinhala urdu tamil malayalam kannada

indicnlp-transliteration's Introduction

IndianNLP-Transliteration

Project Website | Demo UI | Python Library

The main goal of this project is to create open source input tools for content creation in under-represented languages in India.
It started in collaboration with Story Weaver a non-profit working towards foundational literary education for children, supported by Google's AI for Social Good initiative.

Most languages in India do not have digital presence due to an underdeveloped ecosystem. One of the major bottlenecks in content creation and language adoption, is difficulty to input text in several native Indian languages. Lack of stable input tools in underserved languages is huge barrier for creating digital content and NLP datasets in these languages.

Supported Languages

  • Bengali - বাংলা
  • Gujarati - ગુજરાતી
  • Hindi - हिंदी
  • Kannada - ಕನ್ನಡ
  • Konkani Goan - कोंकणी
  • Maithili - मैथिली
  • Malayalam - മലയാളം
  • Marathi - मराठी
  • Panjabi Eastern - ਪੰਜਾਬੀ
  • Sindhi - سنڌي‎
  • Sinhala - සිංහල
  • Telugu - తెలుగు
  • Tamil - தமிழ்
  • Urdu - اُردُو

Repository Usage

For Attributions and Contributions lists, check here 🖖

Training Procedures

This repository is developed to facilate easier experimentation with different network architecture models, reformulated objectives with minimal effort and highly tinkerable, rather than a offshelf library.

A Condensed standalone version of a simple model training, inferencing and accuracy computation is created as jupyter notebook.
Notebook

Pythonic Library

Pythonic transliteration library is available as Python Package Index and also under github releases.
Follow usages in apps readme.


NeuralNet Models

Transliteration models for languages are made available as releases, in a easy deployable way.

All the NN models (along with metadata) of Xlit - Transliteration are licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.

CC BY SA 4.0


Datasets

Datasets created as part of the project for languages Maithili, Konkani, Hindi are made available as JSON files under downloads.

Xlit - Transliteration Datasets by Story Weaver & AI4Bharat are licensed under a Creative Commons Attribution 4.0 International License.

CC BY 4.0

Kindly attribute if you use the dataset for your research or products


Contact

If you have benefited by our datasets/models/services or got motivated by our works, we would like to hear from you.

email: [email protected]


indicnlp-transliteration's People

Contributors

gokulnc avatar josephgeobenjamin avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

indicnlp-transliteration's Issues

Didn't understand how to use the UI.

Might seem stupid. But does it translate stuff written in english to Indian languages or is it that you write a word in say Hindi in English font and it provides you the Indian option of it. Please do help me out. TIA!

Support for Batch Processing and Inference on GPU

Hi, thanks for the transliteration api. It's helpful.

  1. Is it possible to transliterate words in batches rather than one at a time?
  2. Also, can there be a flag in API where the model can use GPU if available?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.