Giter Site home page Giter Site logo

paperpotato / unihan-to-pinyin Goto Github PK

View Code? Open in Web Editor NEW

This project forked from jeffw16/unihan-to-pinyin

0.0 0.0 0.0 1.03 MB

A Chrome extension to convert Unihan characters into pinyin and tool to take a mapping of Unihan characters and generate a character to pinyin mapping

Python 72.89% HTML 6.40% JavaScript 20.71%

unihan-to-pinyin's Introduction

unihan-to-pinyin

This repo is comprised of a simple script that generates a character to pinyin and jyutping mapping in JSON from the Unihan database, as well as a Chrome extension, located under chrome_extension, which uses the results of the script to replace or supplement Chinese characters with hanzi. The Chrome extension may be installed by downloading the hanzi_to_pinyin.crx file.

Running the generation script

The generation script is generate_mappings.py and runs on Python 3.7 or later. It generates both pinyin and jyutping output files. The output format is JSON, and the respective outputs are placed into hanzi_to_pinyin.json and hanzi_to_jyutping.json.

Data is taken from the Unihan database. It is available for download at http://www.unicode.org/Public/UCD/latest/ucd/Unihan.zip

Place the Unihan_Readings.txt from the Unihan database file in the base directory of your local copy of this repo.

Using the data

It's as simple as loading your JSON file into memory and doing substitutions for each word.

An example that reads input from stdin and outputs into stdout is located at hanzi_to_pinyin.py.

RSH

The 'hanzi_to_pinyin.py' file has been modified to update the English field of the RSH.txt field to include the Jyutping translation of each character. The 'RSH+jyutping.txt' file can be loaded into Anki to create flashcards for those wanting to learn Cantonese.

unihan-to-pinyin's People

Contributors

jeffw16 avatar paperpotato avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.