Giter Site home page Giter Site logo

nemo-nullius / opencc Goto Github PK

View Code? Open in Web Editor NEW

This project forked from byvoid/opencc

0.0 1.0 0.0 8.24 MB

Conversion between Traditional and Simplified Chinese

Home Page: https://opencc.byvoid.com/

License: Apache License 2.0

CMake 5.61% Makefile 0.99% Python 8.39% C++ 80.78% JavaScript 2.80% TypeScript 0.36% Batchfile 0.55% Shell 0.52%

opencc's Introduction

Open Chinese Convert 開放中文轉換

Travis AppVeyor Python package

Introduction 介紹

OpenCC

Open Chinese Convert (OpenCC, 開放中文轉換) is an opensource project for conversions between Traditional Chinese, Simplified Chinese and Japanese Kanji (Shinjitai). It supports character-level and phrase-level conversion, character variant conversion and regional idioms among Mainland China, Taiwan and Hong Kong. This is not translation tool between Mandarin and Cantonese, etc.

中文簡繁轉換開源項目,支持詞彙級別的轉換、異體字轉換和地區習慣用詞轉換(中國大陸、臺灣、香港、日本新字體)。不提供普通話與粵語的轉換。

Discussion (Telegram): https://t.me/open_chinese_convert

Features 特點

  • 嚴格區分「一簡對多繁」和「一簡對多異」。
  • 完全兼容異體字,可以實現動態替換。
  • 嚴格審校一簡對多繁詞條,原則爲「能分則不合」。
  • 支持中國大陸、臺灣、香港異體字和地區習慣用詞轉換,如「裏」「裡」、「鼠標」「滑鼠」。
  • 詞庫和函數庫完全分離,可以自由修改、導入、擴展。

Installation 安裝

See Download.

Usage 使用

Online demo 線上轉換展示

Warning: This is NOT an API. You will be banned if you make calls programmatically.

https://opencc.byvoid.com/

Node.js

npm npm i install opencc

JavaScript

const OpenCC = require('opencc');
const converter = new OpenCC('s2t.json');
converter.convertPromise("汉字").then(converted => {
  console.log(converted);  // 漢字
});

TypeScript

import { OpenCC } from 'opencc';
async function main() {
  const converter: OpenCC = new OpenCC('s2t.json');
  const result: string = await converter.convertPromise('汉字');
  console.log(result);
}

See demo.js and ts-demo.ts.

Python

PyPI pip install opencc (Windows, Linux, Mac)

import opencc
converter = opencc.OpenCC('s2t.json')
converter.convert('汉字')  # 漢字

C++

#include "opencc.h"

int main() {
  const SimpleConverter converter("s2t.json");
  converter.Convert("汉字");  // 漢字
  return 0;
}

C

#include "opencc.h"

int main() {
  opencc_t opencc = opencc_open("s2t.json");
  const char* input = "汉字";
  char* converted = opencc_convert_utf8(opencc, input, strlen(input));  // 漢字
  opencc_convert_utf8_free(converted);
  opencc_close(opencc);
  return 0;
}

Document 文檔: https://byvoid.github.io/OpenCC/

Command Line

  • opencc --help
  • opencc_dict --help
  • opencc_phrase_extract --help

Others (Unofficial)

Configurations 配置文件

預設配置文件

  • s2t.json Simplified Chinese to Traditional Chinese 簡體到繁體
  • t2s.json Traditional Chinese to Simplified Chinese 繁體到簡體
  • s2tw.json Simplified Chinese to Traditional Chinese (Taiwan Standard) 簡體到臺灣正體
  • tw2s.json Traditional Chinese (Taiwan Standard) to Simplified Chinese 臺灣正體到簡體
  • s2hk.json Simplified Chinese to Traditional Chinese (Hong Kong variant) 簡體到香港繁體
  • hk2s.json Traditional Chinese (Hong Kong variant) to Simplified Chinese 香港繁體到簡體
  • s2twp.json Simplified Chinese to Traditional Chinese (Taiwan Standard) with Taiwanese idiom 簡體到繁體(臺灣正體標準)並轉換爲臺灣常用詞彙
  • tw2sp.json Traditional Chinese (Taiwan Standard) to Simplified Chinese with Mainland Chinese idiom 繁體(臺灣正體標準)到簡體並轉換爲中國大陸常用詞彙
  • t2tw.json Traditional Chinese (OpenCC Standard) to Taiwan Standard 繁體(OpenCC 標準)到臺灣正體
  • hk2t.json Traditional Chinese (Hong Kong variant) to Traditional Chinese 香港繁體到繁體(OpenCC 標準)
  • t2hk.json Traditional Chinese (OpenCC Standard) to Hong Kong variant 繁體(OpenCC 標準)到香港繁體
  • t2jp.json Traditional Chinese Characters (Kyūjitai) to New Japanese Kanji (Shinjitai) 繁體(OpenCC 標準,舊字體)到日文新字體
  • jp2t.json New Japanese Kanji (Shinjitai) to Traditional Chinese Characters (Kyūjitai) 日文新字體到繁體(OpenCC 標準,舊字體)
  • tw2t.json Traditional Chinese (Taiwan standard) to Traditional Chinese 臺灣正體到繁體(OpenCC 標準)

Build 編譯

Build with CMake

Linux & Mac OS X

g++ 4.6+ or clang 3.2+ is required.

make

Windows Visual Studio:

build.cmd

Test 測試

Linux & Mac OS X

make test

Windows Visual Studio:

test.cmd

Benchmark 基準測試

make benchmark

Example results (from Travis CI):

1: ------------------------------------------------------------------
1: Benchmark                        Time             CPU   Iterations
1: ------------------------------------------------------------------
1: BM_Initialization/hk2s     1587215 ns      1587485 ns          435
1: BM_Initialization/hk2t      126112 ns       125976 ns         5384
1: BM_Initialization/jp2t      245646 ns       245414 ns         2847
1: BM_Initialization/s2hk    26017749 ns     26017390 ns           27
1: BM_Initialization/s2t     26298084 ns     26296375 ns           27
1: BM_Initialization/s2tw    26483120 ns     26482164 ns           27
1: BM_Initialization/s2twp   26455564 ns     26454666 ns           26
1: BM_Initialization/t2hk       44759 ns        44636 ns        15733
1: BM_Initialization/t2jp      143401 ns       143227 ns         4876
1: BM_Initialization/t2s      1374298 ns      1373979 ns          510
1: BM_Initialization/tw2s     1443389 ns      1443701 ns          464
1: BM_Initialization/tw2sp    1699645 ns      1699823 ns          399
1: BM_Initialization/tw2t       76294 ns        76083 ns         9229
1: BM_Convert                     581 ms          581 ms            1
1/1 Test #1: BenchmarkTest ....................   Passed   14.49 sec

Projects using OpenCC 使用 OpenCC 的項目

License 許可協議

Apache License 2.0

Third Party Library 第三方庫

All these libraries are statically linked.

Change History 版本歷史

Links 相關鏈接

Contributors 貢獻者

Please update this list you have contributed OpenCC.

opencc's People

Contributors

byvoid avatar mingruimingrui avatar sgalal avatar kunki avatar gucong3000 avatar lotem avatar xpol avatar ayaka14732 avatar peterdavehello avatar groverlynn avatar kyleskimo avatar shinzoqchiuq avatar steelywing avatar jjgod avatar cedarkuo avatar mrhso avatar tchaikov avatar pprkut avatar tonyable avatar mscdex avatar yxliang01 avatar weakish avatar rueycheng avatar mxgit1090 avatar kanru avatar longyn avatar amowu avatar ryandesign avatar syaoranhinata avatar weihanglo avatar

Watchers

James Cloos avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.