Giter Site home page Giter Site logo

foosoft / yomichan-import Goto Github PK

View Code? Open in Web Editor NEW
82.0 6.0 23.0 2.6 MB

External dictionary importer for Yomichan.

Home Page: https://foosoft.net/projects/yomichan-import/

License: MIT License

Go 99.16% Shell 0.84%
yomichan japanese epwing translation edict enamdict

yomichan-import's People

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

yomichan-import's Issues

Request: support 英辞郎 (Eijirou).

This is an awesome dictionary for Japanese phrases and collocations. It'd be nice if you could support it so that it can be converted to many other formats.

大辞泉 Error (255)

When I try and import my daijisen file zero-epwing.exe stops responding and I get
conversion process failed: exit status 255

image

I managed to get Daijirin working properly though w/ the same steps.

Conversion error

I'm on windows 10 64-bit, trying to import an EPWING, and I get this error:

2018/04/30 02:08:18 conversion process failed: failed to find compatible extractor for 'NHK 日本語発音アクセント辞典'

The path itself and all the files within are written in normal characters, but the actual dictionary is named as above, and I'm not sure how to safely change that information (but I assume it's coming from within the CATALOGS file), or if that is what is causing the problem.

Here is the relevant part of my dictionary source path: NHK Accent\NHKACT\NHKACT\CATALOGS
And I made up this path for the zip (it doesn't exist yet): yomichan_dicts\nhkaccent.zip

Various 故事ことわざの辞典 bugs

Hello, here are some things I've noticed with the 故事ことわざの辞典 importer:

  1. There is a \n\n\n\n at the end of each definition, which should be removed
  2. I found this entry: 愛して〔も〕その悪を知り、憎みて〔も〕その善を知る. Looks like there isn't any handling for those 〔〕 brackets. I'm guessing it means you can do with or without the thing inside. Didn't check if there were other entries with this issue
    image
  3. Something is broken at the start of 愛縁奇縁
    image
    image
    image

Thanks -Tyler

Bad parses for daijisen

Below is an excerpt of some expression -> glossary maps where the expression is clearly not parsed correctly.

"【一往": "いち‐おう【一応】‐オウ・【一往】‐ワウ㊀[名]①一度。一回。「―も二応も」「今―篤(と
く)と考えて見まして」〈二葉亭・浮雲〉②一度行くこと。「―の新賓なれば感思おさへがたし」〈海道記・
序〉㊁[副]①十分ではないが、ひととおり。大略。「これで―でき上がりだ」②ほぼそのとおりと思われるが
、念のために。「―見直しましょう」◆本来は「一往」と書く。", 
    "【一族郎等": "いちぞく‐ろうどう【一族郎党】‐ラウダウ・【一族郎等】‐ラウドウ《「いちぞくろう
とう」とも》①一家一族。家族。②同族と家来。③一族とその関係者。「選挙**に―を総動員する」", 
    "【下り湯": "おり‐ゆ【△居り湯】をり‐・【下り湯】おり‐別に沸かした湯を湯船に移し入れて使う風呂
。のちには据え風呂と混同された。", 
    "【不充分": "ふ‐じゅうぶん【不十分】‐ジフブン・【不充分】‐ジユウブン[名・形動]足りないとこ
ろのあること。完全でないこと。また、そのさま。「―な明るさ」「証拠―」", 
    "【不恰好": "ぶ‐かっこう【不格好】‐カクカウ・【不×恰好】‐カツカウ[名・形動]格好の悪いこと。
みっともないこと。また、そのさま。「―なズボン」[派生]ぶかっこうさ[名]", 
    "【中ぶらり": "ちゅう‐ぶらり【宙ぶらり】チウ‐・【中ぶらり】チユウ‐[名・形動]「宙ぶらりん」
に同じ。「議案が―になる」", 
    "【中ぶらりん": "ちゅう‐ぶらりん【宙ぶらりん】チウ‐・【中ぶらりん】チユウ‐[名・形動]①空中にぶらさがっていること。また、そのさま。「台風で電線が―になる」②どっちつかずで中途半端であること。
また、そのさま。「―な(の)立場」「計画が―になる」", 
    "【中飯": "ちゅう‐はん【昼飯】チウ‐・【中飯】チユウ‐ひるめし。昼食。「漸(やつ)と諸君の―が了(おわ)り」〈独歩・湯ヶ原ゆき〉",

Please comment the fields of the database format

the database format has changed since the early releases, and I have a custom dictionary that no longer loads in the latest release. Please comment and describe what each field means (and if they can be blank) so that I can generate a compatible one with correct values in each field.

Request: yomichan to EDICT/JMDICT #29

I solved my last request (#29) for EPWING conversion using dedicated EPWING software, but when I wrote that I had also been looking into EDICT/JMDICT2 format conversion.

Since you're probably well versed in both formats, I figured this is the place to ask. Could you add a functionality for converting files that are already in yomichan format (or perhaps in a more general json format) to the EDICT or JMDICT format?

Pretty much every Japanese study tool ever created (if it includes a dictionary) uses EDICT/JMDICT2. A converter from yomichan format would be very handy in plugging into these types of tools.

Incorrect merging of entries for German entries

Currently glossaries for different readings of the same Kanji get merged together, for example 言質:
2021-02-18-07:48:16
げんち's and げんしつ's glossary are merged, resulting in double the words and an annotation which tells one that the right reading is wrong. However this is not the case for all words, for example 人 works just fine

(this was made in assumption that the automatically generated dictionaries at https://foosoft.net/projects/yomichan/ also use yomichan-import)

Error when trying to import Kenkyusha EPWING

I am on Mac OS X Sierra (10.12.3) trying to import the EPWING version of Kenkyusha 5th edition. Zero-epwing processing seems to go well, but I can never get it to start the server.

On my first try, I got this error:

2017/03/14 07:27:54 converting 'Kenkyusha/5' to '/var/folders/0r/fpk8v_nn6z70bqlrkxw_7q0r0000gn/T/yomichan_tmp_557844224' in 'epwing' format...
2017/03/14 07:27:54 invoking zero-epwing from '/Users/user/Documents/bin/darwin/zero-epwing'...
2017/03/14 07:28:27 completed zero-epwing processing
runtime: stat overflow: val 33838, n 65536

I tried again, replacing the program and bin folder with fresh files, and received this error.

2017/03/14 07:33:06 converting 'Kenkyusha/5' to '/var/folders/0r/fpk8v_nn6z70bqlrkxw_7q0r0000gn/T/yomichan_tmp_098786236' in 'epwing' format...
2017/03/14 07:33:06 invoking zero-epwing from '/Users/user/Documents/bin/darwin/zero-epwing'...
2017/03/14 07:33:25 completed zero-epwing processing
mach error semaphore_signal: 15
fatal error: mach error

runtime stack:
runtime.throw(0x358f48, 0xa)
	/usr/lib/go-1.6/src/runtime/panic.go:547 +0x79
runtime.macherror(0xf, 0x36ca20, 0x10)
	/usr/lib/go-1.6/src/runtime/os1_darwin.go:209 +0x96
runtime.mach_semrelease.func1()
	/usr/lib/go-1.6/src/runtime/os1_darwin.go:459 +0x2b
runtime.systemstack(0x10733ee0)
	/usr/lib/go-1.6/src/runtime/asm_386.s:329 +0x77
runtime.mach_semrelease(0x0)
	/usr/lib/go-1.6/src/runtime/os1_darwin.go:459 +0x37
runtime.semawakeup(0xfe9d9d0)
	/usr/lib/go-1.6/src/runtime/os1_darwin.go:22 +0x15
runtime.unlock(0x4e9c70)
	/usr/lib/go-1.6/src/runtime/lock_sema.go:109 +0x134
runtime.incidlelocked(0xffffffff)
	/usr/lib/go-1.6/src/runtime/proc.go:3353 +0x4d
runtime.retake(0xb2879b90, 0x14ab91cd, 0x14ab91cd)
	/usr/lib/go-1.6/src/runtime/proc.go:3576 +0x164
runtime.sysmon()
	/usr/lib/go-1.6/src/runtime/proc.go:3511 +0x1ee
runtime.mstart1()
	/usr/lib/go-1.6/src/runtime/proc.go:1098 +0xc6
runtime.mstart()
	/usr/lib/go-1.6/src/runtime/proc.go:1068 +0x53

goroutine 1 [runnable]:
syscall.Syscall(0xa, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0)
	/usr/lib/go-1.6/src/syscall/asm_darwin_386.s:17 +0x5
syscall.Close(0xa, 0x0, 0x0)
	/usr/lib/go-1.6/src/syscall/zsyscall_darwin_386.go:404 +0x3f
os.(*file).close(0x150a4020, 0x0, 0x0)
	/usr/lib/go-1.6/src/os/file_unix.go:140 +0x4a
os.(*File).Close(0x23dfe798, 0x0, 0x0)
	/usr/lib/go-1.6/src/os/file_unix.go:132 +0x45
main.writeDb.func2(0x33cfe0, 0x4, 0x21abc000, 0x3cb07, 0x4baaa, 0x19, 0x0, 0x0)
	/home/alex/projects/go/src/github.com/FooSoft/yomichan-import/common.go:158 +0x49c
main.writeDb(0x107122d0, 0x47, 0x107142d0, 0x2a, 0x33d2b0, 0x6, 0x21abc000, 0x3cb07, 0x4baaa, 0x0, ...)
	/home/alex/projects/go/src/github.com/FooSoft/yomichan-import/common.go:180 +0x18d
main.epwingExportDb(0xbffffd06, 0xb, 0x107122d0, 0x47, 0x107142d0, 0x2a, 0x2710, 0x0, 0x0, 0x0)
	/home/alex/projects/go/src/github.com/FooSoft/yomichan-import/epwing.go:201 +0x1300
main.exportDb(0xbffffd06, 0xb, 0x107122d0, 0x47, 0x33b6b0, 0x6, 0x0, 0x0, 0x2710, 0x0, ...)
	/home/alex/projects/go/src/github.com/FooSoft/yomichan-import/main.go:57 +0x423
main.main()
	/home/alex/projects/go/src/github.com/FooSoft/yomichan-import/main.go:112 +0x57c

Any ideas on how to fix this?

Binaries in release should be more portable

I just tried downloading the latest version of yomichan-import, and I got the following error:

./yomichan: /lib/x86_64-linux-gnu/libc.so.6: version `GLIBC_2.32' not found (required by ./yomichan)

With a little snooping, i found google/cadvisor#3155 where they dealt with this same issue by disabling cgo (I'm not so familiar with go, so I don't know if i've totally missed the boat here).

I think it would make sense to distribute a static binary rather than a dynamically linked one.

Conversion process failed: exit status 1

I would like to convert an epwing Japanese dictionary using Yomichan import, but I get this error:
conversion process failed: exit status 1

Screenshot at 2020-03-26 17-15-28

This is the structure of the folder:

Screenshot at 2020-03-26 17-15-51

Request: yomichan to EPWING

Any chance of a reverse yomichan json -> EPWING tool being made? Things like EBPocket and OCRMangaReader for Android support EPWING but not yomichan, so even if it's your own custom EPWING format, having something supported would be pretty dang useful.

On a related note, iirc EPWING is undocumented, but if making a yomichan->epwing converter would take more time that you're willing to put in, do you at least have any detailed documentation about the EPWING and yomichan formats that some other willing soul might be able to take and use to make a converter?

Conversion error

Hello,

I'm trying to convert any of the supported dictionaries but to no avail. Here's the screenshot.

image

Request: Separate Daijirin's J-J and J-E versions

The Daijirin EPWING dictionary comes with both J-J and J-E definitions. Ideally, Yomichan Import should split these into two separate dictionaries so users can choose to add either only the J-J version or only the J-E version to Yomichan.

Alternatively, if the dictionary can't be converted into two separate versions at once, the user should be given the option to strip one version out during the conversion process, leaving them with either only a J-J version or only a J-E version.

Error when trying to import 新和英大辞典

Error log is:

QuotaExceededError: The current transaction exceeded its quota limitations. (3)

Access to IndexedDB appears to be restricted. Firefox seems to require that the history preference is set to "Remember history" before IndexedDB use of any kind is allowed. (6)

ConstraintError: A mutation operation in the transaction failed because a constraint was not satisfied. (439)

Error: Dictionary may not have been imported properly: 448 errors reported.

Firefox was set to "Remember history" when this happened.
Any idea about how to fix it?

Build documentation needs to be improved

The pre-built executables don't work on many platforms (e.g. Ubuntu 20.04) due to missing dependencies, so it would be good to have some general instructions on building from source in the README.

Request: Strip example sentences from Kenkyusha dictionary

At present, the Kenkyusha EPWING dictionary is practically unusable in Yomichan because the huge amount of example sentences clutters the popup beyond all reason. Yomichan Import should strip these out when converting the dictionary for use in Yomichan.

(I can't recall if the various J-J EPWING dictionaries are also effected by this issue, but they probably are, albeit to a lesser extent by virtue of the fact that they have less example sentence in them.)

Various 広辞苑 bugs

  1. over a thousand entries with �� or as the headword
  2. some headwords need this boxed A thing removed
    image
  3. same bug as number 3 in issue #27
    image
  4. there are a lot of broken looking entries with a ○ at the start of the headword

What is the purpose of each element in a dictionary entry

For an example entry like this

[
        "休学",
        "きゅうがく",
        "",
        "",
        0,
        [
            "きゅうがく【休学】\ntemporary absence from school.~する have\u003ca term's\u003eleave of absence from school.\n"
        ],
        54840,
        ""
],

I can gather the first two elements are the word, the sixth is definitions, and the seventh is the id.
But what are the third, fourth, fifth, and eighth for?
If someone could enlighten me on this, I would be grateful.

Request: support 新辞林

新辞林 is a J-J dictionary from 三省堂 (who made 大辞林) that has very concise definitions and consistent formatting.

Daijisen or 研究社 新和英大辞典第5版 import issue.

Hello,

It's great that made importing epwing dictionaries possible into Yomichan. Daijirin imports fine but when I try to do so with above mentioned ones I get an error "failed to find compatible extractor for '大辞'" in case of Daijisen or "failed to find compatible extractor for '研究社 新和英大辞典 第5版'" for the latter.

How you can resolve the issue.

Request: Add 英辞郎 (Eijiro) parser

This might be a lot of work because Eijiro is huge, but it's a very powerful dictionary for translation (which is what I mostly use Yomichan for). I don't have any experience to help you out here, but I do have it in its text format.

Application output: 2017/06/15 01:32:53 exit status 1

The local server port doesn't seem to start in order to import the local dictionary. It shows me a time stamp followed by 'exit status 1' as if I already stopped it on the localhost.

I'm using Windows 10. I have tried changing the port number a few times. Nothing changed. Do you know how I could get the server port running?

[feature request] ignore English or alphanumeric entries

It would be great if I could pass in a command line flag to ignore english or alphanumeric entries.

The yomichan extension can get annoying when I type into a text box and hit Shift while my cursor happens to be hovering over an English word contained in an EPWING entry.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.