foosoft / yomichan-import Goto Github PK
View Code? Open in Web Editor NEWExternal dictionary importer for Yomichan.
Home Page: https://foosoft.net/projects/yomichan-import/
License: MIT License
External dictionary importer for Yomichan.
Home Page: https://foosoft.net/projects/yomichan-import/
License: MIT License
This is an awesome dictionary for Japanese phrases and collocations. It'd be nice if you could support it so that it can be converted to many other formats.
at least one kanji, '剝' , gets dropped when converting from kanjidic2.xml to the ZIP file. This kanji is clearly in the kanjidic2.xml file, but it's not in the converted ZIP file.
I'm on windows 10 64-bit, trying to import an EPWING, and I get this error:
2018/04/30 02:08:18 conversion process failed: failed to find compatible extractor for 'NHK 日本語発音アクセント辞典'
The path itself and all the files within are written in normal characters, but the actual dictionary is named as above, and I'm not sure how to safely change that information (but I assume it's coming from within the CATALOGS file), or if that is what is causing the problem.
Here is the relevant part of my dictionary source path: NHK Accent\NHKACT\NHKACT\CATALOGS
And I made up this path for the zip (it doesn't exist yet): yomichan_dicts\nhkaccent.zip
Hello, here are some things I've noticed with the 故事ことわざの辞典 importer:
\n\n\n\n
at the end of each definition, which should be removed愛して〔も〕その悪を知り、憎みて〔も〕その善を知る
. Looks like there isn't any handling for those 〔〕
brackets. I'm guessing it means you can do with or without the thing inside. Didn't check if there were other entries with this issue愛縁奇縁
Thanks -Tyler
Below is an excerpt of some expression -> glossary
maps where the expression
is clearly not parsed correctly.
"【一往": "いち‐おう【一応】‐オウ・【一往】‐ワウ㊀[名]①一度。一回。「―も二応も」「今―篤(と
く)と考えて見まして」〈二葉亭・浮雲〉②一度行くこと。「―の新賓なれば感思おさへがたし」〈海道記・
序〉㊁[副]①十分ではないが、ひととおり。大略。「これで―でき上がりだ」②ほぼそのとおりと思われるが
、念のために。「―見直しましょう」◆本来は「一往」と書く。",
"【一族郎等": "いちぞく‐ろうどう【一族郎党】‐ラウダウ・【一族郎等】‐ラウドウ《「いちぞくろう
とう」とも》①一家一族。家族。②同族と家来。③一族とその関係者。「選挙**に―を総動員する」",
"【下り湯": "おり‐ゆ【△居り湯】をり‐・【下り湯】おり‐別に沸かした湯を湯船に移し入れて使う風呂
。のちには据え風呂と混同された。",
"【不充分": "ふ‐じゅうぶん【不十分】‐ジフブン・【不充分】‐ジユウブン[名・形動]足りないとこ
ろのあること。完全でないこと。また、そのさま。「―な明るさ」「証拠―」",
"【不恰好": "ぶ‐かっこう【不格好】‐カクカウ・【不×恰好】‐カツカウ[名・形動]格好の悪いこと。
みっともないこと。また、そのさま。「―なズボン」[派生]ぶかっこうさ[名]",
"【中ぶらり": "ちゅう‐ぶらり【宙ぶらり】チウ‐・【中ぶらり】チユウ‐[名・形動]「宙ぶらりん」
に同じ。「議案が―になる」",
"【中ぶらりん": "ちゅう‐ぶらりん【宙ぶらりん】チウ‐・【中ぶらりん】チユウ‐[名・形動]①空中にぶらさがっていること。また、そのさま。「台風で電線が―になる」②どっちつかずで中途半端であること。
また、そのさま。「―な(の)立場」「計画が―になる」",
"【中飯": "ちゅう‐はん【昼飯】チウ‐・【中飯】チユウ‐ひるめし。昼食。「漸(やつ)と諸君の―が了(おわ)り」〈独歩・湯ヶ原ゆき〉",
the database format has changed since the early releases, and I have a custom dictionary that no longer loads in the latest release. Please comment and describe what each field means (and if they can be blank) so that I can generate a compatible one with correct values in each field.
I solved my last request (#29) for EPWING conversion using dedicated EPWING software, but when I wrote that I had also been looking into EDICT/JMDICT2 format conversion.
Since you're probably well versed in both formats, I figured this is the place to ask. Could you add a functionality for converting files that are already in yomichan format (or perhaps in a more general json format) to the EDICT or JMDICT format?
Pretty much every Japanese study tool ever created (if it includes a dictionary) uses EDICT/JMDICT2. A converter from yomichan format would be very handy in plugging into these types of tools.
Currently glossaries for different readings of the same Kanji get merged together, for example 言質:
げんち's and げんしつ's glossary are merged, resulting in double the words and an annotation which tells one that the right reading is wrong. However this is not the case for all words, for example 人 works just fine
(this was made in assumption that the automatically generated dictionaries at https://foosoft.net/projects/yomichan/ also use yomichan-import)
I am on Mac OS X Sierra (10.12.3) trying to import the EPWING version of Kenkyusha 5th edition. Zero-epwing processing seems to go well, but I can never get it to start the server.
On my first try, I got this error:
2017/03/14 07:27:54 converting 'Kenkyusha/5' to '/var/folders/0r/fpk8v_nn6z70bqlrkxw_7q0r0000gn/T/yomichan_tmp_557844224' in 'epwing' format...
2017/03/14 07:27:54 invoking zero-epwing from '/Users/user/Documents/bin/darwin/zero-epwing'...
2017/03/14 07:28:27 completed zero-epwing processing
runtime: stat overflow: val 33838, n 65536
I tried again, replacing the program and bin folder with fresh files, and received this error.
2017/03/14 07:33:06 converting 'Kenkyusha/5' to '/var/folders/0r/fpk8v_nn6z70bqlrkxw_7q0r0000gn/T/yomichan_tmp_098786236' in 'epwing' format...
2017/03/14 07:33:06 invoking zero-epwing from '/Users/user/Documents/bin/darwin/zero-epwing'...
2017/03/14 07:33:25 completed zero-epwing processing
mach error semaphore_signal: 15
fatal error: mach error
runtime stack:
runtime.throw(0x358f48, 0xa)
/usr/lib/go-1.6/src/runtime/panic.go:547 +0x79
runtime.macherror(0xf, 0x36ca20, 0x10)
/usr/lib/go-1.6/src/runtime/os1_darwin.go:209 +0x96
runtime.mach_semrelease.func1()
/usr/lib/go-1.6/src/runtime/os1_darwin.go:459 +0x2b
runtime.systemstack(0x10733ee0)
/usr/lib/go-1.6/src/runtime/asm_386.s:329 +0x77
runtime.mach_semrelease(0x0)
/usr/lib/go-1.6/src/runtime/os1_darwin.go:459 +0x37
runtime.semawakeup(0xfe9d9d0)
/usr/lib/go-1.6/src/runtime/os1_darwin.go:22 +0x15
runtime.unlock(0x4e9c70)
/usr/lib/go-1.6/src/runtime/lock_sema.go:109 +0x134
runtime.incidlelocked(0xffffffff)
/usr/lib/go-1.6/src/runtime/proc.go:3353 +0x4d
runtime.retake(0xb2879b90, 0x14ab91cd, 0x14ab91cd)
/usr/lib/go-1.6/src/runtime/proc.go:3576 +0x164
runtime.sysmon()
/usr/lib/go-1.6/src/runtime/proc.go:3511 +0x1ee
runtime.mstart1()
/usr/lib/go-1.6/src/runtime/proc.go:1098 +0xc6
runtime.mstart()
/usr/lib/go-1.6/src/runtime/proc.go:1068 +0x53
goroutine 1 [runnable]:
syscall.Syscall(0xa, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0)
/usr/lib/go-1.6/src/syscall/asm_darwin_386.s:17 +0x5
syscall.Close(0xa, 0x0, 0x0)
/usr/lib/go-1.6/src/syscall/zsyscall_darwin_386.go:404 +0x3f
os.(*file).close(0x150a4020, 0x0, 0x0)
/usr/lib/go-1.6/src/os/file_unix.go:140 +0x4a
os.(*File).Close(0x23dfe798, 0x0, 0x0)
/usr/lib/go-1.6/src/os/file_unix.go:132 +0x45
main.writeDb.func2(0x33cfe0, 0x4, 0x21abc000, 0x3cb07, 0x4baaa, 0x19, 0x0, 0x0)
/home/alex/projects/go/src/github.com/FooSoft/yomichan-import/common.go:158 +0x49c
main.writeDb(0x107122d0, 0x47, 0x107142d0, 0x2a, 0x33d2b0, 0x6, 0x21abc000, 0x3cb07, 0x4baaa, 0x0, ...)
/home/alex/projects/go/src/github.com/FooSoft/yomichan-import/common.go:180 +0x18d
main.epwingExportDb(0xbffffd06, 0xb, 0x107122d0, 0x47, 0x107142d0, 0x2a, 0x2710, 0x0, 0x0, 0x0)
/home/alex/projects/go/src/github.com/FooSoft/yomichan-import/epwing.go:201 +0x1300
main.exportDb(0xbffffd06, 0xb, 0x107122d0, 0x47, 0x33b6b0, 0x6, 0x0, 0x0, 0x2710, 0x0, ...)
/home/alex/projects/go/src/github.com/FooSoft/yomichan-import/main.go:57 +0x423
main.main()
/home/alex/projects/go/src/github.com/FooSoft/yomichan-import/main.go:112 +0x57c
Any ideas on how to fix this?
I just tried downloading the latest version of yomichan-import, and I got the following error:
./yomichan: /lib/x86_64-linux-gnu/libc.so.6: version `GLIBC_2.32' not found (required by ./yomichan)
With a little snooping, i found google/cadvisor#3155 where they dealt with this same issue by disabling cgo (I'm not so familiar with go, so I don't know if i've totally missed the boat here).
I think it would make sense to distribute a static binary rather than a dynamically linked one.
Any chance of a reverse yomichan json -> EPWING tool being made? Things like EBPocket and OCRMangaReader for Android support EPWING but not yomichan, so even if it's your own custom EPWING format, having something supported would be pretty dang useful.
On a related note, iirc EPWING is undocumented, but if making a yomichan->epwing converter would take more time that you're willing to put in, do you at least have any detailed documentation about the EPWING and yomichan formats that some other willing soul might be able to take and use to make a converter?
The Daijirin EPWING dictionary comes with both J-J and J-E definitions. Ideally, Yomichan Import should split these into two separate dictionaries so users can choose to add either only the J-J version or only the J-E version to Yomichan.
Alternatively, if the dictionary can't be converted into two separate versions at once, the user should be given the option to strip one version out during the conversion process, leaving them with either only a J-J version or only a J-E version.
Error log is:
QuotaExceededError: The current transaction exceeded its quota limitations. (3)
Access to IndexedDB appears to be restricted. Firefox seems to require that the history preference is set to "Remember history" before IndexedDB use of any kind is allowed. (6)
ConstraintError: A mutation operation in the transaction failed because a constraint was not satisfied. (439)
Error: Dictionary may not have been imported properly: 448 errors reported.
Firefox was set to "Remember history" when this happened.
Any idea about how to fix it?
If you could add a chinese dictionary that would be great :)
The pre-built executables don't work on many platforms (e.g. Ubuntu 20.04) due to missing dependencies, so it would be good to have some general instructions on building from source in the README.
At present, the Kenkyusha EPWING dictionary is practically unusable in Yomichan because the huge amount of example sentences clutters the popup beyond all reason. Yomichan Import should strip these out when converting the dictionary for use in Yomichan.
(I can't recall if the various J-J EPWING dictionaries are also effected by this issue, but they probably are, albeit to a lesser extent by virtue of the fact that they have less example sentence in them.)
��
or �
as the headwordA
thing removedHey,
As the title states, I'm trying to get Yomichan to work with a German EPWING dictionary. It'd be really cool if you could start expanding EPWING support to non-Japanese dictionaries starting with the German one. I know at least a dozen people who could benefit from this.
German EPWING dictionary: https://mega.nz/file/4RJB2CSa#qT1Tlpkd9zaLkMcX03HVUmdY3cvNGb9bQMFDjz6HK7M
For an example entry like this
[
"休学",
"きゅうがく",
"",
"",
0,
[
"きゅうがく【休学】\ntemporary absence from school.~する have\u003ca term's\u003eleave of absence from school.\n"
],
54840,
""
],
I can gather the first two elements are the word, the sixth is definitions, and the seventh is the id.
But what are the third, fourth, fifth, and eighth for?
If someone could enlighten me on this, I would be grateful.
Would be awesome if possible.
新辞林 is a J-J dictionary from 三省堂 (who made 大辞林) that has very concise definitions and consistent formatting.
Hello,
It's great that made importing epwing dictionaries possible into Yomichan. Daijirin imports fine but when I try to do so with above mentioned ones I get an error "failed to find compatible extractor for '大辞'" in case of Daijisen or "failed to find compatible extractor for '研究社 新和英大辞典 第5版'" for the latter.
How you can resolve the issue.
This might be a lot of work because Eijiro is huge, but it's a very powerful dictionary for translation (which is what I mostly use Yomichan for). I don't have any experience to help you out here, but I do have it in its text format.
The local server port doesn't seem to start in order to import the local dictionary. It shows me a time stamp followed by 'exit status 1' as if I already stopped it on the localhost.
I'm using Windows 10. I have tried changing the port number a few times. Nothing changed. Do you know how I could get the server port running?
It would be great if I could pass in a command line flag to ignore english or alphanumeric entries.
The yomichan extension can get annoying when I type into a text box and hit Shift while my cursor happens to be hovering over an English word contained in an EPWING entry.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.