Giter Site home page Giter Site logo

Comments (9)

xxyzz avatar xxyzz commented on June 10, 2024 1

The document is kind confusion and it needs update. The code makes a copy of the book and sets the language of the copied book to English and sends this copied book to Kindle, because Word Wise is only enabled for English books. If you set the book language to English then the plugin will assume the book is in English and only looks for English words.

Could you upload the Word Wise database file created when the book language is French?

from worddumb.

xxyzz avatar xxyzz commented on June 10, 2024 1

When the "save" button is clocked the code creates a file for spaCy to use later, maybe the enabled words by default are too many so this process is very slow. You can use SQL to disable large rows in a query to db file worddumb-lemmas/fr/wiktionary_fr_fr_v0.db(with SQLite command or https://sqlitebrowser.org):

UPDATE senses SET enabled = 0 WHERE difficulty < 3;

Then click the save button it should runs faster. I should make enabled words much less by default but haven't find better data source to convert to the difficulty value.

The export feature is for creating Anki cards. Your settings for lemmas are saved to the db file.

from worddumb.

xxyzz avatar xxyzz commented on June 10, 2024

Since this issue is not related to the solved issue 141, I'll answer your questions here:

  • Gloss length ratio is only used for EPUB books, and the pop up notes are still there
  • Gloss shown in calibre viewer is because it's a EPUB book
  • Only the default Chinese Word Wise db file is replaced, the English file is not touched. If you need the default Chinese file and connect to Wifi, Kindle will redownload it. Or you can keep a copy somewhere.
  • Maybe choose a smaller model if your machine is struggle with the load? You can delete the wordwise-lemmas folder in the calibre plugin folder, all downloaded word wise data files are saved there.

from worddumb.

gloverd avatar gloverd commented on June 10, 2024

For some reason I can no longer run "Generate Word Wise" on .mobi books. I've tried clean installs of plug-in and removing the associated folders under %APPDATA% , it consistently just keeps running where as in the past it would at least complete. sometimes in seconds, but most often a few minutes (as per screenshots in #141). It will run on epub files. I wonder if I corrupted the book somehow as part of this... This is one of the previously generated files I had in my kindle.

In order to upload, I renamed the .kll to .txt
LanguageLayer.en.BBB2IHO521.txt

In this one, for example, I see the french word "Morale" picked up with gloss as "Moral", other pairs are (Talons, griffe), (Savants, savant), (Instant, immédiat, instantané), (unique, unique), ...

from worddumb.

xxyzz avatar xxyzz commented on June 10, 2024

I think it runs so slow with French books maybe is because the default setting have too much enabled lemmas.

And I fixed a bug for KFX books: ba6582e, but you're using MOBI book?

from worddumb.

gloverd avatar gloverd commented on June 10, 2024

I've tried KFX, mobi, and epub in the past. I have this running in the background right now; I downloaded a new out-of-copyright book (Les Miserables) as an epub file. I converted it to MOBI, and am running only the "Generate Word Wise" (not the full word dumb button). It has been running for 30 minutes at this point.
image

You may be onto something about the size, because I can run it for english books quite fast. As far as trying with fewer lemmas, If I uncheck the enabled button in the customize kindle wordwise pop-up for a whole series of words, will that improve performance, or does the fact that it still has to look up the word before determining if it is enabled or not prevent significant improvements?
image

from worddumb.

gloverd avatar gloverd commented on June 10, 2024

I disabled the lemmas under difficulty 5 and 4, and it finally produced the expected result. Some of the lemmas in 5 are probably WAY too common in text (it has words like "it", "not/no", "the (plural)", "a"), and level 4 also has some very common words; so I'm sure that it is bogging it down.

It took 5.5 hours to save the updated lemma file. I tried to export it and re-import it, but I don't think that's possible? the exported file doesn't seem to have any information about the level or enabled status; and I'm not sure if I can just rename it to enable its import.

After a computer restart though; it no longer works. I am going through the process of re-saving the lemmas and will re-try.

from worddumb.

gloverd avatar gloverd commented on June 10, 2024

That really has helped!
Saving new lemmas down to 43m from 330m, and per-book word-wise generation about 70% faster!

from worddumb.

xxyzz avatar xxyzz commented on June 10, 2024

I test a French book in KFX and AZW3 format and both have working Word Wise now. But for a better quality enabled French words by default, data similar to how English and Chinese default words are chosen are needed: https://github.com/xxyzz/Proficiency

from worddumb.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.