pgaskin / kepubify Goto Github PK

View Code? Open in Web Editor NEW

564.0 16.0 33.0 9.21 MB

Fast, standalone EPUB to Kobo EPUB conversion tool.

Home Page: https://pgaskin.net/kepubify

License: MIT License

Go 99.90% Shell 0.10%

kobo epub kepub golang go command-line-tool ebooks ebook conversion file-converter

kepubify's Introduction

I am currently studying Software Engineering at Queen's University.
I like working on backends embedded systems tool development reverse engineering cybersecurity
I know a bit about sysadmin devops networking full-stack web development
I have dabbled with electronics 3d-printing pcb design
I am experienced with Go JavaScript TypeScript Linux C HTML CSS Bash ARMv7 Android EPUB Docker
I am familiar with Windows C++ Java Python SQL PromQL
I am learning Rust x64 ARM64 PowerShell Kotlin

Recently, I've been working on

Patches and extra features for Android apps like the Lithium EPUB Reader and VNC Viewer.
Custom i3wm i3status status bar implementation.
Stuff for Kobo e-Readers including patches, an integrated launcher, firmware update website/api, an ebook converter/web interface, dictionary tools, patching tools, small mods, toolchain builds, and hooking library.
Nicer automatically-generated schedules and calendar integration for the Queen's University ARC and other gyms using Innosoft Fusion.
Small apps for myself like a network thoughput indicator, a battery status quick settings tile, and a port of the pixel windy live wallpaper.
Tools for extracting qt resources, chrome bookmarks, dictionaries, and other stuff.
Contributions to cmus, termux packages, pulseaudio, dnscontrol, and other projects.
Misc Go libraries like snappr, xmlwriter, and czlib.
Personal patches and builds for Android and OpenWrt.
Personal reverse engineering, web development, scraping, and tools.
Personal virtual machine and container tooling.
Server/network infrastructure/automation.
Creating and solving CTF challenges.
Security research.

You can find me on

Email
Website pgaskin.net
Discord @pgaskin
MobileRead Forums @geek1011

kepubify's People

Contributors

Stargazers

Watchers

kepubify's Issues

Question about potential integration with calibre

While kepubify has been a dream regarding performance and accuracy, I miss the ability to send books to my device via the library management interface from calibre. I was wondering if perhaps there was any interest in developing a wrapper plugin around a kepubify install to send kepubs to the device via kepubify.

Option to add custom css to epub

Should support appending css to content files in the epub.

Footnotes give text a different color.

Hello,

I'm having the following problem when converting books, related to (references to) footnotes.

For some of my converted books, each first footnote in a chapter causes all the preceding text in that chapter to have a different color from the default color. After the first footnote, the text has the default color and any subsequent footnotes in that chapter work as intended.

The original epub with a footnote:

The converted epub with all the text preceding the first footnote in the chapter in a different color:

Series metadata updating

Add command to update the series on the Kobo.

Support adding series metadata to books before being imported

See the comments starting from shermp/Kobo-UNCaGED#22 (comment).

Converted file fails to open on Kobo H2O

I'm trying to convert a (rather simple, I think) epub file, but even after many modifications to the input file the converted kepub fails to open, giving an "Oops!, This document couldn't be opened" error on my Kobo Aura H2O. I've verified the input epub file with epubcheck 4.2.2 and it is fully compliant. I've tried playing with kepubify settings (smarten-punctuation, hyphenate) and it didn't help. The original epub transfered to the reader opens just fine.

Here are the files in question: epub+kepub

Support using .kepub extension

This would allow converting Calibre libraries in-place, and has been requested multiple times on MobileRead and in emails.

Have some way of limiting concurrency

When converting a large-ish ebook, I get the following error:

Kepubify v2.3.2: Converting 1 books
Output folder: xxx
[1/1] Converting 'xxx.epub'
  Error: open /var/folders/jp/5lygkj014j9_j3gcc9x1bt2w0000gn/T/kepubify425287431/OEBPS/page-539.xhtml: too many open files

1 total, 1 converted, 0 skipped, 1 errored

Errors:
  'xxx.epub': open /var/folders/jp/5lygkj014j9_j3gcc9x1bt2w0000gn/T/kepubify425287431/OEBPS/page-539.xhtml: too many open files

I'm guessing that it happens because kepubify processes the html files in parallel, without limit. Unfortunately I'm not familiar with go, so I can't patch it myself.

Add xmlns:xlink namespace if missing

Some books are missing the xmlns:xlink namespace even when required. Kepubify should add it as part of the cleanup.

On windows, kepubify never stops

On windows, as a command line tool, the application never seems to stop.
I have to kill the process from the task manager (or send a ctrl+c in a dos box).

After examination of the code, it makes sense since I saw

if runtime.GOOS == "windows" {
    time.Sleep(5000 * time.Second)
}

Is this really normal? Because waiting 5000 seconds for the process to finish is quite long. :-)

Elements only containing nbsps are removed

Similar to #14

Support automatically choosing directory conversion output folder

Output to the original directory with the prefix _converted

Cross-compile seriesmeta for windows

seriesmeta should be available for windows (need to figure out how to get sqlite to cross-compile).

TOC entry always jump to first page/cover

hi,

i'm having issue with the table of content entries which always jump to first page when i press on them. this doesn't happen with .epub files, only with .kepub. have you got any idea to resolve the problem?

Convert whole folder

Homebrew formula

Thank you for making this tool. I’ve been looking for something like it for quite some time, and it seems like I finally found it. It seems to work well, and I’m eager to try it further.

Homebrew is a popular package manager on macOS, and I like to install all my tools through it, so I’ve made a formula. Anyone can now install kepubify with brew install vitorgalvao/kepubify/kepubify. I’m following the releases page for this repo so I know when a new version is out.

Making this issue to thank you for the tool and to inform about the formula, in case you want to include it in the README, or if you’d like to be the one to manage the formula yourself, officially (it takes little work and I can help you set everything up, if you’re interested).

Improve HTML parsing and manipulation

I'm rewriting the HTML manipulation code to fix the root cause of quite a few of the recent bugs including #45, #36, #29, #28, #25, #21, and #2. These issues were caused due to goquery (and thus kepubify) internally using golang.org/x/net/html which is a HTML5 library. The parsing was fine for nearly all books, but it wasn't tolerant of self-closing non-void elements (as the spec says it should, but it's not really a useful thing to do), which many XML generators generate when an element doesn't contain any children. The code generation was also usually fine as it generated valid XML (it did things optional for HTML5 but mandatory for XHTML: putting the closing /> on void elements, having an empty ="" on boolean attributes, using only a few named entities, etc), but it caused a few issues with not escaping NBSPs, and messing up the XML declarations found in XHTML for EPUB2 books.

In addition, these changes will improve the performance and memory usage of kepubify (there will be a lot less string allocations and copying).

The majority of the XHTML/XML/HTML5 fixes has been done in my fork of golang.org/x/net/html: https://github.com/geek1011/net/commits?author=geek1011.

[linux] Still epub

Hi,
I tried kepubify on ubuntu 17 yesterday.
Everything is fine, the conversion is running but when I check my original epub file is still an epub file.
I'm very interesting to get a solution to convert in kepub on linux and currently i have found no one.
Thks

Fix hacky span adding code

Can be problematic with certain html files.

Add -vt (vertical text) as parameter to Output Japanese Ebooks

As the title indicates

Closes as I open it

Hi! The program closes three secons after I open it, I can't do anything, even executing as admin there's no change. Anyone knows how to fix it?

Kepubify v3

I'm planning to work on kepubify v3 sometime soon. The goals will be:

More predictable and intuitive output locations and options (this will be the main feature).
Improved CLI and logging.
More unit tests and code cleanup.
Automatically detect and sync with a kobo.
Improved XHTML parsing and generation.
Possible performance improvements.
Possible pre-importing to the kobo.
More comprehensive help text.
Possibly a GUI or installer with explorer integration for Windows.
The Calibre extension may be ready.
Possibly caching for performance improvements.
A better README.
More later.

If anybody has any suggestions/comments, I'd be happy to consider them.

Incorrect page displayed on e-reader

When I click on a link in the ToC of a book, my Kobo e-reader displays the wrong page number. For example, when I click on a link op page 2 that sends me to the middle of the book, my e-reader shows that the current page number is 3 (it should be something like 200).
Is this a known issue or is there a known workaround?

Thanks for making this btw.

covergen fails to find epubs

covergen finishes immediately without doing anything:

PS C:\Software\Kepubify> .\covergen-windows-64bit.exe --regenerate
Finding kobo reader
... Found Kobo Libra H2O at M:
Finding epubs
... Found 0 epubs
Generating covers
0 covers (0 books): 0 updated, 0 errored, 0 skipped, 0 without covers

There are 6.4 GB of epub files in M:\Books.

Wonder if this is related to mattn/go-zglob#27 somehow?

application/xhtml+xml files with .xml extension are not converted

Public domain books from Feedbooks use .xml extension for "application/xhtml+xml" files. Kepubify does not add "kobospan" classes to html files with extension .xml.

Is it possible to support these type epub files?

Some of the books I tried:
http://www.feedbooks.com/book/92.epub
http://www.feedbooks.com/book/52.epub
http://www.feedbooks.com/book/81.epub

Whitespace is swallowed after italic

Hi,

kepub files produced with kepubify "swallow" whitespaces after words in italic.

For example, I used it with the French epub file found here:
https://www.ebooksgratuits.com/newsendbook.php?id=2610&format=epub

In the "Préface" (foreword), whenever the italic word Belgica is used, the following word becomes glued right next to it, without whitespace, for example:

...l'exploration de la Belgicaprécède...

instead of:

...l'exploration de la Belgica précède...

I am using version 1.3.5 under Linux. I have also build the current "dev" version from source, it didn't help.

Thanks for this fantastic tool!

Cheers,
L.

Options to force enable and disable hyphenation

File masks don't work on Windows

Given command from the examples:
> kepubify -o "converted" *.epub
Error: scan input "*.epub": CreateFile *.epub: The filename, directory name, or volume label syntax is incorrect.

Kepubify doesnt start in Ubuntu 18.04

Hey geek1011 I´ve got problems to start your program but it doesnt want run i made steps

Download kepubify
Open a terminal
Type cd ~/Downloads (or whatever location you downloaded kepubify to) and press enter
Type chmod +x kepubify-linux-* and press enter
I using Ubuntu Budgie 18.04 64 Bit version did i miss something ?

Improve sentence tokenization

Should use a tokenizer library to handle edge cases.

Self-closing title tag makes rest of page blank

I've had many emails about this, even though self-closing title tags are technically invalid. This issue is similar to readium/readium-sdk#81.

I've decided to fix this due to the number of complaints.

Update seriesmeta for firmware changes

A UUID for each series is now required. See https://www.mobileread.com/forums/showthread.php?p=3959731.

cc @davidfor @shermp

Option to fix fullscreen reading bugs without patching firmware

Based on: https://www.mobileread.com/forums/showpost.php?p=3113460&postcount=16

Self-closing script tag is expanded and causes blank pages

I have a book which uses kobo-specific JS for whatever odd reason, and uses a self-closing script tag. When running the book through kepubify, the script tag is expanded and is incorrectly wrapped around the entire page, causing blank pages. Here is a diff of a page provided by calibre's ebook-edit tool:

Seriesmeta documentation

I'm not exactly sure what this does, where and how it pulls the metadata, some info would be nice :)

Conversion issues on Windows

https://www.mobileread.com/forums/showthread.php?t=289040

Better CLI

Improvements for the CLI for the next major version:

Use GNU style flags
Allow converting multiple files from the command line
Allow dragging and dropping multiple files
Clean up output for readability
Add option to skip already converted books
Fix bug in batch conversion
Make it easier to extend the options

Kepubify runs out of memory on gigantic ebooks

I received a PM on MobileRead about a book (William Shakespeare Complete Works published by Modern Library) which was way bigger than anything I've seen before and causes kepubify to run out of memory during conversion.

Mac & Kepubify

Your Mac install produces, for me, a 64 bit Darwin.dms file,I don't this this should be, but being a novice terminal user .... Please, can you help me with the steps necessary in MacOs 10.12.6 Calibre will work, I know, but on the surface I think this should be a superior method.

epubcheck: Fatal Error while parsing file 'The entity "nbsp" was referenced, but not declared.'.

Solution: Replace   with   (or   if you prefer hex) in the following code:
https://github.com/geek1011/kepubify/blob/c5af6bc6cfc350284665091a6ad16e6c29abd12e/kepub/content.go#L308

Problem with kepub

Today for the first time I used the kepubify program.
After the conversion, I copied the file to the reader to see if the kepub format really is so great. Unfortunately, it turned out that the file looks worse than a regular epub.

EPUB (two pages)
KEPUB (two pages)

I also attach the source file.
Sample.zip

Use semver

Should start using semantic versioning.

Multi-threaded conversion

Add option for multi-threaded conversion.

Update screenshots

The screenshots are outdated.

New site

It's about time to design a new site. The current one looks cluttered.

No margins on generated kepub

Picture of the original epub
Picture of the converted kepub

Note that the "margin" setting has no influence on the kepub one.

Both were sideloaded to my kobo aura one. I tried with different kepubify parameters but it made no difference.

The book in question is the french translation of War and peace from project Gutenberg.
Conversion log:

version: v2.3.1

output: .
output-abs: /mnt/data/media/books/Tolstoy, Leo
help: false
update: false
verbose: true
css:
hyphenate: false
nohyphenate: false
inlinestyles: false

fullscreenfixes: false

file: Guerre et Paix I.epub
  file-result: /mnt/data/media/books/Tolstoy, Leo/Guerre et Paix I.epub -> /mnt/data/media/books/Tolstoy, Leo/Guerre et Paix I.kepub.epub

Kepubify v2.3.1: Converting 1 books
Output folder: /mnt/data/media/books/Tolstoy, Leo
[1/1] Converting '/mnt/data/media/books/Tolstoy, Leo/Guerre et Paix I.epub'
  i: /mnt/data/media/books/Tolstoy, Leo/Guerre et Paix I.epub
  o: /mnt/data/media/books/Tolstoy, Leo/Guerre et Paix I.kepub.epub
  e: false
  de: true
  Unpacking ePub
  Processing 18 content files
  ..................
  Cleaning content.opf
  Cleaning epub files
  Packing ePub

n: 1
converted: 1
skipped: 0
errored: 0
errs: []

1 total, 1 converted, 0 skipped, 0 errored

Also another quick question: is fullscreenfixes still necessary now that removing header and footer has been officially exposed by recent firmwares?

Converted epub sourced from HTML causes KA1 to crash

Hi there. I have a bit of an edge case, so I understand if this isn't something you support or want to look into.

I saved a plain HTML webpage from Firefox (The Org Mode Manual, located at https://orgmode.org/org.html), and then converted that single HTML page to an epub using pandoc. I then used kepubify to convert the epub to a kepub. After loading the book onto my Kobo Aura One, I cannot open the book without the entire device crashing and causing a reboot.

The content/source material is open source and should be fine to either upload to this issue, or you can easily reproduce the steps yourself. Let me know if you'd like to pursue this issue and I can provide the files I used for the conversation at each step.

ARM binary 32bit or 64bit?

Hi, I can see you've produced an array of cross platform binaries, but I'm wondering if the arm binary is 32bit or 64bit?

With the advent of the RPi4 and the new 64bit Raspberry OS I think there's a case for both 32bit arm binaries for the older raspberry pi models and a 64bit arm binary for the newer architecture, although I understand that would be more work for yourself, at the least I think it might be helpful to clarify which arm architecture it's for like you currently do with the Linux version.

Thanks for reading and for all your work.

Cannot comment out XML start tag for files beginning with UTF-8 BOM (ef bb bf)

Version: v2.3.2 on Windows 10

I found that kepubify cannot comment out XML start tag for files beginning with UTF-8 BOM (ef bb bf). And it worked well after I removed the UTF-8 BOM from files.

Another strange issue is that if I change the string "utf-8" to uppercase, it will works, too.
It means, kepubify cannot comment out

But it does comment out

even the file begins with UTF-8 BOM.

XML start tag commented out

The output has the xml tag wrong: 

pgaskin / kepubify Goto Github PK

kepubify's Introduction

kepubify's People

Contributors

Stargazers

Watchers

Forkers

kepubify's Issues

Recommend Projects

Recommend Topics

Recommend Org