Giter Site home page Giter Site logo

webtoepub's Introduction

WebToEpub

(c) 2015 David Teviotdale

Extension for Firefox and Chrome that converts Web Novels (and other web pages) into an EPUB. Works with many sites, including the following:

  • Baka-Tsuki.org
  • ArchiveOfOurOwn.org
  • blogspot (some)
  • mugglenet.com
  • FanFiction.net
  • gravitytales.com
  • hellping.org
  • krytykal.org
  • moonbunnycafe.com
  • nanodesu (some of the *thetranslation.wordpress.com sites)
  • royalroad.com
  • shikkakutranslations.org
  • sonako.wikia.com
  • wuxiaworld.com
  • rebirth.online
  • and many other sites

Credits

  • Firefox port by Markus Vieth
  • Michael Fox (Belldandu)
  • typhoon71
  • toshiya44
  • dreamer2908
  • Parser for German Project Gutenberg by GallusMax
  • Hogesyx
  • Asif Mahmood
  • snnsnn
  • Sergii Pravdzivyi
  • Aurimas Niekis
  • Tom Goetz
  • Alen Toma (css styling)
  • JimmXinu
  • gamebeaker (additional metadata, Library)
  • Kondeeza
  • Mathnerd314
  • Sickan90
  • Miracutor
  • Kiradien
  • Synteresis
  • Lej77
  • nandakishore2009 (Parsers for madnovel.com, www.panda-novel.com)
  • courli79
  • Dimava
  • alethiophile
  • Yoanhg421
  • Leone Jacob Sunil (ImLJS)
  • xRahul
  • Oleksii Taranenko
  • Naheulf
  • perishableloc
  • praschke
  • ImmortalDreamer
  • ktrin
  • nozwock

How to use with Baka-Tsuki:

  • Browse to a Baka-Tsuki web page that has the full text of a story.
  • Click on the WebToEpub icon on top right of the window.
  • Check story details are correct.
  • Select image to use for cover.
  • Click the "Pack EPUB" button.
  • Wait for progress bar to finish (indicating the images being downloaded) and the generated EPUB to be placed in your downloads directory.

How to use with Archive of Our Own:

  • Browse to first chapter of story you want.
  • Click on the WebToEpub icon on top right of the window.
  • Check story details are correct.
  • Click the "Pack EPUB" button.
  • Wait for progress bar to finish (indicating the additional chapters are being downloaded) and the generated EPUB to be placed in your downloads directory.

How to use for site that there is no specific parser for:

See: https://dteviot.github.io/Projects/webToEpub_DefaultParser.html

How to create Parsers for new sites

For details on how to extend, see the following

How to install

from Chrome Web Store

with Firefox

on Android

  • Caution I have not (and do not test) on Android. I've been told the following work, but I can't guarantee them.
  • Get yourself Kiwi browser, Yandex browser, or Firefox nightly (not default Firefox branch, it does not supports extensions.)
  • Install from Chrome web store for Kiwi and Yandex, of from Mozilla addons for Firefox Nightly (links above).

How to install from Source (for people who are not developers)

Firefox

The easiest set of steps is using Firefox.

  1. Download prebuilt Firefox version of extension from https://drive.google.com/drive/folders/1B_X2WcsaI_eg9yA-5bHJb8VeTZGKExl8?usp=sharing.
  2. Open Firefox and type "about:debugging#/runtime/this-firefox" into the URL bar.
  3. Click "Load Temporary Add-on".
  4. Click on the zip file you downloaded in step 1. Installing in Firefox screenshot

Chrome

  1. Download prebuilt Chrome version of extension from https://drive.google.com/drive/folders/1B_X2WcsaI_eg9yA-5bHJb8VeTZGKExl8?usp=sharing.
  2. Unpack zip file
  3. Open Chrome and type "chrome://extensions" into the browser.
  4. Make sure "Developer Mode" at the top of the page is checked.
  5. Press the "Load unpacked extension.." button and browse to unpacked zip directory from step 2. wte-chrome-small

How to install from Source (for developers)

  1. Clone this repo
  2. Build extension. See "To run Eslint (and build the plugin)" in "Other notes" below.
  3. Install extension in browser of choice, using instructions above.

License information

Licenced under GPLv3.

WebToEpub uses the following libraries:

Other notes

To run Eslint (and build the plugin)

  • Install Node.js (if not already installed)
  • Run npm install to install dependencies
  • Run npm run lint to build plugin and lint
  • This will produce 3 files in the eslint directory.
    • WebToEpub0.0.0.x.xpi (Firefox version of plug-in.)
    • WebToEpub0.0.0.x.zip (Chrome version of plug-in.)
    • packed.js
  • Lint tests are OK if output ends with Wrote Zip to disk; Done in XXXs.

To run unit tests

  • Install Node.js (if not already installed)
  • Run npm install to install dependencies
  • Run npm test
  • Tests will be launched in your default browser. To open them in different browser, open the page URL in it.

To run unit tests without Node.js

  • If you are not trying to run unit tests in /unitTest/ folder, you do not need this
  • If you can use nodejs, see previous paragraph instead
  • If you can not install nodejs or http-server is not working when you run npm test, and you have no alternative ways to serve files (like chrome Web Server app for example), you have to allow browser to run local html files to run tests.

To run unit tests (without local server) under Chrome

  • Close all running copies of Chrome
  • Start Chrome with command line argument --allow-file-access-from-files. That is:
    • Open a command propmt
    • Browse to the directory holding Chrome
    • Type in command chrome.exe --allow-file-access-from-files . Press "Enter".
    • If you don't do this, some tests will fail with error messages containing the text Failed to execute 'send' on 'XMLHttpRequest': Failed to load.
  • Load unitTest/Tests.html
  • If you get Failed to read the 'localStorage' property from 'Window': Access is denied for this document errors
    • Type chrome://settings/content into Chrome's search bar
    • Uncheck Block third-party cookies and site data
    • Click Finished
    • Re-run unit tests
  • When finished with unit tests.
    • Restore original value of Block third-party cookies and site data (if you changed it).
    • close all running copies of Chrome

To run unit tests (without local server) under Firefox

  • Start Firefox
  • Go to about:config
  • Find security.fileuri.strict_origin_policy parameter
  • Set it to false
  • Load unitTest/Tests.html
  • (Remember to reset security.fileuri.strict_origin_policy to true when done.

webtoepub's People

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

webtoepub's Issues

Check baka-tsuki pages for a specific tag or other

I think it would be a good idea to check for some tag that's always on the volume pages and if that tag doesn't exist then throw a message at the user saying "Hey this is not a book page" instead of throwing meaningless errors about hrefs and stuff being missing (because its obvious they are missing since the parent element is nowhere on the page).

Add wayback machine archival support.

Since the wayback machine tries to preserve the integrity of each site there should be no real change to any parser stuff except for maybe small parts.
parserFactory.register("web.archive.org/web/*someregexthatallowsnumbersonly*/http://www.baka-tsuki.org", function() { return new BakaTsukiParser() });

This is just a suggestion and it would be neat to have implemented. I emailed you already about this so yeah.

Add website specific code for hosting on a server

requested by amit34521 over at Baka-Tsuki

Can someone add a similar generator for the mobile users or some other option for the mobile users who used the epub generator before???

Simply add an index.html file and call the required files and add more javascript if need be.

I can easily host it on my dedi when this is done :D

Spurious link after images with some readers (Baka-Tsuki)

In sumatrapdf there's a link after every image.
Happens to both the images at the start and those in the middle of chapters.
Doesn't happen in calibre (or sigil, but that's not a reader).
I checked in icecream too: the starting images are replaced by a bunch of links, exept the cover which is fine; the images in the middle of chapters are fine too.

Failure to fetch image stops fetch of rest of images

(Note for me.)
Sometimes the attempt to get the high resolution version of an image fails. (e.g. WayBackMachine did not preserve the file.)
In this case, the HTTP exception that is thrown aborts the "fetch all images" operation.

Need an option to say "fetch rest of images on image failure". In which case, use the thumbnail image on the original page and try fetching the rest of the images.

BakaTsuki - Content list (table of contents)

Small fix needed.
If the table of content is hidden while viewing a BakaTsuki novel, the plugin will show a page like this.

ee

Can an option be added to remove it completely if it's hidden?
And maybe another option to automaticlly "show" it instead?

Suggestion / wild idea

Maybe you already had this idea, but I was thinking: what about getting some well formatted epubs, even hand made, and check them to see how they're actually formatted?
That could help with getting solutions/possibilities on how to build them.
Too wild?

Feature Request: Use URL to specify Cover image

(Request by "Guest")

On the topic of cover issues with the extension, could I make a suggestion, to allow the setting of covers using an image from any URL instead of just those available on the page?

For example, take this page

https://www.baka-tsuki.org/project/index.php?title=The_Zashiki_Warashi_of_Intellectual_Village:Volume9

As visible on that page itself (and therefore available for packing by the extension), the closest one can get to a cover would be

https://www.baka-tsuki.org/project/images/3/31/Zashiki_v09_000.jpg

However, this is too wide as it includes the front cover, the spine, and the back cover as well.
If on the other hand one were to check the main series page,

https://www.baka-tsuki.org/project/index.php?title=The_Zashiki_Warashi_of_Intellectual_Village

there is a much better option available to act as a cover, not present on the volume's full text page.

https://www.baka-tsuki.org/project/images/3/30/Zashiki_Volume_9_Cover.jpg

As-is, the extension does not allow for setting this as the cover, and therefore the epub needs to be manually tweaked after the fact to replace the cover.

Since I'm uncertain if there is any practical easy one-size-fits-all fix to somehow magically detect the presence of a cover image on a page other than the one being viewed, then a solution could be to allow entering an image URL to fetch a specific image to act as cover.

Make UI look nice

The popup on firefox looks a little goofy compared to chrome. Suggest styling it a bit with some css hacks if need be.

Calibre - Images are cut between pages in 2 pages mode

Yeah, I know what you guys are going to say... but I still need to report this one.
Basically IU experience this:

glitch

Calibre bugtracker has this entry "https://bugs.launchpad.net/calibre/+bug/1293102", but it's tagged as "wontfix"; would it be possible to "fix" it on this side?
I don't know if the solution suggested in that post will impact the /div removing thing.
webtopdf is 0009 on chrome.

Strange thing is I remember it working flawlessy, and I didn't update calibre after that since I just updated it before.

Icon

I noticed there was something along the lines of changing the icon, so... I suggest this one:

http://www.flaticon.com/free-icon/books_150360

It's the one I use on Firefox; since you already have to unzip, zip, change extension... one can just replace the icon .png. The strange thing is that the default icon is OK.-ish on chrome, but renders horribly on Firefox.

Advanced Options button

(note for self)
Put "Advanced Options" button to right of progress bar.
Button is only visible if Parser advertises that it has advanced options.
If clicked, additional options are put on dialog.
e.g.

  • Fetch Highest Resolution images. (default on)
  • Add List of source URLs to end of ePUB. (default off) No need for this to be visible to most readers. Instead is encoded in <source> elements in contents.opf.
  • URL for cover image
  • Optional stylesheet
  • Include source URL in images (default on)
  • Move "Remove Duplicate Images" to advanced options. (default off)
  • Setting needed for retrying on non 404 error pages (default off) and allowing setting of max retries (default 3) Proved too difficult to implement. Instead, tell user there was problem and give option to retry

Also note

  • Setting need to be saved to localStorage so they can be remembered. between invocations.
  • Also, need to preserve settings when generator updates itself from Chrome store. Seems to do this automatically.

br tags

In BakaTsukiParser.js it says,
// discard br tags as epubcheck says they are invalid in the places they are at in xhtml util.removeElements(util.getElements(element, "br"));

br tags are not invalid, they only need to be closed properly in xhtml, like <br/>
In Baka-Tsuki pages br tags show up as <br>, that's why epubcheck didn't like it.
Can you please replace all <br> by <br/> instead of removing it?

Firefox - WebToEpub breaks if noscript is installed

OK, I investigated and found that WebToEpub breaks if noscript is installed (Firefox 49.0a2).
I tested this with a clean profile, and as soon noscript is installed, WebToEpub gives this error:

uffuffuff

Since noscript doesn't break any other plugin I have I'm posting this issue here.

Improve memory utilization. Stream content directly to EPUB file.

Currently plugin downloads all content (chapters and images) before packing them into the EPUB. This works OK at moment, but may have problems later when try processing items that are very large,.
Should re-architect so packs each piece of content into the epub file on disk as it's downloaded.
Note, this may need to wait until Chrome 52, which implements streams.

Add Wuxiaworld support.

-If it's possible- I'd like you to add Wuxiaworld support, since there are some nice novels there.
The website has project pages like "www.wuxiaworld.com/wmw-index/", it should be possible to work with.
Thanks anyway.

Code restructuring

  • Move image handling code from BakaTsukiParser to BakaTsukiImageCollector.
  • Remove "BakaTsuki" prefix from Image class names.
  • Merge BakaTsukiEpubItemSupplier into EpubItemSupplier.
  • using "class" and "extends" to do inheritance.
  • Parsers should probably use Promise.resolve() in getChapterUrls()
  • Replace chapters object with "FetchList" and chapter with "WebPage"
  • Replace XmlHttpRequest with fetch().
  • Move the HttpClient fetch simulation from UtestWattpadParser into it's own file.
  • Move code for splitting a web page's content into multiple EPUB chapters into its own class (as it's now used by 3 other parsers.)

Remove duplicate images from image gallery at start of Web Page

Many Baka-Tsuki web pages have an image gallery at the start of the web page.
Some of these images also appear in the story text.
Provide an option to have the ePUB generator remove any images in the gallery that also appear in the text.

Suggested implementation notes.

    • BakaTsukiImageCollector.prototype.findImagesUsedInDocument() is used to do an initial scan of the document, collecting information on each image. This scan should be updated to check if image element is nested inside a list element of class gallerybox. If image element is not nested inside gallerybox, add flag to imageInfo that it appears outside gallery.
    • Later on BakaTsukiParser.prototype.processImages() is used to replace the image elements. This should be updated to check that if element is nested inside a gallery box, and the image is also not in a gallery box, then remove the element.

(Obviously, the above processing needs to be done BEFORE the images in the gallerybox are “flattened” to be outside the box.)

RoyalRoad missing chapter names

I notices stuff grabbed from RoyalRoad is missing chapter names.
Check "http://royalroadl.com/fiction/1233".
It happened after the removing of the "Try the beta reader" line (sonako 16/07/2016).

At the end of the chapter(s) there is an incomplete link to the next chapter on RoyalRoad web (arrow thing), maybe that should be removed too.

Errors in opf

metadata, manifest, spine and guide sections in the opf has xmlns="" inserted in them.

<metadata xmlns="" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:opf="http://www.idpf.org/2007/opf">
<manifest xmlns="">
<spine xmlns="" toc="ncx">
<guide xmlns="">

Again, calibre can still open the book, but I guess it might be an issue for other lightweight apps.
(using Firefox 49.0a2)

BakaTsuki - Empty page after Image gallery

In calibre, after the images at the start of a epub, there's a white page, just before the novel start.
Can you check if it's caused by the reader or not? It's not supposed to be there.
I can't try with sumatra because there's the link issue still around (so the page isn't empty and I don't know if it's on calibre side).

Firefox Beta channel

I would like to have the betas of webtoepub available on the relevant Mozilla addon channel (beta); since they are not scrutinized by Mozilla as the release ones, it's possible to release them more frequently.
It should be useful for testing and stuff.

Firefox - Keep compatibility with non-WebExtension Browser

If it's possible and not too much work, I suggest to keep compatibility with non-WebExtension "enabled" Browser.
This includes Firefox 47 which is the last stable since Firefox 48 should be out in August(and support WebExtension), and other forks like Palemoon.
I won't ask for Other Chrome derivates because they should already be working (Iron does work).

BakaTsuki epub formatting

First thing thanks for this extension, for which I'd like to suggest some improvement:

  1. Download the hi-res images when available from BakaTsuki ("original image"), not the resized ones, let the epub reader scale it [for better quality].
  2. Put one image /page [for cleaner layout and easier printing].
  3. Improve formatting; if you compare the now dead BakaTsuki epub Generator output and the output of this plugin you will notice the output is messy. A good example is the "content" table appearing, or the text spacing being messy [for cleaner layout].

I suppose that if the old code could do it should be possible to do it now too.
I hope I don't sound annoying, it's not my intention, I'm asking since the web thing was really handy and had god results quality wise.

Packing epub fails

Uhm, I got this today, while trying out the 0.0.0.6 release.

bs

I tried the Zashikiwarashi vol 9 before this one, and it did pack it.

Btw: I tried with 0.0.0.7 (sonako) manually installing it, the same happens both on firefox and chrome.
[google store has 0.0.0.6 avail right now, as for mozilla I can't even find it]

Remove "empty" <div> sections.

"Guest" wrote:

Using the "remove duplicate images" option leaves a bunch of empty

<div>

        </div>
<div>

        </div>

scattered around in the illustrations html.
I mean, I guess it doesn't really affect the user-facing result so it's hardly high priority, but it does seem a wee bit untidy.

Add Calibre Series Metadata

I think it would be nice if the extension could include Series metadata in the generated epub file. It should be quite easy to parse series name from the URL (at least on Baka Tsuki) and the series metadata is pretty useful for ebook library management in Calibre and other such tools. Some e-book reading devices can even make use of the series metadata to automatically assign books into collections (at least the Kobo Auro H2O when managed with Calibre).

Surprisingly I have not been able to find information about how the series metadata should look like in the epub spec (well, I haven't really looked very deeply), but this is how the ebook-meta tool provided by Calibre does it:

ebook-meta foo.epub --series bar_series

It can also be used by just passing the name of an epub file to inspect epub metadata, including series information. In any case checking how the epub file looks like after being modified by the ebook-meta tool might provide insight how series metadata works.

While I can't really help with extension development, I can certainly help with testing of this if needed. :)

Unit test of ArchiveOfOurOwnParser.getChapterUrls() fails.

Initial investigation shows problem is setting element is not working as expected when the test file is loaded asynchronously from local file.
Fix is not urgent, as value is set correctly when loading from network. (In this case, is only used to try and simulate loading from network for test.)
Note, also works when file is loaded synchronously.

Epubcheck playOrder errors

This is the last remaining error that i have so far been unable to fix

[kami@Index ~]$ epubcheck Toaru_Maju...dexVolume1.epub 
Validating using EPUB version 2.0.1 rules.
ERROR(RSC-005): Toaru_Maju...dexVolume1.epub/OEBPS/toc.ncx(1,884): Error while parsing file 'different playOrder values for navPoint/navTarget/pageTarget that refer to same target'.
ERROR(RSC-005): Toaru_Maju...dexVolume1.epub/OEBPS/toc.ncx(1,1080): Error while parsing file 'different playOrder values for navPoint/navTarget/pageTarget that refer to same target'.
ERROR(RSC-005): Toaru_Maju...dexVolume1.epub/OEBPS/toc.ncx(1,2006): Error while parsing file 'different playOrder values for navPoint/navTarget/pageTarget that refer to same target'.
ERROR(RSC-005): Toaru_Maju...dexVolume1.epub/OEBPS/toc.ncx(1,2191): Error while parsing file 'different playOrder values for navPoint/navTarget/pageTarget that refer to same target'.
ERROR(RSC-005): Toaru_Maju...dexVolume1.epub/OEBPS/toc.ncx(1,2735): Error while parsing file 'different playOrder values for navPoint/navTarget/pageTarget that refer to same target'.
ERROR(RSC-005): Toaru_Maju...dexVolume1.epub/OEBPS/toc.ncx(1,2923): Error while parsing file 'different playOrder values for navPoint/navTarget/pageTarget that refer to same target'.

Check finished with errors

epubcheck completed

This is the toc.ncx

<?xml version='1.0' encoding='utf-8'?>
<ncx xmlns="http://www.daisy.org/z3986/2005/ncx/" version="2005-1" xml:lang="en">

    <head>
        <meta content="https://web.archive.org/web/20140803022146/http://www.baka-tsuki.org/project/index.php?title=Toaru_Majutsu_no_Index:Volume1" name="dtb:uid" />
        <meta content="2" name="dtb:depth" />
        <meta content="0" name="dtb:totalPageCount" />
        <meta content="0" name="dtb:maxPageNumber" />
    </head>
    <docTitle><text>Toaru Majutsu no Index:Volume1</text></docTitle>
    <navMap>
        <navPoint id="body0001" playOrder="1">
            <navLabel><text>Novel Illustrations</text></navLabel>
            <content src="Text/0000_Novel_Illustrations.xhtml" />
        </navPoint>
        <navPoint id="body0002" playOrder="2">
            <navLabel><text>Prologue: The Tale of the Illusion Killer Boy. The_Imagine-Breaker.</text></navLabel>
            <content src="Text/0001_Prologue_T...ne-Breaker.xhtml" />
        </navPoint>
        <navPoint id="body0003" playOrder="3">
            <navLabel><text>Chapter 1: The Magician Lands on the Tower. FAIR,_Occasionally_GIRL.</text></navLabel>
            <content src="Text/0002_Chapter_1_...nally_GIRL.xhtml" />
            <navPoint id="body0004" playOrder="4">
                <navLabel><text>Part 1</text></navLabel>
                <content src="Text/0002_Chapter_1_...nally_GIRL.xhtml" />
            </navPoint>
            <navPoint id="body0005" playOrder="5">
                <navLabel><text>Part 2</text></navLabel>
                <content src="Text/0003_Part_2.xhtml" />
            </navPoint>
            <navPoint id="body0006" playOrder="6">
                <navLabel><text>Part 3</text></navLabel>
                <content src="Text/0004_Part_3.xhtml" />
            </navPoint>
            <navPoint id="body0007" playOrder="7">
                <navLabel><text>Part 4</text></navLabel>
                <content src="Text/0005_Part_4.xhtml" />
            </navPoint>
            <navPoint id="body0008" playOrder="8">
                <navLabel><text>Part 5</text></navLabel>
                <content src="Text/0006_Part_5.xhtml" />
            </navPoint>
            <navPoint id="body0009" playOrder="9">
                <navLabel><text>Part 6</text></navLabel>
                <content src="Text/0007_Part_6.xhtml" />
            </navPoint>
            <navPoint id="body0010" playOrder="10">
                <navLabel><text>Part 7</text></navLabel>
                <content src="Text/0008_Part_7.xhtml" />
            </navPoint>
        </navPoint>
        <navPoint id="body0011" playOrder="11">
            <navLabel><text>Chapter 2: The Illusionist Bestows Demise. The_7th-Egde.</text></navLabel>
            <content src="Text/0009_Chapter_2_...e_7th-Egde.xhtml" />
            <navPoint id="body0012" playOrder="12">
                <navLabel><text>Part 1</text></navLabel>
                <content src="Text/0009_Chapter_2_...e_7th-Egde.xhtml" />
            </navPoint>
            <navPoint id="body0013" playOrder="13">
                <navLabel><text>Part 2</text></navLabel>
                <content src="Text/0010_Part_2.xhtml" />
            </navPoint>
            <navPoint id="body0014" playOrder="14">
                <navLabel><text>Part 3</text></navLabel>
                <content src="Text/0011_Part_3.xhtml" />
            </navPoint>
            <navPoint id="body0015" playOrder="15">
                <navLabel><text>Part 4</text></navLabel>
                <content src="Text/0012_Part_4.xhtml" />
            </navPoint>
        </navPoint>
        <navPoint id="body0016" playOrder="16">
            <navLabel><text>Chapter 3: The Grimoire Peacefully Smiles. "Forget_me_not."</text></navLabel>
            <content src="Text/0013_Chapter_3_...get_me_not.xhtml" />
            <navPoint id="body0017" playOrder="17">
                <navLabel><text>Part 1</text></navLabel>
                <content src="Text/0013_Chapter_3_...get_me_not.xhtml" />
            </navPoint>
            <navPoint id="body0018" playOrder="18">
                <navLabel><text>Part 2</text></navLabel>
                <content src="Text/0014_Part_2.xhtml" />
            </navPoint>
            <navPoint id="body0019" playOrder="19">
                <navLabel><text>Part 3</text></navLabel>
                <content src="Text/0015_Part_3.xhtml" />
            </navPoint>
            <navPoint id="body0020" playOrder="20">
                <navLabel><text>Part 4</text></navLabel>
                <content src="Text/0016_Part_4.xhtml" />
            </navPoint>
        </navPoint>
        <navPoint id="body0021" playOrder="21">
            <navLabel><text>Chapter 4: The Exorcist Chooses the End. (N)Ever_Say_Good_bye.</text></navLabel>
            <content src="Text/0017_Chapter_4_...y_Good_bye.xhtml" />
        </navPoint>
        <navPoint id="body0022" playOrder="22">
            <navLabel><text>Epilogue: The Conclusion of the Index of Prohibited Books Girl. Index-Librorum-Prohibitorum.</text></navLabel>
            <content src="Text/0018_Epilogue_T...ohibitorum.xhtml" />
        </navPoint>
        <navPoint id="body0023" playOrder="23">
            <navLabel><text>Afterword</text></navLabel>
            <content src="Text/0019_Afterword.xhtml" />
            <navPoint id="body0024" playOrder="24">
                <navLabel><text>Translator's Notes</text></navLabel>
                <content src="Text/0020_Translators_Notes.xhtml" />
            </navPoint>
            <navPoint id="body0025" playOrder="25">
                <navLabel><text>Alternate Translations</text></navLabel>
                <content src="Text/0021_Alternate_...anslations.xhtml" />
            </navPoint>
        </navPoint>
    </navMap>
</ncx>

This is probably the only thing so far i'm stumped by. If anyone wants to enlighten me as to whats wrong here as far as the playOrder is concerned i'm all ears (and eyes).

Upgrade JSZip library to 3.0.0

From Firefox review of plugin.

Note that old versions of JSZip have known issues which might make extract zip files created with it impossible. Before requesting full review, please upgrade to the latest version.

Also note:

  • Update readme to include the version of JSZip library
  • Use a Git Submodule to bring in the JSZip library, don't include the minified source.

Optionality for whether or not the main file name is concenated or not

As the title says this will NOT affect chapter names and image names, it will affect the main epub file name.

I'm thinking of adding this to #14 but i was looking for opinions on whether its a good idea or not.

If implemented concenation will be on by default and users that want it off can turn it off from advanced options.

Change popup for window

First great work with this :)

Popup close up every time we change tab and it's a pain if we(?) to change multiple param if they are in another tab.
So is it possible to add a real window/tab instead of a popup ? Or maybe both ?
Another possibility will be to store already modified parameter to localstorage to avoid losing them every time popup close

Calibre has problems with three dots in image filenames

Reported by dreamer2908

Ugh, apparently Calibre has problems with three dots in image filenames if I enable --smarten-punctuation option. Maybe I should report it to them when I stop being lazy.

Maybe replace the three dots (which are supposed to indicate an elipsis) with something else. Not sure what would be best. Some options are:

  • underscore(s)
  • hyphen(s)
  • maybe '-xxx-'

Missing namespace?

This file has no namespace. Its namespace must be http://www.w3.org/1999/xhtml. Set the namespace by defining the xmlns attribute on the element, like this

This is the error calibre's editor tells me. It asks me to put <html xmlns="http://www.w3.org/1999/xhtml"> instead of writing just <html> in the xhtml or html files in the book.

Also, every single paragraph and heading tag has xmlns="http://www.w3.org/1999/xhtml" written inside it. Sample,
<p xmlns="http://www.w3.org/1999/xhtml">
<h2 xmlns="http://www.w3.org/1999/xhtml">
<h3 xmlns="http://www.w3.org/1999/xhtml">

Calibre can still open the book, but I guess it might be an issue for other lightweight apps.
(using Firefox 49.0a2)

Invalid id in ncx

IDs must start with a letter but in the toc only 4 digit numbers are used.
Please replace them with something like toc_[4digitnumberhere] for maximum compatibility.

Metadata improvements

Please add field to enter translator's name.
Split author's name input field into lastname-firstname parts and update the way it's recorded in the opf.
<dc:creator opf:file-as="last, first" opf:role="aut">first last</dc:creator>
As for the translator's name:
<dc:contributor opf:file-as="name" opf:role="trl">name</dc:contributor>

Edit: Since there's no response from dteviot yet, I'll add one more metadata issue.
The meta cover tag has content before name.
Current: <meta content="image0000" name="cover"/>
Should be: <meta name="cover" content="image0000"/>

Baka-Tsuki - epubcheck errors

  1. Images can be embedded in B-T stories in form of inline images instead of thumbnails. The result xhtml code will be (slightly) invalid if WebToEpub encounters this type of images: div tag is inside p tag.
    Example: All non-gallery images here: Utsuro no Hako:Volume 1
    Result xhtml code for the first image:
    `

    ` Epubcheck error message: `ERROR: /home/yumi/Downloads/Utsuro_no_...koVolume_1.epub/OEBPS/Text/0000_Novel_Illustrations.xhtml(2,34): element "div" not allowed here; expected the element end-tag, text or element "a", "abbr", "acronym", "applet", "b", "bdo", "big", "br", "cite", "code", "del", "dfn", "em", "i", "iframe", "img", "ins", "kbd", "map", "noscript", "ns:svg", "object", "q", "samp", "script", "small", "span", "strong", "sub", "sup", "tt" or "var" (with xmlns:ns="http://www.w3.org/2000/svg")`
  2. WebToEpub doesn't convert the deprecated u tag (underline) into suitable form for epub.
    <p>Normal <u>underline></u></p> should become <p>Normal <span style="text-decoration: underline;">underline></span></p>
    Sample: same as above.
    Epubcheck error message:
    ERROR: /home/yumi/Downloads/Utsuro_no_...koVolume_1.epub/OEBPS/Text/0001_Prologue.xhtml(4,85): element "u" not allowed anywhere; expected the element end-tag, text or element "a", "abbr", "acronym", "applet", "b", "bdo", "big", "br", "cite", "code", "del", "dfn", "em", "i", "iframe", "img", "ins", "kbd", "map", "noscript", "ns:svg", "object", "q", "samp", "script", "small", "span", "strong", "sub", "sup", "tt" or "var" (with xmlns:ns="http://www.w3.org/2000/svg")
  3. Invalid id in span tag inside h* tag are not fixed, like <h3><span class="mw-headline" id="1st_time">1<sup>st</sup> time</span></h3>
    Epubcheck error message:
    ERROR: /home/yumi/Downloads/Utsuro_no_...koVolume_1.epub/OEBPS/Text/0002_1st_time.xhtml(1,497): value of attribute "id" is invalid; must be an XML name without colons
    Side note: BTE-GEN converts it into <h3 id="1st_time">, but it's still not fixed, and not useful here.

Well, some more, but I lost the samples.

  • center tag isn't allowed in epub, too. <center>text</center> should become <p style="text-align: center;"></p>
  • align attribute in p/span/div should be converted into css style text-align:

BTE-GEN moves up heading if higher levels are missing, i.e h2 to h1, h3 to h2 if there's no h1. Can this be considered?

In list of references (translator's notes) in B-T web, the link to jump up to where the reference belongs to only has a single symbol. The same in BTE-GEN's output. In WebToEpub's output, it becomes Jump up ↑. If you remove cite-accessibility-label (class), the Jump up text will stop popping up out of nowhere.

Full disclose: I'm developing my own (not easy-to-use) Baka-Tsuki to epub converter, which is for freaks like me, and not for normal users at all.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.