The ebooklib's discuss from aerkalov

Make it work on Python 2.7 / Python 3.3

Will need to change couple of API calls. This will make library work on Python 2.7 minimal.

Load navigation document

When reading an EPUB file, if the NCX file is not present the TOC structure should be obtained by parsing the NAV document instead.

Creating EPUB files does not work in Python 3.3

Issue with string with lxml parse function and dictionary iteritems method.

Implement new API for Plugins

API will change a lot, but for now we just need something to start working.

def before_write(self, book):
    "Processing before save"

def after_write(self, book):
    "Processing after save"

def before_read(self, book):
    "Processing before save"

def after_read(self, book):
    "Processing after save"

def item_after_read(self, book, item):
    "Process general item after read."

def item_before_write(self, book, item):
    "Process general item before write."

def html_after_read(self, book, chapter):
    "Processing HTML before read."

def html_before_write(self, book, chapter):
     "Processing HTML before write"

Navigation xhtml file should behave like other xhtml files

Navigation xhtml file should extend standard xhtml chapter file class. Also, we should be able to use nav.add_item() to add CSS style definitions. For now, it was just hard coded.

Wrong copyright info

I copy pasted copyright from Booktype. Should removed references to Booktype from inside.

Remove dependency of itertools module

No need for this. Just use normal generator expression.

Cover image item

When an EPUB3 manifest is loaded, the item with cover-image property assigned is not recognized as the cover image.

Check type of Item in epub

We should be able to check type of items in EPUB file. Return some kind of ID for different items (image, html, css, ...)

Parse EPUB2 guide

The EPUB2 guide element of the OPF file is not parsed when an EPUB file is loaded.

Faulty navigation points in the NCX

Navigation points in the NCX documents that correspond to book sections have an empty URL assigned for content. This in reported as error by the epubcheck program.

Remove print statements from source code

Implement document type

It would be handy to have Document type also. We should be able to know "this is html document" but we should also know if it is cover.xhtml, nav.xhtml or just another chapter.

Do not create new title tag in chapter if it already exists

When creating chapter content we do two things. We copy old tags from the original document and we also add new title tag. We should not add new tag if it already exists. But also, we should not set empty title (if it is not defined) if it already exists.

Basic plugin for filtering non HTML5 content

We need a basic plugin which will be able to filter out most of non HTML5 tags, attributes and things like that.

What we would need later is also replace non supported tags with new syntax. For instance, replace tag with element and css and etc...etc.....

Implement different methods for fetching different items from a book

Implement call like get_link_of_href to fetch item from a book. The question is, should it return just one item or it should return more then one item. I guess more useful would be just to return one item.

Item in spine could have flag linear

We need to support linear flag in spine. The best would be to have option in Item and to be able to mark it somehow when defining spine.

Move common functions to ebooklib.utils

There are some common functionality which should really be in ebooklib.utils. Things like debug, parse, ....

Mime type is not correctly guessed when adding items

Like the title said, mime type is not correctly gussed. mimetype.guess_type can return string OR tuple. We are only handling if it returns tuple. End result is value None for our mime type.

Handle properties in manifest file when writing to epub

Handle properties tag when creating epub file.

Head and body elements missing in some cases

If the original document has empty body with no children, body and head elements will be missing from the generated content.

Implement API to add files which are not present in the manifest

There is a need to add files which are not present in the manifest (for instance - iTunesMetadata.plist, META-INF/com.apple.ibooks.display-options.xml).

Probably have it as it is right now but have argument .add_item(item, manifest=False).

Extend API with methods for filtering data

Implement methods for fetching and filtering data in book or chapter.

A typo in the nav item string representation.

Add copyright, author info and setup.py

Add license info, author info and setup.py file.

Add additional item types for audio and video files

Add different item types like ITEM_AUDIO and ITEM_VIDEO.

Decode filenames when reading them from zip file

We should unquote filenames when reading them from zip file.

Spurious metadata entry while parsing the OPF file.

When metadata found inside the OPF file is parsed, a bogus entry is read and placed in the book's metadata container.

Increase version to 0.15

Have EpubItem for remote resources

EPUB3 supports remote-resources property for video and audio elements. Meaning, they can be stored somewhere remotely. But this is only for audio and video tags. These items also must be placed in the list of resouces. Our EpubItem should be aware of this and not create local file in EPUB3 in this case.

Plugin which cleans content with Tidy HTML

We need standard plugin which will use tidy to clean chapter content before they are saved in EPUB.

standard tidy
https://github.com/w3c/tidy-html5

Put parsing function in the utils module

We are using HTML5 parser way too many times. Just put it in the utils module.

Cover file should also extend EpubHtml class

It would be best if Cover file also extends EpubHtml. It would be possible to add dynamically other CSS files or JavaScript files with API.

Make some function names more pythonic + update docs + update examples

writeEPUB and readEPUB should really be write_epub and read_epub.

Use six package for Python2/Python3 compatibility

Just use six package to make it work better on Python2/Python3.

Epubcheck fails for some tag attributes

For instance, P dir="RTL" will complain because RTL is in uppercase. Epubcheck expects them to be in lowercase.

Fix typo in setup.py

We must not use .wait() for waiting Popen to end

We must use communicate. Here is a little tip... Read documentation and look at the big red boxes in the documentation.

EpubCoverHtml should extend EpubHtml and use it for processing HTML

We have separate cover template and we used duplicated methods for the same thing. Just extend EpubHtml and use its methods for processing HTML.

Do not use ZIP_STORED for every item in zip file

For unknown reasons we are using ZIP_STORED flag for every single item in zip file. We should use it only for mimetype file.

The EPUB folder name is not configurable.

The default folder name is hard-coded in the container's XML template and will not reflect the name assigned for the book.

Delete temporary directory

Somehow temporary directory with unextracted epub ended up in the repository. I wonder who has put it inside.

Support for guide

Implement API and create guide element in the manifest. Guide is deprecated feature, but we should be able to support it.

We should also be able to support landmark feature:
http://www.idpf.org/epub/30/spec/epub30-contentdocs.html#sec-xhtml-nav-def-types-landmarks

Guide

Example of deprecated guide:

Guide support different types:

cover
title-page
toc
index
glossary
acknowledgements
bibliography
colophon
copyright-page
dedication
epigraph
foreword
loi
lot
notes
preface
text

style = '''BODY { text-align: justify;}'''

default_css = epub.EpubItem(uid="style_default", file_name="style/default.css", media_type="text/css", content=style)
book.add_item(default_css)

c2 = epub.EpubHtml(title='About this book', file_name='about.xhtml')
c2.content='<h1>About this book</h1><p>Helou, this is my book! There are many books, but this one is mine.</p>'
c2.add_item(default_css)

Add sample files

Add some basic sample files. Something to show how to use EbookLib library.

Preserve XML declaration when creating XML files

We do not preserve XML declarations with creating XML files. What we should so is use option xml_declaration when using etree.tostring function.

Example:
tree_str = etree.tostring(tree, pretty_print=True, encoding='utf-8', xml_declaration=True)

aerkalov / ebooklib Goto Github PK

ebooklib's Issues

Guide

Recommend Projects

Recommend Topics

Recommend Org