Manga-Downloader

Manga-Downloader is a cross-platform Windows/Mac/Linux Python 2/3 script.
It can be automated via an external .xml file, and can convert images for viewing on the Kindle.
Currently supports mangafox.com, mangareader.net, and otakuworks.com, with a total of over 5000 mangas.
Downloads into .cbz format, can optionally download into .zip instead.

Dependencies

Python 2.6+, including 3.x

PIL if using Kindle conversion.

How to backport to:
2.5 – change the exception-handling code and use StringIO instead of io module
2.4 – removing parentheses after class declarations

Usage

manga.py [options] <manga name> <manga name> <etc.>

Options:
--version
show program’s version number and exit

-h, --help
show this help message and exit

--all
Download all available chapters.

-d <download path>, --directory=<download path>
The destination download directory. Defaults to a directory named after the manga.

--overwrite
Overwrites previous copies of downloaded chapters.

-t <number>, --threads=<number>
Limits the number of chapter threads to the user specified value. Default value is 3.

--verbose
Verbose output.

-x <xmlfile path>, --xml=<xmlfile path>
Parses the .xml file and downloads all chapters newer than the last chapter downloaded for the listed mangas.

-z, --zip
Downloads using .zip compression. Omitting this option defaults to .cbz.

-c, --convertFiles
Converts the files that are downloaded to a Format/Size ratio acceptable to the device specified by the device
parameter. The converted images are saved in the directory specified by the outputDirectory paraemter.

--device
Specifies the target device for the image conversion.

--convertDirectory
Converts the image files stored in the directory specified by the inputDirectory parameter. Stores the images
in the directory specified by the outputDirectory Parameter

--inputDirectory
The directory containing the images to covert when convertDirectory is specified.

--outputDirectory
The directory to store the converted Images. Omitting this option defaults to DOWNLOAD_PATH/Pictures.

The script will offer a choice of 3 manga sites, it will default to the first upon pressing ‘enter’.
After selecting a site, the script will output a list of all chapters of the series it has found on the site you selected.
When it prompts “Download which chapters?”, type in the ones you want delimited by ‘-’ and ‘,’. You can also type ‘all’ if you did not specify --all before.

See Notes for examples of acceptable inputs.

Notes

Usage examples:

manga.py -d "C:\Documents and Settings\admin\Documents\Manga\" -z Bleach
On a Windows machine, downloads ‘Bleach’ to C:\Documents and Settings\admin\Documents\Manga\, using .zip compression.

./manga.py --overwrite Bleach
On a Linux/Unix machine, downloads ‘Bleach’ to ./Bleach, using .cbz compression and overwriting previously downloaded chapters.

1,2,9-12
Downloads chapters 1, 2, and 9 through 12

all
Downloads all chapters

./manga.py -x example.xml
Parses example.xml to run the script.

Bugs

Input between the two prompts “Which site?” and “Download which chapters?” is queued up for chapter selection.

Issues

Careful! MangaFox sometimes leaves up invalid chapters that it has marked for deletion. Be suspicious of out-of-order chapters if they appear.
The script can be a bandwidth hog when downloading a lot of manga chapters and --threads is set to a large number.
UNICODE characters have caused the script issues in the past. If a UNICODE issue is found, create a new issue under the Issue tab and
provided a detailed description of the issue.

History

v0.9.0 (06/13/11)
+ Support for Python 3.x

v0.8.8a (06/02/11)
+ Code cleanup for clarity, adherence to naming conventions

v0.8.8 (02/11/11)
+ Added Support for HTTP GZIP Encoding. Reduces the bandwidth used by the program. Required for downloading images hosted by MangaFox.

v0.8.7 (02/05/11)
+ Fixed unicode issue. One of the found manga’s contained a nonascii character. This caused the script to crash. I fixed the localized area.
However, there are probably similar issues elsewhere in the code. I currently did not have time to hunt for similar issues.
+ Manga Fox search has changed the matches are not as exact. If the matches take multiple pages, the script fails to find any manga after
page one. I have modified the search in the script to match on the beginning. If zero mangas are found it reverts to the old search.

v0.8.6 (01/27/11)
+ Changed XML Update to occur only when all chapters are successfully download for the selected manga

v0.8.5 (01/27/11)
+ Fixed double prepend of manga name when using MangaReader

v0.8.4 (12/31/10)
+ Fixed deadlock situation when skipping already-downloaded chapter

v0.8.3 (12/31/10)
+ Implmented —numChapterThreads to limit the number of chapters used by the script

v0.8.2 (12/30/10)
+ Implmented downloading chapters in parallel.
+ Implemented progress bars for each chapter downloaded

v0.8.1 (12/30/10)
+ Implemented multiple manga download from the command-line.
+ Each manga is downloaded in parallel.

v0.8.0 (12/30/10)
+ Implemented parallel processing of multiple mangas when using the XML option.
+ Added —verbose option

v0.7.9 (12/30/10)
+ Replaced mangadl_temp folder with tempfile.mkdtemp which uses the OS Temporary File mechanics.

v0.7.8 (12/30/10)
+ Changed the default download directory to a directory using the name of the manga.
+ Removed the obsolete parameter: -s, —subdirectory parameter
+ Fixed 3-digit zero padding to be applied to chapters only

v0.7.7 (12/30/10)
+ Displays error if expected image not found on site
+ Fixed double cleaning of temporary directory
+ Changed default behavior of overwrite and chapter skipping to be relative to the download format specified
+ Applied 3-digit zero-padding to all sites, this restores chapter skipping for MangaReader.
+ Delayed chapter removal when overwriting until just before copy
+ Removed obsolete ‘Upcoming features’ in readme

v0.7.5 (12/29/10)
+ Add code to zero pad the chapter string from Manga Reader. Code zero pads any number in the string to 3 digits.

v0.7.4 (12/29/10)
+ Add timeStamp Node to XML. It documents when the date/time the last chapter was downloaded.

v0.7.3 (12/29/10)
+ Fixed script LastChapterDownloaded Empty Node Bug

v0.7.2 (12/29/10)
+ Fixed script where it created a download directory even if the manga was not found.
+ Modified Custom Exceptions in SiteParser to accept custom error messages.

v0.7.1 (12/26/10)
+ Fixed broken functionality, minor code cleanup

v0.7.0 (12/26/10)
+ Thorough code commenting

v0.6.9 (12/26/10)
+ Heavy code cleanup (functionality remains the same, with most discovered bugs fixed)

v0.6.3a
+ Added check for the PIL Library. The script now works (w/o conversion functionality) without the PIL
+ Added Help (-h) command-line parameter. Prints out the command-line switches.
+ Added -v command-line to retrieve the current version of the script/executable.
+ Added MAC OSX executable. The executable is compiled with the correct libraries (PIL, ect) and is self
contained executable.
+ Added Icon files to use when compiling the Windows executable.

v0.6.2a
+ Fix All Flag (Broken by OOP Changes)

v0.6.1a
+ Fixed an encoding issued in the download Image function. If the retrieved Image URL contained special characters such as spaces
the python (at least python 2.5) failed to load the url.

v0.6.1 (12/12/10)
+ Fixed the download all flag (Broken by OOP changes)

v0.6.0a
+ Initial Check-in with Kindle Conversion code.

v0.6.0 (12/11/10)
+ Begin modularization of the code base
+ Added SiteParser.py – Uses the factory pattern to create different instances of SiteParser classes
+ Added MangaXMLParser.py – Contains the code for the XML Feature
+ Modified manga.py to use the new classes

v0.5.5 (12/1/10)
+ Implemented own search for MangaReader, fixes non-alphabetical characters

v0.5.4 (11/30/10)
+ Fixed MangaReader (new site format)
+ Removed trailing spaces in saved .cbz + .zip, additional file/folder name compatibility

v0.5.2 (11/30/10)
+ Minor code cleanup and bugfixes for special characters

v0.5.1 (11/3/10)
+ Modified the MangaReader parse function to find all entries returned by the search query instead of just the first one. The other two parse functions already work this way.
- Removed absolute paths that were added in v0.5.0 due to increased complexity especially considering the multiple OS platforms. Relative paths enabled via os.chdir.

v0.5.0 (11/1/10)
+ Added XML File Feature
+ New Command-line Flag added (-x)
+ Added internal flag to automate the parse functions
+ Modified the OtakuWorks parse function to find all entries returned by the search query instead of just the first one.
+ Modified script to use absolute paths to remove the built in limitation that the initial working directory is the directory the script is located.

v0.4.1 (10/17/10)
+ fixed bug with removed series crash in MangaFox

v0.4.0 (10/17/10)
+ fixed bug with decimal chapters not being downloaded or ordered improperly
+ fixed bug in MangaFox chapter parsing with missing titles
+ restored OtakuWorks functionality
+ implemented more code reuse for easier adding of future sites

v0.3.7 (10/05/10)
+ added mangafox.com as primary download engine (extremely fast servers)
+ fixed minor bug in output when download already exists
+ fixed minor bug where folder got created for -f even if download failed or never happened
- OtakuWorks functionality broken right now due to site maintenance

v0.3.4 (09/30/10)
+ fixed critical bug that occurs when a series only occurs on otakuworks as a manga
+ fixed critical bug when MangaReader tries to block fast requests for images

v0.3.3 (09/29/10)
+ added mangareader.net as a source
+ fixed critical bug that occurs when trying to download an image the server no longer has
- removed ‘best site’ rankings, user now selects site

v0.3.0 (09/29/10)
+ redid manga search mechanism on OtakuWorks
+ redid manga chapter selection on OtakuWorks
+ overhauled a lot of code
+ fixed image creation bug if not type jpeg or png
+ fixed most known bugs
- removed site structure search via Google due to unpredictable results
- permanently retired cloudmanga.com as a source
- permanently retired mangavolume.com as a source

v0.2.5 (08/12/10)
+ added otakuworks.com as primary download engine – boasts over 3500 mangas!
+ changed compression so that .png automatically converts to .jpg
+ improved image searches
+ program will continue to run even if website tries to disconnect
+ improved bandwidth consumption by 2x on CloudManga

v0.2.1 (08/05/10)
+ added new download option -f
+ program will no longer crash with flaky connections, but will continue trying to download

v0.2.0 (08/04/10)
+ successfully spoofed downloader as Chrome 6/Ubuntu Lucid Lynx
+ able to navigate site structures via Google now – this should make adding future sites much easier if needed and helps spell check (some overhead bandwidth might be in order though)
+ implemented auto-spell correction for cloudmanga + mangavolume + animea via Google
+ added cloudmanga.com as a source

v0.1.8 (08/03/10)
+ tested to work in windows (fixed zipfile.close() requirement in Windows)

v0.1.7 (08/03/10)
+ Should work flawlessly on Windows now as well (possibly broken before due to wrong direction of slashes)
+ Spoofed User-Agent as Firefox 2
- Removed manga.animea.net as a source

v0.1.6 (08/03/10)
+ Added manga.animea.net as another source

v0.1.5 (08/03/10)
+ now checks for valid starting/ending chapters
+ images now no longer compressed inside a temp directory
+ optimized code; no longer wastes bandwidth or cycles redownloading pages for source code / checking manga validity
+ tweaked code so it’s more object-oriented, support for other sites coming in next release
+ added option (-a) to download all chapters in a series
+ improved code readability
+ improved readme readability

v0.1.1 (08/02/10)
+ fixed archive not getting moved out of temp directory and erased

v0.1.0 (08/02/10)
+ fixed .cbz creation bug
+ fixed .jpeg deletion bug
+ fixed chapter redownloading if already complete from previous use
+ fixed program crash when manga doesn’t exist or is misspelled
+ added Windows support
+ added .zip support
+ tweaked code to prepare for addition of other sites and improve loop structure

v0.0.1 (08/02/10)
+ Initial Commit

Future Features

Add manga.animea.net support.

aurevec / manga_downloader Goto Github PK