Giter Site home page Giter Site logo

xlcnd / isbntools Goto Github PK

View Code? Open in Web Editor NEW
208.0 14.0 19.0 1.59 MB

python app/framework for 'all things ISBN' including metadata, descriptions, covers...

License: Other

Python 100.00%
python isbn metadata biblatex ean13 microsoft-word doi endnote bibliographic-references google-books open-library file-rename wikipedia

isbntools's Introduction

Built Status Bugs PYPI Downloads

Info

isbntools provides several useful methods and functions to validate, clean, transform, hyphenate and get metadata for ISBN strings.

For the end user several scripts are provided to use from the command line:

$ to_isbn10 ISBN13

transforms an ISBN13 number to ISBN10.

$ to_isbn13 ISBN10

transforms an ISBN10 number to ISBN13.

$ isbn_info ISBN

gives you the group identifier of the ISBN.

$ isbn_mask ISBN

masks (hyphenate) an ISBN (split it by identifiers).

$ isbn_meta ISBN [goob|openl|merge] [bibtex|csl|msword|endnote|refworks|opf|json] [YOUR_APIKEY_TO_SERVICE]

gives you the main metadata associated with the ISBN, goob uses the Google Books service (no key is needed), and is the default option (you only have to enter, e.g. isbn_meta 9780321534965), openl uses the OpenLibrary.org api (no key is needed) and wiki uses the Wikipedia api (no key is needed). You can enter API keys and set preferences in the file isbntools.conf in your $HOME\.isbntools directory (UNIX). For Windows, you should look at %APPDATA%/isbntools/isbntools.conf. The output can be formatted as bibtex, csl (CSL-JSON), msword, endnote, refworks, opf or json (BibJSON) bibliographic formats.

NOTE You can apply this command to many ISBNs by using posix pipes (e.g. type FILE_WITH_ISBNs.txt | isbn_meta [SERVICE] [FORMAT] [APIKEY] in Windows)

You can add more sources for metadata by installing isbnlib plugins: isbnlib-bnf, isbnlib-porbase, isbnlib-loc, isbnlib-mcues, isbnlib-dnb, isbnlib-sbn, isbnlib-kb, ... (check pypi for available plugins).

$ isbn_editions ISBN

gives the collection of ISBNs that represent a given book (uses Wikipedia and LibraryThing).

$ isbn_validate ISBN

validates ISBN10 and ISBN13.

$ ... | isbn_validate

to use with posix pipes (e.g. cat FILE_WITH_ISBNs | isbn_validate in OSX or Linux).

TIP Suppose you want to extract the ISBN of a pdf eboook (MYEBOOK.pdf). Install pdfminer and then enter in a command line:

$ pdf2txt.py -m 5 MYEBOOK.pdf | isbn_validate
$ isbn_from_words "words from title and author name"

a fuzzy script that returns the most probable ISBN from a set of words! (You can verify the result with isbn_meta)!

$ isbn_goom "words from title and author name" [bibtex|csl|msword|endnote|refworks|json]

a script that returns from Google Books multiple references.

$ isbn_doi ISBN

returns the doi's ISBN-A code of a given ISBN.

$ isbn_ean13 ISBN

returns the EAN13 code of a given ISBN.

$ isbn_classify ISBN

returns the OCLC classifiers of a given ISBN.

$ isbn_ren FILENAME

renames (using metadata) files in the current directory that have ISBNs in their filename (e.g. isbn_ren 1783559284_book.epub, isbn_ren "*.pdf").

Enter isbn_ren to see many other options.
$ isbntools

writes version and copyright notice and checks if there are updates.

With

$ isbn_repl

you will get a REPL with history, autocompletion, fuzzy options, redirection and access to the shell.

Following is a typical session:

$ isbn_repl

    Welcome to the isbntools 4.3.30 REPL.
    ** For help type 'help' or '?'
    ** To exit type 'exit' :)
    ** To run a shell command, type '!<shellcmnd>'
    ** Use '#' in place of the last ISBN

$ isbn> ?

Commands available (type ?<command> to get help):
-------------------------------------------------
BIBFORMATS  classify  desc     ean13     from_words  info  to_isbn10
PROVIDERS   conf      doi      editions  goom        mask  to_isbn13
audit       cover     doi2tex  exit      help        meta  validate


$ isbn> meta 9780156001311 tex
@book{9780156001311,
     title = {The Name Of The Rose},
    author = {Umberto Eco},
      isbn = {9780156001311},
      year = {1994},
 publisher = {Harcourt Brace}
}
$ isbn> meta 9780156001311 tex >>myreferences.bib
$ isbn> !ls
myreferences.bib
$ isbn> desc #
It is the year 1327. Franciscans in an Italian abbey are suspected of
heresy, but Brother William of Baskerville's investigation is suddenly
overshadowed by seven bizarre deaths. Translated by William Weaver. A Helen
and Kurt Wolff Book
$ isbn> cover #
     thumbnail:  http://books.google.com/books/content?id=PVVyuD1UY1wC&printsec=frontcover&img=1&zoom=1
smallThumbnail:  http://books.google.com/books/content?id=PVVyuD1UY1wC&printsec=frontcover&img=1&zoom=5
$ isbn> PROVIDERS
bnf  dnb  goob  kb  loc  mcues  openl  porbase  wiki
$ isbn> exit
bye

Within REPL many of the operations are faster.

Install

From the command line enter (in some cases you have to precede the command with sudo):

$ pip install isbntools

If you use linux systems, you can install using your distribution package manager (packages python-isbntools and python3-isbntools), however usually these packages are very old and don't work well anymore!

For Devs

If you would like to contribute to the project please read the guidelines.

Conf File

You can enter API keys and set preferences in the file isbntools.conf in your $HOME/.isbntools directory (UNIX). For Windows, you should look at %APPDATA%/isbntools/isbntools.conf (create these, directory and file, if don't exist [Now just enter isbn_conf make!]). The file should look like:

...

[MISC]
REN_FORMAT={firstAuthorLastName}{year}_{title}_{isbn}
DEBUG=False

[SYS]
URLOPEN_TIMEOUT=10
THREADS_TIMEOUT=12
LOAD_METADATA_PLUGINS=True
LOAD_FORMATTER_PLUGINS=True

[SERVICES]
DEFAULT_SERVICE=goob
VIAS_MERGE=parallel

...

The values are self-explanatory!

NOTE If you are running isbntools inside a virtual environment, the isbntools.conf file will be inside folder isbntools at the root of the environment.

The easier way to manipulate these files is by using the script isbn_conf. At a terminal enter:

$ isbn_conf show

to see the current conf file.

This script has many options that allow a controlled editing of the conf file. Just enter isbn_conf for help.

Known Issues

  1. The meta method and the isbn_meta script sometimes give a wrong result (this is due to errors on the chosen service), in alternative you should try one of the others services.
  2. The isbntools works internally with unicode, however this doesn't solve errors of lost information due to bad encode/decode at the origin!
  3. Periodically, agencies, issue new blocks of ISBNs. The range of these blocks is on a database that mask uses. So it could happen, if you have a version of isbntools that is too old, mask doesn't work for valid (recent) issued ISBNs. The solution? Update isbntools often!
  4. Calls to metadata services are cached by default. If you don't want this feature, just enter isbn_conf setopt cache no. If by any reason you need to clear the cache, just enter isbn_conf delcache.

Any issue that you would like to report, please do it at github or at stackoverflow with tag isbntools.

isbntools's People

Contributors

cclauss avatar dependabot[bot] avatar xlcnd avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

isbntools's Issues

Conf file

The only way to the user to input preferences is by the command line.

Starting with the next version (2.1.2), it will be possible the end user write a .conf file to express some options (isbndb api key, default metadata provider,...).

Service doesn't find the data or gives a wrong result

The meta method and the isbn_meta script sometimes give a wrong result (this is due to errors on the provider service) or don't find the data.

Solution 1? Try another service (wcat, goob, openl, isbndb, ...) or (better) use the provider merge (it merges wcat with goob in order to maximize the quality of data).

Solution 2? Calls to metadata services are cached by default. Sometimes this cache could get corrupted. To mitigate that, just enter into a command line isbn_conf delcache. If by any reason you don't want the cache, just enter isbn_conf setopt cache no.

test_goom doesn't pass for py26 and pypy!

It seems that this happens because there are cases (remember the call to google is NOT deterministic) where industryIdentifiers is missing! And the _mapper in goom was assuming that it exist.

Create the option to cache all metadata

In addition, we can create:

  • a new metadata provider that pulls metadata from that cache.
  • an option to only go to the web if the data is not available in the cache.
  • an option to go local if internet connection is not available.

New features?

This is a discussion to collect and evaluate new features.

Start by reading the issues with label enhancement.

File renaming by ISBN

rename (an ebook file) using metadata from the ISBN of the book (more difficult than it seems!)

A more robust addin framework

The present one works fine for providers and things like that... but it would be nice to split the lib between core functionality (that shouldn't have external dependencies) and addins that could use other libraries.

Create new module to help with file renaming

There are some 'concerns' that should be taken into account:

  • There will be many situations where proposed new filename is equal to an existing file (this could lead to data loss) DON'T ALLOW RENAMING IN THIS CASE.
  • If we rename files to another file system, we will get in trouble (even in UNIX systems), so keep renaming inside the SAME directory.
  • THERE IS NO SAFE WAY to deal with unicode in filenames. "The solution" of restricting to ASCII characters is NOT applicable here, so allow unicode encoded in UTF-8 but filter some characters.
  • Limit the length of the filename to avoid serious problems with Windows
  • Let other things (permissions, ...) to the operating system and catch them with OSError.

Fine tune coverage (`no cover` annotations)

It is OK to annotate code with no cover in the following cases:

  1. pure defensive code
  2. code only for py2 or only for py3
  3. code covered by tests that are only executed locally and not on Travis

Scripts don't work on Windows

Despite you can access and use isbntools as a library, in some Windows installations, the packaged scripts will not run! At least in the * anaconda distribution* of python for Windows ...

Solution?

See some suggestions below, but can you help?

Unicode errors

The isbntools works internally with unicode, however this doesn't solve errors of lost information due to bad encode/decode at the origin!

Timeouts

If you are in a slow internet connection, you could receive timeout errors.

Solution? Starting in version 2.1.2 (and if you are in a UNIX machine) you will be able to change some parameters to fix that.

However, you could do it now by increasing the values of SOCKETS_TIMEOUT
and THREADS_TIMEOUT in setup.py.

`isbntools.conf` is not updated

isbntools.conf should not be overwritten but must be updated.

In researching this bug, was discovered that there is no guarantee that isbntools.conf would be installed in ~/.isbntools!

Ranges database

Periodically new blocks of ISBNs are issued to agencies. The range of these blocks is on a database that mask uses. So it could happen, if you have a version of isbntools that is too old, mask doesn't work for valid (recent) issued ISBNs.

The solution? Update isbntools often!

Rename private modules according with pep8

People are importing private modules... and are getting strange errors! So rename private modules according with pep8 to signal that those modules MUST NOT be imported directly. They functionality is exposed in another way, namely in the namespaces:

  • isbntools
  • isbntools.conf
  • isbntools.dev
  • isbntools.dev.lab

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.