Giter Site home page Giter Site logo

josfranmc / jgutenbergcatalog Goto Github PK

View Code? Open in Web Editor NEW
0.0 2.0 0.0 26.88 MB

This software allows you to create a database with information of the existing books in the Gutenberg project catalog.

License: GNU General Public License v3.0

Java 100.00%
java gutenberg-ebooks rdf ebooks

jgutenbergcatalog's Introduction

JGutenbergCatalog

This software allows you to query and build the book catalog of Gutenberg project.
You can download the catalog from this link: https://www.gutenberg.org/cache/epub/feeds/rdf-files.tar.zip
This zip file is updated nightly and constains a set of files in RDF format. There is a RDF file for every book with information about it. Each RDF file is in its own folder.

With this software you can query the set of RDF files and retrieve for each book its title, author and language. This data may be loaded in a database. By default, a HSQL database is used. Its name is gutenberg and it is stored in a folder called catalog, which is located in the application execution directory.
You may also use either a PostgreSQL or a MySQL database, using the appropriate setting file.

License: GPL v3

Getting Started

The project is a Maven project, so you can import it in your favorite IDE as any other Maven project.

mvn install

will install the artifact in your local repository, being ready to be used as a dependency in any project:

<dependency>
  <groupId>org.josfranmc.gutenberg</groupId>
  <artifactId>JGutenbergCatalog</artifactId>
  <version>2.4</version>
</dependency>

When you build the project with Maven you get two jars in the target directory: JGutenbergCatalog-2.4.jar and JGutenbergCatalog-2.4-shaded.jar. The first one is the standard jar of the project. The second one is an uber jar with all necessary dependencies, which is suitable to use from command line.

Download the latest uber jar from Releases.

Usage

The main class to use is JGutenbergCatalog, which offers some methods to query and build the book catalog of Gutenberg project.

If you unzip the catalog into a folder called RdfFiles in the application's execution directory, the following code will read all RDF files and load them into memory:

JGutenbergCatalog jcatalog = new JGutenbergCatalog("RdfFiles/cache/epub");
jcatalog.readRdfFiles();

Then, you can get the catalog as a Map collection where the key is the book identifier. You can retrieve the books from this collection using their identifier:

Map<String, RdfFile> catalog = jcatalog.getRdfCatalog();
catalog.get("10607").getBook();

Or you can retrieve a book directly:

Book book = jcatalog.getBook("10607");

After reading the files and loading their data into memory you can load these data into a database with the following method:

jcatalog.loadDb();

By default a HSQL database is used. This database is created inside a folder called catalog in the application's execution directory.

If you want to use either a PostgreSQL or a MySQL database you can specify the access configuration using a properties file (you can use the templates in the repository to specify the database connection data):

jcatalog.setDatabase("postgresql-connection.properties");
jcatalog.loadDb();

Finally, you can execute the JGutenbergCatalog's main method by passing the setting options as argument. The following code reads RDF files from RdfFiles/cache/epub folder and loads the read data into a database, deleting previous data if it exists (-d argument):

String[] args = {"-r", "RdfFiles/cache/epub", "-d"};
JGutenbergCatalog.main(args);

These are the options you can use as arguments:

-r xxx (xxx path to the RDF files folder)
-b xxx (xxx  path to the database setting file)
-d     (delete previous data)

(only -h to show options list);

It is possible to run the program from the command line. To this purpose, you may use the JGutenbergCatalog-2.4-shaded.jar package this way (the -r parameter is mandatory):

java -jar JGutenbergCatalog-2.4-shaded.jar -r "path/to/catalog/rdf" [-b "path/to/database/setting/file" -d]

License

GPLv3 or later, see LICENSE for more details.

jgutenbergcatalog's People

Contributors

dependabot[bot] avatar josfranmc avatar

Watchers

 avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.