Giter Site home page Giter Site logo

atomgraph / json2rdf Goto Github PK

View Code? Open in Web Editor NEW
81.0 81.0 12.0 47 KB

Streaming generic JSON to RDF converter

Home Page: https://hub.docker.com/r/atomgraph/json2rdf

License: Apache License 2.0

Dockerfile 1.46% Java 98.14% Shell 0.40%
docker-image json json-converter json-ld json2rdf knowledge-graph linked-data rdf semantic-web sparql streaming transformer

json2rdf's Issues

Adding a Apache Spark UDF?

Hi! Would it make sense to have a small addition that makes the library usable in Apache Spark? Something along the lines of

package com.atomgraph.etl.json;
import org.apache.spark.sql.api.java.UDF1;
import org.apache.jena.rdf.model.Model;
public class Json2rdfUDF implements UDF1<String, String> {
private static final long serialVersionUID = 1L;
@Override
  public StreamRDF call(String jsonString) throws Exception {

       InputStream bis = new ByteArrayInputStream(jsonString.getBytes());
       Reader reader =  new BufferedReader(bis);

       StreamRDF rdfStream = new CollectorStreamRDF();
       new JsonStreamRDFWriter(reader, rdfStream, baseURI.toString()).convert();
       
       return rdfStream;
   }
}

# for namespace properties should be optional

From the README: "Property namespace is constructed by adding # to the base URI."

If the base URI ends in a forward slash (/), a pound/number sign (#) should not be added to the base URI.

Incorrect version number used in Docker container

When running JSON2RDF in the Docker container is shown in the readme, I get this error message:

xxxx [~/dev/workspaces/graphstuff/JSON2RDF]> echo "{'label':'Hello'}" | head -100 | docker run -i -a stdin -a stdout -a stderr atomgraph/json2rdf urn:test:
Error: Unable to access jarfile target/json2rdf-1.0.0-SNAPSHOT-jar-with-dependencies.jar

The version number in the Dockerfile wasn't adjusted when bumping the version to 1.0.1.

whitespace in json keys

characters.json:

{
  "characters": [
    {
      "first name": "Ash",
      "lastname":"Ketchum"}
      ]
    }


is valid json.

but:

cat characters.json | docker run --rm -i atomgraph/json2rdf 'http://tmp.com/
Exception in thread "main" org.apache.jena.riot.RiotException: <http://tmp.com/#first name> Code: 17/WHITESPACE in FRAGMENT: A single whitespace character. These match no grammar rules of URIs/IRIs. These characters are permitted in RDF URI References, XML system identifiers, and XML Schema anyURIs.
        at org.apache.jena.riot.system.IRIResolver.exceptions(IRIResolver.java:384)
        at org.apache.jena.riot.system.IRIResolver.resolve(IRIResolver.java:341)
        at org.apache.jena.riot.system.IRIResolver.resolveToString(IRIResolver.java:358)
        at com.atomgraph.etl.json.JsonStreamRDFWriter.write(JsonStreamRDFWriter.java:107)
        at com.atomgraph.etl.json.JsonStreamRDFWriter.convert(JsonStreamRDFWriter.java:66)
        at com.atomgraph.etl.json.JSON2RDF.convert(JSON2RDF.java:82)
        at com.atomgraph.etl.json.JSON2RDF.main(JSON2RDF.java:60)

should json2rdf just make the decision to delete non URI safe characters or maybe percent encode them?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.