Giter Site home page Giter Site logo

ruby-rdf / rdf-cassandra Goto Github PK

View Code? Open in Web Editor NEW

This project forked from bhuga/rdf-repository-skeleton

12.0 4.0 1.0 156 KB

[OBSOLETE] RDF.rb storage adapter for the Apache Cassandra distributed database management system.

Home Page: http://rubygems.org/gems/rdf-cassandra

License: The Unlicense

Ruby 100.00%

rdf-cassandra's Introduction

Apache Cassandra Storage Adapter for RDF.rb

This is an RDF.rb plugin that adds support for storing RDF data in the Apache Cassandra distributed database management system.

Features

  • Stores RDF statements in a resource-centric manner using one Cassandra supercolumn family per RDF repository.
  • Inherits Cassandra's characteristics of high availability, eventual consistency, and horizontal scalability.
  • Optimized for write-heavy workloads with no need to perform a read before inserting or deleting an RDF statement.
  • Optimized for resource-oriented access patterns to RDF statements about a particular subject.
  • Partitions RDF data across the Cassandra cluster based on subject URIs, improving data locality when accessing statements about a particular subject.
  • Includes a set of Rake tasks that make it easy to download and setup a local development instance of Cassandra.

Limitations

  • Does not support named graphs at present.

Examples

require 'rdf/cassandra'

Connecting to a Cassandra server running on localhost

repository = RDF::Cassandra::Repository.new

Connecting to specific Cassandra servers

repository = RDF::Cassandra::Repository.new(:servers => "127.0.0.1:9160")

Configuring the Cassandra keyspace and column family

repository = RDF::Cassandra::Repository.new({
  :keyspace      => "MyApplication",  # defaults to "RDF"
  :column_family => "MyRepository",   # defaults to "Resources"
})

Configuration

As of Cassandra 0.6, all keyspaces and column families must be predeclared in storage-conf.xml. You can think of each used Cassandra supercolumn family as being equivalent to an RDF repository, so you'll want to configure as many as you are likely to need.

The following configuration snippet matches the default options for constructing an RDF::Cassandra::Repository instance:

<Keyspaces>
  <Keyspace Name="RDF">
    <ColumnFamily Name="Resources"
                  ColumnType="Super"
                  CompareWith="UTF8Type"
                  CompareSubcolumnsWith="BytesType"
                  Comment="RDF data."/>
  </Keyspace>
</Keyspaces>

See etc/storage-conf.xml for a full configuration file example compatible with Cassandra 0.6.

Data Model

This storage adapter stores RDF data in a resource-centric manner by mapping RDF subject terms to Cassandra row keys, RDF predicates to Cassandra supercolumns, and RDF object terms to Cassandra columns as follows:

{key     => {supercolumn => {column    => value }}}   # Cassandra terminology
{subject => {predicate   => {object_id => object}}}   # RDF terminology

RDF object terms are stored using their canonical N-Triples serialization and are uniquely identified by the binary SHA-1 fingerprint of that representation.

For example, here's how some of RDF.rb's DOAP data would be stored using the RDF::Cassandra data model:

{
  "http://rdf.rubyforge.org/" => {
    "http://www.w3.org/1999/02/22-rdf-syntax-ns#type" => {
      "c0b66f5e31ec616497404f044ff0eaa210f21232" => "<http://usefulinc.com/ns/doap#Project>",
    },
    "http://usefulinc.com/ns/doap#developer" => {
      "9d178ddaa88acfec63f812aa270b42291381b4ff" => "<http://ar.to/#self>",
      "908b42dd9d1a3f5ac5ecf9540e1f9a753f444204" => "<http://bhuga.net/#ben>",
      ...
    },
    ...
  },
  "http://ar.to/#self" => {
    "http://www.w3.org/1999/02/22-rdf-syntax-ns#type" => {
      "74a5c03994aacac0a36003afb61aaf7befc438fd" => "<http://xmlns.com/foaf/0.1/Person>",
    },
    "http://xmlns.com/foaf/0.1/name" => {
      "f369f748e964ef2b82160d6389b63fb55949b464" => '"Arto Bendiken"',
    },
    ...
  },
  "http://bhuga.net/#ben" => {
    "http://www.w3.org/1999/02/22-rdf-syntax-ns#type" => {
      "74a5c03994aacac0a36003afb61aaf7befc438fd" => "<http://xmlns.com/foaf/0.1/Person>",
    },
    "http://xmlns.com/foaf/0.1/name" => {
      "97325e589ac0194e74848090181b66b0db310750" => '"Ben Lavender"',
    },
    ...
  },
}

To learn more about Cassandra's data model, read WTF is a SuperColumn?.

Documentation

http://rdf.rubyforge.org/cassandra/

  • {RDF::Cassandra}
    • {RDF::Cassandra::Repository}

Dependencies

Installation

The recommended installation method is via RubyGems. To install the latest official release of the RDF::Cassandra gem, do:

% [sudo] gem install rdf-cassandra

Download

To get a local working copy of the development repository, do:

% git clone git://github.com/bendiken/rdf-cassandra.git

Alternatively, you can download the latest development version as a tarball as follows:

% wget http://github.com/bendiken/rdf-cassandra/tarball/master

Authors

License

RDF::Cassandra is free and unencumbered public domain software. For more information, see http://unlicense.org/ or the accompanying UNLICENSE file.

rdf-cassandra's People

Contributors

artob avatar bhuga avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

Forkers

malamili

rdf-cassandra's Issues

Doesn't connect to remote repository

When I'm connecting to a remote repository:

repository = RDF::Cassandra::Repository.new(:servers => "xxx.xxx.xxx.xxx:9160")

where xxx.xxx.xxx.xxx is a remote ip with an up and running Cassandra server, which I can access via telnet, the following happens:

ThriftClient::NoServersAvailable: No live servers in ["127.0.0.1:9160"] since Thu Apr 29 15:16:56 +0200 2010.

Rdf query on cassandra (ruby)

I am a thinking to set up a cassandra db and use it with the adapter rdf-cassandra for saving rdf triples. From the documentation i found i cannot see how i can query cassandra. I think i will be able to do basic triple search but not a sparql query. Is there a way to get a sparql endpoint? ( i think i read someone was using sesame for querying cassandra!?) Any suggestions...?
Is there a way to use sparql-client along with cassandra db?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.