Giter Site home page Giter Site logo

cloudxtreme / spark-riak-connector Goto Github PK

View Code? Open in Web Editor NEW

This project forked from basho/spark-riak-connector

0.0 1.0 0.0 36.08 MB

The official Riak Spark Connector for Apache Spark with Riak TS and Riak KV

Home Page: http://basho.com/products/spark/

License: Apache License 2.0

Scala 82.89% Java 17.11%

spark-riak-connector's Introduction

Spark-Riak Connector Build Status

The Spark-Riak connector enables you to connect Spark applications to Riak KV and Riak TS with the Spark RDD and Spark DataFrames APIs. You can write your app in Scala, Python, and Java. The connector makes it easy to partition the data you get from Riak so multiple Spark workers can process the data in parallel and it has support for failover if a Riak node goes down while your Spark job is running.

Features

  • Construct a Spark RDD from a Riak KV bucket with a set of keys
  • Construct a Spark RDD from a Riak KV bucket by using a 2i string index or a set of indexes
  • Construct a Spark RDD from a Riak KV bucket by using a 2i range query or a set of ranges
  • Map JSON formatted data from Riak KV to user defined types
  • Save a Spark RDD into a Riak KV bucket and apply 2i indexes to the contents
  • Construct a Spark Dataframe from a Riak TS table using range queries and schema discovery
  • Save a Spark Dataframe into a Riak TS table
  • Construct a Spark RDD using Riak KV bucket's enhanced 2i query (a.k.a. full bucket read)
  • Perform parallel full bucket reads from a Riak KV bucket into multiple partitions

Compatibility

  • Riak TS 1.2+
  • Apache Spark 1.6+
  • Scala 2.10
  • Java 8

Coming Soon

  • Support for Riak KV 2.2 and later

Prerequisites

In order to use the Spark-Riak connector, you must have the following installed:

Spark-Riak Connector

Mailing List

The Riak Users Mailing List is highly trafficked and a great resource for technical discussions, Riak issues and questions, and community events and announcements.

We pride ourselves on answering every email that comes over the Riak User mailing list. Sign up and send away. If you prefer points for your questions, you can always tag Riak on StackOverflow.

IRC

The #riak IRC room on irc.freenode.net is a great place for real-time help with your Riak issues and questions.

Reporting Bugs

To report a bug or issue, please open a new issue against this repository.

You can read the full guidelines for bug reporting on the Riak Docs.

Contributing

Basho encourages contributions to the Spark-Riak Connector from the community. Here’s how to get started.

  • Fork the appropriate project that is affected by your change.
  • Make your changes and run the test suite.
  • Commit your changes and push them to your fork.
  • Open pull-requests for the appropriate projects.
  • Basho engineers will review your pull-request, suggest changes, and merge it when it’s ready and/or offer feedback.

License

Copyright © 2016 Basho Technologies

Licensed under the Apache License, Version 2.0

spark-riak-connector's People

Contributors

aleksandrpavlenko avatar jbrisbin avatar korry8911 avatar mdigan avatar nehaev avatar nikolaypavlov avatar oleksii-suprun avatar ooshlablu avatar orocklin avatar paegun avatar ph07 avatar srgg avatar tmatvienko avatar vikua avatar zkhadikova avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.