Giter Site home page Giter Site logo

eclairjs-node's Introduction

EclairJS Node

EclairJS Node provides Node.js language bindings for Apache Spark.

Learn more about the larger EclairJS project.

Installation

$ npm install eclairjs

EclairJS Node requires Node 0.12 or higher and also requires a running instance of EclairJS Nashorn.

Supported Spark versions can be found in the Versions section below.

Example

EclairJS Node's api mirrors the Spark api. Here is the classic word count example:

var eclairjs = require('eclairjs');

var sc = new eclairjs.SparkContext("local[*]", "Simple Word Count");

var textFile = sc.textFile('foo.txt');

var words = textFile.flatMap(function(sentence) {
  return sentence.split(" ");
});

var wordsWithCount = words.mapToPair(function(word, Tuple2) {
  return new Tuple2(word, 1);
}, [eclairjs.Tuple2]);

var reducedWordsWithCount = wordsWithCount.reduceByKey(function(value1, value2) {
  return value1 + value2;
});

reducedWordsWithCount.collect().then(function(results) {
  console.log('Word Count:', results);
  sc.stop();
});

Try It

EclairJS Node provides a Docker container that contains all of its dependencies on Dockerhub.

The Docker image supports the latest released version of EclairJS Node and may not work with master. You can simply check out the appropriate branch ( git checkout branch-0.5 for example).

docker pull eclairjs/minimal-gateway
docker run -p 8888:8888 eclairjs/minimal-gateway

After retrieving Docker's IP address (docker-machine ip), you will need to set two environment variables:

export JUPYTER_HOST=??.??.??.?? (your docker ip)
export JUPYTER_PORT=8888

Now you can run the Word count example:

node --harmony examples/rddtop10.js ./dream.txt

You can learn more about the Docker container here. You can also try out EclairJS in Jupyter notebooks running under the IBM Bluemix Cloud.

Documentation

Community

Deploy

You can choose to either deploy using Docker (Using the Docker Container) or manually build and setup your own environment (Build and Package).

Progress

Spark Feature EclairJS Node Status
RDD Partial Support
SQL/DataFrames Partial Support
Streaming Partial Support
ml Partial Support
mllib Partial Support
GraphX Unsupported

Refer to the API Documentation for a list of what is currently implemented. Please note as new APIs are implemented for EclairJS Node they will be added to the master branch.

Contributions are always welcome via pull requests.

Versions

Our goal is to keep the EclairJS master branch up to date with the latest version of Spark. When new versions of Spark require code changes, we create a separate branch. The table below shows what is available now.

EclairJS Version/Tag Apache Spark Version
0.1 1.5.1
0.2 - 0.6 1.6.0
0.7 (master) 1.6.0

eclairjs-node's People

Contributors

pberkland avatar billreed63 avatar david-f avatar brian-burns-bose avatar

Watchers

Essam B. avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.