Giter Site home page Giter Site logo

tspannhw / sparkdemo1 Goto Github PK

View Code? Open in Web Editor NEW

This project forked from vinayshukla/sparkdemo1

0.0 2.0 0.0 184 KB

A sample Spark app that is built against Spark from Hortonworks Maven Repo

License: Apache License 2.0

Java 50.83% Scala 49.17%

sparkdemo1's Introduction

SparkDemo

This maven project creates a Simple Application using Apache Spark that can be run against a Hadoop Cluster. The project uses Hortonwork's maven repository to resolve Spark depedencies.

Prerequisites:

Ensure you have downloaded Spark 1.1.0 tarball and copied it to your HDP 2.1 cluster. You can follow Hortowork's Tech Preview of Spark 1.1.o for instructions on how to setup Spark on your Hadoop cluster.

wget http://path/to/spark-1.1.0.2.1.5.0-695-bin-2.4.0.2.1.5.0-695.tgz

scp -P 2222 spark-1.1.0.2.1.5.0-695-bin-2.4.0.2.1.5.0-695.tgz [email protected]:/root

Note: The password for HDP 2.1 Sandbox is hadoop.

On the HDP 2.1 cluster

tar xvfz spark-1.1.0.2.1.5.0-695-bin-2.4.0.2.1.5.0-695.tgz

export YARN_CONF_DIR=/etc/hadoop/conf

Build the Application:

On your dev environment to build & package the application

mvn clean package

One of the artifact this will produce is ../target/SparkDemo-1.1.0.jar

Copy the Application to Hadoop Cluster

Copy the jar to your Hadoop Cluster. In this example I am pushing the produced SparkDemo jar to an HDP 2.1 Sandbox

scp -P 2222 target/SparkDemo-1.1.0.jar [email protected]:/root

Run the Application on Hadoop Cluster

Go to your Hadoop cluster & ensure you have set the YARN_CONF_DIR cd to your Spark home dir and run the following, ensure the path to SparkDemo-1.1.0.jar is where you copied the SparkDemo on your Hadoop cluster and ensure

export YARN_CONF_DIR=/etc/hadoop/conf

cd spark-1.1.0.2.1.5.0-695-bin-2.4.0.2.1.5.0-695

./bin/spark-submit --class com.whiteware.sparkdemo.SimpleApp --master yarn-cluster --num-executors 3 --driver-memory 512m --executor-memory 512m --executor-cores 1 ../SparkDemo-1.1.0.jar

Running the spark-submit will produce an output similar to

14/09/12 14:47:39 INFO yarn.Client: Application report from ResourceManager: application identifier: application_1410558108229_0001 appId: 1 clientToAMToken: null appDiagnostics: appMasterHost: sandbox.hortonworks.com appQueue: default appMasterRpcPort: 0 appStartTime: 1410558409206 yarnAppState: FINISHED distributedFinalState: SUCCEEDED appTrackingUrl: http://sandbox.hortonworks.com:8088/proxy/application_1410558108229_0001/A appUser: root

Go to http://sandbox.hortonworks.com:8088/proxy/application_1410558108229_0001/A

& click on the Logs dir link

The logs will show something like .....

14/09/12 14:47:40 INFO storage.BlockManagerMaster: BlockManagerMaster stopped 14/09/12 14:47:40 INFO spark.SparkContext: Successfully stopped SparkContext

Log Type: stdout Log Length: 43 Lines with add: 4, lines with security: 10

sparkdemo1's People

Contributors

vinayshukla avatar

Watchers

James Cloos avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.