Giter Site home page Giter Site logo

surajwaghulde / storm-example-projects Goto Github PK

View Code? Open in Web Editor NEW
28.0 2.0 7.0 99 KB

Storm is real time computing system which supports fault-tolerance, horizontal scalability and guaranteed message processing with amazing performance. Here is the library of sample projects which is essentially exposing reusable bolts for real time computation.

storm-example-projects's Introduction

Technical Details:

This is a fun project I created to leverage the newest real time computing platforms to process data generated from sensor devices. This project is about creating a library to continuously listen to and analyze stream of data generated by various sensor devices. This library is developed using storm-distributed stream computing platform (http://www.slideshare.net/nathanmarz/storm-distributed-and-faulttolerant-realtime-computation). The basic architecture of the platform is such that it continuously processes the streaming data generating another stream of result and you can continue creating of such pipeline of stream processing endlessly. 

I have written a library to compute moving average and spike detection for continuous stream of data that can be applied to finance or any other streams. The purpose of the library is I want to create a library of bolts that can do certain operations and we can reuse these bolts than starting over from scratch.

Let us start with deploying my project on single node. Deploying it on single node is very simple. I benchmarked this project to process 96,000 sensor values per second on cluster of 3 machines. It detects spike within few milliseconds.

Hardware Requirements: (Not necessary as you can generate input like sensor data input using inputStreamSpout)

1.	arduino kit with circuit design of photo resistor.
2.	Interface this kit with your laptop using serial port and run the light program I submitted to generate light intensity events.
3.	It will list the serial port being used on the machine.

Software Requirements:

1.	Download storm version 0.6.2 from https://github.com/nathanmarz/storm/downloads
2.	Install maven. I am considering Java 1.6 is installed.


Running my project:
1.	Download storm-starter project from https://github.com/nathanmarz/storm-starter/downloads, unzip the project, rename m2-pom.xml to pom.xml
2.	Copy my project movingAverageWithSpikeDetection.tar.gz to storm-starter/src/jvm directory. Unzip my project submitted movingAverageWithSpikeDetection.tar.gz using “tar –zxf movingAverageWithSpikeDetection.tar.gz”
3.	Build my project using maven with command in storm-starter folder – “mvn clean install” (it will install all the libraries for serialization and other stuff for distributed system.)
4.	Run “mvn eclipse:eclipse” to create eclipse .project file for simplicity.
5.	Open eclipse and import the movingAverageWithSpikeDetection project.
6.	Open LightEventSpout.java and change the PORT_NAMES[] entry according to the serial port on your machine which arduino kit is using. Baud rate is defined to be 9600 in LightEventSpout.java so if you change it for experiment make sure both the baud rates are matching, one from the device and one from the program.
7.	Upload the light program on the arduino kit.
8.	Run SpikeDetectionTopology.java, if you are getting an exception PortInUse then create a folder /var/lock/ and give 775 permissions to it, it is a problem with arduino to Java interface. (This will automatically invoke zookeeper distributed cluster management and run the program over it with one node)


Creating a cluster of machines and running my project on distributed cluster:

1.	Download zookeeper from - http://download.filehat.com/apache/zookeeper/zookeeper-3.3.3/
2.	Unzip zookeeper and change zoo_example.cfg file from the config folder to zoo.cfg
3.	Go to bin folder in zookeeper and start zookeeper instance using this command - “zkServer.sh start”
4.	Unzip storm-0.6.1 folder
5.	Copy storm.yaml.example to storm.yaml and add all the machine names (you can use IP address) to storm.zookeeper.servers that indicates zookeeper is running on every machine for co-ordination in distributed system
6.	Add master machine-name as nimbus.host which is interfaced with the arduino
(Remember all these steps needs to be done on every machine)


Now your cluster is set. Do the following to run the above project on cluster of machines:

1.	On every machine, go to storm-0.6.2 folder. Go to bin directory inside it and run “./storm supervisor”
2.	On master machine, run “./storm nimbus” that will start the master process called nimbus that distributes the runnable programs over the cluster dynamically.
3.	Now run our project from the master node using the command – “./storm jar movingAverageSpikeDetection.jar movingAverage.SpikeDetectionTopology”. if you are getting an exception PortInUse then create a folder /var/lock/ and give 775 permissions to it, it is a problem with arduino to Java interface. (This will automatically invoke zookeeper distributed cluster management and run the program over it with one node)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.