Giter Site home page Giter Site logo

hadoop-spark-cluster's Introduction

Run Hadoop and Spark Cluster within Docker Containers

5 Nodes Hadoop&Spark Cluster

1. pull docker image
sudo docker pull silencebingo/hadoop-spark-cluster
2. clone github repository
git clone https://github.com/silencebingo/hadoop-spark-cluster
3. create hadoop network
sudo docker network create --driver=bridge hadoop
4. start container
cd hadoop-cluster-docker
sudo ./start-container.sh
5. start hadoop
./start-hadoop-spark.sh
6. test hadoop
./run-wordcount.sh

output

input file1.txt:
Hello Hadoop

input file2.txt:
Hello Docker

wordcount output:
Docker    1
Hadoop    1
Hello    2
7. test spark

output

Master

4498 NameNode
4851 ResourceManager
4695 SecondaryNameNode
5211 Jps
4957 Master

Slave

1553 Worker
1362 DataNode
1682 Jps
1476 NodeManager
8. web management page

Hadoop Cluster http://masterip:8088/cluster

Hadoop Overview http://masterip:50070/

Saprk Cluster http://masterip:9090/

Arbitrary size Hadoop cluster

1. pull docker images and clone github repository

do 1~3 like section A

2. rebuild docker image
sudo ./resize-cluster.sh 6
  • specify parameter > 1: 2, 3..
  • this script just rebuild hadoop image with different slaves file, which pecifies the name of all slave nodes
3. start container
sudo ./start-container.sh 6
  • use the same parameter as the step 2
4. run hadoop cluster

do 5~6 like section A

hadoop-spark-cluster's People

Contributors

silencebingo avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.