Giter Site home page Giter Site logo

virtual-hadoop-cluster's Introduction

A working virtual Hadoop cluster

With these files you can setup and provision a locally running, virtual Hadoop cluster in real distributed fashion for trying out Hadoop and related technologies. It runs the latest Cloudera Hadoop distribution: CDH5. It also allows you to practise the use of Cloudera Manager for installing the Hadoop stack. If you're looking for a fully automated install, without user intervention, look elsewhere. I specifically made this with the goal of creating an environment ideally suited for Cloudera Manager to do its job. This gives you the freedom to actually install the services you want, and change the configuration how you see fit.

This README describes how to get the cluster with Cloudera Manager up and running. For more detailed instructions on how to install the whole Hadoop stack on that, you can use this guide.

Specs

The cluster conists of 4 nodes:

  • Master node with 4GB of RAM (Running the NameNode, Hue, ResourceManager etc. after installing the Hadoop services)
  • 3 slaves with 2GB of RAM each (Running DataNodes)

As you can see, you'll need at least 10GB of free RAM to run this. If you have less, you can try to remove one machine from the Vagrantfile. This will lead to worse performance though!

Usage

Depending on the hardware of your computer, installation will probably take between 15 and 25 minutes.

First install VirtualBox and Vagrant.

Install the Vagrant Hostmanager plugin

$ vagrant plugin install vagrant-hostmanager

Clone this repository.

$ git clone https://github.com/DandyDev/virtual-hadoop-cluster.git

Provision the bare cluster. It will ask you to enter your password, so it can modify your /etc/hosts file for easy access in your browser. It uses the Vagrant Hostmanager plugin to do this.

$ cd virtual-hadoop-cluster
$ vagrant up

Go to the Cloudera Manager web console and follow the installation instructions. For more detailed instructions on how to do that, you can use this guide.

Done! Have fun with your Hadoop cluster.

virtual-hadoop-cluster's People

Contributors

dondebonair avatar

Watchers

James Cloos avatar iasinDev avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.