Giter Site home page Giter Site logo

hyonaldo / hadoop-multiple-streaming Goto Github PK

View Code? Open in Web Editor NEW
6.0 6.0 0.0 1.46 MB

hadoop-multiple-streaming is a addition to the Hadoop-Streaming which is a utility that comes with the Hadoop distribution. This utility allows you to not only do Hadoop-Streaming, but also create and run 'multiple' Map/Reduce jobs with any executable or script as the mapper and/or the reducer for 'one' input. hadoop-multiple-streaming includes Hadoop-Streaming.

Java 100.00%

hadoop-multiple-streaming's Introduction

hadoop-multiple-streaming

hadoop-multiple-streaming extends Hadoop-Streaming which is a utility that comes with the Hadoop distribution.
This utility allows you to not only do Hadoop-Streaming, but also create and run 'multiple' Map/Reduce jobs for 'one' input with any executable or scripts. For example:

hadoop jar hadoop-multiple-streaming.jar \  
  -input    myInputDirs \  
  -multiple "outputDir1|mypackage.Mapper1|mypackage.Reducer1" \  
  -multiple "outputDir2|mapper2.sh|reducer2.sh" \  
  -multiple "outputDir3|mapper3.py|reducer3.py" \  
  -multiple "outputDir4|/bin/cat|/bin/wc" \  
  -libjars  "libDir/mypackage.jar" \
  -file     "libDir/mapper2.sh" \  
  -file     "libDir/mapper3.py" \  
  -file     "libDir/reducer2.sh" \  
  -file     "libDir/reducer3.py"

This project is the maven project. So you can simply do maven build command for making hadoop-multiple-streaming.jar file. In more detail, 'mvn clean package' command will compile source code and packaging to ${project_home}/target folder.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.