Giter Site home page Giter Site logo

spring-xd-sqoop-job's Introduction

spring-xd-sqoop-job

This is a fork from https://github.com/tzolov/spring-xd-sqoop-job that cleans up hadoop configuration mechanism.

Simple Sqoop Job module for Spring-XD. Works with Hadopp 2 based Hadoop distributions.

Sqoop export usage: xd:>job create --name --definition "sqoop --command='export --connect jdbc:mysql:/// --table --username --password --export-dir /hdfs/source/folder/*.csv'"

Example list table usage: job create testsqoopjob --definition "sqoop --command='list-tables --connect=jdbc:postgresql://127.0.0.1:5432/tch --driver=org.postgresql.Driver --username=tch'"

Build and Installation

Build the job jar:

mvn clean package

Copy the result xd-sqoop-module-0.0.1-SNAPSHOT-job.jar into ${XD_HOME}/modules/job/sqoop/lib

cp target/xd-sqoop-module-0.0.1-SNAPSHOT-job.jar `${XD_HOME}/modules/job/sqoop/lib`

Copy the sqoop.xml module definition into ${XD_HOME}/modules/job/sqoop/config

cp src/main/resources/sqoop.xml ${XD_HOME}/modules/job/sqoop/config

Make sure that the hadoop fs uri is configured properly in spring xd servers.yml configuration file:

# Hadoop properties
spring:
  hadoop:
  fsUri: hdfs://localhost:8020

Usage

Start Spring-XD

Start the admin

${XD_HOME}/bin/xd-single node --hadoopDistro cdh5

Start xd-shell (in separate shell)

${XD_HOME}/../shell/bin/xd-shell

The export tool exports a set of files from HDFS back to an RDBMS. The target table must already exist in the database. The input files are read and parsed into a set of records according to the user-specified delimiters.

The default operation is to transform these into a set of INSERT statements that inject the records into the database. In "update mode," Sqoop will generate UPDATE statements that replace existing records in the database.

Sample data export from HDFS to remote MySQL database:

xd:>job create --name hdfsToDbExport --definition "sqoop --command='export 
  --connect jdbc:mysql://your-db-hostname/target-db-name
  --username db-username --password db-password 
  --table target-table-name 
  --export-dir /hdfs/source/folder/*.csv'"

xd:>stream create --name exportTrigger --definition "trigger > job:hdfsToDbExport"

spring-xd-sqoop-job's People

Contributors

tzolov avatar tch avatar

Watchers

 avatar James Cloos avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.