Giter Site home page Giter Site logo

embulk-parser-xml's Introduction

XML parser plugin for Embulk

Parser plugin for Embulk.

Read data from input as xml and fetch each entries to output.

Overview

  • Plugin type: parser
  • Load all or nothing: yes
  • Resume supported: no

Types

  • xml: Find rows by SAX.
  • xpath: Find finds rows by Xpath, so you can process XML by more complex condition than xml type.

Configuration

XML

parser:
  type: xml
  root: data/students/student
  schema:
    - {name: name, type: string}
    - {name: age, type: long}
  • type: specify this plugin as xml .
  • root: root property to start fetching each entries, specify in path/to/node style, required.
  • schema: specify the attribute of table and data type, required.

If you need to parse column as timestamp type, schema supports 2 optional parameters:

schema:
  - {name: timestamp_column, type: timestamp, format: "%Y-%m-%d", timezone: "+0000"}
  • format: timestamp format to parse, required.
  • timezone: timestamp will be parsing in this timezone, "+0900" is used by default.

Xpath

parser:
  type: xpath
  root: //data/students/student
  schema:
    - {path: name, type: string, name: name}
    - {path: age, type: long, name: age}
    - {path: hobbies/hobby, type: json, name: hobbies}
  • type: specify this plugin as xpath .
  • root: root property to start fetching each entries, specify in Xpath, '/'' is used by default.
  • schema: specify the attribute of table and data type, required.
  • namespaces: xml namespaces

If you need to parse column as timestamp type, schema supports 2 optional parameters:

schema:
  - {name: timestamp_column, type: timestamp, format: "%Y-%m-%d", timezone: "+0000"}
  • format: timestamp format to parse, required.
  • timezone: timestamp will be parsing in this timezone, "+0900" is used by default.

Here is XML for xample:

<data>
  <result>true</result>
  <students>
    <student>
      <name>John</name>
      <age>10</age>
      <hobbies>
        <hobby>music</hobby>
        <hobby>movie</hobby>
      </hobbies>
    </student>
    <student>
      <name>Paul</name>
      <age>16</age>
      <hobbies>
        <hobby>game</hobby>
      </hobbies>
    </student>
    <student>
      <name>George</name>
      <age>17</age>
    </student>
    <student>
      <name>Ringo</name>
      <age>18</age>
    </student>
  </students>
</data>

embulk-parser-xml's People

Contributors

takumakanari avatar hiroyuki-sato avatar

Watchers

James Cloos avatar Hiro Hori avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.