Giter Site home page Giter Site logo

danielptv / tap-db2 Goto Github PK

View Code? Open in Web Editor NEW
1.0 1.0 0.0 800 KB

Singer tap built using the Meltano SDK to extract data from IBM DB2 relational databases.

Home Page: https://hub.meltano.com/extractors/tap-db2--danielptv/

License: MIT License

Python 100.00%
db2 etl meltano meltano-sdk singer singer-tap

tap-db2's Introduction

Tap-DB2 ๐Ÿ‘‘

Actions Status License: MIT Code style: black PyPI - Version PyPI - Downloads

Tap-DB2 is a Singer tap for IBM DB2 data sources. Built with the Meltano Tap SDK for Singer Taps.

Installation โš™๏ธ

Install from PyPi:

pipx install tap-ibm-db2

Install from GitHub:

pipx install git+https://github.com/danielptv/tap-db2.git@main

Configuration ๐Ÿ“

Setting Required Default Description
host True localhost The DB2 hostname.
port True 50000 The DB2 port.
database True None The DB2 database.
schema False None The DB2 schema.
user True None The DB2 username.
password True None The DB2 password.
encryption True None Encryption settings for the DB2 connection. Disabled if omitted.
connection_parameters False None Additional parameters to be appended to the connection string. This is an objects containing key-value pairs.
sqlalchemy_execution_options False None Additional execution options to be passed to SQLAlchemy. This is an objects containing key-value pairs.
query_partitioning False None Partition query into smaller subsets.
filter False None Apply a custom WHERE condition per stream. Unlike the filter available in stream_maps, this will be evaluated BEFORE extracting the data.
ignore_supplied_tables False True Ignore DB2-supplied user tables. For more info check out Db2-supplied user tables.
ignore_views False False Ignore views.
stream_maps False None Config object for stream maps capability. For more information check out Stream Maps.
stream_map_config False None User-defined config values to be used within map expressions.

A full list of supported settings and capabilities for this tap is available by running:

tap-db2 --about --format json

Configure using environment variables โœ๏ธ

This Singer tap will automatically import any environment variables within the working directory's .env if the --config=ENV is provided, such that config values will be considered if a matching environment variable is set either in the terminal context or in the .env file.

Configure encryption settings ๐Ÿ”’

This Singer tap supports encrypted connection settings to DB2 according to the python-ibmdb driver.

SSL without additional options:

...
plugins:
  extractors:
  - name: tap-db2
    variant: danielptv
    pip_url: tap-ibm-db2
    config:
      ...
      encryption: {}

This will append SECURITY=SSL; to the connection string.

SSL using SSLServerCertificate keyword:

...
plugins:
  extractors:
  - name: tap-db2
    variant: danielptv
    pip_url: tap-ibm-db2
    config:
      ...
      encryption:
        ssl_server_certificate: <Full path to the server certificate>

This will append SECURITY=SSL;SSLServerCertificate=<Full path to the server certificate>; to the connection string.

SSL using SSLClientKeyStoreDB and SSLClientKeyStoreDBPassword keywords:

...
plugins:
  extractors:
  - name: tap-db2
    variant: danielptv
    pip_url: tap-ibm-db2
    config:
      ...
      encryption:
        ssl_client_key_store_db:
          database: <Full path to the client keystore database>
          password: <Keystore password>

This will append SECURITY=SSL;SSLClientKeyStoreDB=<Full path to the client keystore database>;SSLClientKeyStoreDBPassword=<Keystore password>; to the connection string.

SSL using SSLClientKeyStoreDB and SSLClientKeyStash keywords:

...
plugins:
  extractors:
  - name: tap-db2
    variant: danielptv
    pip_url: tap-ibm-db2
    config:
      ...
      encryption:
        ssl_client_key_store_db:
          database: <Full path to the client keystore database>
          key_stash: <Full path to the client keystore stash>

This will append SECURITY=SSL;SSLClientKeyStoreDB=<Full path to the client keystore database>;SSLClientKeyStash=<Full path to the client keystore stash>; to the connection string.

Configure query partitioning ๐Ÿงฉ

This Singer tap supports the partitioning of SQL queries into smaller sub-queries to reduce the CPU load on the database. This is particularly useful when working with large amounts of data and a DB2 that has set strict resource limits per query. Note: This only works for streams with a numeric primary key.

The configuration for query partitioning should look as follows:

...
plugins:
  extractors:
  - name: tap-db2
    variant: danielptv
    pip_url: tap-ibm-db2
    config:
      ...
      query_partitioning:
      <stream>:
        primary_key: <primary key>
        partition_size: 1000

Replace <stream> with the stream name and <primary key> with the stream's primary key. Use * to apply a query partitioning setting to all streams not explicitly declared.

Usage ๐Ÿ‘ทโ€โ™€๏ธ

You can easily run tap-db2 by itself or in a pipeline using Meltano.

Executing the Tap Directly ๐Ÿ”จ

tap-db2 --version
tap-db2 --help
tap-db2 --config CONFIG --discover > ./catalog.json

Developer Resources ๐Ÿ‘ฉ๐Ÿผโ€๐Ÿ’ป

Follow these instructions to contribute to this project.

Initialize your Development Environment

pipx install poetry
poetry install

Create and Run Tests ๐Ÿงช

Create tests within the tests subfolder and then run:

poetry run pytest

You can also test the tap-db2 CLI interface directly using poetry run:

poetry run tap-db2 --help

Testing with Meltano

Note: This tap will work in any Singer environment and does not require Meltano. Examples here are for convenience and to streamline end-to-end orchestration scenarios.

Next, install Meltano (if you haven't already) and any needed plugins:

# Install meltano
pipx install meltano
# Initialize meltano within this directory
cd tap-db2
meltano install

Now you can test and orchestrate using Meltano:

# Test invocation:
meltano invoke tap-db2 --version
# OR run a test `elt` pipeline:
meltano elt tap-db2 target-jsonl

SDK Dev Guide

See the dev guide for more instructions on how to use the SDK to develop your own taps and targets.

tap-db2's People

Contributors

danielptv avatar dependabot[bot] avatar pre-commit-ci[bot] avatar

Stargazers

 avatar

Watchers

 avatar

tap-db2's Issues

bug: not advertising some capabilities that are supported

I noticed that the default capabilities are being overridden here

def capabilities(self) -> list[CapabilitiesEnum]:
. Is there a reason for that? I think that tap supports capabilities that its not advertising: about, catalog, discover. Tools like Meltano use these capabilities to decide how to run the tap (i.e. run discover before syncing).

Does this also support state?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.