Giter Site home page Giter Site logo

soda's Introduction

Soda

Soda is a data API for building typesafe & composable processing pipeline. Soda now supports:

  • Read/Write common physical file formats (csv, tsv, json, zipped)
  • Read/Write relational databases : mysql, h2, redis, postgres, mongo
  • Event-driven directory watch to trigger pipeline
  • AWS s3 as part of pipeline
  • Serialisation and compression
  • wget as part of the pipeline
  • Sequence pipeline
  • Nested pipeline
  • Branched pipeline

Build & Run

soda-etl

Main data workflow library. Most tests run without external dependencies except following:

DB unit tests

Start docker compose before running unit tests and setup dependencies

docker-compose -f docker-compose-testsuite.yaml up -d --no-recreate

./init-test-dependencies.sh

If you want to inspect initial data inside instances, just simply use your CLI of choice, e.g.

mysql -h localhost --protocol=TCP -uroot -p
# enter the root password as described in docker-compose file


docker exec -it redis-soda-test redis-cli
# then AUTH with password as described in docker-compose file

After tests, you can tear down all dependencies by

./stop-test-dependencies.sh

soda-cli

Collection of sample runnable workflows are in here (see in soda-cli/main/scala/de/tao/soda/runnable)

PublishLocal

Build and publish JAR to local repository with

sbt publishLocal
# published ivy to /Users/$(whoami)/.ivy2/local/de.tao/soda-etl_2.13/0.0.1/ivys/ivy.xml

Licence

MIT

soda's People

Contributors

tao-pr avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.