Giter Site home page Giter Site logo

deanbaker / apache-kafka-three-ways Goto Github PK

View Code? Open in Web Editor NEW

This project forked from troy-west/apache-kafka-three-ways

0.0 1.0 0.0 21.93 MB

AK3W: Open Source Apache Kafka Workshop

License: MIT License

CSS 39.49% HTML 17.99% JavaScript 42.52%

apache-kafka-three-ways's Introduction

Apache Kafka Three Ways ๐Ÿš€

An Open Source Apache Kafka Workshop provided by Troy-West.

By developers, for developers, our workshop takes new starters through the nuts and bolts of Kafka. We explain the big ideas, layered abstractions, and gradual evolution of Kafka over the years.

Morning materials provided as a reveal.js presentation and Q+A (theory).

Afternoon materials provided as Java and Clojure projects (practice) where we solve the mystery of the Number Stations.

We offer a guided, full day workshop with your team - or use and adapt these MIT licensed materials as you see fit.

Goals

To learn the fundamentals, ergonomics, trade-offs, and maturity of Apache Kafka as:

  1. A Message Broker.
  2. A Streaming Compute Platform.
  3. A Distributed Database.

To build a streaming compute application that processes +1M messages and solves the mystery of the Number Stations.

To operate the streaming compute application with a local Kafka cluster, understanding the tiered abstractions of Apache Kafka and how they enable availability, scalability, and (near) real-time compute.

Time permitting, we dabble with vendor tooling (KSQL, Kafka-Connect).

Prerequisites

Solutions

Completed solutions to both the Java and Clojure programming exercises are available:

Morning Session (Theory)

A deep dive into the history and fundamentals of Apache Kafka as a Message Broker.

We learn about Kafka via the lens of three key project decisions and their trade-offs:

  1. High Availability
  2. Linear Scalability
  3. Near Real-Time Compute

Throughout we focus on real-time data's unifying abstraction, the log.

After the presentation and Q+A we use troy-west/apache-kafka-cli-tools to start a local, 3-node Kafka Cluster. We operate that cluster via the shell scripts provided by Kafka, exploring repartitioning, offsets, consumer groups, and more.

Afternoon Session (Practice)

Solve the mystery of the Numbers Stations!

We talk briefly about the big ideas behind the Processor API and Kafka Streams, and how Kafka facilitates highly available, linearly scalable, near real-time compute. We share our experience with Apache Storm in this space, and comment on similarities to Clojure regarding immutable streams of data and functional composition.

Then we introduce the mystery of troy-west/apache-kafka-number-stations (or the Clojure variant).

We inspect the secret radio, send 1.5M messages to local Kafka, then progressively fix unit-tests until we have built a streaming compute application that filters, branches, maps, groups, windows, and aggregates data - decoding the secret message and solving the mystery!

Once the tests are green we build and deploy multiple versions of the system, discussing the expected and observed impact on local state partitioning.

Finally we discuss Interactive Queries and the possibility of using Kafka as a distributed database. What data should we expect to source from Kakfa, just the logs, or derived / materialized / computed data?

Self Guided

If you are experienced with Kafka you might find these workshop materials useful for your own team.

Start with the index.html in this project, or view at kafka.troywest.com.

The presentation moves left to right, then vertically for each the broker, compute, db, and vendor parts.


Copyright ยฉ 2019 Troy-West, Pty Ltd. MIT Licensed. Contributions welcome.

apache-kafka-three-ways's People

Contributors

d-t-w avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.