Add interface for test purposes to Quix/KafkaStreamingClient

100% Python Stream Processing for Apache Kafka

Quix Streams is a cloud-native library for processing data in Kafka using pure Python. It’s designed to give you the power of a distributed system in a lightweight library by combining Kafka's low-level scalability and resiliency features with an easy-to-use Python interface (to ease newcomers to stream processing).

It has the following benefits:

Streaming DataFrame API (similar to pandas DataFrame) for tabular data transformations.
Custom stateful operations via a state object.
Custom reducing and aggregating over tumbling and hopping time windows.
Exactly-once processing semantics via Kafka transactions.
Pure Python with no need for a server-side engine.

Use Quix Streams to build simple Kafka producer/consumer applications or leverage stream processing to build complex event-driven systems, real-time data pipelines and AI/ML products.

Getting Started 🏄

Install Quix Streams

python -m pip install quixstreams

Requirements

Python 3.8+, Apache Kafka 0.10+

See requirements.txt for the full list of requirements

Documentation

Quix Streams Docs

Example

Here's an example of how to process data from a Kafka Topic with Quix Streams:

from quixstreams import Application

# A minimal application reading temperature data in Celsius from the Kafka topic,
# converting it to Fahrenheit and producing alerts to another topic.

# Define an application that will connect to Kafka
app = Application(
    broker_address="localhost:9092",  # Kafka broker address
)

# Define the Kafka topics
temperature_topic = app.topic("temperature-celsius", value_deserializer="json")
alerts_topic = app.topic("temperature-alerts", value_serializer="json")

# Create a Streaming DataFrame connected to the input Kafka topic
sdf = app.dataframe(topic=temperature_topic)

# Convert temperature to Fahrenheit by transforming the input message (with an anonymous or user-defined function)
sdf = sdf.apply(lambda value: {"temperature_F": (value["temperature"] * 9/5) + 32})

# Filter values above the threshold
sdf = sdf[sdf["temperature_F"] > 150]

# Produce alerts to the output topic
sdf = sdf.to_topic(alerts_topic)

# Run the streaming application 
app.run(sdf)

Tutorials

To see Quix Streams in action, check out the Quickstart and Tutorials in the docs:

Key Concepts

There are two primary objects:

StreamingDataFrame - a predefined declarative pipeline to process and transform incoming messages.
Application - to manage the Kafka-related setup, teardown and message lifecycle (consuming, committing). It processes each message with the dataframe you provide for it to run.

Under the hood, the Application will:

Consume and deserialize messages.
Process them with your StreamingDataFrame.
Produce it to the output topic.
Automatically checkpoint processed messages and state for resiliency.
Scale using Kafka's built-in consumer groups mechanism.

Deployment

You can run Quix Streams pipelines anywhere Python is installed.

Deploy to your own infrastructure or to Quix Cloud on AWS, Azure, GCP or on-premise for a fully managed platform.
You'll get self-service DevOps, CI/CD and monitoring, all built with best in class engineering practices learned from Formula 1 Racing.

Please see the Connecting to Quix Cloud page to learn how to use Quix Streams and Quix Cloud together.

Roadmap 📍

This library is being actively developed by a full-time team.

Here are some of the planned improvements:

Windowed aggregations over Tumbling & Hopping windows
Stateful operations and recovery based on Kafka changelog topics
Group-by operation
"Exactly Once" delivery guarantees for Kafka message processing (AKA transactions)
Joins
Windowed aggregations over Sliding windows
Support for Avro and Protobuf formats
Schema Registry support

Get Involved 🤝

Please use GitHub issues to report bugs and suggest new features.
Join the Quix Community on Slack, a vibrant group of Kafka Python developers, data engineers and newcomers to Apache Kafka, who are learning and leveraging Quix Streams for real-time data processing.
Watch and subscribe to @QuixStreams on YouTube for code-along tutorials from scratch and interesting community highlights.
Follow us on X and LinkedIn where we share our latest tutorials, forthcoming community events and the occasional meme.
If you have any questions or feedback - write to us at [email protected]!

License 📗

Quix Streams is licensed under the Apache 2.0 license.
View a copy of the License file here.

quixio / quix-streams Goto Github PK

quix-streams's Introduction

100% Python Stream Processing for Apache Kafka

Getting Started 🏄

Install Quix Streams

Requirements

Documentation

Example

Tutorials

Key Concepts

Deployment

Roadmap 📍

Get Involved 🤝

License 📗

quix-streams's People

Contributors

Stargazers

Watchers

Forkers

quix-streams's Issues

Recommend Projects

Recommend Topics

Recommend Org