Welcome to this ready to run repository to get started with the Apache Airflow Kafka provider! ๐
This repository assumes you have basic knowledge of Apache Kafka and Apache Airflow. You can find resources on these tools in the Resouces section below.
This repository contains 3 DAGs:
produce_consume_treats
: This DAG will produce NUMBER_OF_TREATS messages to a local Kafka cluster. Run it manually to produce and consume new messages.listen_to_the_stream
: This DAG will continuously listen to a topic in a local Kafka cluster and run theevent_triggered_function
whenever a message causes theapply_function
to return a value. Unpause this DAG to have it continuously run.walking_your_pet
: This DAG is the downstream DAG the givenevent_triggered_function
in thelisten_to_the_stream
DAG will trigger. Unpause this DAG to have it ready to be triggered by a TriggerDagRunOperator in the upstream DAG.
This repository is designed to spin up both a local Kafka cluster and a local Astro project and connect them automatically. Note that it sometimes takes a minute longer for the Kafka cluster to be fully started.
Run this Airflow project without installing anything locally.
-
Fork this repository.
-
Create a new GitHub codespaces project on your fork. Make sure it uses at least 4 cores!
-
After creating the codespaces project the Astro CLI will automatically start up all necessary Airflow components as well as the local Kafka cluster, using the instructions in the
docker-compose.override.yml
. This can take a few minutes. -
Once the Airflow project has started access the Airflow UI by clicking on the Ports tab and opening the forward URL for port 8080.
-
Unpause all DAGs. Manually run the
produce_consume_treats
DAG to see the pipeline in action. Note that a random function is used to generate parts of the message to Kafka which determines if thelisten_for_mood
task will trigger the downstreamwalking_your_pet
DAG. You might need to run theproduce_consume_treats
several times to see the full pipeline in action!
Download the Astro CLI to run Airflow locally in Docker. astro
is the only package you will need to install.
- Run
git clone https://github.com/astronomer/airflow-quickstart.git
on your computer to create a local clone of this repository. - Install the Astro CLI by following the steps in the Astro CLI documentation. Docker Desktop/Docker Engine is a prerequisite, but you don't need in-depth Docker knowledge to run Airflow with the Astro CLI.
- Run
astro dev start
in your cloned repository. - After your Astro project has started. View the Airflow UI at
localhost:8080
. - Unpause all DAGs. Manually run the
produce_consume_treats
DAG to see the pipeline in action. Note that a random function is used to generate parts of the message to Kafka which determines if thelisten_for_mood
task will trigger the downstreamwalking_your_pet
DAG. You might need to run theproduce_consume_treats
several times to see the full pipeline in action!