Learn how to connect Airflow, Presto, and Cassandra all on your browser via Gitpod! This demo can also be done on your local via Docker.
IMPORTANT: Remember to make the ports public when the dialog shows in the bottom righthand corner!
bash setup.sh
docker ps
username: admin
password: password
hostname -I | awk '{print $2}'
3.2 Fill in presto_default connection with the following items and then confirm by testing the connection
Connection Type: Presto
Host: value copied from 3.1
Schema: remove hive and leave blank
Login: admin
Port: 8080
sed -i "s/{hostname}/$(hostname -I | awk '{print $2}')/" cassandra.properties
docker cp cassandra.properties $(docker container ls | grep 'presto' | awk '{print $1}'):/opt/presto-server/etc/catalog/cassandra.properties
docker exec -it presto sh -c "ls /opt/presto-server/etc/catalog"
docker exec -it presto presto-cli
show catalogs ;
If you do not see cassandra, then we need to restart the container
docker restart presto
docker cp setup.cql $(docker container ls | grep 'cassandra' | awk '{print $1}'):/
docker exec -it cassandra cqlsh -f setup.cql
mkdir ~/airflow/dags && mv presto_read_from_cassandra.py ~/airflow/dags && mv presto_join_and_xcom.py ~/airflow/dags && mv presto_write_to_cassandra.py ~/airflow/dags
key: presto_query
value: select * from cassandra.demo.spacecraft_journey_catalog;
select cassandra.demo.spacecraft_journey_catalog.spacecraft_name, cassandra.demo.spacecraft_journey_catalog.summary, cassandra.demo.spacecraft_speed_over_time.speed from cassandra.demo.spacecraft_journey_catalog inner join cassandra.demo.spacecraft_speed_over_time on cassandra.demo.spacecraft_journey_catalog.journey_id = cassandra.demo.spacecraft_speed_over_time.journey_id;
7.6.1 Click on the logs to visualize the result of the Presto query and see how they change from the first task to the second task
docker exec -it cassandra cqlsh -e "select * from demo.spacecraft_journey_summary_and_speed"