Explore and test Open-sources tools for Data Engineering in standalone mode or integrated with a ecosystem
Using K8S
We should have
- MinIO as Datalake ๐
- SparkOperator as Spark engine with S3 and Delta-OSS enabled ๐
- Dremio-OSS / Trino as Datalake engine working with Delta withot issue on refresh ๐
- Airflow as Orchestrator (operator?) ๐
- Superset as BI tool ๐