- This repo contains the files for setting up a data ingestion pipeline for Covid19 tweets
- The tweets were collected from the streaming endpoint provided by Twitter. Through their developer program
- Following files are stored in this repo:
- Code for hydrating the tweet IDs (python)
- Code for taking snapshot of ES index (python)
- Code for basic input/output for ES domain (python)
- CloudFormation template for setting up Data Ingestion pipeline (yaml)
- Lambda function for processing the tweets (python)
kumarkapil / data-engineering-covid19-etl Goto Github PK
View Code? Open in Web Editor NEWThis project forked from skyprince999/data-engineering-covid19-etl