- This repo contains the files for setting up a data ingestion pipeline for Covid19 tweets
- The tweets were collected from the streaming endpoint provided by Twitter. Through their developer program
- Following files are stored in this repo:
- Code for hydrating the tweet IDs (python)
- Code for taking snapshot of ES index (python)
- Code for basic input/output for ES domain (python)
- CloudFormation template for setting up Data Ingestion pipeline (yaml)
- Lambda function for processing the tweets (python)
skyprince999/Data-Engineering-Covid19-ETL
Folders and files
| Name | Name | Last commit date | ||
|---|---|---|---|---|