Towards a Server-less ETL and Data Storage Solution
Towards a Server-less ETL and Data Storage Solution
Samenvatting
Traditionally moving raw data from source to destination has been a tedious process of planning infrastructure predicting scalability, availability, and costs. The solution demonstrated in this thesis is a serverless ETL pipeline that manages semi-structured data and transforms it into a query-compatible format. This process is handled by applying data acquisition techniques in AWS services, particularly Lambda and S3 bucket, as well as the implementation of data warehouse concepts in Redshift. By adopting these services, the main focus is on developing the solution and testing it by answering the common BI questions asked by clients. The result is a prototype that could encourage customers to adopt a more flexible data-driven approach.