Wizeline ETL Captstone Project
Project information
- Category: Data
- Client: Wizeline Academy
- Project date: 30 September, 2023
- Project URL: https://github.com/RhythmBear/wizeline-capstone-project
The goal is to create an end-to-end solution; this problem reflects a real-world scenario where a Data Engineer must configure the environment and resources, integrate the data from external sources, clean it, and process it for further analysis. When the data is ready, transform it to reflect business rules and deliver knowledge in the form of insights
Problem Description.
I was tasked with building a data pipeline to populate the fact_movie_analytics table, an OLAP table. The data from fact_movie_analytics is helpful for analysts and tools like dashboard software A PostgreSQL table named user_purchase Daily data by an external vendor in a CSV file named movie_review.csv that populates the classified_movie_review table. This file contains a customer id, a review id, and the message from the movie review. Daily data by an external vendor in a CSV file named log_reviews.csv. This file contained the id review and the metadata about the session when the movie review was done, like log date, device (mobile, computer), OS (windows, Linux), region, browser, IP, and phone number Image showing relationships between tables.