Search⌘ K

Exercise: Incremental Load

Explore incremental loading methods by building an ETL load step that imports daily log activity data into PostgreSQL. Learn to create tables, write loading functions with the COPY command, and automate deletion of data older than one month to optimize storage and performance.

In this exercise, we’ll practice using the COPY command to incrementally load data. In incremental loading, we only load the recent changes in the source into the destination repository. As always, the logic by which we do it depends on the business requirement and context.

Example

As data engineers working for a retail company, we’re tasked with creating an ETL pipeline for loading customers’ log activity data from a transactional database into a PostgreSQL database for analysis.

According to the business requirements, new batches of data should be loaded at the end of each day. Also, to save storage costs, the PostgreSQL database should only contain data from the last month. After creating ...