Slow Changing Dimensions Type 2 (SCD T2) is a pattern in data warehouses where you store every version of a record, in order to keep track of historic data. In other words, when a record changes, we insert a new row in a history table. This allows users to go back in time and see what the data looked like on a specific date & time.
Example table Customers:
Id Name City
1 ACME Inc NY
2 Pepsi WA
3 Cola BR
Example history table for Customers using SCD T2 where the name of ACME was updated on 15th of Jan from “ACME Ltd” to “ACME Inc”:
Id Name City Timestamp History id
1 ACME Ltd NY 2024-01-01 1_2024-01-01
1 ACME Inc NY 2024-01-15 1_2024-01-15
2 Pepsi WA 2024-01-01 2_2024-01-01
3 Cola BR 2024-01-01 3_2024-01-01
The “History id” column is the so called “surrogate key”, which is unique for each line in the history table. The PK of the original table (id) is called the “natural key”, it is unique in the main table but of course not unique in the history table.
Activating a history pipeline in Peliqan
We can implement this pattern in Peliqan with a low-code script, see below.
Make sure the script is executed after each refresh of the source table, using e.g. a regular schedule or by including this in a pipeline. For example set an hourly schedule on the script when the source table is from a daily scheduled pipeline.
Note: this script does not compare the historic and current record, it will fetch new and updated records incrementally (based on a timestamp) and write those to a history table. For Singer target tables the timestamp field _sdc_batched_at
is used, this timestamp is only updated if there is an actual change in the source record.
Viewing historic data snapshots
The history tables can be used to view historic snapshots of your data.
Here’s an example of an SQL query you can use (in Peliqan or in your BI tool) to see data as it was on a given date in the past (24 December 2024 in the example below):
SELECT DISTINCT ON (id) *
FROM history.deals_history
WHERE _sdc_batched_at < '2024-12-25T00:00:00'
ORDER BY id, _sdc_batched_at DESC;
Creating a change log
The below script can be used to generate an SQL query, that will convert a history table into a change log. The change log will list out only the columns (cells) that changed. This is useful when you have tables with many columns, to show in a condensed way which values actually changed on a given date. A change log is useful in reporting to explain e.g. changes in forecasts over time.
Example of a changelog for a history table from “tickets” with columns Name, Description, Color etc:
Id Date Changes
1 2025-04-21 13:08:51 Name changed to: Ticket V2 | Color changed to: NULL
5 2025-04-20 13:08:51 Color changed to: Red
7 2025-04-21 13:08:51 Name changed to: Ticket V2 | Color changed to: Yellow
7 2025-04-25 13:08:51 Name changed to: Ticket V3 | Color changed to: NULL
By using snapshots (see above) in combination with changelogs, it’s possible to provide clear information to a user on what changed when.