Slowly changing dimensions

Common data models for creating type-2 slowly changing dimensions tables from mutable data sources in Dataform.

Supported warehouses

  • BigQuery
  • Redshift/PG
  • Snowflake

If you would like us to add support for another warehouse, please get in touch via email or Slack

Installation

Add the package to your package.json file in your Dataform project. You can find the most up to package version on the releases page.

Configure the package

Create a new JS file in your definitions/ folder create an SCD table with the following example:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
const scd = require("dataform-scd");

scd("source_data_scd", {
  // A unique identifier for rows in the table.
  uniqueKey: "user_id",
  // A field that stores a timestamp or date of when the row was last changed.
  timestamp: "updated_at",
  // The source table to build slowly changing dimensions from.
  source: {
    schema: "dataform_scd_example",
    name: "source_data",
  },
  // Any configuration parameters to apply to the incremental table that will be created.
  incrementalConfig: {
    bigquery: {
      partitionBy: "updated_at",
    },
  },
});

For more advanced customization of outputs, see the example.js.

Scheduling

Slowly changing dimensions can only by updated as quickly as these models are run. These models should typically be scheduled to run every day or every hour, depending on the granularity of changes you want to capture.

Data models

This package will create two relations in the warehouse, for a given name these will be:

  • {name} - a view with scd_valid_from and scd_valid_to fields
  • {name}_updates - an incremental table that stores the change history of the source table

What's next

Create new packages

Learn how to create new packages on Dataform.

BigQuery Audit Logs

Segment

Sitemap