Amazon Redshift

Authentification, configuration options, and content for Redshift.

Authentification

Redshift projects require the following configuration settings:

  • Hostname in the form [name].[id].[region].redshift.amazonaws.com
  • Port (usually 5439 )
  • Username and password
  • Database name
Dataform's IP addresses must be whitelisted in order to access your Redshift cluster. Please follow these instructions. Dataform's IP addresses are 35.233.106.210 and 104.196.10.242 .

How to find Redshift credentials

  1. Go to Redshift in your AWS console.
  2. Select your cluster under Clusters .
  3. The hostname is the endpoint listed at the top of the page. Username and database name are listed under cluster database properties.

The Redshift user should have permissions to CREATE schemas and SELECT from INFORMATION_SCHEMAS.TABLES and INFORMATION_SCHEMAS.COLUMNS . Please contact our team via slack if you need help.

Configuration options

Redshift specific options can be applied to tables using the redshift configuration parameter.

Distributing data

You can configure how Redshift distributes data in your cluster by configuring the distStyle and distKey properties.

1
2
3
4
5
6
7
8
config {
  type: "table",
  redshift: {
    distKey: "user_id",
    distStyle: "key"
  }
}
SELECT user_id FROM ...

This query compiles to the following statement:

1
2
3
4
CREATE TABLE "dataform"."example"
DISTKEY(user_id)
DISTSTYLE even
AS SELECT user_id FROM ...

Sorting data

You can also configure how Redshift sorts data in your datasets with the sortKeys and sortStyle properties.

1
2
3
4
5
6
7
config {
  redshift: {
    sortKeys: [ "ts" ],
    sortStyle: "compound"
  }
}
SELECT 1 AS ts

Binding views

By default, all views in Redshift are created as late binding views. This can be changed by setting the bind property in the redshift configuration block.

1
2
3
4
5
6
7
config {
  type: "view",
  redshift: {
    bind: true
  }
}
SELECT 1 AS ts

Dataform web features for Redshift

Real time query validation

Dataform validates the compiled script you are editing against Redshift in real time. It will let you know if the query is valid (or won’t run) before having to run it.

Sample Dataform project with Redshift

We prepared the following sample project of a fictional ecommerce store using using Redshift.

Sample bigquery Dataform project DAG
Dependency tree of the Redshift sample project

Blog posts

Import data from S3 to Redshift using Dataform

The blog post offers a walkthrough to load data from S3 to Redshift.

Getting help

If you are using Dataform web and are having trouble connecting to Redshift, please reach out to us by using the intercom messenger icon at the bottom right of the app.

If you have other questions related to Redshift, you can join our slack community and ask question on the #Redshift channel.

What's next

Snowflake

Authentification, configuration options, and content for Snowflake.

Google BigQuery

Authentification, configuration options, and content for BigQuery.

Azure SQL Data Warehouse

Authentification and configuration options for Azure SQL Data Warehouse.

Postgres

Authentification and configuration options for Postgres.