API Reference

IAssertionConfig

Configuration options for assertion action types.

database
string

The database where the corresponding view for this assertion should be created.

description
string

A description for this assertion.

disabled
boolean

If set to true, this action will not be executed. However, the action may still be depended upon. Useful for temporarily turning off broken actions.

hermetic
boolean

Declares whether or not this action is hermetic. An action is hermetic if all of its dependencies are explicitly declared.

If this action depends on data from a source which has not been declared as a dependency, then hermetic should be explicitly set to false. Otherwise, if this action only depends on data from explicitly-declared dependencies, then it should be set to true.

schema
string

The schema where the corresponding view for this assertion should be created.

tags
string[]

A list of user-defined tags with which the action should be labeled.

IAssertionConfigProperties

IColumnsDescriptor

Describes columns in a dataset.

{ [name]: string | IRecordDescriptor }

ICommonContext

Context methods are available when evaluating contextable SQL code, such as within SQLX files, or when using a Contextable argument with the JS API.

name
() => string

Returns the name of this dataset.

ref
(ref: Resolvable | string[], rest: string[]) => string

References another action, adding it as a dependency to this action, returning valid SQL to be used in a from expression.

This function can be called with a Resolvable object, for example:

${ref({ name: "name", schema: "schema", database: "database" })}

This function can also be called using individual arguments for the "database", "schema", and "name" values. When only two values are provided, the default database will be used and the values will be interpreted as "schema" and "name". When only one value is provided, the default data base schema will be used, with the provided value interpreted as "name".

${ref("database", "schema", "name")}
${ref("schema", "name")}
${ref("name")}
resolve
(ref: Resolvable | string[], rest: string[]) => string

Similar to ref except that it does not add a dependency, but just resolves the provided reference so that it can be used in SQL, for example in a from expression.

See the ref function for example usage.

self
() => string

Equivelant to resolve(name()).

Returns a valid SQL string that can be used to reference the dataset produced by this action.

IRecordDescriptor

Describes a struct, object or record in a dataset that has nested columns.

bigqueryPolicyTags
string | string[]

BigQuery policy tags that should be applied to this column.

These should be the fully qualified identifier of the tag, including the project name, location, and taxonomy, which can be copied from the policy tags page in GCP.

For example: "projects/1/locations/eu/taxonomies/2/policyTags/3"

Currently BigQuery supports only a single tag per column.

columns
IColumnsDescriptor

A description of columns within the struct, object or record.

description
string

A description of the struct, object or record.

ITarget

A reference to a dataset within the warehouse.

database
string
name
string
schema
string

Contextable

Contextable arguments can either pass a plain value for their generic type T or can pass a function that will be called with the context object for this type of operation.

T | (ctx: Context) => T

Resolvable

A resolvable can be either the name of a dataset as string, or an object that describes the full path to the relation.

string | ITarget

IDeclarationConfig

Configuration options for declaration action types.

columns
IColumnsDescriptor

A description of columns within the dataset.

database
string

The database in which the output of this action should be created.

description
string

A description of the dataset.

schema
string

The schema in which the output of this action should be created.

IDeclarationConfigProperties

IOperationConfig

Configuration options for operations action types.

columns
IColumnsDescriptor

A description of columns within the dataset.

database
string

The database in which the output of this action should be created.

description
string

A description of the dataset.

disabled
boolean

If set to true, this action will not be executed. However, the action may still be depended upon. Useful for temporarily turning off broken actions.

hasOutput
boolean

Declares that this operations action creates a dataset which should be referenceable using the ref function.

If set to true, this action should create a dataset with its configured name, using the self() context function.

For example:

create or replace table ${self()} as select ...
hermetic
boolean

Declares whether or not this action is hermetic. An action is hermetic if all of its dependencies are explicitly declared.

If this action depends on data from a source which has not been declared as a dependency, then hermetic should be explicitly set to false. Otherwise, if this action only depends on data from explicitly-declared dependencies, then it should be set to true.

schema
string

The schema in which the output of this action should be created.

tags
string[]

A list of user-defined tags with which the action should be labeled.

IIOperationConfigProperties

IBigQueryOptions

BigQuery-specific warehouse options.

additionalOptions

Key-value pairs for options table, view, materialized view.

Some options (e.g. partitionExpirationDays) have dedicated type/validity checked fields; prefer using those. String values need double-quotes, e.g. additionalOptions: {numeric_option: "5", string_option: '"string-value"'} If the option name contains special characters, e.g. hyphens, then quote its name, e.g. additionalOptions: { "option-name": "value" }.

clusterBy
string[]

The keys by which to cluster partitions by.

For more information, read the BigQuery clustered tables docs.

labels

Key-value pairs for BigQuery labels.

If the label name contains special characters, e.g. hyphens, then quote its name, e.g. labels: { "label-name": "value" }.

partitionBy
string

The key with which to partition the table. Typically the name of a timestamp or date column.

For more information, read the BigQuery partitioned tables docs.

partitionExpirationDays
number

This setting specifies how long BigQuery keeps the data in each partition. The setting applies to all partitions in the table, but is calculated independently for each partition based on the partition time.

For more information, see our docs.

requirePartitionFilter
boolean

When you create a partitioned table, you can require that all queries on the table must include a predicate filter ( a WHERE clause) that filters on the partitioning column. This setting can improve performance and reduce costs, because BigQuery can use the filter to prune partitions that don't match the predicate.

For more information, see our docs.

updatePartitionFilter
string

SQL based filter for when incremental updates are applied.

For more information, see our incremental dataset docs.

IPrestoOptions

Options for creating tables within Presto projects.

partitionBy
string[]

The key with which to partition the table. Typically the name of a timestamp or date column.

For more information, read the partitioning documentation for the Presto connection in use.

IRedshiftOptions

Redshift-specific warehouse options.

distKey
string

Sets the DISTKEY property when creating tables.

For more information, read the Redshift create table docs.

distStyle
string

Set the DISTSTYLE property when creating tables.

For more information, read the Redshift create table docs.

sortKeys
string[]

A list of string values that will configure the SORTKEY property when creating tables.

For more information, read the Redshift create table docs.

sortStyle
string

Sets the style of the sort key when using sort keys.

For more information, read the Redshift sort style article.

ISQLDataWarehouseOptions

Azure SQL Data Warehouse-specific warehouse options.

distribution
string

The distribution option value.

For more information, read the Azure CTAS docs.

ISnowflakeOptions

Snowflake-specific warehouse options.

clusterBy
string[]

A list of clustering keys to cluster the table by. Only applicable to actions of type "table" or "incremental".

For more information, read the Snowflake clustering docs.

secure
boolean

If set to true, a secure view will be created.

For more information, read the Snowflake Secure Views docs.

transient
boolean

If set to true, a transient table will be created. Only applicable to actions of type "table".

For more information, read the Snowflake docs.

ITableAssertions

Options for creating assertions as part of a dataset definition.

nonNull
string | string[]

Column(s) which may never be NULL.

If set, the resulting assertion will fail if any row contains NULL values for these column(s).

rowConditions
string[]

General condition(s) which should hold true for all rows in the dataset.

If set, the resulting assertion will fail if any row violates any of these condition(s).

uniqueKey
string | string[]

Column(s) which constitute the dataset's unique key index.

If set, the resulting assertion will fail if there is more than one row in the dataset with the same values for all of these column(s).

uniqueKeys
[]

Combinations of column(s), each of which should constitute a unique key index for the dataset.

If set, the resulting assertion(s) will fail if there is more than one row in the dataset with the same values for all of the column(s) in the unique key(s).

ITableConfig

Configuration options for dataset actions, including table, view and incremental action types.

assertions
ITableAssertions

Assertions to be run on the dataset.

If configured, relevant assertions will automatically be created and run as a dependency of this dataset.

bigquery
IBigQueryOptions

BigQuery-specific warehouse options.

columns
IColumnsDescriptor

A description of columns within the dataset.

database
string

The database in which the output of this action should be created.

description
string

A description of the dataset.

disabled
boolean

If set to true, this action will not be executed. However, the action may still be depended upon. Useful for temporarily turning off broken actions.

hermetic
boolean

Declares whether or not this action is hermetic. An action is hermetic if all of its dependencies are explicitly declared.

If this action depends on data from a source which has not been declared as a dependency, then hermetic should be explicitly set to false. Otherwise, if this action only depends on data from explicitly-declared dependencies, then it should be set to true.

materialized
boolean

Only valid when the table type is view. Only valid when using Snowflake or BigQuery.

If set to true, will make the view materialized.

For more information, read the BigQuery materialized view docs or the Snowflake materialized view docs.

presto
IPrestoOptions

Presto-specific options.

protected
boolean

Only allowed when the table type is incremental.

If set to true, running this action will ignore the full-refresh option. This is useful for tables which are built from transient data, to ensure that historical data is never lost.

redshift
IRedshiftOptions

Redshift-specific warehouse options.

schema
string

The schema in which the output of this action should be created.

snowflake
ISnowflakeOptions

Snowflake-specific options.

sqldatawarehouse
ISQLDataWarehouseOptions

Azure SQL Data Warehouse-specific options.

tags
string[]

A list of user-defined tags with which the action should be labeled.

type
TableType

The type of the dataset. For more information on how this setting works, check out some of the guides on publishing different types of datasets with Dataform.

uniqueKey
string[]

Unique keys for merge criteria for incremental tables.

If configured, records with matching unique key(s) will be updated, rather than new rows being inserted.

ITableContext

Context methods are available when evaluating contextable SQL code, such as within SQLX files, or when using a Contextable argument with the JS API.

incremental
() => boolean

Indicates whether the config indicates the file is dealing with an incremental table.

name
() => string

Returns the name of this dataset.

ref
(ref: Resolvable | string[], rest: string[]) => string

References another action, adding it as a dependency to this action, returning valid SQL to be used in a from expression.

This function can be called with a Resolvable object, for example:

${ref({ name: "name", schema: "schema", database: "database" })}

This function can also be called using individual arguments for the "database", "schema", and "name" values. When only two values are provided, the default database will be used and the values will be interpreted as "schema" and "name". When only one value is provided, the default data base schema will be used, with the provided value interpreted as "name".

${ref("database", "schema", "name")}
${ref("schema", "name")}
${ref("name")}
resolve
(ref: Resolvable | string[], rest: string[]) => string

Similar to ref except that it does not add a dependency, but just resolves the provided reference so that it can be used in SQL, for example in a from expression.

See the ref function for example usage.

self
() => string

Equivelant to resolve(name()).

Returns a valid SQL string that can be used to reference the dataset produced by this action.

when
(cond: boolean, trueCase: string, falseCase: string) => string

Shorthand for an if condition. Equivalent to cond ? trueCase : falseCase. falseCase is optional, and defaults to an empty string.

DistStyleType

Valid types for setting the distribution style for Redshift tables.

View the Redshift documentation for more information.

SortStyleType

Valid types for setting the sort style for Redshift tables.

View the Redshift documentation for more information.

TableType

Supported types of table actions.

Tables of type view will be created as views.

Tables of type table will be created as tables.

Tables of type incremental must have a where clause provided. For more information, see the incremental tables guide.

ITableConfigProperties

ITestConfig

Configuration options for unit tests.

dataset
Resolvable

The dataset that this unit test tests.

What's next

Introduction

Learn the basics of Dataform, how it works, and where it fits in your data stack.

Supported warehouses

Learn about which data warehouses Dataform can work with and how to configure them.

Example projects and scripts

Learn how Dataform works with examples projects and scripts.

Getting started tutorial

This tutorial is for people who are new to Dataform and want to be taught how to set up a new project. We will show you how to create your own data model, how to test and document it and how to run schedules on it.

Build your Dataform project

Guides to build your Dataform project.

Dataform web guides

Learn how to set up and run your projects in Dataform Web's cloud environment.

Best practices using Dataform

Best practices to scale your Dataform project and your analytics

Use the Dataform CLI

Guide to learn how to use the Dataform command line interface tool.

Packages

A list of ready made functions to use in your Dataform projects.

Sitemap