Write custom SQL operations

Learn how to define custom SQL operations in Dataform.

Introduction

An operations file defines a set of SQL commands that will be executed in order against your warehouse. Operations can be used to run custom SQL that doesn't necessarily fit into the model of publishing a dataset or writing an assertion.

Simple example

To define a new operations action, create a file such as definitions/example_operation.sqlx :

definitions/example_operation.sqlx
1
CREATE OR REPLACE VIEW someschema.someview AS (SELECT 1 AS test)

Multiple statements can be separated with a single line containing only 3 dashes --- :

1
2
3
CREATE OR REPLACE VIEW someschema.someview AS (SELECT 1 AS test)
---
DROP VIEW IF EXISTS someschema.someview

These statements will be run without modification against the warehouse. You can use warehouse specific commands in these files, such as BigQuery's DML or DDL statements, or Redshift/Postgres specific commands.

BigQuery and SQL Data Warehouse run all operations for a file in the same context; the executed SQL is created by joining all operations with a semi-colon ; . For other warehouse types, operations are run as separate queries.

Custom dataset builds

In some cases, you may want to create a dataset manually rather than relying on Dataform's built-in logic. This can be achieved by writing an operations file and specifying the full CREATE TABLE/VIEW commands manually:

1
CREATE OR REPLACE VIEW sometable.someschema AS (SELECT 1 AS TEST)

Declaring and referencing outputs

If an operation creates a dataset or view that you would like to make available to other scripts, you can reference this operation as you would a normal dataset by using the ref() function. Note that operations may use self() to create a dataset or view that matches the current file name.

For example, in definitions/custom_view.sqlx :

definitions/custom_view.sqlx
1
2
config { hasOutput: true }
CREATE OR REPLACE VIEW ${self()} AS (SELECT 1 AS TEST)

References to "customview" will now resolve to "defaultschema.customview" and can be used in other SQL files, for example:

1
SELECT * FROM ${ref("customview")}
The output dataset created by the operation must match the name of the file in order for references to work properly. We recommend using self() to enforce this requirement. If you would like to create a dataset in a custom schema, or override the dataset's name, use the{" "} schema or name configuration settings.

Example: Running VACUUM commands (in Postgres or Redshift)

Postgres and Redshift have a VACUUM command that can be used to improve the performance of some datasets. This is a common use case for operations:

1
2
3
VACUUM DELETE ONLY ${ref("sometable")} TO 75 PERCENT
---
VACUUM REINDEX ${ref("sometable")}

What's next

Publish data tables and views

Learn how to configure, publish and document data tables in your warehouse.

SQLX

Learn about the structure and features of SQLX files.

Test data quality with assertions

Learn how to test data quality with assertions.

Declare external datasets with declarations

Learn how to declare external datasets with declarations.

Configure your project

Learn how to configure your Dataform project.

Power your code with JavaScript

Learn how you can use JavaScript to re-use code across your scripts and define several actions.

Organise your project with tags

Learn how to organise your project with tags.

Run unit tests on your queries

Learn how to run unit tests on your queries.

Configure CI/CD

Configure continuous integration/deployment workflows for your Dataform project.

Sitemap