Command line interface

Installation

Dataform can be installed via Node:

npm i -g @dataform/cli

Create new project

To create a new project in the folder new_project:

dataform init new_project --warehouse bigquery

Currently supported warehouse types are [bigquery, redshift, snowflake, postgres]

Project structure

The default project structure is as follows:

project-dir
├── definitions
├── includes
├── package.json
└── dataform.json

Define a table

The definitions/ directory contains files that define tables, assertions, and operations.

To create a new table, create a new file such as definitions/simplemodel.sql with the following contents:

select 1 as test

Compile project

To check everything has worked, run the following command at the root of your project directory to get JSON dump of the compiled project:

dataform compile

You should see output along the lines of:

{
  "tables": [
    {
      "name": "simplemodel",
      "type": "view",
      "target": {
        "schema": "dataform",
        "name": "simplemodel"
      },
      "query": "select 1 as test
",
      "parsedColumns": [
        "test"
      ],
      "fileName": "definitions/simplemodel.sql"
    }
  ]
}

Dry run

We can take this a step further and see exactly what statements and graph will be run, by executing the following:

dataform build

You should see something similar to the following:

{
  "nodes": [
    {
      "name": "simplemodel",
      "tasks": [
        {
          "statement": "create or replace view `dataform.simplemodel`
           as select * from (select 1 as test)"
        }
      ]
    }
  ]
}

Create a profile

A profile defines the connection parameters to your warehouse.

TODO

Run project

If you are happy with the statements that will be executed, you can execute them against your data warehouse, by providing the path to your profile.json file created in the previous step.

dataform run --profile=<path-to-profile.json>