The CLI enables you to initialize, compile, test, and run Dataform projects directly from your local machine or as part of other systems.
The Dataform CLI can be installed using NPM:
1npm i -g @dataform/cli
To create a new bigquery
, postgres
, redshift
, snowflake
, or sqldatawarehouse
project in the new_project
directory, run the respective command:
1dataform init bigquery new_project --default-database <your-google-cloud-project-id> 2- or - 3dataform init postgres new_project 4- or - 5dataform init redshift new_project 6- or - 7dataform init snowflake new_project 8- or - 9dataform init sqldatawarehouse new_project 10- or, if you've cloned a pre-existing project - 11dataform install
Change directory into the newly-created new_project
directory and take a look at your newly created project files:
1cd new_project 2ls
You should see the following structure:
1project-dir 2├── definitions 3├── includes 4├── package.json 5└── dataform.json
The definitions/
directory should be used for files that define tables, assertions, and operations.
To create a new dataset, create a new file definitions/example.sqlx
:
1echo "config { type: 'view' } SELECT 1 AS test" > definitions/example.sqlx
To check that your Dataform code compiles, run the compile
command at the root of your project directory to get JSON output of the compiled project:
1dataform compile
You should see output similar to the following:
1Compiling... 2 3Compiled 1 action(s). 41 dataset(s): 5 dataform.example [view]
To see the output of the compilation process as a JSON object, add the --json
option.
1dataform compile --json
If your project uses custom compilation variables, you can pass their values using the --vars flag:
1dataform compile --vars=exampleVar=exampleValue,foo=bar
Dataform requires a credentials file in order to connect to your warehouse. Run the init-creds
command and Dataform will guide you through credentials file creation:
1dataform init-creds bigquery 2- or - 3dataform init-creds postgres 4- or - 5dataform init-creds redshift 6- or - 7dataform init-creds snowflake 8- or - 9dataform init-creds sqldatawarehouse
A .df-credentials.json
file will be written to disk containing your provided details.
Check out our data warehouse setup guide if you need help with the init-creds
wizard.
In order to run your code, Dataform needs to access your data warehouse in order to determine its current state and tailor the resulting SQL accordingly. If you'd like to see the final SQL that Dataform will run on your warehouse without actually running it, you can perform a dry run:
1dataform run --dry-run
You should see something similar to the following:
1Compiling... 2 3Compiled successfully. 4 5Dry run (--dry-run) mode is turned on; not running the following actions against your warehouse: 6 71 dataset(s): 8 dataform.example [table]
Removing the --dry-run
option will result in the SQL being run in your warehouse:
1dataform run
The run
command's output will now include the run's execution status, including any errors encountered during the run:
1Compiling... 2 3Compiled successfully. 4 5Running... 6 7Dataset created: dataform.example [view]
In addition to this guide, you can run the help
command to get a short description of any Dataform command or option. For example, you can type:
1dataform help
This will list all of the available commands and options:
1Commands: 2 dataform help [command] Show help. If [command] is specified, the help is for the given command. 3 dataform init <warehouse> [project-dir] Create a new dataform project. 4 dataform init-creds <warehouse> [project-dir] Create a .df-credentials.json file for dataform to use when accessing your warehouse. 5 dataform compile [project-dir] Compile the dataform project. Produces JSON output describing the non-executable graph. 6 dataform test [project-dir] Run the dataform project\'s unit tests on the configured data warehouse. 7 dataform run [project-dir] Run the dataform project\'s scripts on the configured data warehouse. 8 dataform listtables <warehouse> List tables on the configured data warehouse. 9 dataform gettablemetadata <warehouse> <schema> <table> Fetch metadata for a specified table. 10 11Options: 12 --help Show help [boolean] 13 --version Show version number [boolean]
If you want to get help for a specific command, you can type:
1dataform help compile
You should see something similar to the following:
1dataform compile [project-dir] 2 3Compile the dataform project. Produces JSON output describing the non-executable graph. 4 5Positionals: 6 project-dir The Dataform project directory. [default: \".\"] 7 8Options: 9 --help Show help [boolean] 10 --version Show version number [boolean] 11 --watch Whether to watch the changes in the project directory. [boolean] [default: false] 12 --schema-suffix A suffix to be appended to output schema names. 13 --verbose If true, the full contents of command output will be output (containing fully compiled SQL, etc). [boolean] [default: false]
You have now seen how easy it is to use Dataform to publish simple datasets. Next, how about publishing a dataset?