Getting set up

Learn how to create a new BigQuery project and generate warehouse credentials.
Dataform signups are closed. Please join the waitlist to register your interest in using Dataform.

Welcome to the Dataform Getting Started Tutorial. This tutorial is for people who are new to Dataform and would like to learn how to set up a new project. We will show you how to create your own data model, how to test and document it and how to configure schedules. A working knowledge of SQL will be helpful for this tutorial.

For this tutorial we’re going to pretend we are a fictional e-commerce shop. We already have 3 main data sources in our data warehouse:

  • Information about our customers coming from Salesforce
  • Orders information from Shopify
  • Payment information from Stripe

The aim of this tutorial is to create two new tables in our warehouse, one called order_stats and one called customers , which are:

  • Updated every hour
  • Tested for data quality
  • Well documented

Create a new BigQuery project

For this this tutorial we’ll use BigQuery. Anyone with a Google Account can use it and it has a free tier. We’ve created a public dataset in BigQuery that anyone can access for the purpose of this tutorial.

  1. First you will need to create a new Bigquery project:

    • Go to the BigQuery console (If you don’t already have a GCP account you’ll need to create one here).
    • If you’ve just created a new account you’ll be asked to create a new project straight away. If you already have an existing account you can select the project drop down in the header bar, and create a new project from there.

Generate warehouse credentials

In order for Dataform to connect to your BigQuery warehouse you’ll need to use Application Default Credentials or a service account and JSON key.

Using Application Default Credentials

If running on GCE or GKE this will be automatically available. If not, then use [gcloud auth application-default](https://cloud.google.com/sdk/gcloud/reference/auth/application-default) to authenticate.

Create a service account with JSON key

You’ll need to create a service account from your Google Cloud Console and assign it permissions to access BigQuery.

  1. To create a new service account in Google Cloud Console you need to:

    • Go to the Services Account page
    • Make sure the new project you created is selected and click Open .
    • Click on Create Service Account and give it a name.
    • Grant the new account the BigQuery Admin role.
  2. Once you’ve done this you need to create a key for your new service account (in JSON format):

    • On the Service Accounts page, find the row of the service account that you want to create a key for and click the More button.
    • Then click Create key .
    • Select JSON key type and click Create .

Now you've created a new BigQuery project and generated your warehouse credentials, you're ready to create your Dataform project!

For more detailed info on generating credentials for BigQuery, see our docs.

What's next

Building your data model

Learn how to connect to a warehouse and create and publish your first dataset.

Managing dependencies

Learn how to use the ref function in Dataform and how to view your project in the Dependency tree.

Setting up a schedule

Learn how to set up a schedule and alerts in Dataform

Data quality tests and documenting datasets

Learn how to set up data quality tests using assertions and how to document your datasets

Committing your changes

Learn how to commit changes you've made in your Dataform project

Sitemap