Welcome to the Dataform Getting Started Tutorial. This tutorial is for people who are new to Dataform and would like to learn how to set up a new project. We will show you how to create your own data model, how to test and document it and how to configure schedules. A working knowledge of SQL will be helpful for this tutorial.
For this tutorial we’re going to pretend we are a fictional e-commerce shop. We already have 3 main data sources in our data warehouse:
The aim of this tutorial is to create two new tables in our warehouse, one called order_stats
and one called customers
, which are:
For this this tutorial we’ll use BigQuery. Anyone with a Google Account can use it and it has a free tier. We’ve created a public dataset in BigQuery that anyone can access for the purpose of this tutorial.
First you will need to create a new Bigquery project:
In order for Dataform to connect to your BigQuery warehouse you’ll need to use Application Default Credentials or a service account and JSON key.
If running on GCE or GKE this will be automatically available. If not, then use [gcloud auth application-default](https://cloud.google.com/sdk/gcloud/reference/auth/application-default)
to authenticate.
You’ll need to create a service account from your Google Cloud Console and assign it permissions to access BigQuery.
To create a new service account in Google Cloud Console you need to:
Open
.Create Service Account
and give it a name.Once you’ve done this you need to create a key for your new service account (in JSON format):
More
button.Create key
.Create
.Now you've created a new BigQuery project and generated your warehouse credentials, you're ready to create your Dataform project!
For more detailed info on generating credentials for BigQuery, see our docs.