Welcome to the Dataform Getting Started Tutorial. This tutorial is for people who are new to Dataform and would like to learn how to set up a new project. We will show you how to create your own data model, how to test and document it and how to configure schedules. A working knowledge of SQL will be helpful for this tutorial.
For this tutorial we’re going to pretend we are a fictional e-commerce shop. We already have 3 main data sources in our data warehouse:
The aim of this tutorial is to create two new tables in our warehouse, one called
order_stats and one called
customers , which are:
Dataform connects to many different warehouses but for this tutorial we’ll use BigQuery, since anyone with a Google Account can use it and it has a free tier. We’ve created a public dataset in BigQuery that anyone can access for the purpose of this tutorial.
First you will need to create a new Bigquery project:
In order for Dataform to connect to your BigQuery warehouse you’ll need to generate some credentials. Dataform will connect to BigQuery using a service account. You’ll need to create a service account from your Google Cloud Console and assign it permissions to access BigQuery.
To create a new service account in Google Cloud Console you need to:
Create Service Accountand give it a name.
Once you’ve done this you need to create a key for your new service account (in JSON format):
Now you've created a new BigQuery project and generated your warehouse credentials, you're ready to create your Dataform project!
For more detailed info on generating credentials for BigQuery, see our docs.