Dataform provides methods that enable you to easily reference another dataset in your project using the ref
function.
This provides two advantages:
In this step you'll learn how to manage dependencies in Dataform.
You'll now create a second table called customers
, following the same process as before
New Dataset
and select the table template.customers
Create
.Define your dataset:
1SELECT 2 customers.id AS id, 3 customers.first_name AS first_name, 4 customers.last_name AS last_name, 5 customers.email AS email, 6 customers.country AS country, 7 COUNT(orders.id) AS order_count, 8 SUM(orders.amount) AS total_spent 9 10FROM 11 dataform-demos.dataform_tutorial.crm_customers AS customers 12 LEFT JOIN ${ref('order_stats')} orders 13 ON customers.id = orders.customer_id 14 15WHERE 16 customers.id IS NOT NULL 17 AND customers.first_name <> 'Internal account' 18 AND country IN ('UK', 'US', 'FR', 'ES', 'NG', 'JP') 19 20GROUP BY 1, 2, 3, 4, 5
customers.sqlx
, below the config block.ref
function. The ref
function enables you to reference any other table defined in a Dataform project.ref
function has been replaced with the fully qualified table name.Once you can see that your query is valid you can publish the table to your warehouse by clicking on Publish Table
.
View the Dependency tree:
Dependency Tree
tab.You now have two tables created in your warehouse, one called order_stats
and one called customers
. customers
depends on order_stats
and will start running when order_stats
is completed.
For more detailed info on managing dependencies in Dataform, see our docs.