Stackoverflow data on BigQuery

A simple project turning Stackoverflow public raw data into reporting tables in a BigQuery data warehouse.


This project transforms four raw datasets (posts_answers , posts_questions , badges and users ) into two summary reporting tables.

  • posts_combined brings stackoverflow posts and answers into a single table, with a “type” field for differentiating between the two.
  • user_stats provides an overview of each users engagement: when they signed up, how many badges they have, and how many posts and answers they’ve made.

Dependency tree of the project

Sample bigquery Dataform project DAG
Dependency tree of the BigQuery sample project

View the project

