Ask AI

Integrating BigQuery & dbt with Dagster+ Insights#

BigQuery costs can be integrated into the Dagster Insights UI. The dagster-cloud package contains utilities for capturing and submitting BigQuery cost metrics about data operations to Dagster+.

If you use dbt to materialize tables in BigQuery, use this guide to integrate BigQuery cost metrics into the Insights UI. For instructions on integrating direct BigQuery queries, see Integrating Direct BigQuery Usage with Dagster+ Insights.


Prerequisites#

To complete the steps in this guide, you'll need:

  • A Dagster+ account on the Pro plan

  • Access to the Dagster+ Insights feature

  • BigQuery credentials which have access to the INFORMATION_SCHEMA.JOBS table (e.g. BigQuery Resource viewer role). These credentials should be provided used by your dbt profile. For more information on granting access to this table, see the BigQuery documentation.

  • To install the following libraries:

    pip install dagster dagster-cloud dagster-dbt dagster-gcp
    

    Note: If you already have dagster-cloud installed, make sure you're using version 1.7.0 or newer.


Step 1: Instrument your Dagster code#

First, instrument the Dagster @dbt_assets function with dbt_with_bigquery_insights:

from dagster_cloud.dagster_insights import dbt_with_bigquery_insights


@dbt_assets(...)
def my_asset(context: AssetExecutionContext, dbt: DbtCliResource):
    # Typically you have a `yield from dbt_resource.cli(...)`.
    # Wrap the original call with `dbt_with_bigquery_insights` as below.
    dbt_cli_invocation = dbt_resource.cli(["build"], context=context)
    yield from dbt_with_bigquery_insights(context, dbt_cli_invocation)

This passes through all underlying events and emits additional AssetObservations with BigQuery cost metrics. These metrics are obtained by querying the underlying INFORMATION_SCHEMA.JOBS table, using the BigQuery client from the dbt adapter.


Step 2: Update dbt_project.yml#

Next, add the following to your dbt project's dbt_project.yml:

query-comment:
  comment: "bigquery_dagster_dbt_v1_opaque_id[[[{{ node.unique_id }}:{{ invocation_id }}]]]"
  append: true

This allows you to add a comment, containing the dbt invocation ID and unique ID, to every query recorded in BigQuery's INFORMATION_SCHEMA.JOBS table. Using this data, Insights will attribute cost metrics in BigQuery to the corresponding Dagster jobs and assets.


Step 3: View BigQuery usage in the Dagster UI#

Typically within 24 hours, the BigQuery metrics should be available in the Insights tab in the Dagster UI:

BigQuery metrics in the Insights tab of the Dagster UI

The BigQuery cost metric is based off of the bytes billed for queries wrapped with dbt_with_bigquery_insights, based on a unit price of $6.25 USD per TiB.