Integrating BigQuery & dbt with Dagster+ Insights#
BigQuery costs can be integrated into the Dagster Insights UI. The dagster-cloud package contains utilities for capturing and submitting BigQuery cost metrics about data operations to Dagster+.
If you use dbt to materialize tables in BigQuery, use this guide to integrate BigQuery cost metrics into the Insights UI. For instructions on integrating direct BigQuery queries, see Integrating Direct BigQuery Usage with Dagster+ Insights.
BigQuery credentials which have access to the INFORMATION_SCHEMA.JOBS table (e.g. BigQuery Resource viewer role). These credentials should be provided used by your dbt profile. For more information on granting access to this table, see the BigQuery documentation.
First, instrument the Dagster @dbt_assets function with dbt_with_bigquery_insights:
from dagster_cloud.dagster_insights import dbt_with_bigquery_insights
@dbt_assets(...)defmy_asset(context: AssetExecutionContext, dbt: DbtCliResource):# Typically you have a `yield from dbt_resource.cli(...)`.# Wrap the original call with `dbt_with_bigquery_insights` as below.
dbt_cli_invocation = dbt_resource.cli(["build"], context=context)yieldfrom dbt_with_bigquery_insights(context, dbt_cli_invocation)
This passes through all underlying events and emits additional AssetObservations with BigQuery cost metrics. These metrics are obtained by querying the underlying INFORMATION_SCHEMA.JOBS table, using the BigQuery client from the dbt adapter.
First, instrument the op function with dbt_with_bigquery_insights:
from dagster_cloud.dagster_insights import dbt_with_bigquery_insights
@op(out={})defmy_dbt_op(context: OpExecutionContext, dbt: DbtCliResource):# Typically you have a `yield from dbt_resource.cli(...)`.# Wrap the original call with `dbt_with_bigquery_insights` as below.
dbt_cli_invocation = dbt.cli(["build"], context=context, manifest=dbt_manifest_path
)yieldfrom dbt_with_bigquery_insights(context, dbt_cli_invocation)@jobdefmy_dbt_job():...
my_dbt_op()...
This passes through all underlying events and emits additional AssetObservations with BigQuery cost metrics. These metrics are obtained by querying the underlying INFORMATION_SCHEMA.JOBS table, using the BigQuery client from the dbt adapter.
This allows you to add a comment, containing the dbt invocation ID and unique ID, to every query recorded in BigQuery's INFORMATION_SCHEMA.JOBS table. Using this data, Insights will attribute cost metrics in BigQuery to the corresponding Dagster jobs and assets.
Typically within 24 hours, the BigQuery metrics should be available in the Insights tab in the Dagster UI:
The BigQuery cost metric is based off of the bytes billed for queries wrapped with dbt_with_bigquery_insights, based on a unit price of $6.25 USD per TiB.