Integrating Google BigQuery usage with Dagster+ Insights#
External metrics, such as Google BigQuery usage, can be integrated into the Dagster Insights UI. The dagster-cloud package contains utilities for capturing and submitting external metrics about data operations to Dagster+ via an API.
BigQuery credentials that have access to the INFORMATION_SCHEMA.JOBS table. For more information on granting access to this table, see the BigQuery documentation.
To install the following libraries:
pip install dagster dagster-cloud dagster-gcp
Note: If you already have dagster-cloud installed, make sure you're using version 1.7.0 or newer.
The first step is to replace any existing BigQuery resources with InsightsBigQueryResource. This resource is a drop-in replacement for the BigQueryResource resource, but it also emits BigQuery usage metrics to the Dagster+ Insights API.
from dagster_cloud.dagster_insights import InsightsBigQueryResource
defs = Definitions(
resources={"bigquery": InsightsBigQueryResource(project="my-project")})
Once the pipeline runs, BigQuery usage will be visible in the Insights tab in the Dagster UI:
The BigQuery cost metric is based off of the bytes billed for queries performed using the InsightsBigQueryResource, based on a unit price of $6.25 USD per TiB.