BigQuery Connection
This feature is in preview phase and is available in Dagster+ in limited early access. Functionality and APIs may change as we continue development. To get early access to this feature, reach out to your Dagster account team. For more information, see the API lifecycle stages documentation.
This guide covers connecting Dagster+ to Google BigQuery to automatically discover and sync dataset, table, and view metadata.
Overview
To create a BigQuery Connection in Dagster+, you will need to:
- Create a GCP service account.
- Set up authentication in Dagster+.
- Create the BigQuery Connection in Dagster+.
Step 1: Create a GCP service account and grant permissions
Dagster requires read-only access to BigQuery metadata. We recommend creating a dedicated GCP service account for Dagster Connections.
Step 1.1: Create a GCP service account
- Open the GCP Console.
- Navigate to IAM & Admin > Service Accounts.
- Click Create Service Account.
- Enter a name for the account, such as
dagster-connection. - Click Create and Continue.
Step 1.2: Grant required permissions
The service account needs two sets of permissions: permissions on target projects (projects with data to sync) and permissions on the extractor project (where the service account resides).
Step 1.2.1: Grant permissions on target projects (projects with data to sync)
Grant the BigQuery Metadata Viewer role, which includes:
bigquery.datasets.get- Read dataset metadatabigquery.datasets.getIamPolicy- Access dataset permissionsbigquery.tables.list- List tables in datasetsbigquery.tables.get- Read table metadatabigquery.routines.getandbigquery.routines.list- Access stored procedures
To grant this role:
- Navigate to IAM & Admin > IAM in the GCP Console
- Click Grant Access
- Enter your service account email
- Select role: BigQuery Metadata Viewer (
roles/bigquery.metadataViewer) - Click Save
Repeat this for each project containing data you want to sync.
Step 1.2.2: Grant permissions on the extractor project (where the service account resides)
The service account needs to execute queries for metadata extraction. Grant the BigQuery Job User role, which includes:
bigquery.jobs.create- Execute metadata queriesbigquery.jobs.list- List job statusbigquery.readsessions.create- Create read sessions for large resultsbigquery.readsessions.getData- Read session data
To grant this role:
- In the project where your service account was created, navigate to IAM.
- Find your service account.
- Add role: BigQuery Job User (
roles/bigquery.jobUser).
Step 1.3: (Optional) Enable lineage and usage tracking
To track table lineage and usage statistics, add:
bigquery.jobs.listAll- View all jobs for lineage extractionlogging.logEntries.list- Access audit logs for usage tracking
These are available in the BigQuery Resource Viewer role (roles/bigquery.resourceViewer).
Step 1.4: Enable required APIs
Ensure these APIs are enabled in your GCP project:
gcloud services enable bigquery.googleapis.com
gcloud services enable bigquerystorage.googleapis.com
Or enable them in the GCP Console.
Step 2: Set up authentication in Dagster+
Step 2.1: Create and download service account key from GCP
- In IAM & Admin > Service Accounts, find your
dagster-connectionservice account - Click the service account email to open details
- Navigate to the Keys tab
- Click Add Key > Create new key
- Choose JSON format
- Click Create - the key file will download automatically
Service account keys provide full access to your GCP resources. Store them securely and never commit them to version control.
Step 2.2: Encode credentials and store them in Dagster+
BigQuery credentials must be base64-encoded before storing in Dagster+:
- Encode your JSON key file:
base64 -i /path/to/your-key-file.json
Or on Linux:
base64 -w 0 /path/to/your-key-file.json
-
Copy the base64-encoded output
-
In Dagster+, navigate to Deployment > Environment variables
-
Create a new environment variable:
- Name:
BIGQUERY_CONNECTION_CREDENTIALS(or any name you prefer) - Value: Paste the base64-encoded string
- Name:
Step 3: Create the BigQuery connection in Dagster+
- In Dagster+, click Connections in the left sidebar
- Click Create Connection
- Select BigQuery as the connection type
- Configure the connection details
Required fields
- Connection name: A unique name for this Connection (e.g.,
bigquery_analytics)- This will become the name of the code location containing synced assets
- Google application credentials environment variable: Name of the Dagster+ environment variable containing your base64-encoded service account JSON (e.g.,
BIGQUERY_CONNECTION_CREDENTIALS)
Optional: Configure region qualifiers
Specify which BigQuery regions to scan. Defaults to region-us and region-eu if not specified:
{
"region_qualifiers": ["region-us", "region-eu", "region-asia-northeast1"]
}
Region qualifiers help optimize scanning for multi-region datasets.
Optional: Configure asset filtering
Use filtering to control which projects, datasets, tables, and views are synced. Patterns use regular expressions.