Databricks Connection
This feature is in preview phase and is available in Dagster+ in limited early access. Functionality and APIs may change as we continue development. To get early access to this feature, reach out to your Dagster account team. For more information, see the API lifecycle stages documentation.
This guide covers connecting Dagster+ to Databricks Unity Catalog to automatically discover and sync catalog, schema, table, and view metadata.
Overview
To create a Databricks Connection in Dagster+, you will need to:
- Generate an authentication token with appropriate permissions.
- Add the authentication token as an environment variable in Dagster+.
- Create the Databricks Connection in Dagster+.
Step 1: Generate an authentication token with appropriate permissions
Dagster Connections require read-only access to Databricks Unity Catalog metadata. We recommend using a dedicated service principal, but personal access tokens (PATs) are also supported.
Option A: Create a Service principal (recommended for production)
Service principals provide more secure, auditable access without tying to a specific user account.
Step 1A.1: Create service principal
- In your Databricks workspace, navigate to Settings > Admin Console.
- Click Service principals in the left sidebar.
- Click Add service principal.
- Enter a name like
dagster-connection. - Click Add.
Step 1A.2: Grant Unity Catalog permissions
Grant these privileges on the catalogs and schemas you want to sync:
-- Grant catalog access
GRANT USE CATALOG ON CATALOG <catalog_name> TO `dagster-connection`;
-- Grant schema access
GRANT USE SCHEMA ON SCHEMA <catalog_name>.<schema_name> TO `dagster-connection`;
-- Grant read access to tables and views
GRANT SELECT ON SCHEMA <catalog_name>.<schema_name> TO `dagster-connection`;
Step 1A.3: Generate access token for service principal
- In the Admin Console, find your
dagster-connectionservice principal - Click the Generate token button
- Set an expiration period (or "no expiration" for long-term use)
- Copy the generated token - it will only be shown once
Option B: Personal access token
For simpler setups or development environments, you can use a PAT tied to your user account.
Step 1B.1: Ensure your user has required permissions
Your user account needs these Unity Catalog privileges:
USE CATALOGon target catalogsUSE SCHEMAon target schemasSELECTon tables and views
Step 1B.2: Create personal access token
- Click your username in the top-right corner of the Databricks workspace
- Select User Settings
- Navigate to the Developer tab
- Click Manage next to Access tokens
- Click Generate new token
- Enter a comment like "Dagster Connection"
- Set expiration (or leave blank for no expiration)
- Click Generate
- Copy the token immediately - it won't be shown again
Step 2: Store access token in Dagster+
-
In Dagster+, navigate to Deployment > Environment variables
-
Create a new environment variable:
- Name:
DATABRICKS_CONNECTION_TOKEN(or any name you prefer) - Value: Paste your service principal token or PAT
- Name:
Step 3: Create the Databricks Connection
- In Dagster+, click Connections in the left sidebar
- Click Create Connection
- Select Databricks as the connection type
- Configure the connection details
Required fields
- Connection name: A unique name for this Connection (e.g.,
databricks_unity_catalog)- This will become the name of the code location containing synced assets
- Workspace URL: Your Databricks workspace URL
- Format:
https://dbc-1234abcd-56ef.cloud.databricks.com - Find this in your browser address bar when logged into Databricks
- Format:
- Personal access token environment variable: Name of the Dagster+ environment variable containing your token (e.g.,
DATABRICKS_CONNECTION_TOKEN)
Optional: Configure asset filtering
Use filtering to control which catalogs, schemas, tables, views, and notebooks are synced. Patterns use regular expressions.