Skip to main content

dbt patterns and best practices

This guide covers advanced patterns and best practices for integrating dbt with Dagster, helping you build more maintainable data pipelines.

Preventing concurrent dbt snapshots

dbt snapshots track changes to data over time by comparing current data to previous snapshots. Running snapshots concurrently can corrupt these tables, so it's critical to ensure only one snapshot operation runs at a time.

1. Separate snapshots from other models

Create separate dbt component definitions to isolate snapshots from your regular dbt models. First, scaffold two dbt components:

# Create component for regular models
dg scaffold defs dagster_dbt.DbtProjectComponent dbt_models

# Create component for snapshots
dg scaffold defs dagster_dbt.DbtProjectComponent dbt_snapshots

Configure the regular models component to exclude snapshots:

my_project/defs/dbt_models/defs.yaml
type: dagster_dbt.DbtProjectComponent

attributes:
project: '{{ project_root }}/dbt'
exclude: "resource_type:snapshot"

Configure the snapshots component with concurrency control:

my_project/defs/dbt_snapshots/defs.yaml
type: dagster_dbt.DbtProjectComponent

attributes:
project: '{{ project_root }}/dbt'
select: "resource_type:snapshot"

post_processing:
assets:
- target: "*"
attributes:
pool: "dbt-snapshots"

2. Configure concurrency pools

Configure your Dagster instance to create pools with maximum concurrency of 1. Add this configuration to your dagster.yaml (for Dagster Open Source) or deployment settings (for Dagster+):

dagster.yaml
concurrency:
pools:
dbt-snapshots:
limit: 1
granularity: 'op'

Then set the pool limit for the snapshot pool:

# Set pool limit using CLI
dagster instance concurrency set dbt-snapshots 1

3. Manage multiple snapshot groups with Dagster components

For large projects with many snapshots, you can create multiple snapshot groups while still preventing concurrency issues within each group. Create separate Dagster components for different business domains:

# Create component for sales snapshots
dg scaffold defs dagster_dbt.DbtProjectComponent dbt_snapshots_sales

# Create component for inventory snapshots
dg scaffold defs dagster_dbt.DbtProjectComponent dbt_snapshots_inventory

Sales snapshots component:

my_project/defs/dbt_snapshots_sales/defs.yaml
type: dagster_dbt.DbtProjectComponent

attributes:
project: '{{ project_root }}/dbt'
select: "resource_type:snapshot,path:snapshots/sales/*"

post_processing:
assets:
- target: "*"
attributes:
pool: "sales-snapshots"

Inventory snapshots component:

my_project/defs/dbt_snapshots_inventory/defs.yaml
type: dagster_dbt.DbtProjectComponent

attributes:
project: '{{ project_root }}/dbt'
select: "resource_type:snapshot,path:snapshots/inventory/*"

post_processing:
assets:
- target: "*"
attributes:
pool: "inventory-snapshots"

Configure separate pool limits for each domain. This approach allows snapshots from different business domains to run in parallel while preventing concurrent execution within each domain, reducing the risk of corruption while maintaining reasonable performance.