Skip to main content

Review dbt project structure

With our base tables populated, we can turn to the dbt project. The analytics/models directory, located alongside our Dagster code in the src/project_dbt directory, contains a small but representative dbt project:

analytics
├── marts
│   ├── daily_metrics.sql
│   └── location_metrics.sql
├── sources
│   └── raw_taxis.yml
└── staging
├── staging.yml
├── stg_trips.sql
└── stg_zones.sql

It’s usually simpler to keep your dbt project in the same repository as your Dagster code. This makes it easier for Dagster to parse the metadata files generated by dbt commands, and enables tighter integration between the two tools.

dbt sources

This dbt project includes four models and two sources. The sources correspond to the taxi_zones and taxi_trips tables created in the previous step:

src/project_dbt/analytics/models/sources/raw_taxis.yml
version: 2

sources:
- name: raw_taxis
schema: main
tables:
- name: zones
- name: trips

The connection for the dbt project is defined in profiles.yml. Here, we point it to the same DuckDB storage layer used by our Dagster DuckDB resource, /var/tmp/duckdb.db:

src/project_dbt/analytics/profiles.yml
dbt_project:
target: dev
outputs:
dev:
type: duckdb
path: '{{ env_var("DUCKDB_DATABASE", "/var/tmp/duckdb.db") }}'

dbt models

In addition to the sources, the project defines several models that capture the business logic. Most models are materialized as tables, one is incremental, and one includes lightweight tests (configured in staging.yml).

TableMaterializationTests
daily_metricsIncrementalNo
location_metricsTableNo
stg_tripsTableNo
stg_zonesTableYes

The SQL inside each model isn’t the focus here. Instead, this compact project highlights many of the core dbt patterns you’ll encounter: staging layers, source declarations, different materializations, and basic testing.

Next, we’ll turn these dbt models into Dagster assets so they can be orchestrated, tracked, and monitored as part of your data pipeline.

Next steps