dbt + Dagster#

Using dbt Cloud? Check out the dbt Cloud with Dagster guide!

Dagster orchestrates dbt alongside other technologies, so you can combine dbt with Spark, Python, etc. in a single workflow. Dagster's software-defined asset abstractions make it simple to define data assets that depend on specific dbt models, or to define the computation required to compute the sources that your dbt models depend on. You could, for example:

  • Run your dbt models after ingesting data into your data warehouse
  • Selectively materialize dbt models and their dependencies

Dagster has built-in support for loading dbt models, seeds, and snapshots as software-defined assets, enabling you to:

  • Visualize and orchestrate a graph of dbt assets, and execute them with a single dbt invocation
  • Version your dbt models by their defining SQL code, allowing Dagster to indicate when a model has changed
  • View detailed historical metadata and logs for each asset
  • Define Python computations that depend directly on tables updated using dbt
  • Track data lineage through dbt and your other tools

Using dbt with Dagster#

Dagster’s software-defined assets (SDAs) bear several similarities to dbt models. A software-defined asset contains an asset key, a set of upstream asset keys, and an operation that is responsible for computing the asset from its upstream dependencies. Models defined in a dbt project are similar to Dagster SDAs in that:

  • The asset key for a dbt model is (by default) the name of the model.
  • The upstream dependencies of a dbt model are defined with ref or source calls within the model's definition.
  • The computation required to compute the asset from its upstream dependencies is the SQL within the model's definition.

These similarities make it natural to interact with dbt models as SDAs. Let’s take a look at a dbt model and an SDA, in code:

Comparison of a dbt model and Dagster asset in code

Here's what's happening in this example:

  • The first code block is a dbt model
    • As dbt models are named using file names, this model is named orders
    • The data for this model comes from a dependency named raw_orders
  • The second code block is a Dagster asset
    • The asset key corresponds to the name of the dbt model, orders
    • raw_orders is provided as an argument to the asset, defining it as a dependency

To learn how to load dbt models into Dagster as assets, check out the tutorial below or the quick version in the dagster-dbt reference.

dbt and Dagster software-defined assets tutorial#

In this tutorial, we'll walk you through integrating dbt with Dagster using dbt's example jaffle shop project, the dagster-dbt library, and a data warehouse, such as DuckDB.

By the end of the tutorial, you'll have a working dbt and Dagster integration and a handful of materialized Dagster assets, including a plotly chart powered by data computed from your dbt models.