This quickstart will get your dbt project up and running quickly with Dagster. By the end of this guide, you'll have an integrated Dagster and dbt project and be able to view it in the Dagster UI.
Prerequisites
To complete the steps in this guide, you'll need:
A valid dbt project
A dbt project must contain the dbt_project.yml and profiles.yml files
Note: We strongly recommend installing Dagster inside a Python virtualenv. Refer to the Dagster installation docs for more information.
Install dbt, Dagster, and the Dagster webserver by running the following:
pip install dagster-dbt dagster-webserver
The dagster-dbt library installs both dbt-core and dagster as dependencies. Refer to the dbt and Dagster installation docs for more information.
Other requirements based on the dbt project could be needed. In most cases, installing the library that supports your dbt adapter will be required. For instance, dbt-duckdb if you are using DuckDB as a dbt adapter.
Running your dbt project with Dagster can be easily done after creating a single file. For this example, let's consider a basic use case - say you want to represent your dbt models as Dagster assets and run them daily at midnight.
With your text editor of choice, create a Python file in the same directory as your dbt project directory and add the following code. Note that since this file contains all Dagster definitions required for your code location, it is recommended to name it definitions.py.
The following code assumes that your Python file and dbt project directory are adjacent in the same directory. If that's not the case, make sure to update the RELATIVE_PATH_TO_MY_DBT_PROJECT constant so that it points to your dbt project.
This approach uses the dagster-dbt CLI to create a new Dagster project and wrap it around a dbt project. Running this command requires two arguments:
--project-name - The name of the Dagster project to be created. In our example, this will be my_dagster_project.
--dbt-project-dir - The path to the dbt project. In our example, this will be ./my_dbt_project, which means our current location is in the directory where my_dbt_project is located.
This command will create a new directory called my_dagster_project/ inside the current directory. The new my_dagster_project/ directory will contain a set of files that define a Dagster project to load the dbt project provided in ./my_dbt_project.
Add the new objects to your Dagster project's Definitions object
Note: This example assumes that your existing Dagster project includes both assets.py and definitions.py files, among other required files like setup.py and pyproject.toml. For example, your project might look like this:
Change directories to the Dagster project directory:
cd my_dagster_project/
Create a Python file named project.py and add the following code:
from pathlib import Path
from dagster_dbt import DbtProject
RELATIVE_PATH_TO_MY_DBT_PROJECT ="./my_dbt_project"
my_project = DbtProject(
project_dir=Path(__file__).joinpath("..", RELATIVE_PATH_TO_MY_DBT_PROJECT).resolve(),)
my_project.prepare_if_dev()
The DbtProject object is a representation of the dbt project that assists with manifest.json preparation.
In your project's assets.py file, add the following code:
from dagster import AssetExecutionContext
from dagster_dbt import DbtCliResource, dbt_assets
from.project import my_project
@dbt_assets(manifest=my_project.manifest_path)defmy_dbt_assets(context: AssetExecutionContext, dbt: DbtCliResource):yieldfrom dbt.cli(["build"], context=context).stream()
The @dbt_assets decorator allows Dagster to create a definition for how to compute a set of dbt resources, described by a manifest.json.
In your project's definitions.py file, update the Definitions object to include the newly created objects:
from dagster import Definitions
from dagster_dbt import DbtCliResource
from.assets import my_dbt_assets
from.project import my_project
defs = Definitions(...,
assets=[...,# Add the dbt assets alongside your other asset
my_dbt_assets,],
resources={...:...,# Add the dbt resource alongside your other resources"dbt": DbtCliResource(project_dir=my_project),},)
With these changes, your existing Dagster project is ready to run your dbt project.
Locate the Dagster file containing your definitions. If you created a single Dagster file in the previous section (Option 1), this file will be definitions.py.
To start Dagster's UI, run the following:
dagster dev -f definitions.py
Which will result in output similar to:
Serving dagster-webserver on http://127.0.0.1:3000 in process 70635
Change directories to the Dagster project directory:
cd my_dagster_project/
To start Dagster's UI, run the following:
dagster dev
Which will result in output similar to:
Serving dagster-webserver on http://127.0.0.1:3000 in process 70635
In your browser, navigate to http://127.0.0.1:3000. The page will display the asset graph for the job created by the schedule definition:
In Dagster, running a dbt model corresponds to materializing an asset. The schedule definition included in your Dagster project's code location (Definitions object) will materialize the assets at its next cron tick.
Assets can also be materialized manually by clicking the Materialize all button near the top right corner of the page.