Skip to main content

Build your first ETL pipeline

In this tutorial, you'll build a full ETL pipeline with Dagster that:

Ingests data into DuckDB
Transforms data into reports with dbt
Runs scheduled reports automatically
Generates one-time reports on demand
Visualizes the data with Evidence

You will learn to:

Set up a Dagster project with the recommended project structure
Integrate with other tools
Create and materialize assets and dependencies
Ensure data quality with asset checks
Create and materialize partitioned assets
Automate the pipeline
Create and materialize assets with sensors

Prerequisites

To follow the steps in this tutorial, you'll need:

Python 3.9+ and uv installed. For more information, see the Installation guide.
Familiarity with Python and SQL.
A basic understanding of data pipelines and the extract, transform, and load (ETL) process.

Set up your Dagster project

Open your terminal and scaffold a new project with uv:

uvx create-dagster project etl-tutorial

Creating a Dagster project at <YOUR PATH>/etl-tutorial.
Scaffolded files for Dagster project at <YOUR PATH>/etl-tutorial.
...

Change directory into your new project:
```
cd etl-tutorial
```
Activate the project virtual environment:
- MacOS
- Windows
source .venv/bin/activate
To make sure Dagster and its dependencies were installed correctly, start the Dagster webserver:
```
dg dev
```
In your browser, navigate to http://127.0.0.1:3000

At this point the project will be empty, but we will continue to add to it throughout the tutorial.

Next steps

Continue this tutorial with extract data

Prerequisites
Set up your Dagster project
Next steps