Components ETL pipeline tutorial
This feature is considered in a preview stage and is under active development. There may be API changes and feature gaps. Please go to the #dg-components channel in our Slack to report issues or give feedback.
Setup
1. Install project dependencies
First, install duckdb
for a local database and tree
to visualize project structure:
- Mac
- Windows
- Linux
tree
is optional and is only used to produce a nicely formatted representation of the project structure on the comand line. You can also use find
, ls
, dir
, or any other directory listing command.
2. Scaffold a new project
After installing dependencies, scaffold a components-ready project. The flow for scaffolding a project will depend on your package manager/environment management strategy.
- uv
- pip
- Mac
- Windows
- Linux
brew install uv
powershell -ExecutionPolicy ByPass -c 'irm https://astral.sh/uv/install.ps1 | iex'
curl -LsSf https://astral.sh/uv/install.sh | sh
For more detailed uv
installation instructions, see the uv
docs.
Ensure you have dg
installed globally as a uv
tool:
uv tool install dagster-dg
Now run the below command. Say yes to the prompt to run uv sync
after scaffolding:
dg init jaffle-platform
The dg init
command builds a project at jaffle-platform
. Running uv sync
after scaffolding creates a virtual environment and installs the dependencies listed in pyproject.toml
, along with jaffle-platform
itself as an editable install. Now let's enter the directory and activate the virtual environment:
cd jaffle-platform && source .venv/bin/activate
Because pip
does not support global installations, you will install dg
inside your project virtual environment.
We'll create and enter our project directory, initialize and activate a virtual environment, and install the dagster-dg
package into it:
mkdir jaffle-platform && cd jaffle-platform
python -m venv .venv
source .venv/bin/activate
pip install dagster-dg
The dg
executable is now available via the activated virtual environment. Let's run dg init .
to scaffold a new project. The .
tells dg
to scaffold the project in the current directory.
dg init .
Finally, install the newly created project package into the virtual environment as an editable install:
pip install -e .
To learn more about the files, directories, and default settings in a project scaffolded with dg init
, see "Creating a project with components".
Ingest data
1. Add the Sling component type to your environment
To ingest data, you must set up Sling. We can list the component types available to our project with dg list plugins
. If we run this now, the Sling component won't appear, since the dagster
package doesn't contain components for specific integrations (like Sling):
dg list plugins
Using /.../jaffle-platform/.venv/bin/dagster-components
┏━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓
┃ Plugin ┃ Objects ┃
┡━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┩
│ dagster │ ┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━┓ │
│ │ ┃ Symbol ┃ Summary ┃ Features ┃ │
│ │ ┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━ ━━╇━━━━━━━━━━━━━━━━━━━━━┩ │
│ │ │ dagster.asset │ Create a │ [scaffold-target] │ │
│ │ │ │ definition for how │ │ │
│ │ │ │ to compute an │ │ │
│ │ │ │ asset. │ │ │
│ │ ├─────────────────────────────────────────────────────────────┼────────────────────┼─────────────────────┤ │
│ │ │ dagster.components.DefinitionsComponent │ An arbitrary set │ [component, │ │
│ │ │ │ of dagster │ scaffold-target] │ │
│ │ │ │ definitions. │ │ │
│ │ ├─────────────────────────────────────────────────────────────┼────────────────────┼─────────────────────┤ │
│ │ │ dagster.components.DefsFolderComponent │ A folder which may │ [component, │ │
│ │ │ │ contain multiple │ scaffold-target] │ │
│ │ │ │ submodules, each │ │ │
│ │ │ │ which define │ │ │
│ │ │ │ components. │ │ │
│ │ ├─────────────────────────────────────────────────────────────┼────────────────────┼─────────────────────┤ │
│ │ │ dagster.components.PipesSubprocessScriptCollectionComponent │ Assets that wrap │ [component, │ │
│ │ │ │ Python scripts │ scaffold-target] │ │
│ │ │ │ executed with │ │ │
│ │ │ │ Dagster's │ │ │
│ │ │ │ PipesSubprocessCl… │ │ │
│ │ ├─────────────────────────────────────────────────────────────┼────────────────────┼─────────────────────┤ │
│ │ │ dagster.schedule │ Creates a schedule │ [scaffold-target] │ │
│ │ │ │ following the │ │ │
│ │ │ │ provided cron │ │ │
│ │ │ │ schedule and │ │ │
│ │ │ │ requests runs for │ │ │
│ │ │ │ the provided job. │ │ │
│ │ ├─────────────────────────────────────────────────────────────┼────────────────────┼─────────────────────┤ │
│ │ │ dagster.sensor │ Creates a sensor │ [scaffold-target] │ │
│ │ │ │ where the │ │ │
│ │ │ │ decorated function │ │ │
│ │ │ │ is used as the │ │ │
│ │ │ │ sensor's │ │ │
│ │ │ │ evaluation │ │ │
│ │ │ │ function. │ │ │
│ │ └─────────────────────────────────────────────────────────────┴────────────────────┴─────────────────────┘ │
└─────────┴────────────────────────────────────────────────────────────────────────────────────────────────────────────┘
To make the Sling component available in your environment, install the dagster-sling
package:
- uv
- pip
uv add dagster-sling
pip install dagster-sling
2. Confirm availability of the Sling component type
To confirm that the dagster_sling.SlingReplicationCollectionComponent
component type is now available, run the dg list plugins
command again:
dg list plugins
Using /.../jaffle-platform/.venv/bin/dagster-components
┏━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓
┃ Plugin ┃ Objects ┃
┡━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┩
│ dagster │ ┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━┓ │
│ │ ┃ Symbol ┃ Summary ┃ Features ┃ │
│ │ ┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━┩ │
│ │ │ dagster.asset │ Create a │ [scaffold-targ… │ │
│ │ │ │ definition for │ │ │
│ │ │ │ how to compute │ │ │
│ │ │ │ an asset. │ │ │
│ │ ├─────────────────────────────────────────────────────────────┼──────────────────┼─────────────────┤ │
│ │ │ dagster.components.DefinitionsComponent │ An arbitrary set │ [component, │ │
│ │ │ │ of dagster │ scaffold-targe… │ │
│ │ │ │ definitions. │ │ │
│ │ ├─────────────────────────────────────────────────────────────┼──────────────────┼─────────────────┤ │
│ │ │ dagster.components.DefsFolderComponent │ A folder which │ [component, │ │
│ │ │ │ may contain │ scaffold-targe… │ │
│ │ │ │ multiple │ │ │
│ │ │ │ submodules, each │ │ │
│ │ │ │ which define │ │ │
│ │ │ │ components. │ │ │
│ │ ├─────────────────────────────────────────────────────────────┼──────────────────┼─────────────────┤ │
│ │ │ dagster.components.PipesSubprocessScriptCollectionComponent │ Assets that wrap │ [component, │ │
│ │ │ │ Python scripts │ scaffold-targe… │ │
│ │ │ │ executed with │ │ │
│ │ │ │ Dagster's │ │ │
│ │ │ │ PipesSubprocess… │ │ │
│ │ ├─────────────────────────────────────────────────────────────┼──────────────────┼─────────────────┤ │
│ │ │ dagster.schedule │ Creates a │ [scaffold-targ… │ │
│ │ │ │ schedule │ │ │
│ │ │ │ following the │ │ │
│ │ │ │ provided cron │ │ │
│ │ │ │ schedule and │ │ │
│ │ │ │ requests runs │ │ │
│ │ │ │ for the provided │ │ │
│ │ │ │ job. │ │ │
│ │ ├─────────────────────────────────────────────────────────────┼──────────────────┼─────────────────┤ │
│ │ │ dagster.sensor │ Creates a sensor │ [scaffold-targ… │ │
│ │ │ │ where the │ │ │
│ │ │ │ decorated │ │ │
│ │ │ │ function is used │ │ │
│ │ │ │ as the sensor's │ │ │
│ │ │ │ evaluation │ │ │
│ │ │ │ function. │ │ │
│ │ └─────────────────────────────────────────────────────────────┴──────────────────┴─────────────────┘ │
│ dagster_sling │ ┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━┓ │
│ │ ┃ Symbol ┃ Summary ┃ Features ┃ │