Dagster & Sling with components
dg
and Dagster Components are under active development. You may encounter feature gaps, and the APIs may change. To report issues or give feedback, please join the #dg-components channel in the Dagster Community Slack.
The dagster-sling library provides a SlingReplicationCollectionComponent
which can be used to easily represent a collection of Sling replications as assets in Dagster.
1. Prepare a Dagster project
To begin, you'll need a Dagster project. You can use an existing components-ready project or create a new one:
create-dagster project my-project && cd my-project/src
Activate the project virtual environment:
source ../.venv/bin/activate
Finally, add the dagster-sling
library to the project. We will also add duckdb
to use as a destination for our Sling replication.
uv add dagster-sling duckdb
2. Scaffold a Sling component
Now that you have a Dagster project, you can scaffold a Sling component:
dg scaffold defs dagster_sling.SlingReplicationCollectionComponent sling_ingest
Creating a component at /.../my-project/src/my_project/defs/sling_ingest.
The scaffold call will generate a defs.yaml
file and a unpopulated Sling replication.yaml
file:
tree my_project/defs
my_project/defs
├── __init__.py
└── sling_ingest
├── defs.yaml
└── replication.yaml
2 directories, 3 files
In its scaffolded form, the defs.yaml
file contains the configuration for your Sling workspace:
type: dagster_sling.SlingReplicationCollectionComponent
attributes:
replications:
- path: replication.yaml
The generated file is a template, which still needs to be configured:
source: {}
streams: {}
target: {}
3. Configure Sling replications
In the defs.yaml
file, you can directly specify a list of Sling connections which you can use in your replications. Here, you can specify a connection to DuckDB:
type: dagster_sling.SlingReplicationCollectionComponent
attributes:
sling:
connections:
- name: DUCKDB
type: duckdb
instance: /tmp/my_project.duckdb
replications:
- path: ./replication.yaml
For this example replication, we will ingest a set of CSV files to DuckDB. You can use curl
to download some sample data:
curl -O https://raw.githubusercontent.com/dbt-labs/jaffle-shop-classic/refs/heads/main/seeds/raw_customers.csv &&
curl -O https://raw.githubusercontent.com/dbt-labs/jaffle-shop-classic/refs/heads/main/seeds/raw_orders.csv &&
curl -O https://raw.githubusercontent.com/dbt-labs/jaffle-shop-classic/refs/heads/main/seeds/raw_payments.csv
Next, you can configure Sling replications for each CSV file in replication.yaml
:
source: LOCAL
target: DUCKDB
defaults:
mode: full-refresh
object: "{stream_table}"
streams:
file://raw_customers.csv:
object: "main.raw_customers"
file://raw_orders.csv:
object: "main.raw_orders"
file://raw_payments.csv:
object: "main.raw_payments"
Our newly configured Sling component will produce an asset for each replicated file:
dg list defs
┏━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ ━┓
┃ Section ┃ Definitions ┃
┡━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┩
│ Assets │ ┏━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━┳━━━━━━━━━━━━━┓ │
│ │ ┃ Key ┃ Group ┃ Deps ┃ Kinds ┃ Description ┃ │
│ │ ┡━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━╇━━━━━━━━━━━━━┩ │
│ │ │ file_raw_customers/csv │ default │ │ │ │ │
│ │ ├───────────────────────────┼─────────┼────────────────────── ──┼───────┼─────────────┤ │
│ │ │ file_raw_orders/csv │ default │ │ │ │ │
│ │ ├───────────────────────────┼─────────┼────────────────────────┼───────┼─────────────┤ │
│ │ │ file_raw_payments/csv │ default │ │ │ │ │
│ │ ├───────────────────────────┼─────────┼────────────────────────┼───────┼─────────────┤ │
│ │ │ target/main/raw_customers │ default │ file_raw_customers/csv │ sling │ │ │
│ │ ├───────────────────────────┼─────────┼────────────────────────┼───────┼─────────────┤ │
│ │ │ target/main/raw_orders │ default │ file_raw_orders/csv │ sling │ │ │
│ │ ├───────────────────────────┼─────────┼────────────────────────┼───────┼─────────────┤ │
│ │ │ target/main/raw_payments │ default │ file_raw_payments/csv │ sling │ │ │
│ │ └───────────────────────────┴─────────┴────────────────────────┴───────┴─────────────┘ │
└─────────┴────────────────────────────────────────────────────────────────────────────────────────┘
4. Customize Sling assets
Properties of the assets emitted by each replication can be customized in the defs.yaml
file using the translation
key:
type: dagster_sling.SlingReplicationCollectionComponent
attributes:
sling:
connections:
- name: DUCKDB
type: duckdb
instance: /tmp/my_project.duckdb
replications:
- path: ./replication.yaml
translation:
group_name: sling_data
description: "Loads data from Sling replication {{ stream_definition.name }}"
dg list defs
┏━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓
┃ Section ┃ Definitions ┃
┡━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┩
│ Assets │ ┏━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓ │
│ │ ┃ Key ┃ Group ┃ Deps ┃ Kinds ┃ Description ┃ │
│ │ ┡━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┩ │
│ │ │ file_raw_customers/csv │ default │ │ │ │ │
│ │ ├───────────────────────────┼────────────┼────────────────────────┼───────┼──────────────────────────────┤ │
│ │ │ file_raw_orders/csv │ default │ │ │ │ │
│ │ ├───────────────────────────┼────────────┼────────────────────────┼───────┼──────────────────────────────┤ │
│ │ │ file_raw_payments/csv │ default │ │ │ │ │
│ │ ├───────────────────────────┼────────────┼────────────────────────┼───────┼──────────────────────────────┤ │
│ │ │ target/main/raw_customers │ sling_data │ file_raw_customers/csv │ sling │ Loads data from Sling │ │
│ │ │ │ │ │ │ replication │ │
│ │ │ │ │ │ │ file://raw_customers.csv │ │
│ │ ├───────────────────────────┼────────────┼────────────────────────┼───────┼──────────────────────────────┤ │
│ │ │ target/main/raw_orders │ sling_data │ file_raw_orders/csv │ sling │ Loads data from Sling │ │
│ │ │ │ │ │ │ replication │ │
│ │ │ │ │ │ │ file://raw_orders.csv │ │
│ │ ├───────────────────────────┼────────────┼────────────────────────┼───────┼──────────────────────────────┤ │
│ │ │ target/main/raw_payments │ sling_data │ file_raw_payments/csv │ sling │ Loads data from Sling │ │
│ │ │ │ │ │ │ replication │ │
│ │ │ │ │ │ │ file://raw_payments.csv │ │
│ │ └───────────────────────────┴────────────┴────────────────────────┴───────┴──────────────────────────────┘ │
└─────────┴───────────────────────────────────────────── ───────────────────────────────────────────────────────────────┘