Skip to main content

Dagster & Sling with components

info

dg and Dagster Components are under active development. You may encounter feature gaps, and the APIs may change. To report issues or give feedback, please join the #dg-components channel in the Dagster Community Slack.

The dagster-sling library provides a SlingReplicationCollectionComponent which can be used to easily represent a collection of Sling replications as assets in Dagster.

1. Prepare a Dagster project

To begin, you'll need a Dagster project. You can use an existing components-ready project or create a new one:

create-dagster project my-project && cd my-project/src

Activate the project virtual environment:

source ../.venv/bin/activate

Finally, add the dagster-sling library to the project. We will also add duckdb to use as a destination for our Sling replication.

uv add dagster-sling duckdb

2. Scaffold a Sling component

Now that you have a Dagster project, you can scaffold a Sling component:

dg scaffold defs dagster_sling.SlingReplicationCollectionComponent sling_ingest
Creating a component at /.../my-project/src/my_project/defs/sling_ingest.

The scaffold call will generate a defs.yaml file and a unpopulated Sling replication.yaml file:

tree my_project/defs
my_project/defs
├── __init__.py
└── sling_ingest
├── defs.yaml
└── replication.yaml

2 directories, 3 files

In its scaffolded form, the defs.yaml file contains the configuration for your Sling workspace:

my_project/defs/sling_ingest/defs.yaml
type: dagster_sling.SlingReplicationCollectionComponent

attributes:
replications:
- path: replication.yaml

The generated file is a template, which still needs to be configured:

my_project/defs/sling_ingest/replication.yaml
source: {}
streams: {}
target: {}

3. Configure Sling replications

In the defs.yaml file, you can directly specify a list of Sling connections which you can use in your replications. Here, you can specify a connection to DuckDB:

my_project/defs/sling_ingest/defs.yaml
type: dagster_sling.SlingReplicationCollectionComponent

attributes:
sling:
connections:
- name: DUCKDB
type: duckdb
instance: /tmp/my_project.duckdb
replications:
- path: ./replication.yaml

For this example replication, we will ingest a set of CSV files to DuckDB. You can use curl to download some sample data:

curl -O https://raw.githubusercontent.com/dbt-labs/jaffle-shop-classic/refs/heads/main/seeds/raw_customers.csv &&
curl -O https://raw.githubusercontent.com/dbt-labs/jaffle-shop-classic/refs/heads/main/seeds/raw_orders.csv &&
curl -O https://raw.githubusercontent.com/dbt-labs/jaffle-shop-classic/refs/heads/main/seeds/raw_payments.csv

Next, you can configure Sling replications for each CSV file in replication.yaml:

my_project/defs/sling_ingest/replication.yaml
source: LOCAL
target: DUCKDB

defaults:
mode: full-refresh
object: "{stream_table}"

streams:
file://raw_customers.csv:
object: "main.raw_customers"
file://raw_orders.csv:
object: "main.raw_orders"
file://raw_payments.csv:
object: "main.raw_payments"

Our newly configured Sling component will produce an asset for each replicated file:

dg list defs
┏━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓
┃ Section ┃ Definitions ┃
┡━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┩
│ Assets │ ┏━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━┳━━━━━━━━━━━━━┓ │
│ │ ┃ Key ┃ Group ┃ Deps ┃ Kinds ┃ Description ┃ │
│ │ ┡━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━╇━━━━━━━━━━━━━┩ │
│ │ │ file_raw_customers/csv │ default │ │ │ │ │
│ │ ├───────────────────────────┼─────────┼────────────────────────┼───────┼─────────────┤ │
│ │ │ file_raw_orders/csv │ default │ │ │ │ │
│ │ ├───────────────────────────┼─────────┼────────────────────────┼───────┼─────────────┤ │
│ │ │ file_raw_payments/csv │ default │ │ │ │ │
│ │ ├───────────────────────────┼─────────┼────────────────────────┼───────┼─────────────┤ │
│ │ │ target/main/raw_customers │ default │ file_raw_customers/csv │ sling │ │ │
│ │ ├───────────────────────────┼─────────┼────────────────────────┼───────┼─────────────┤ │
│ │ │ target/main/raw_orders │ default │ file_raw_orders/csv │ sling │ │ │
│ │ ├───────────────────────────┼─────────┼────────────────────────┼───────┼─────────────┤ │
│ │ │ target/main/raw_payments │ default │ file_raw_payments/csv │ sling │ │ │
│ │ └───────────────────────────┴─────────┴────────────────────────┴───────┴─────────────┘ │
└─────────┴────────────────────────────────────────────────────────────────────────────────────────┘

4. Customize Sling assets

Properties of the assets emitted by each replication can be customized in the defs.yaml file using the translation key:

my_project/defs/sling_ingest/defs.yaml
type: dagster_sling.SlingReplicationCollectionComponent

attributes:
sling:
connections:
- name: DUCKDB
type: duckdb
instance: /tmp/my_project.duckdb
replications:
- path: ./replication.yaml
translation:
group_name: sling_data
description: "Loads data from Sling replication {{ stream_definition.name }}"
dg list defs
┏━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓
┃ Section ┃ Definitions ┃
┡━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┩
│ Assets │ ┏━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓ │
│ │ ┃ Key ┃ Group ┃ Deps ┃ Kinds ┃ Description ┃ │
│ │ ┡━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┩ │
│ │ │ file_raw_customers/csv │ default │ │ │ │ │
│ │ ├───────────────────────────┼────────────┼────────────────────────┼───────┼──────────────────────────────┤ │
│ │ │ file_raw_orders/csv │ default │ │ │ │ │
│ │ ├───────────────────────────┼────────────┼────────────────────────┼───────┼──────────────────────────────┤ │
│ │ │ file_raw_payments/csv │ default │ │ │ │ │
│ │ ├───────────────────────────┼────────────┼────────────────────────┼───────┼──────────────────────────────┤ │
│ │ │ target/main/raw_customers │ sling_data │ file_raw_customers/csv │ sling │ Loads data from Sling │ │
│ │ │ │ │ │ │ replication │ │
│ │ │ │ │ │ │ file://raw_customers.csv │ │
│ │ ├───────────────────────────┼────────────┼────────────────────────┼───────┼──────────────────────────────┤ │
│ │ │ target/main/raw_orders │ sling_data │ file_raw_orders/csv │ sling │ Loads data from Sling │ │
│ │ │ │ │ │ │ replication │ │
│ │ │ │ │ │ │ file://raw_orders.csv │ │
│ │ ├───────────────────────────┼────────────┼────────────────────────┼───────┼──────────────────────────────┤ │
│ │ │ target/main/raw_payments │ sling_data │ file_raw_payments/csv │ sling │ Loads data from Sling │ │
│ │ │ │ │ │ │ replication │ │
│ │ │ │ │ │ │ file://raw_payments.csv │ │
│ │ └───────────────────────────┴────────────┴────────────────────────┴───────┴──────────────────────────────┘ │
└─────────┴────────────────────────────────────────────────────────────────────────────────────────────────────────────┘