ELT pipeline with Sling and dlt
In this example, you'll build a full ELT pipeline that processes e-commerce order data alongside GitHub repository activity. The pipeline ingests production database tables (users, orders, products) and GitHub API data (issues, pull_requests) into DuckDB, then transforms them into analytics-ready summaries (customer_order_summary, product_revenue) — all using Dagster components.
The pipeline:
- Ingests database tables from Postgres into DuckDB using Sling
- Ingests API data from the GitHub API into DuckDB using dlt
- Registers both ingestion pipelines as Dagster assets using Dagster components.
- Transforms the ingested data into analytics-ready tables using
TemplatedSqlComponent
The entire pipeline — ingestion and transformation — is built with Dagster components. Instead of writing @asset functions by hand, you describe each stage in YAML and Dagster generates the asset graph. This keeps your pipeline declarative, consistent, and easy to extend.
Prerequisites
To follow the steps in this guide, you'll need:
- Python 3.10+ and
uvinstalled. For more information, see the Installation guide. - A GitHub personal access token (for the dlt GitHub ingestion).
- A running Postgres instance (for the Sling ingestion).
Step 1: Set up your Dagster environment
First, set up a new Dagster project.
-
Clone the Dagster repo and navigate to the project:
cd examples/docs_projects/project_elt_pipeline -
Install the required dependencies with
uv:uv sync -
Activate the virtual environment:
- MacOS
- Windows
source .venv/bin/activate.venv\Scripts\activate -
Ensure the following environments have been populated in your
.envfile. Start by copying the template:cp .env.example .envAnd then populate the fields.
Step 2: Launch the Dagster webserver
To make sure Dagster and its dependencies were installed correctly, navigate to the project root directory and start the Dagster webserver:
dg dev
Next steps
- Continue this example by adding Sling ingestion