Skip to main content

ELT pipeline with Sling and dlt

In this example, you'll build a full ELT pipeline that processes e-commerce order data alongside GitHub repository activity. The pipeline ingests production database tables (users, orders, products) and GitHub API data (issues, pull_requests) into DuckDB, then transforms them into analytics-ready summaries (customer_order_summary, product_revenue) — all using Dagster components.

The pipeline:

  • Ingests database tables from Postgres into DuckDB using Sling
  • Ingests API data from the GitHub API into DuckDB using dlt
  • Registers both ingestion pipelines as Dagster assets using Dagster components.
  • Transforms the ingested data into analytics-ready tables using TemplatedSqlComponent

The entire pipeline — ingestion and transformation — is built with Dagster components. Instead of writing @asset functions by hand, you describe each stage in YAML and Dagster generates the asset graph. This keeps your pipeline declarative, consistent, and easy to extend.

Prerequisites

To follow the steps in this guide, you'll need:

Step 1: Set up your Dagster environment

First, set up a new Dagster project.

  1. Clone the Dagster repo and navigate to the project:

    cd examples/docs_projects/project_elt_pipeline
  2. Install the required dependencies with uv:

    uv sync
  3. Activate the virtual environment:

    source .venv/bin/activate

  4. Ensure the following environments have been populated in your .env file. Start by copying the template:

    cp .env.example .env

    And then populate the fields.

Step 2: Launch the Dagster webserver

To make sure Dagster and its dependencies were installed correctly, navigate to the project root directory and start the Dagster webserver:

dg dev

Next steps