Using environment variables with components
dg
and Dagster Components are under active development. You may encounter feature gaps, and the APIs may change. To report issues or give feedback, please join the #dg-components channel in the Dagster Community Slack.
With dg
and components, you can easily configure components depending on the environment in which they are run. To demonstrate this, we'll walk through setting up an example ELT pipeline with a Sling component which reads Snowflake credentials from environment variables.
1. Create a new Dagster components project
First, we'll set up a basic ELT pipeline using Sling in an empty Dagster components project:
uvx -U create-dagster project ingestion
cd ingestion && source .venv/bin/activate
We'll install dagster-sling
and scaffold an empty Sling connection component:
uv add dagster-sling
dg list components
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓
┃ Key ┃ Summary ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ ━━━━━━━━━━━━━━━━━━━━━━━━━┩
│ dagster.DefinitionsComponent │ An arbitrary set of Dagster definitions. │
├───────────────────────────────────────────────────┼─────────────────────────────────────────────────────────────┤
│ dagster.DefsFolderComponent │ A folder which may contain multiple submodules, each │
│ │ which define components. │
├───────────────────────────────────────────────────┼─────────────────────────────────────────────────────────────┤
│ dagster_sling.SlingReplicationCollectionComponent │ Expose one or more Sling replications to Dagster as assets. │
└───────────────────────────────────────────────── ──┴─────────────────────────────────────────────────────────────┘
dg scaffold defs dagster_sling.SlingReplicationCollectionComponent ingest_to_snowflake
Creating a component at /.../ingestion/src/ingestion/defs/ingest_to_snowflake.
2. Use environment variables in a component
Next, we will configure a Sling connection that will sync a local CSV file to a Snowflake database, with credentials provided with environment variables:
curl -O https://raw.githubusercontent.com/dbt-labs/jaffle-shop-classic/refs/heads/main/seeds/raw_customers.csv
source: LOCAL
target: SNOWFLAKE
defaults:
mode: full-refresh
object: "{stream_table}"
streams:
file://raw_customers.csv:
object: "sandbox.raw_customers"
We will use the env
function to template credentials into Sling configuration in our defs.yaml
file. Running dg check yaml
will highlight that we
need to explicitly encode these environment dependencies at the bottom of the file:
type: dagster_sling.SlingReplicationCollectionComponent
attributes:
sling:
connections:
- name: SNOWFLAKE
type: snowflake
account: "{{ env.SNOWFLAKE_ACCOUNT }}"
user: "{{ env.SNOWFLAKE_USER }}"
password: "{{ env.SNOWFLAKE_PASSWORD }}"
database: "{{ env.SNOWFLAKE_DATABASE }}"
replications:
- path: replication.yaml
dg check yaml
/.../ingestion/src/ingestion/defs/ingest_files/defs.yaml:1 - requirements.env Component uses environment variables that are not specified in the component file: SNOWFLAKE_ACCOUNT, SNOWFLAKE_DATABASE, SNOWFLAKE_PASSWORD, SNOWFLAKE_USER
|
1 | type: dagster_sling.SlingReplicationCollectionComponent
| ^ Component uses environment variables that are not specified in the component file: SNOWFLAKE_ACCOUNT, SNOWFLAKE_DATABASE, SNOWFLAKE_PASSWORD, SNOWFLAKE_USER
2 |
3 | attributes:
4 | sling:
5 | connections:
6 | - name: SNOWFLAKE
7 | type: snowflake
8 | account: "{{ env.SNOWFLAKE_ACCOUNT }}"
9 | user: "{{ env.SNOWFLAKE_USER }}"
10 | password: "{{ env.SNOWFLAKE_PASSWORD }}"
11 | database: "{{ env.SNOWFLAKE_DATABASE }}"
12 | replications:
13 | - path: replication.yaml
|
After adding the environment dependencies, running dg check yaml
again will confirm that the file is valid:
type: dagster_sling.SlingReplicationCollectionComponent
attributes:
sling:
connections:
- name: SNOWFLAKE
type: snowflake
account: "{{ env.SNOWFLAKE_ACCOUNT }}"
user: "{{ env.SNOWFLAKE_USER }}"
password: "{{ env.SNOWFLAKE_PASSWORD }}"
database: "{{ env.SNOWFLAKE_DATABASE }}"
replications:
- path: replication.yaml
requirements:
env:
- SNOWFLAKE_ACCOUNT
- SNOWFLAKE_USER
- SNOWFLAKE_PASSWORD
- SNOWFLAKE_DATABASE
dg check yaml
All components validated successfully.
Next, you can invoke dg list env
, which shows all environment variables configured or used by components in the project. Here we can see all of the Snowflake credentials we must configure in our shell or .env
file in order to run our project:
dg list env
┏━━━━━━━━━━━━━━━━━━━━┳━━━━━━━┳━━━━━━━━━━━━━━┓
┃ Env Var ┃ Value ┃ Components ┃
┡━━━━━━━━━━━━━━━━━━━━╇━━━━━━━╇━━━━━━━━━━━━━━┩
│ SNOWFLAKE_ACCOUNT │ │ ingest_files │
│ SNOWFLAKE_DATABASE │ │ ingest_files │
│ SNOWFLAKE_PASSWORD │ │ ingest_files │
│ SNOWFLAKE_USER │ │ ingest_files │
└────────────────────┴───────┴──────────────┘
You can edit the .env
file in your project root to specify environment variables for Dagster to use when running the project locally. You can run dg list env
again to see that they are now set:
echo 'SNOWFLAKE_ACCOUNT=...' >> .env
echo 'SNOWFLAKE_USER=...' >> .env
echo 'SNOWFLAKE_PASSWORD=...' >> .env
echo "SNOWFLAKE_DATABASE=sandbox" >> .env
dg list env
┏━━━━━━━━━━━━━━━━━━━━┳━━━━━━━┳━━━━━━━━━━━━━━┓
┃ Env Var ┃ Value ┃ Components ┃