Skip to main content

Dagster & Airbyte (Component)

The dagster-airbyte library provides an AirbyteWorkspaceComponent which can be used to easily represent Airbyte connections as assets in Dagster.

1. Prepare a Dagster project

To begin, you'll need a Dagster project. You can use an existing components-ready project or create a new one:

create-dagster project my-project && cd my-project/src

Activate the project virtual environment:

source ../.venv/bin/activate

Finally, add the dagster-airbyte library to the project:

uv add dagster-airbyte

2. Scaffold an Airbyte component

Now that you have a Dagster project, you can scaffold an Airbyte component. You'll need to provide your Airbyte workspace ID and API credentials:

dg scaffold defs dagster_airbyte.AirbyteWorkspaceComponent airbyte_ingest \
--workspace-id test_workspace --client-id "{{ env.AIRBYTE_CLIENT_ID }}" --client-secret "{{ env.AIRBYTE_CLIENT_SECRET }}"
Creating defs at /.../my-project/src/my_project/defs/airbyte_ingest.

The scaffold call will generate a defs.yaml file:

tree my_project/defs
my_project/defs
├── __init__.py
└── airbyte_ingest
└── defs.yaml

2 directories, 2 files

In its scaffolded form, the defs.yaml file contains the configuration for your Airbyte workspace:

my_project/defs/airbyte_ingest/defs.yaml
type: dagster_airbyte.AirbyteWorkspaceComponent

attributes:
workspace:
workspace_id: test_workspace
client_id: '{{ env.AIRBYTE_CLIENT_ID }}'
client_secret: '{{ env.AIRBYTE_CLIENT_SECRET }}'

You can check the configuration of your component:

dg list defs
┏━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓
┃ Section ┃ Definitions ┃
┡━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┩
│ Assets │ ┏━━━━━━━━━━━━━┳━━━━━━━━━┳━━━━━━┳━━━━━━━━━━━┳━━━━━━━━━━━━━┓ │
│ │ ┃ Key ┃ Group ┃ Deps ┃ Kinds ┃ Description ┃ │
│ │ ┡━━━━━━━━━━━━━╇━━━━━━━━━╇━━━━━━╇━━━━━━━━━━━╇━━━━━━━━━━━━━┩ │
│ │ │ account │ default │ │ airbyte │ │ │
│ │ │ │ │ │ snowflake │ │ │
│ │ ├─────────────┼─────────┼──────┼───────────┼─────────────┤ │
│ │ │ company │ default │ │ airbyte │ │ │
│ │ │ │ │ │ snowflake │ │ │
│ │ ├─────────────┼─────────┼──────┼───────────┼─────────────┤ │
│ │ │ contact │ default │ │ airbyte │ │ │
│ │ │ │ │ │ snowflake │ │ │
│ │ ├─────────────┼─────────┼──────┼───────────┼─────────────┤ │
│ │ │ opportunity │ default │ │ airbyte │ │ │
│ │ │ │ │ │ snowflake │ │ │
│ │ ├─────────────┼─────────┼──────┼───────────┼─────────────┤ │
│ │ │ task │ default │ │ airbyte │ │ │
│ │ │ │ │ │ snowflake │ │ │
│ │ ├─────────────┼─────────┼──────┼───────────┼─────────────┤ │
│ │ │ user │ default │ │ airbyte │ │ │
│ │ │ │ │ │ snowflake │ │ │
│ │ └─────────────┴─────────┴──────┴───────────┴─────────────┘ │
└─────────┴────────────────────────────────────────────────────────────┘

3. Configuration for Airbyte OSS or Self-Managed Enterprise

In order to configure your Airbyte component for Airbyte OSS or Self-Managed Enterprise, you will need to provide the REST API URL and Configuration API URL.

The REST API URL endpoint is exposed at https://<airbyte-server-hostname>/api/public/v1 and the Configuration API URL endpoint is exposed at https://<airbyte-server-hostname>/api/v1.

Airbyte OSS and Self-Managed Enterprise support several authentication methods. Please see Authentication in Self-Managed in the Airbyte API docs for more details.

my_project/defs/airbyte_ingest/defs.yaml
type: dagster_airbyte.AirbyteWorkspaceComponent

attributes:
workspace:
rest_api_base_url: http://localhost:8000/api/public/v1
configuration_api_base_url: http://localhost:8000/api/v1
workspace_id: test_workspace
client_id: "{{ env.AIRBYTE_CLIENT_ID }}"
client_secret: "{{ env.AIRBYTE_CLIENT_SECRET }}"

4. Select specific connections

You can select specific Airbyte connections to include in your component using the connection_selector key. This allows you to filter which connections are represented as assets:

my_project/defs/airbyte_ingest/defs.yaml
type: dagster_airbyte.AirbyteWorkspaceComponent

attributes:
workspace:
workspace_id: test_workspace
client_id: "{{ env.AIRBYTE_CLIENT_ID }}"
client_secret: "{{ env.AIRBYTE_CLIENT_SECRET }}"
connection_selector:
by_name:
- salesforce_to_snowflake
dg list defs
┏━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓
┃ Section ┃ Definitions ┃
┡━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┩
│ Assets │ ┏━━━━━━━━━━━━━┳━━━━━━━━━┳━━━━━━┳━━━━━━━━━━━┳━━━━━━━━━━━━━┓ │
│ │ ┃ Key ┃ Group ┃ Deps ┃ Kinds ┃ Description ┃ │
│ │ ┡━━━━━━━━━━━━━╇━━━━━━━━━╇━━━━━━╇━━━━━━━━━━━╇━━━━━━━━━━━━━┩ │
│ │ │ account │ default │ │ airbyte │ │ │
│ │ │ │ │ │ snowflake │ │ │
│ │ ├─────────────┼─────────┼──────┼───────────┼─────────────┤ │
│ │ │ opportunity │ default │ │ airbyte │ │ │
│ │ │ │ │ │ snowflake │ │ │
│ │ ├─────────────┼─────────┼──────┼───────────┼─────────────┤ │
│ │ │ task │ default │ │ airbyte │ │ │
│ │ │ │ │ │ snowflake │ │ │
│ │ ├─────────────┼─────────┼──────┼───────────┼─────────────┤ │
│ │ │ user │ default │ │ airbyte │ │ │
│ │ │ │ │ │ snowflake │ │ │
│ │ └─────────────┴─────────┴──────┴───────────┴─────────────┘ │
└─────────┴────────────────────────────────────────────────────────────┘

5. Customize Airbyte assets

Properties of the assets emitted by each connection can be customized in the defs.yaml file using the translation key:

my_project/defs/airbyte_ingest/defs.yaml
type: dagster_airbyte.AirbyteWorkspaceComponent

attributes:
workspace:
workspace_id: test_workspace
client_id: "{{ env.AIRBYTE_CLIENT_ID }}"
client_secret: "{{ env.AIRBYTE_CLIENT_SECRET }}"
connection_selector:
by_name:
- salesforce_to_snowflake
translation:
group_name: airbyte_data
description: "Loads data from Airbyte connection {{ props.connection_name }}"
dg list defs
┏━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓
┃ Section ┃ Definitions ┃
┡━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┩
│ Assets │ ┏━━━━━━━━━━━━━┳━━━━━━━━━━━━━━┳━━━━━━┳━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓ │
│ │ ┃ Key ┃ Group ┃ Deps ┃ Kinds ┃ Description ┃ │
│ │ ┡━━━━━━━━━━━━━╇━━━━━━━━━━━━━━╇━━━━━━╇━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┩ │
│ │ │ account │ airbyte_data │ │ airbyte │ Loads data from Airbyte connection │ │
│ │ │ │ │ │ snowflake │ salesforce_to_snowflake │ │
│ │ ├─────────────┼──────────────┼──────┼───────────┼────────────────────────────────────────────────────────┤ │
│ │ │ opportunity │ airbyte_data │ │ airbyte │ Loads data from Airbyte connection │ │
│ │ │ │ │ │ snowflake │ salesforce_to_snowflake │ │
│ │ ├─────────────┼──────────────┼──────┼───────────┼────────────────────────────────────────────────────────┤ │
│ │ │ task │ airbyte_data │ │ airbyte │ Loads data from Airbyte connection │ │
│ │ │ │ │ │ snowflake │ salesforce_to_snowflake │ │
│ │ ├─────────────┼──────────────┼──────┼───────────┼────────────────────────────────────────────────────────┤ │
│ │ │ user │ airbyte_data │ │ airbyte │ Loads data from Airbyte connection │ │
│ │ │ │ │ │ snowflake │ salesforce_to_snowflake │ │
│ │ └─────────────┴──────────────┴──────┴───────────┴────────────────────────────────────────────────────────┘ │
└─────────┴────────────────────────────────────────────────────────────────────────────────────────────────────────────┘