Dagster & Airbyte Cloud with components
dg
and Dagster Components are under active development. You may encounter feature gaps, and the APIs may change. To report issues or give feedback, please join the #dg-components channel in the Dagster Community Slack.
The dagster-airbyte library provides an AirbyteCloudWorkspaceComponent
which can be used to easily represent Airbyte Cloud connections as assets in Dagster.
1. Prepare a Dagster project
To begin, you'll need a Dagster project. You can use an existing components-ready project or create a new one:
create-dagster project my-project && cd my-project/src
Activate the project virtual environment:
source ../.venv/bin/activate
Finally, add the dagster-airbyte
library to the project:
uv add dagster-airbyte
2. Scaffold an Airbyte Cloud component
Now that you have a Dagster project, you can scaffold an Airbyte Cloud component. You'll need to provide your Airbyte Cloud workspace ID and API credentials:
dg scaffold defs dagster_airbyte.AirbyteCloudWorkspaceComponent airbyte_ingest \
--workspace-id test_workspace --client-id "{{ env.AIRBYTE_CLIENT_ID }}" --client-secret "{{ env.AIRBYTE_CLIENT_SECRET }}"
Creating a component at /.../my-project/src/my_project/defs/airbyte_ingest.
The scaffold call will generate a defs.yaml
file:
tree my_project/defs
my_project/defs
├── __init__.py
└── airbyte_ingest
└── defs.yaml
2 directories, 2 files
In its scaffolded form, the defs.yaml
file contains the configuration for your Airbyte Cloud workspace:
type: dagster_airbyte.AirbyteCloudWorkspaceComponent
attributes:
workspace:
workspace_id: test_workspace
client_id: '{{ env.AIRBYTE_CLIENT_ID }}'
client_secret: '{{ env.AIRBYTE_CLIENT_SECRET }}'
You can check the configuration of your component:
dg list defs
┏━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓
┃ Section ┃ Definitions ┃
┡━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┩
│ Assets │ ┏━━━━━━━━━━━━━┳━━━━━━━━━┳━━━━━━┳━━━━━━━━━━━┳━━━━━━━━━━━━━┓ │
│ │ ┃ Key ┃ Group ┃ Deps ┃ Kinds ┃ Description ┃ │
│ │ ┡━━━━━━━━━━━━━╇━━━━━━━━━╇━━━━━━╇━━━━━━━━━━━╇━━━━━━━━━━━━━┩ │
│ │ │ account │ default │ │ airbyte │ │ │
│ │ │ │ │ │ snowflake │ │ │
│ │ ├─────────────┼─────────┼──────┼───────────┼─────────────┤ │
│ │ │ company │ default │ │ airbyte │ │ │
│ │ │ │ │ │ snowflake │ │ │
│ │ ├─────────────┼─────────┼──────┼───────────┼─────────────┤ │
│ │ │ contact │ default │ │ airbyte │ │ │
│ │ │ │ │ │ snowflake │ │ │
│ │ ├─────────────┼─────────┼──────┼───────────┼─────────────┤ │
│ │ │ opportunity │ default │ │ airbyte │ │ │
│ │ │ │ │ │ snowflake │ │ │
│ │ ├─────────────┼─────────┼──────┼───────────┼─────────────┤ │
│ │ │ task │ default │ │ airbyte │ │ │
│ │ │ │ │ │ snowflake │ │ │
│ │ ├─────────────┼─────────┼──────┼───────────┼─────────────┤ │
│ │ │ user │ default │ │ airbyte │ │ │
│ │ │ │ │ │ snowflake │ │ │
│ │ └─────────────┴─────────┴──────┴───────────┴─────────────┘ │
└─────────┴────────────────────────────────────────────────────────────┘
3. Select specific connections
You can select specific Airbyte Cloud connections to include in your component using the connection_selector
key. This allows you to filter which connections are represented as assets:
type: dagster_airbyte.AirbyteCloudWorkspaceComponent
attributes:
workspace:
workspace_id: test_workspace
client_id: "{{ env.AIRBYTE_CLIENT_ID }}"
client_secret: "{{ env.AIRBYTE_CLIENT_SECRET }}"
connection_selector:
by_name:
- salesforce_to_snowflake
dg list defs
┏━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓
┃ Section ┃ Definitions ┃
┡━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┩
│ Assets │ ┏━━━━━━━━━━━━━┳━━━━━━━━━┳━━━━━━┳━━━━━━━━━━━┳━━━━━━━━━━━━━┓ │
│ │ ┃ Key ┃ Group ┃ Deps ┃ Kinds ┃ Description ┃ │
│ │ ┡━━━━━━━━━━━━━╇━━━━━━━━━╇━━━━━━╇━━━━━━━━━━━╇━━━━━━━━━━━━━┩ │
│ │ │ account │ default │ │ airbyte │ │ │
│ │ │ │ │ │ snowflake │ │ │
│ │ ├─────────────┼─────────┼──────┼───────────┼─────────────┤ │
│ │ │ opportunity │ default │ │ airbyte │ │ │
│ │ │ │ │ │ snowflake │ │ │
│ │ ├─────────────┼─────────┼──────┼───────────┼─────────────┤ │
│ │ │ task │ default │ │ airbyte │ │ │
│ │ │ │ │ │ snowflake │ │ │
│ │ ├─────────────┼─────────┼──────┼───────────┼─────────────┤ │
│ │ │ user │ default │ │ airbyte │ │ │
│ │ │ │ │ │ snowflake │ │ │
│ │ └─────────────┴─────────┴──────┴───────────┴─────────────┘ │
└─────────┴────────────────────────────────────────────────────────────┘
4. Customize Airbyte Cloud assets
Properties of the assets emitted by each connection can be customized in the defs.yaml
file using the translation
key:
type: dagster_airbyte.AirbyteCloudWorkspaceComponent
attributes:
workspace:
workspace_id: test_workspace
client_id: "{{ env.AIRBYTE_CLIENT_ID }}"
client_secret: "{{ env.AIRBYTE_CLIENT_SECRET }}"
connection_selector:
by_name:
- salesforce_to_snowflake
translation:
group_name: airbyte_data
description: "Loads data from Airbyte connection {{ props.connection_name }}"
dg list defs
┏━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓
┃ Section ┃ Definitions ┃
┡━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┩
│ Assets │ ┏━━━━━━━━━━━━━┳━━━━━━━━━━━━━━┳━━━━━━┳━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓ │
│ │ ┃ Key ┃ Group ┃ Deps ┃ Kinds ┃ Description ┃ │
│ │ ┡━━━━━━━━━━━━━╇━━━━━━━━━━━━━━╇━━━━━━╇━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┩ │
│ │ │ account │ airbyte_data │ │ airbyte │ Loads data from Airbyte connection │ │
│ │ │ │ │ │ snowflake │ salesforce_to_snowflake │ │
│ │ ├─────────────┼──────────────┼──────┼───────────┼────────────────────────────────────────────────────────┤ │
│ │ │ opportunity │ airbyte_data │ │ airbyte │ Loads data from Airbyte connection │ │
│ │ │ │ │ │ snowflake │ salesforce_to_snowflake │ │
│ │ ├─────────────┼──────────────┼──────┼───────────┼───────────────────── ───────────────────────────────────┤ │
│ │ │ task │ airbyte_data │ │ airbyte │ Loads data from Airbyte connection │ │
│ │ │ │ │ │ snowflake │ salesforce_to_snowflake │ │
│ │ ├─────────────┼──────────────┼──────┼───────────┼────────────────────────────────────────────────────────┤ │
│ │ │ user │ airbyte_data │ │ airbyte │ Loads data from Airbyte connection │ │
│ │ │ │ │ │ snowflake │ salesforce_to_snowflake │ │
│ │ └─────────────┴──────────────┴──────┴───────────┴────────────────────────────────────────────────────────┘ │
└─────────┴────────── ──────────────────────────────────────────────────────────────────────────────────────────────────┘