The dagster-airflow
library provides interoperability between Dagster and Airflow. The main scenarios for using the Dagster Airflow integration are:
This integration is designed to help support users who have existing Airflow usage and are interested in exploring Dagster.
Interested in migrating from Airflow to Dagster? Check out the migration guide for a step-by-step walkthrough.
If you're not sure how to map Airflow concepts to Dagster, check out the cheatsheet in the next section before you begin.
While Airflow and Dagster have some significant differences, there are many concepts that overlap. Use this cheatsheet to understand how Airflow concepts map to Dagster.
Airflow concept | Dagster concept | Notes |
---|---|---|
Directed Acyclic Graphs (DAG) | Jobs | |
Task | Ops | |
Datasets | Software-defined Assets (SDAs) | SDAs are more powerful and mature than datasets and include support for things like partitioning. |
Connections/Variables |
| |
DagBags | Code locations | Multiple isolated code locations with different system and Python dependencies can exist within the same Dagster instance. |
DAG runs | Job runs | |
depends_on_past | An asset can depend on earlier partitions of itself. When this is the case, backfills and auto-materialize will only materialize later partitions after earlier partitions have completed. | |
Executors | Executors | |
Hooks | Resources | Dagster resource contain a superset of the functionality of hooks and have much stronger composition guarantees. |
Instances | Instances | |
Operators | None | Dagster uses normal Python functions instead of framework-specific operator classes. For off-the-shelf functionality with third-party tools, Dagster provides integration libraries. |
Pools | Run coordinators | |
Plugins/Providers | Integrations | |
Schedulers | Schedules | |
Sensors | Sensors | |
SubDAGs/TaskGroups | Dagster provides rich, searchable metadata and tagging support beyond what’s offered by Airflow. | |
task_concurrency | Asset/op-level concurrency limits | |
Trigger | Dagster UI Launchpad | Triggering and configuring ad-hoc runs is easier in Dagster which allows them to be initiated through the Dagster UI, the GraphQL API, or the CLI. |
XComs | I/O managers | I/O managers are more powerful than XComs and allow the passing large datasets between jobs. |