Migrating from Airflow to Dagster, or integrating Dagster into your existing workflow orchestration stack, can be accomplished in many ways. The Pick your own journey guide provides a variety suggestions in how one can migrate their Airflow pipelines to Dagster, or build a platform where both tools co-exist.
While Airflow and Dagster have some significant differences, there are many concepts that overlap. Use this cheatsheet to understand how Airflow concepts map to Dagster.
Want a look at this in code? Check out the Learning Dagster from Airflow guide.
Airflow concept | Dagster concept | Notes |
---|---|---|
Directed Acyclic Graphs (DAG) | Jobs | |
Task | Ops | |
Datasets | Assets | Dagster assets are more powerful and mature than datasets and include support for things like partitioning. |
Connections/Variables |
| |
DagBags | Code locations | Multiple isolated code locations with different system and Python dependencies can exist within the same Dagster instance. |
DAG runs | Job runs | |
depends_on_past | An asset can depend on earlier partitions of itself. When this is the case, backfills and Declarative Automation will only materialize later partitions after earlier partitions have completed. | |
Executors | Executors | |
Hooks | Resources | Dagster resource contain a superset of the functionality of hooks and have much stronger composition guarantees. |
Instances | Instances | |
Operators | None | Dagster uses normal Python functions instead of framework-specific operator classes. For off-the-shelf functionality with third-party tools, Dagster provides integration libraries. |
Pools | Run coordinators | |
Plugins/Providers | Integrations | |
Schedulers | Schedules | |
Sensors | Sensors | |
SubDAGs/TaskGroups | Dagster provides rich, searchable metadata and tagging support beyond what’s offered by Airflow. | |
task_concurrency | Asset/op-level concurrency limits | |
Trigger | Dagster UI Launchpad | Triggering and configuring ad-hoc runs is easier in Dagster which allows them to be initiated through the Dagster UI, the GraphQL API, or the CLI. |
XComs | I/O managers | I/O managers are more powerful than XComs and allow the passing large datasets between jobs. |