Using Dagster with Airflow#

You can find the code for this example on Github

The dagster-airflow library provides interoperability between Dagster and Airflow. The main scenarios for using the Dagster Airflow integration are:

  • You want to do a lift-and-shift migration of all your existing Airflow DAGs into Dagster jobs/Software-defined Assets (SDAs)
  • You want to trigger Dagster job runs from Airflow

This integration is designed to help support users who have existing Airflow usage and are interested in exploring Dagster.

Migrating to Dagster#

Interested in migrating from Airflow to Dagster? Check out the migration guide for a step-by-step walkthrough.

If you're not sure how to map Airflow concepts to Dagster, check out the cheatsheet in the next section before you begin.

Mapping Airflow concepts to Dagster#

While Airflow and Dagster have some significant differences, there are many concepts that overlap. Use this cheatsheet to understand how Airflow concepts map to Dagster.

Airflow conceptDagster conceptNotes
Directed Acyclic Graphs (DAG) Jobs
Task Ops
Datasets Software-defined Assets (SDAs)SDAs are more powerful and mature than datasets and include support for things like partitioning.
DagBags Code locationsMultiple isolated code locations with different system and Python dependencies can exist within the same Dagster instance.
DAG runsJob runs
depends_on_pastAn asset can depend on earlier partitions of itself. When this is the case, backfills and auto-materialize will only materialize later partitions after earlier partitions have completed.
Executors Executors
Hooks ResourcesDagster resource contain a superset of the functionality of hooks and have much stronger composition guarantees.
Instances Instances
OperatorsNoneDagster uses normal Python functions instead of framework-specific operator classes. For off-the-shelf functionality with third-party tools, Dagster provides integration libraries.
Pools Run coordinators
Plugins/Providers Integrations
Schedulers Schedules
Sensors Sensors
SubDAGs/TaskGroupsDagster provides rich, searchable metadata and tagging support beyond what’s offered by Airflow.
task_concurrency Asset/op-level concurrency limits
Trigger Dagster UI LaunchpadTriggering and configuring ad-hoc runs is easier in Dagster which allows them to be initiated through the Dagster UI, the GraphQL API, or the CLI.
XComs I/O managersI/O managers are more powerful than XComs and allow the passing large datasets between jobs.