Dagster offers several ways to run data pipelines without manual intervention, including traditional scheduling and event-based triggers. Automating your Dagster pipelines can boost efficiency and ensure that data is produced consistently and reliably.
When one of Dagster's automation methods is triggered, a tick is created, which indicates that a run should occur. The tick will kick off a run, which is a single instance of a pipeline being executed.
In this guide, we'll cover the available automation methods Dagster provides and when to use each one.
Before continuing, you should be familiar with:
In this section, we'll touch on each of the automation methods currently supported by Dagster. After that we'll discuss what to think about when selecting a method.
Schedules are Dagster's imperative approach, which allow you to specify when a job should run, such as Mondays at 9:00 AM. Jobs triggered by schedules can contain a subset of assets or ops. Refer to the Schedules documentation to learn more.
You can use sensors to run a job or materialize an asset in response to specific events. Sensors periodically check and execute logic to know whether to kick off a run. They are commonly used for situations where you want to materialize an asset after some externally observable event happens, such as:
You can also use sensors to act on the status of a job run. Refer to the Sensors documentation to learn more.
Declarative Automation allows you to automatically materialize assets when specified criteria are met. Using Declarative Automation, you could update assets:
Materialization conditions are declared on an asset-by-asset basis. Refer to the Declarative Automation documentation to learn more.
Asset sensors trigger jobs when a specified asset is materialized. Using asset sensors, you can instigate runs across jobs and code locations and keep downstream assets up-to-date with ease.
Refer to the Asset Sensor documentation to learn more.
Before you dive into automating your pipelines, you should think about:
The following cheatsheet contains high-level details about each of the automation methods we covered, along with when to use each one.
Method | How it works | May be a good fit if... | Works with | |
---|---|---|---|---|
Schedules | Starts a job at a specified time |
|
| |
Sensors | Starts a job or materializes a selection of assets when a specific event occurs | You want to trigger runs based off an event |
| |
Declarative Automation | Automatically materializes an asset when specified criteria (ex: upstream changes) are met |
| Assets only | |
Asset Sensors | Starts a job when a materialization occurs for a specific asset or selection of assets |
| Assets only |