Automation#

Dagster offers several ways to run data pipelines without manual intervation, including traditional scheduling and event-based triggers. Automating your Dagster pipelines can boost efficiency and ensure that data is produced consistently and reliably.

When one of Dagster's automation methods is triggered, a tick is created, which indicates that a run should occur. The tick will kick off a run, which is a single instance of a pipeline being executed.

In this guide, we'll cover the available automation methods Dagster provides and when to use each one.

Prerequisites#

Before continuing, you should be familiar with:

Asset definitions
Jobs (optional)
Ops (optional; advanced)

Available methods#

In this section, we'll touch on each of the automation methods currently supported by Dagster. After that we'll discuss what to think about when selecting a method.

Schedules#

Schedules are Dagster's imperative approach, which allow you to specify when a job should run, such as Mondays at 9:00 AM. Jobs triggered by schedules can contain a subset of assets or ops. Refer to the Schedules documentation to learn more.

Sensors#

You can use sensors to run a job or materialize an asset in response to specific events. Sensors periodically check and execute logic to know whether to kick off a run. They are commonly used for situations where you want to materialize an asset after some externally observable event happens, such as:

A new file arrives in a specific location, such as Amazon S3
A webhook notification is received
An external system frees up a worker slot

You can also use sensors to act on the status of a job run. Refer to the Sensors documentation to learn more.

Declarative Automation
(Experimental)
#

Declarative Automation allows you to automatically materialize assets when specified criteria are met. Using Declarative Automation, you could update assets:

When the asset hasn't yet been materialized
When an asset's upstream dependency has been updated
After an asset's parents have been updated since a cron tick
... based on your own custom conditions

Materialization conditions are declared on an asset-by-asset basis. Refer to the Declarative Automation documentation to learn more.

Asset Sensors
(Experimental)
#

Asset sensors trigger jobs when a specified asset is materialized. Using asset sensors, you can instigate runs across jobs and code locations and keep downstream assets up-to-date with ease.

Refer to the Asset Sensor documentation to learn more.

Selecting a method#

Before you dive into automating your pipelines, you should think about:

Is my pipeline made up of assets, ops, graphs, or some of everything?
How often does the data need to be refreshed?
Is the data partitioned, and do old records require updates?
Should updates occur in batches? Or should updates start when specific events occur?

The following cheatsheet contains high-level details about each of the automation methods we covered, along with when to use each one.

Method	How it works	May be a good fit if...	Works with
Schedules	Starts a job at a specified time	You're using jobs, and You want to run the job at a specific time	Assets Ops Graphs
Sensors	Starts a job or materializes a selection of assets when a specific event occurs	You want to trigger runs based off an event	Assets Ops Graphs
Declarative Automation	Automatically materializes an asset when specified criteria (ex: upstream changes) are met	You're not using jobs, You want a declarative approach, and You're comfortable with experimental APIs	Assets only
Asset Sensors	Starts a job when a materialization occurs for a specific asset or selection of assets	You're using jobs, You want to trigger a job in response to asset materialization(s), and You're comfortable with experimental APIs	Assets only