Ask AI

Automation#

Dagster offers several ways to run data pipelines without manual intervation, including traditional scheduling and event-based triggers. Automating your Dagster pipelines can boost efficiency and ensure that data is produced consistently and reliably.

When one of Dagster's automation methods is triggered, a tick is created, which indicates that a run should occur. The tick will kick off a run, which is a single instance of a pipeline being executed.

In this guide, we'll cover the available automation methods Dagster provides and when to use each one.


Prerequisites#

Before continuing, you should be familiar with:


Available methods#

In this section, we'll touch on each of the automation methods currently supported by Dagster. After that we'll discuss what to think about when selecting a method.

Schedules#

Schedules are Dagster's imperative approach, which allow you to specify when a job should run, such as Mondays at 9:00 AM. Jobs triggered by schedules can contain a subset of assets or ops. Refer to the Schedules documentation to learn more.

Sensors#

You can use sensors to run a job or materialize an asset in response to specific events. Sensors periodically check and execute logic to know whether to kick off a run. They are commonly used for situations where you want to materialize an asset after some externally observable event happens, such as:

  • A new file arrives in a specific location, such as Amazon S3
  • A webhook notification is received
  • An external system frees up a worker slot

You can also use sensors to act on the status of a job run. Refer to the Sensors documentation to learn more.

Declarative Automation
Experimental
#

Declarative Automation allows you to automatically materialize assets when specified criteria are met. Using Declarative Automation, you could update assets:

  • When the asset hasn't yet been materialized
  • When an asset's upstream dependency has been updated
  • After an asset's parents have been updated since a cron tick
  • ... based on your own custom conditions

Materialization conditions are declared on an asset-by-asset basis. Refer to the Declarative Automation documentation to learn more.

Asset Sensors
Experimental
#

Asset sensors trigger jobs when a specified asset is materialized. Using asset sensors, you can instigate runs across jobs and code locations and keep downstream assets up-to-date with ease.

Refer to the Asset Sensor documentation to learn more.


Selecting a method#

Before you dive into automating your pipelines, you should think about:

  • Is my pipeline made up of assets, ops, graphs, or some of everything?
  • How often does the data need to be refreshed?
  • Is the data partitioned, and do old records require updates?
  • Should updates occur in batches? Or should updates start when specific events occur?

The following cheatsheet contains high-level details about each of the automation methods we covered, along with when to use each one.

MethodHow it worksMay be a good fit if...Works with
SchedulesStarts a job at a specified time
  • You're using jobs, and
  • You want to run the job at a specific time
  • Assets
  • Ops
  • Graphs
SensorsStarts a job or materializes a selection of assets when a specific event occursYou want to trigger runs based off an event
  • Assets
  • Ops
  • Graphs
Declarative AutomationAutomatically materializes an asset when specified criteria (ex: upstream changes) are met
  • You're not using jobs,
  • You want a declarative approach, and
  • You're comfortable with experimental APIs
Assets only
Asset SensorsStarts a job when a materialization occurs for a specific asset or selection of assets
  • You're using jobs,
  • You want to trigger a job in response to asset materialization(s), and
  • You're comfortable with experimental APIs
Assets only