Concepts

Dagster provides a variety of abstractions for building and orchestrating data pipelines. These concepts enable a modular, declarative approach to data engineering, making it easier to manage dependencies, monitor execution, and ensure data quality.

Asset

An asset represents a logical unit of data such as a table, dataset, or machine learning model. Assets can have dependencies on other assets, forming the data lineage for your pipelines. As the core abstraction in Dagster, assets can interact with many other Dagster entities to facilitate certain tasks. When you define an asset, either with the @dg.asset decorator or via a component, the definition is automatically added to a top-level Definitions object.

Concept	Relationship
asset check	`asset` may use an `asset check`
asset spec	`asset` is described by an `asset spec`
component	`asset` may be programmatically built by a component
config	`asset` may use a `config`
definitions	`asset` is added to a top-level `Definitions` object to be deployed
io manager	`asset` may use a `io manager`
partition	`asset` may use a `partition`
resource	`asset` may use a `resource`
job	`asset` may be used in a `job`
schedule	`asset` may be used in a `schedule`
sensor	`asset` may be used in a `sensor`

Asset check

An asset_check is associated with an asset to ensure it meets certain expectations around data quality, freshness or completeness. Asset checks run when the asset is executed and store metadata about the related run and if all the conditions of the check were met.

Concept	Relationship
asset	`asset check` may be used by an `asset`
definitions	`asset check` is added to a top-level `Definitions` object to be deployed

Asset spec

Specs are standalone objects that describe the identity and metadata of Dagster entities without defining their behavior. For example, an AssetSpec contains essential information like the asset's key (its unique identifier) and tags (labels for organizing and annotating the asset), but it doesn't include the logic for materializing that asset.

Concept	Relationship
asset	`asset spec` may describe the identity and metadata of an `asset`

Code location

A code location is a collection of Dagster entity definitions deployed in a specific environment. A code location determines the Python environment (including the version of Dagster being used as well as any other Python dependencies). A Dagster project can have multiple code locations, helping isolate dependencies.

Concept	Relationship
definitions	`code location` must contain at least one top-level `Definitions` object

Component

Components are objects that programmatically build assets and other Dagster entity definitions, such as asset_checks, schedules, resources, and sensors. They accept schematized configuration parameters (which are specified using YAML or lightweight Python) and use them to build the actual definitions you need. Components are designed to help you quickly bootstrap parts of your Dagster project and serve as templates for repeatable patterns.

Concept	Relationship
asset	`component` builds `assets` and other `definitions`
asset check	`component` builds `asset_checks` and other `definitions`
definitions	`component` builds `assets` and other `definitions`
job	`component` builds `jobs` and other `definitions`
schedule	`component` builds `schedules` and other `definitions`
sensor	`component` builds `sensors` and other `definitions`
resource	`component` builds `resources` and other `definitions`

Config

A config is used to specify config schema for assets, jobs, schedules, and sensors. A RunConfig is a container for all the configuration that can be passed to a run. This allows for parameterization and the reuse of pipelines to serve multiple purposes.

Concept	Relationship
asset	`config` may be used by an `asset`
job	`config` may be used by a `job`
schedule	`config` may be used by a `schedule`
sensor	`config` may be used by a `sensor`

Definitions

In Dagster, "definitions" means two things:

The objects that combine metadata about Dagster entities with Python functions that define how they behave, for example, asset, ScheduleDefinition , and resource definitions.
The top-level Definitions object that contains references to all the definitions in a Dagster project. Entities included in the Definitions object will be deployed and visible within the Dagster UI.

Concept	Relationship
asset	Top-level `Definitions` object may contain one or more `asset` definitions
asset check	Top-level `Definitions` object may contain one or more `asset check` definitions
io manager	Top-level `Definitions` object may contain one or more `io manager` definitions
job	Top-level `Definitions` object may contain one or more `job` definitions
resource	Top-level `Definitions` object may contain one or more `resource` definitions
schedule	Top-level `Definitions` object may contain one or more `schedule` definitions
sensor	Top-level `Definitions` object may contain one or more `sensor` definitions
component	`definition` may be the output of a `component`
code location	`definitions` must be deployed in a `code location`

Graph

A GraphDefinition connects multiple ops together to form a DAG. If you are using assets, you will not need to use graphs directly.

Concept	Relationship
config	`graph` may use a `config`
op	`graph` must include one or more `ops`
job	`graph` must be part of `job` to execute

IO manager

An IOManager defines how data is stored and retrieved between the execution of assets and ops. This allows for a customizable storage and format at any interaction in a pipeline.

Concept	Relationship
asset	`io manager` may be used by an `asset`
definitions	`io manager` is added to a top-level `Definitions` object to be deployed

Job

A job is a subset of assets or the GraphDefinition of ops. Jobs are the main form of execution in Dagster.

Concept	Relationship
asset	`job` may contain a selection of `assets`
config	`job` may use a `config`
graph	`job` may contain a `graph`
schedule	`job` may be used by a `schedule`
sensor	`job` may be used by a `sensor`
definitions	`job` is added to a top-level `Definitions` object to be deployed

Op

An op is a computational unit of work. Ops are arranged into a GraphDefinition to dictate their order. Ops have largely been replaced by assets.

Concept	Relationship
type	`op` may use a `type`
graph	`op` must be contained in `graph` to execute

Partition

A PartitionsDefinition represents a logical slice of a dataset or computation mapped to a certain segments (such as increments of time). Partitions enable incremental processing, making workflows more efficient by only running on relevant subsets of data.

Concept	Relationship
asset	`partition` may be used by an `asset`

Resource

A ResourceDefinition is a way to make external resources (like database or API connections) available to Dagster entities (like assets, schedules, or sensors) during job execution, and to clean up after execution resolves. A ConfigurableResource is a resource that uses structured configuration. For more information, see Configuring resources.

Concept	Relationship
asset	`resource` may be used by an `asset`
schedule	`resource` may be used by a `schedule`
sensor	`resource` may be used by a `sensor`
definitions	`resource` is added to a top-level `Definitions` object to be deployed

Type

A type is a way to define and validate the data passed between ops.

Concept	Relationship
op	`type` may be used by an `op`

Schedule

A ScheduleDefinition is a way to automate jobs or assets to occur on a specified interval. In the cases that a job or asset is parameterized, the schedule can also be set with a run configuration (RunConfig) to match.

Concept	Relationship
asset	`schedule` may include a `job` or selection of `assets`
config	`schedule` may include a `config` if the `job` or `assets` include a `config`
job	`schedule` may include a `job` or selection of `assets`
definitions	`schedule` is added to a top-level `Definitions` object to be deployed

Sensor

A sensor is a way to trigger jobs or assets when an event occurs, such as a file being uploaded or a push notification. In the cases that a job or asset is parameterized, the sensor can also be set with a run configuration (RunConfig) to match.

Concept	Relationship
asset	`sensor` may include a `job` or selection of `assets`
config	`sensor` may include a `config` if the `job` or `assets` include a `config`
job	`sensor` may include a `job` or selection of `assets`
definitions	`sensor` is added to a top-level `Definitions` object to be deployed

Asset​

Asset check​

Asset spec​

Code location​

Component​

Config​

Definitions​

Graph​

IO manager​

Job​

Op​

Partition​

Resource​

Type​

Schedule​

Sensor​