Solids
The core abstraction of Dagster is the solid. A solid is a functional unit of computation. It has defined inputs and
outputs, and multiple solids can be wired together to form a Pipeline
by
defining dependencies between solid inputs and outputs.
A solid has a number of properties:
- Coarse-grained and for use in batch computations.
- Defines inputs and outputs, optionally typed within the Dagster type system.
- Embeddable in a dependency graph (pipeline) that is constructed by connecting the inputs and outputs of multiple solids.
- Emits a stream of typed, structured events — such as expectations and materializations — corresponding to the semantics of its computation.
- Exposes self-describing, strongly typed configuration.
- Testable and reusable.
Defining a solid¶
There are two ways to define a solid:
- Wrap a python function in the
@solid
decorator [Preferred] - Construct a
SolidDefinition
object
Method 1: Using the decorator
To use the @solid
decorator,
wrap a function that takes a context
argument as the first
parameter. The context is provides access to system information such as
resources and solid configuration. See Solid Context for more information.
@solid
def my_solid(context):
return 1
Method 2: Constructing the SolidDefinition object
To construct a SolidDefinition
object,
you need to pass the constructor a solid name,
input definitions, output definitions, and a compute_fn
. The compute
function is the same as the function you would decorate using the @solid
decorator.
def _return_one(_context, inputs):
yield Output(1)
solid = SolidDefinition(
name="my_solid",
input_defs=[],
output_defs=[OutputDefinition(Int)],
compute_fn=_return_one,
)
Solid inputs and outputs¶
Dependencies between solids in Dagster are defined using InputDefinitions
and OutputDefinitions
.
Input and Output definitions are:
- Named
- Optionally typed
- Optionally have human readable descriptions
Inputs:
Inputs are arguments to a solid's compute_fn
, and are specified using InputDefinitions
.
They can be passed from outputs of other solids, or stubbed using config.
A solid only executes once all of its inputs have been resolved, which means that the all of the outputs that the solid depends on have been successfully yielded.
The argument names of the compute_fn
must match the InputDefinitions
names,
and must be in the same order after the context argument.
For example, if we wanted a solid with an input of type str
and an input of type int
:
@solid(
input_defs=[
InputDefinition(name="a", dagster_type=str),
InputDefinition(name="b", dagster_type=int),
]
)
def my_input_example_solid(context, a, b):
pass
Outputs:
Outputs are yielded from a solid's compute_fn
. A solid can yield
multiple outputs.
@solid(
input_defs=[
InputDefinition(name="a", dagster_type=int),
InputDefinition(name="b", dagster_type=int),
],
output_defs=[
OutputDefinition(name="sum", dagster_type=int),
OutputDefinition(name="difference", dagster_type=int),
],
)
def my_input_output_example_solid(context, a, b):
yield Output(a + b, output_name="sum")
yield Output(a - b, output_name="difference")
Solid context¶
A context object is passed as the first parameter to a solid's
compute_fn
. The context is an instance of SystemComputeExecutionContext
,
and provides access to:
- solid configuration (
context.solid_config
) - loggers (
context.log
) - resources (
context.resources
) - run ID (
context.run_id
)
For example, to access the logger
@solid
def my_logging_solid(context):
context.log.info("Hello world")