Modeling dependencies without inputs and outputs

You can find the code for this example on Github.

Dagster thinks of a pipeline's dependency graph as a data flow graph.

The way we tell Dagster that one solid should execute after another solid is by declaring that one of the inputs of the former solid depends on one of the outputs of the latter solid. If the former solid doesn't depend on something produced by the latter solid, it theoretically shouldn't need to execute after it.

However, sometimes it doesn't make sense to use Dagster's inputs/outputs to model the dependency. For example, if one solid creates a particular table in a database and another solid consumes that table, then the data is flowing through the database, not through inputs and outputs defined in Dagster.

In this situation, we can use the Nothing Dagster type to model the dependency between the two solids. We are passing "nothing" via Dagster between the two solids. By hooking up a "nothing" output of the first solid to a "nothing" input of the second solid, Dagster understands that the second should execute after the first.

repo.py
@solid
def create_table_1(_) -> Nothing:
    get_database_connection().execute("create table_1 as select * from some_source_table")


@solid(input_defs=[InputDefinition("start", Nothing)])
def create_table_2(_):
    get_database_connection().execute("create table_2 as select * from table_1")


@pipeline
def my_pipeline():
    create_table_2(create_table_1())

Open in a playground

Open in Gitpod

Download

curl https://codeload.github.com/dagster-io/dagster/tar.gz/master | tar -xz --strip=2 dagster-master/examples/nothing
cd nothing