Runs instigated from the Dagit UI, the scheduler, or
dagster pipeline launch are "launched" in
Dagster. This is a distinct operation from "executing" a pipeline using the
python API or the CLI
execute command. A 'launch' operation allocates computational resources
(e.g. a process, a container, a kubernetes pod, etc) to carry out a run execution and then
instigates the execution.
The core abstraction in the launch process is the run launcher, which is configured as part of the Dagster Instance. The run launcher is the interface to the computational resources that will be used to actually execute Dagster runs. It receives the ID of a created run and a representation of the pipeline that is about to undergo execution.
The simplest run launcher is the built-in run launcher,
run launcher spawns a new process on the same node as the pipeline's repository
location. It also provides the ability to terminate launched runs.
Another example is the
K8sRunLauncher, which allocates a
Kubernetes Job per run.
A few examples of when a custom run launcher is needed:
You want to allocate different computational resources for different pipelines or pipeline runs (e.g. GPUs for some, more cores or memory for others). These decisions should be made in the run launcher.
You have custom infrastructure or custom APIs for allocating nodes for execution.
Colloquially we refer to the process or computational resource created by the run launcher as the run worker. The run launcher only determines the behavior of the run worker. Once execution starts within the run worker, it is the executor -- an in-memory abstraction in the run worker process -- that takes over management of computational resources.