Executors are responsible for executing steps within a pipeline run. Once a run has launched and the the process for the run, or run worker, has been allocated and started, the executor assumes responsibility for execution. Executors can range from single-process serial executors all the way to managing per-step computational resources with a sophisticated control plane.
What executor is used is determined by two things. First, modes provide the possible set of
executors one can use. In order to set this use the
executor_defs property on
execution config section of the
run config determines the actual executor.
Example executors include:
in_process_executor: Execution plan executes serially within the run worker itself.
multiprocess_executor: Each step executes within its own spawned process. Has configurable level of parallelism.
dask_executor: Executes each step within a dask task.
celery_executor: Executes each step within a celery task.
celery_docker_executor: Executes each step within a Docker container.
celery_k8s_job_executor: Executes each step within a ephemeral kubernetes pod, using celery as a control plane for prioritization, queuing, and so forth.
The executor system is pluggable, and it is possible to write your own executor to target a different execution substrate. This is not well-documented, and the internal APIs continue to be in flux.