Dask (dagster_dask)

See also the Dask deployment guide.

Python version support

Because Dask has dropped support for Python versions < 3.6, we do not test dagster-dask on Python 2.7 or 3.5.

dagster_dask.dask_executor ExecutorDefinition[source]

Dask-based executor.

The ‘cluster’ can be one of the following: (‘existing’, ‘local’, ‘yarn’, ‘ssh’, ‘pbs’, ‘moab’, ‘sge’, ‘lsf’, ‘slurm’, ‘oar’, ‘kube’).

If the Dask executor is used without providing executor-specific config, a local Dask cluster will be created (as when calling dask.distributed.Client() with dask.distributed.LocalCluster()).

The Dask executor optionally takes the following config:

cluster:
    {
        local?: # takes distributed.LocalCluster parameters
            {
                timeout?: 5,  # Timeout duration for initial connection to the scheduler
                n_workers?: 4  # Number of workers to start
                threads_per_worker?: 1 # Number of threads per each worker
            }
    }

If you’d like to configure a dask executor in addition to the default_executors, you should add it to the executor_defs defined on a ModeDefinition as follows:

from dagster import ModeDefinition, default_executors, pipeline
from dagster_dask import dask_executor

@pipeline(mode_defs=[ModeDefinition(executor_defs=default_executors + [dask_executor])])
def dask_enabled_pipeline():
    pass