Orchestration on Celery + Docker

APIs

dagster_celery_docker.celery_docker_executor ExecutorDefinition[source]

Celery-based executor which launches tasks in docker containers.

The Celery executor exposes config settings for the underlying Celery app under the config_source key. This config corresponds to the “new lowercase settings” introduced in Celery version 4.0 and the object constructed from config will be passed to the celery.Celery constructor as its config_source argument. (See https://docs.celeryproject.org/en/latest/userguide/configuration.html for details.)

The executor also exposes the broker, backend, and include arguments to the celery.Celery constructor.

In the most common case, you may want to modify the broker and backend (e.g., to use Redis instead of RabbitMQ). We expect that config_source will be less frequently modified, but that when solid executions are especially fast or slow, or when there are different requirements around idempotence or retry, it may make sense to execute pipelines with variations on these settings.

If you’d like to configure a Celery Docker executor in addition to the default_executors, you should add it to the executor_defs defined on a ModeDefinition as follows:

from dagster import ModeDefinition, default_executors, pipeline
from dagster_celery_docker.executor import celery_docker_executor

@pipeline(mode_defs=[
    ModeDefinition(executor_defs=default_executors + [celery_docker_executor])
])
def celery_enabled_pipeline():
    pass

Then you can configure the executor as follows:

execution:
  celery-docker:
    config:

      docker:
        image: 'my_repo.com/image_name:latest'
        registry:
          url: 'my_repo.com'
          username: 'my_user'
          password: {env: 'DOCKER_PASSWORD'}
        env_vars: ["DAGSTER_HOME"] # environment vars to pass from celery worker to docker

      broker: 'pyamqp://guest@localhost//'  # Optional[str]: The URL of the Celery broker
      backend: 'rpc://' # Optional[str]: The URL of the Celery results backend
      include: ['my_module'] # Optional[List[str]]: Modules every worker should import
      config_source: # Dict[str, Any]: Any additional parameters to pass to the
          #...       # Celery workers. This dict will be passed as the `config_source`
          #...       # argument of celery.Celery().

Note that the YAML you provide here must align with the configuration with which the Celery workers on which you hope to run were started. If, for example, you point the executor at a different broker than the one your workers are listening to, the workers will never be able to pick up tasks for execution.

In deployments where the celery_k8s_job_executor is used all appropriate celery and dagster_celery commands must be invoked with the -A dagster_celery_docker.app argument.