Skip to main content

Deploying Dagster using Docker Compose

This guide provides instructions for deploying Dagster using Docker Compose. This is useful when you want to, for example, deploy Dagster on an AWS EC2 host. A typical Dagster Docker deployment includes a several long-running containers: one for the webserver, one for the daemon, and one for each code location. It also typically executes each run in its own container.

The full example is available on GitHub.

Prerequisites
  • Familiarity with Docker and Docker Compose
  • Familiarity with dagster.yaml instance configuration
  • Familiarity with workspace.yaml code location configuration

Define a Docker image for the Dagster webserver and daemon

The Dagster webserver and daemon are the two host processes in a Dagster deployment. They typically each run in their own container, using the same Docker image. This image contains Dagster packages and configuration, but no user code.

To build this Docker image, use a Dockerfile like the following, with a name like Dockerfile_dagster:

FROM python:3.10-slim

RUN pip install \
dagster \
dagster-graphql \
dagster-webserver \
dagster-postgres \
dagster-docker

# Set $DAGSTER_HOME and copy dagster.yaml and workspace.yaml there
ENV DAGSTER_HOME=/opt/dagster/dagster_home/

RUN mkdir -p $DAGSTER_HOME

COPY dagster.yaml workspace.yaml $DAGSTER_HOME

WORKDIR $DAGSTER_HOME

Additionally, the following files should be in the same directory as the Docker file:

  • A workspace.yaml to tell the webserver and daemon the location of the code servers
  • A dagster.yaml to configure the Dagster instance

Define a Docker image for each code location

Each code location typically has its own Docker image, and that image is also used for runs launched for that code location.

To build a Docker image for a code location, use a Dockerfile like the following, with a name like Dockerfile_code_location_1:

FROM python:3.10-slim

RUN pip install \
dagster \
dagster-postgres \
dagster-docker

# Add code location code
WORKDIR /opt/dagster/app
COPY directory/with/your/code/ /opt/dagster/app

# Run dagster code server on port 4000
EXPOSE 4000

# CMD allows this to be overridden from run launchers or executors to execute runs and steps
CMD ["dagster", "code-server", "start", "-h", "0.0.0.0", "-p", "4000", "-f", "definitions.py"]

Write a Docker Compose file

The following docker-compose.yaml defines how to run the webserver container, daemon container, code location containers, and database container:

docker-compose.yaml
version: "3.7"

services:
# This service runs the postgres DB used by dagster for run storage, schedule storage,
# and event log storage. Depending on the hardware you run this Compose on, you may be able
# to reduce the interval and timeout in the healthcheck to speed up your `docker-compose up` times.
docker_example_postgresql:
image: postgres:11
container_name: docker_example_postgresql
environment:
POSTGRES_USER: "postgres_user"
POSTGRES_PASSWORD: "postgres_password"
POSTGRES_DB: "postgres_db"
networks:
- docker_example_network
healthcheck:
test: ["CMD-SHELL", "pg_isready -U postgres_user -d postgres_db"]
interval: 10s
timeout: 8s
retries: 5

# This service runs the gRPC server that loads your user code, in both dagster-webserver
# and dagster-daemon. By setting DAGSTER_CURRENT_IMAGE to its own image, we tell the
# run launcher to use this same image when launching runs in a new container as well.
# Multiple containers like this can be deployed separately - each just needs to run on
# its own port, and have its own entry in the workspace.yaml file that's loaded by the
# webserver.
docker_example_user_code:
build:
context: .
dockerfile: ./Dockerfile_user_code
container_name: docker_example_user_code
image: docker_example_user_code_image
restart: always
environment:
DAGSTER_POSTGRES_USER: "postgres_user"
DAGSTER_POSTGRES_PASSWORD: "postgres_password"
DAGSTER_POSTGRES_DB: "postgres_db"
DAGSTER_CURRENT_IMAGE: "docker_example_user_code_image"
networks:
- docker_example_network

# This service runs dagster-webserver, which loads your user code from the user code container.
# Since our instance uses the QueuedRunCoordinator, any runs submitted from the webserver will be put on
# a queue and later dequeued and launched by dagster-daemon.
docker_example_webserver:
build:
context: .
dockerfile: ./Dockerfile_dagster
entrypoint:
- dagster-webserver
- -h
- "0.0.0.0"
- -p
- "3000"
- -w
- workspace.yaml
container_name: docker_example_webserver
expose:
- "3000"
ports:
- "3000:3000"
environment:
DAGSTER_POSTGRES_USER: "postgres_user"
DAGSTER_POSTGRES_PASSWORD: "postgres_password"
DAGSTER_POSTGRES_DB: "postgres_db"
volumes: # Make docker client accessible so we can terminate containers from the webserver
- /var/run/docker.sock:/var/run/docker.sock
- /tmp/io_manager_storage:/tmp/io_manager_storage
networks:
- docker_example_network
depends_on:
docker_example_postgresql:
condition: service_healthy
docker_example_user_code:
condition: service_started

# This service runs the dagster-daemon process, which is responsible for taking runs
# off of the queue and launching them, as well as creating runs from schedules or sensors.
docker_example_daemon:
build:
context: .
dockerfile: ./Dockerfile_dagster
entrypoint:
- dagster-daemon
- run
container_name: docker_example_daemon
restart: on-failure
environment:
DAGSTER_POSTGRES_USER: "postgres_user"
DAGSTER_POSTGRES_PASSWORD: "postgres_password"
DAGSTER_POSTGRES_DB: "postgres_db"
volumes: # Make docker client accessible so we can launch containers using host docker
- /var/run/docker.sock:/var/run/docker.sock
- /tmp/io_manager_storage:/tmp/io_manager_storage
networks:
- docker_example_network
depends_on:
docker_example_postgresql:
condition: service_healthy
docker_example_user_code:
condition: service_started

networks:
docker_example_network:
driver: bridge
name: docker_example_network

Start your deployment

To start the deployment, run:

docker compose up