First, you need to ensure that your Dagster deployment has access to the Kubernetes cluster where you want to run your tasks. The PipesK8sClient accepts kubeconfig and kubecontext, and env arguments to configure the Kubernetes client.
Here's an example of what this might look like when configuring the client to access an EKS cluster:
from dagster_k8s import PipesK8sClient
eks_client = PipesK8sClient(# The client will have automatic access to all# environment variables in the execution context.
env={**AWS_CREDENTIALS,"AWS_REGION":"us-west-2"},
kubeconfig_file="path/to/kubeconfig",
kube_context="my-eks-cluster",)
Step 2: Writing an asset that executes the task within a Kubernetes pod#
Once you have access to the Kubernetes cluster, you can write an asset that executes the task within a Kubernetes pod using the PipesK8sClient. In comparison to the KubernetesPodOperator, the PipesK8sClient allows you to define the pod spec directly in your Python code.
In the parameter comparison section of this doc, you'll find a detailed comparison describing how to map the KubernetesPodOperator parameters to the PipesK8sClient parameters.
Here's a comparison of the parameters between the KubernetesPodOperator and the PipesK8sClient: Directly supported arguments:
in_cluster (named load_incluster_config in the PipesK8sClient)
cluster_context (named kube_context in the PipesK8sClient)
config_file (named kubeconfig_file in the PipesK8sClient)
Many arguments are supported indirectly via the base_pod_spec argument.
volumes: Volumes to be used by the Pod (key volumes)
affinity: Node affinity/anti-affinity rules for the Pod (key affinity)
node_selector: Node selection constraints for the Pod (key nodeSelector)
hostnetwork: Enable host networking for the Pod (key hostNetwork)
dns_config: DNS settings for the Pod (key dnsConfig)
dnspolicy: DNS policy for the Pod (key dnsPolicy)
hostname: Hostname of the Pod (key hostname)
subdomain: Subdomain for the Pod (key subdomain)
schedulername: Scheduler to be used for the Pod (key schedulerName)
service_account_name: Service account to be used by the Pod (key serviceAccountName)
priority_class_name: Priority class for the Pod (key priorityClassName)
security_context: Security context for the entire Pod (key securityContext)
tolerations: Tolerations for the Pod (key tolerations)
image_pull_secrets: Secrets for pulling container images (key imagePullSecrets)
termination_grace_period: Grace period for Pod termination (key terminationGracePeriodSeconds)
active_deadline_seconds: Deadline for the Pod's execution (key activeDeadlineSeconds)
host_aliases: Additional entries for the Pod's /etc/hosts (key hostAliases)
init_containers: Initialization containers for the Pod (key initContainers)
The following arguments are supported under the nested containers key of the base_pod_spec argument of the PipesK8sClient:
image: Docker image for the container (key 'image')
cmds: Entrypoint command for the container (key command)
arguments: Arguments for the entrypoint command (key args)
ports: List of ports to expose from the container (key ports)
volume_mounts: List of volume mounts for the container (key volumeMounts)
env_vars: Environment variables for the container (key env)
env_from: List of sources to populate environment variables (key envFrom)
image_pull_policy: Policy for pulling the container image (key imagePullPolicy)
container_resources: Resource requirements for the container (key resources)
container_security_context: Security context for the container (key securityContext)
termination_message_policy: Policy for the termination message (key terminationMessagePolicy)
For a full list, see the kubernetes container spec documentation. The following arguments are supported under the base_pod_meta argument, which configures the metadata of the pod: