Ask AI

Setting up alerts in Dagster Cloud#

This guide is applicable to Dagster Cloud.

In this guide, we'll walk you through configuring alerts in Dagster Cloud.

Dagster Cloud can send alerts to:

  • Email
  • Slack
  • Microsoft Teams

Understanding alert policies#

Alert policies define which events will trigger an alert, the conditions under which an alert will be sent, and how the alert will be sent.

  • Asset based alert policies can trigger on asset materialization failure or success, as well as asset check error, warn, passed, or failure to execute. An asset group or asset key can be provided to asset based alert policies, which limits notifications to only fire if the asset group or asset key matches the materialized asset. In the case of checks, notifications will only be sent if the asset group/key to which the check is attached matches. Note: Asset based alert policies are still experimental, and may be subject to change as we gather user feedback.
  • Job run based alert policies include a set of configured tags. If an alert policy has no configured tags, all jobs will be eligible for that alert. Otherwise, only jobs that contain all the tags for a given alert policy are eligible for that alert.
  • Alert policies created for schedule/sensor tick failure will apply to all schedules/sensors. However, you will only receive alerts when the schedule/sensor changes from a state of succeeding to failing, so subsequent failures will not trigger new alerts.
  • Code location error alert policies will trigger when a code location fails to load due to an error.
  • Agent downtime alert policies will trigger when a Hybrid agent hasn't heartbeated within the last 5 minutes.

Alert policies are configured on a per-deployment basis. For example, asset alerts configured in a prod deployment are only applicable to assets in that deployment.


Managing alert policies in Dagster Cloud#

Organization Admin, Admin, or Editor permissions are required to manage alerts in Dagster Cloud.

Creating alert policies#

  1. Sign in to your Dagster Cloud account.

  2. In the top navigation, click Deployment

  3. Click the Alerts tab.

  4. Click + Create alert policy.

  5. From the Alert Policy type drop-down, select the type of alert to create.

  6. In the Create alert policy window, fill in the following:

    • Alert policy name - Enter a name for the alert policy. For example, slack_urgent_failure

    • Description - Enter a description for the alert policy

    • Notification service - Select the service for the alert policy:

      • Slack - If you haven't connected Slack, click the Connect to Slack button to add the Dagster Cloud Slack app to your workspace. After the installation completes, invite the @Dagster Cloud bot user to the desired channel.

        You can then configure the alert policy to message this channel. Note: Only messaging one channel per alert policy is currently supported:

        Slack alert configured to alert the sales-notifications channel

      To disconnect Dagster Cloud from Slack, remove the Dagster Cloud app from your Slack workspace. Refer to Slack's documentation for more info and instructions. Once the app is removed, refresh the Alerts page in Dagster Cloud and the Connect to Slack option will be displayed.

      • Email - Email alerts can be sent to one or more recipients. For example:

        Email alert configured to alert two recipients
      • Microsoft Teams - Send alerts to Teams incoming webhooks.

        First, follow the instructions in the Microsoft Teams documentation to create an incoming webhook. Then, paste the url into the alert form.

        Microsoft Teams alert configured with a webhook url

    For asset-based alerts, fill out these additional options:

    • Asset group - Select the asset group to monitor. You will have the option to select from all asset groups in the deployment.
    • Asset key - Select the asset key to monitor. You will have the option to select from all asset keys in the deployment. Note: If you select an asset group, you will not be able to select an asset key.
    • Events - Select whether the alert should trigger on asset materialization failure, asset materialization success, asset check error, asset check warn, asset check passed, or asset check failure to execute
    • Notification Service - Asset alerts have an additional notification option to email asset owners (experimental). Email alert configured to alert asset owners

    For job-based alerts, fill out these additional options:

    • Tags - Add tag(s) for the alert policy. Jobs with these tags will trigger the alert. For example: level:critical or team:sales
    • Events - Select whether the alert should trigger on job success, failure, or both
  7. When finished, click Save policy.

Editing alert policies#

To edit an existing alert policy, click the Edit button next to the policy:

Highlighted Edit button next to an alert policy in Dagster Cloud

Enabling and disabling alert policies#

To enable or disable an alert, use the toggle on the left side of the alert policy.

Deleting alert policies#

To delete an alert policy, click the Delete button next to the policy. When prompted, confirm the deletion.

Highlighted Delete button next to an alert policy in Dagster Cloud

Managing alert policies with the dagster-cloud CLI#

With the dagster-cloud CLI, you can:

Setting a full deployment's policies#

A list of alert policies can be defined in a single YAML file. After declaring your policies, set them for the deployment using the following command:

dagster-cloud deployment alert-policies sync -a <ALERT_POLICIES_PATH>

Viewing a full deployment's policies#

List the policies currently configured on the deployment by running:

dagster-cloud deployment alert-policies list

Configuring a Slack alert policy#

In this example, we'll configure a Slack notification to trigger whenever a run of a job succeeds or fails. This job, named sales_job, has a team tag of sales:

@op
def sales_computation():
    ...


@job(tags={"team": "sales"})
def sales_job():
    sales_computation()

In the alert policies YAML file, we'll define a policy that listens for jobs with a team tag of sales to succeed or fail. When this occurs, a notification will be sent to the sales-notification channel in the hooli workspace:

alert_policies:
  - name: "slack-alert-policy"
    description: "An alert policy to send a Slack notification to sales on job failure or success."
    tags:
      - key: "team"
        value: "sales"
    event_types:
      - "JOB_SUCCESS"
      - "JOB_FAILURE"
    notification_service:
      slack:
        slack_workspace_name: "hooli"
        slack_channel_name: "sales-notifications"

Configuring an email alert policy#

In this example, we'll configure an email alert when a job fails. This job, named important_job, has a level tag of "critical":

def important_computation():
    ...


@job(tags={"level": "critical"})
def important_job():
    important_computation()

In the alert policies YAML file, we'll define a policy that listens for jobs with a level tag of critical to fail. When this occurs, an email notification will be sent to richard.hendricks@hooli.com and nelson.bighetti@hooli.com:

alert_policies:
  - name: "email-alert-policy"
    description: "An alert policy to email company executives during job failure."
    tags:
      - key: "level"
        value: "critical"
    event_types:
      - "JOB_FAILURE"
    notification_service:
      email:
        email_addresses:
          - "richard.hendricks@hooli.com"
          - "nelson.bighetti@hooli.com"

Configuring a Microsoft Teams alert policy#

In this example, we'll configure a Teams alert for when a Hybrid Agent goes down.

alert_policies:
  - name: "email-alert-policy"
    description: "An alert policy to email company executives during job failure."
    tags:
      - key: "level"
        value: "critical"
    event_types:
      - "AGENT_UNAVAILABLE"
    notification_service:
      microsoft_teams:
        webhook_url: "https://yourdomain.webhook.office.com/..."

If the agent stops heartbeating, a message will be sent via the incoming webhook URL.


Compatible event types#

When creating an alert policy using the CLI, only certain event_types can be specified together. You can specify multiple job run-based event types together (JOB_SUCCESS, JOB_FAILURE), or a tick-based event type (TICK_FAILURE), but attempting to mix these will result in an error.