Ask AI

Changelog#

1.5.14 / 0.21.14 (libraries)#

New#

  • Viewing logs for a sensor or schedule tick is now a generally available feature.
    • The feature flag to view sensor or schedule tick logs has been removed, as the feature is now enabled by default.
    • Logs can now be viewed even when the sensor or schedule tick fails.
    • The logs are now viewable in the sensor or schedule tick modal.
  • graph_multi_assets can now accept inputs as kwargs.
  • [ui] The tick timeline for schedules and sensors now defaults to showing all ticks, instead of excluding skipped ticks. The previous behavior can be enabled by unchecking the “Skipped” checkbox below the timeline view.
  • [ui] The updated asset graph is no longer behind an experimental flag. The new version features a searchable left sidebar, a horizontal DAG layout, context menus and collapsible groups!

Bugfixes#

  • [ui] Fix layout and scrolling issues that arise when a global banner alert is displayed in the app.
  • [ui] Use a larger version of the run config dialog in the Runs list in order to maximize the amount of visible config yaml.
  • [ui] When a software-defined asset is removed from a code location, it will now also be removed from global search.
  • [ui] When selecting assets in the catalog, you can now opt to materialize only “changed and missing” items in your selection.
  • [ui] The “Open in Launchpad” option on asset run pages has been updated to link to the graph of assets or asset job instead of an unusable launchpad page.
  • [ui] Partition status dots of multi-dimensional assets no longer wrap on the Asset > Partitions page.
  • [asset checks] Fixed a bug that caused the resource_defs parameter of @asset_check to not be respected
  • [ui] Fixed an issue where schedules or sensors with the same name in two different code locations sometimes showed each others runs in the list of runs for that schedule or sensor.
  • [pipes] Fixed an issue with the PipesFileMessageReader that could cause a crash on Windows.
  • Previously, calling context.log in different threads within a single op could result in some of those log messages being dropped. This has been fixed (thanks @quantum-byte!)
  • [dagster-dbt] On Dagster run termination, the dbt subprocess now exits gracefully to terminate any inflight queries that are materializing models.

Breaking Changes#

  • The file_manager property on OpExecutionContext and AssetExecutionContext has been removed. This is an ancient property that was deprecated prior to Dagster 1.0, and since then had been raising a NotImplementedError whenever invoked.

Community Contributions#

  • Added the Hashicorp Nomad integration to the documentation’s list of community integrations. Thanks, @ThomAub!
  • [dagster-deltalake] Fixed an error when passing non-string valued options and extended the supported data types by the arrow type handler to support pyarrow datasets which allows for lazily loading delta tables. Thanks @roeap!

Experimental#

  • [dagster-pipes] The subprocess and databricks clients now forward termination to the external process if the orchestration process is terminated. A forward_termination argument is available for opting out.

Documentation#

  • Fixed an error in the asset checks factory code example.

Dagster Cloud#

  • The UI now correctly displays failed partitions after a single-run backfill occurs. Previously, if a single-run backfill failed, the corresponding partitions would not display as failed.
  • Several performance improvements when submitting Snowflake metrics to Dagster Cloud Insights.
  • Fixed an error which would occur when submitting Snowflake metrics for a removed or renamed asset to Dagster Cloud Insights.

1.5.13 / 0.21.13 (libraries)#

New#

  • The SensorEvaluationContext object has two new properties: last_sensor_start_time and is_first_tick_since_sensor_start. This enables sensor evaluation functions to vary behavior on the first tick vs subsequent ticks after the sensor has started.
  • The asset_selection argument to @sensor and SensorDefinition now accepts sequence of AssetsDefinitions, a sequences of strings, or a sequence of AssetKeys, in addition to AssetSelections.
  • [dagster-dbt] Support for dbt-core==1.3.* has been removed.
  • [ui] In code locations view, link to git repo when it’s a valid URL.
  • [ui] To improve consistency and legibility, when displaying elapsed time, most places in the app will now no longer show milliseconds.
  • [ui] Runs that were launched by schedules or sensors now show information about the relevant schedule or sensor in the header, with a link to view other runs associated with the same tick.
  • [dagster-gcp] Added a show_url_only parameter to GCSComputeLogManager that allows you to configure the compute log manager so that it displays a link to the GCS console rather than loading the logs from GCS, which can be useful if giving Dagster access to GCS credentials is undesirable.

Bugfixes#

  • Fixed behavior of loading partitioned parent assets when using the BranchingIOManager
  • [ui] Fixed an unwanted scrollbar that sometimes appears on the code location list.

Community Contributions#

  • Fixed a bug where dagster would error on FIPS-enabled systems by explicitly marking callsites of hashlib.md5 as not used for security purposes (Thanks @jlloyd-widen!)
  • [dagster-k8s] Changed execute_k8s_job to be aware of run-termination and op failure by deleting the executing k8s job (Thanks @Taadas!).
  • [dagstermill] Fixed dagstermill integration with the Dagster web UI to allow locally-scoped static resources (required to show certain frontend-components like plotly graphs) when viewing dagstermill notebooks (Thanks @aebrahim!).
  • [dagster-dbt] Fixed type annotation typo in the DbtCliResource API docs (Thanks @akan72!)

Experimental#

  • [pipes] Methods have been added to facilitate passing non-Dagster data back from the external process (report_custom_message ) to the orchestration process (get_custom_messages).
  • [ui] Added a “System settings” option for UI theming, which will use your OS preference to set light or dark mode.

Documentation#

  • [graphql] - Removed experimental marker that was missed when the GraphQL client was fully released
  • [assets] - Add an example for using retries with assets to the SDA concept page
  • [general] - Fixed some typos and formatting issues

1.5.12 / 0.21.12 (libraries)#

Bugfixes#

  • [dagster-embedded-elt] Fixed an issue where EnvVars used in Sling source and target configuration would not work properly in some circumstances.
  • [dagster-insights] Reworked the Snowflake insights ingestion pipeline to improve performance and increase observability.

1.5.11 / 0.21.11 (libraries)#

New#

  • [ui] Asset graph now displays active filters.
  • [ui] Asset graph can now be filtered by compute kind.
  • [ui] When backfilling failed and missing partitions of assets, a “Preview” button allows you to see which ranges will be materialized.
  • [dagster-dbt] When running DAGSTER_DBT_PARSE_PROJECT_ON_LOAD=1 dagster dev in a new scaffolded project from dagster-dbt project scaffold, dbt logs from creating dbt artifacts to loading the project are now silenced.
  • [dagster-airbyte] Added a new connection_meta_to_group_fn argument which allows configuring loaded asset groups based on the connection’s metadata dict.
  • [dagster-k8s] Debug information about failed run workers in errors surfaced by run monitoring now includes logs from sidecar containers, not just the main dagster container.

Bugfixes#

  • The QueuedRunCoordinatorDaemon has been refactored to paginate over runs when applying priority sort and tag concurrency limits. Previously, it loaded all runs into memory causing large memory spikes when many runs were enqueued.
  • Callable objects can once again be used to back sensor definitions.
  • UPathIOManager has been updated to use the correct path delimiter when interacting with cloud storages from a Windows process.
  • In the default multiprocess executor, the STEP_WORKER_STARTED event now fires before importing code in line with the other executors.
  • During execution, skipping a step now takes precedence over “abandoning” it due to upstream failure. This is expected to substantially improve the “retry from failure” workflow when conditional branching is in use.
  • Fixed an issue where default config values set to EnvVar did not work properly.
  • Fixed an issue where resources which implemented IAttachDifferentObjectToOpContext would pass the incorrect object to schedules and sensors.
  • Fixed a bug that caused auto-materialize failures when using the materialize_on_cron rule with dynamically partitioned assets.
  • Fixed an issue where sensor ticks would sporadically fail with a StopIteration exception.
  • [ui] For a job launchpad with a large number of tabs, the “Remove all” option was pushed offscreen. This has been fixed.
  • [ui] The asset backfill page now correctly shows backfills that target only unpartitioned assets.
  • [ui] Launching an asset job that was defined without_checks no longer fails by attempting to include the checks.
  • [dagster-databricks] fix bug that caused crash when polling a submitted job that is still in the Databricks queue (due to concurrency limit).

Community Contributions#

  • Patched issue where the local compute log path exposed file content outside of the compute log base directory - thanks r1b!
  • [dagster-databricks] Added ability to authenticate using an Azure service principal and fix minor bugs involving authenticating with a service principal while DATABRICKS_HOST is set. Thanks @zyd14!

Experimental#

  • [ui] Dark mode is now available via the User Settings dialog, currently in an experimental state. By default, the app will use a “legacy” theme, closely matching our current colors. A new light mode theme is also available.
  • [ui] Asset graph group nodes can be collapsed/expanded by right clicking on the collapsed group node or the header of the expanded group node.
  • [ui] Asset graph group nodes can be all collapsed or all expanded by right clicking anywhere on the graph and selecting the appropriate action.
  • [ui] The tree view was removed from the asset graph.
  • [pipes] PipesLambdaClient, an AWS Lambda pipes client has been added to dagster_aws.
  • Fixed a performance regression introduced in the 1.5.10 release where auto-materializing multi-assets became slower.

Documentation#

Dagster Cloud#

  • When a Dagster Cloud agent starts up, it will now wait to display as Running on the Agents tab in the Dagster Cloud UI until it has launched all the code servers that it needs in order to serve requests.

1.5.10 / 0.21.10 (libraries)#

New#

  • Added a new MetadataValue.job metadata type, which can be used to link to a Dagster job from other objects in the UI.
  • [asset backfills] Previously, when partitions definitions were changed after backfill launch, the asset backfill page would be blank. Now, when partitions definitions are changed, the backfill page will display statuses by asset.
  • [dagster-bigquery, dagster-duckdb, dagster-snowflake]. The BigQuery, DuckDB, and Snowflake I/O Managers will now determine the schema (dataset for BigQuery) in the following order of precedence: schema metadata set on the asset or op, I/O manager schema/ dataset configuration, key_prefix set on the asset. Previously, all methods for setting the schema/dataset were mutually exclusive, and setting more than one would raise an exception.
  • [dagster-shell] Added option to exclude the shell command from logs.
  • [dagster-dbt] When running DAGSTER_DBT_PARSE_PROJECT_ON_LOAD=1 dagster dev in a new scaffolded project from dagster-dbt project scaffold, dbt artifacts for loading the project are now created in a static target/ directory.

Bugfixes#

  • Problematic inheritance that was causing pydantic warnings to be emitted has been corrected.
  • It's now possible to use the logger of ScheduleEvaluationContext when testing via build_schedule_context.
  • The metadata from a Failure exception is now hoisted up to the failure that culminates when retry limits are exceeded.
  • Fixed bug in which the second instance of an hour partition at a DST boundary would never be shown as “materialized” in certain UI views.
  • Fixed an issue where backfilling an hourly partition that occurred during a fall Daylight Savings Time transition sometimes raised an error.
  • [auto-materialize] Fix issue where assets which were skipped because required parent partitions did not exist would not be materialized once those partitions came into existence.
  • [dagster ecs] The exit code of failed containers is now included in the failure message.
  • [dagster pipes] The PipesK8sClient now correctly raises on failed containers.
  • [dagster pipes] Using pipes within ops instead of assets no longer enforces problematic constraints.
  • [helm] Added maxCatchupRuns and maxTickRetries configuration options for the scheduler in the Helm chart.
  • [embedded-elt] Fixed crashes for non-unicode logs.
  • [UI] Fixed an issue where the test sensor dialog for a sensor that targeted multiple jobs would claim that all of the runs were targeting the same job.
  • [UI] Asset keys, job names, and other strings in Dagster UI no longer truncate unnecessarily in Firefox in some scenarios
  • [UI] A larger “View prior events” button on the Asset > Partitions page makes it easier to see the historical materializations of a specific partition of an asset.
  • [asset-checks, dbt] Fixed a bug that that caused asset checks to not execute when a run was not a subset. As part of the fix, the default dbt selection selection string will not be used for dbt runs, even when not in a subset. Instead we pass the explicit set of models and tests to execute, with DBT_INDIRECT_SELECTION=empty.
  • [asset-checks] Fixed a bug that caused asset checks defined with @asset(check_specs=... to not cooperate with the key_prefix argument of the load_assets_from_modules method and it’s compatriots.
  • [asset-checks] Fixed a bug that caused errors when launching a job from the UI that excluded asset checks.
  • [asset-checks] Fixed a bug that caused UI errors when a check run was deleted.

Deprecations#

  • Marked the experimental Airbyte ingestion-as-code feature as deprecated, to be removed in a future release. We suggest users interested in managing their Airbyte connections in code use the Airbyte terraform provider.

Community Contributions#

  • define_asset_job now accepts an op_retry_policy argument, which specifies a default retry policies for all of the ops in the job. (thanks Eugenio Contreras!)
  • Fix IOManager not being able to load assets with MultiPartitionsDefinition - thanks @cyberosa!
  • [dagster-essentials] Three typo fixes in Lesson 8 - thanks Colton @cmpadden!

Experimental#

  • The observable_source_asset decorator now accepts a key argument.
  • [dagster pipes] an implicit_materializations argument has been added to get_results and get_materialize_result to control whether an implicit materialization event is created or not.
  • [embedded-elt] Added a new builder and SlingConnectionResource to allow reusing sources and targets interoperably.
  • [UI] Updated the experimental concurrency limits configuration page to show per-op runtime info and control.
  • [UI] The Auto-materialize history tab for each asset now only includes rows for evaluations where the result of evaluating the policy has changed. Previously, it would also show a row in the table representing periods of time where nothing changed.
  • [asset-checks, dbt] build_dbt_asset_selection now also selects asset checks based on their underlying dbt tests. E.g. build_dbt_asset_selection([my_dbt_assets], dbt_select="tag:data_quality") will select the assets and checks for any models and tests tagged with ‘data_quality’.

Documentation#

Dagster Cloud#

  • Branch deployments now use the same timeouts for starting and canceling runs that are set for their parent full deployment, instead of a fixed value of 10 minutes.
  • [k8s agent] Setting labels on a code location will now apply those labels to the kubernetes deployment and service for that code location, rather than just applying them to the pod for that code location.