Fixed a bug with load_assets_from_x functions where we began erroring when a spec and AssetsDefinition had the same key in a given module. We now only error in this case if include_specs=True.
[dagster-azure] Fixed a bug in 1.9.6 and 1.9.7 where the default behavior of the compute log manager switched from showing logs in the UI to showing a URL. You can toggle the show_url_only option to True to enable the URL showing behavior.
[dagster-dbt] Fixed an issue where group names set on partitioned dbt assets created using the @dbt_assets decorator would be ignored
Added the dagster-github library, a community contribution from @Ramshackle-Jamathon and
@k-mahoney!
dagster-celery
Simplified and improved config handling.
An engine event is now emitted when the engine fails to connect to a broker.
Bugfix
Fixes a file descriptor leak when running many concurrent dagster-graphql queries (e.g., for
backfill).
The @pyspark_solid decorator now handles inputs correctly.
The handling of solid compute functions that accept kwargs but which are decorated with explicit
input definitions has been rationalized.
Fixed race conditions in concurrent execution using SQLite event log storage with concurrent
execution, uncovered by upstream improvements in the Python inotify library we use.
Documentation
Improved error messages when using system storages that don't fulfill executor requirements.
We are now more permissive when specifying configuration schema in order make constructing
configuration schema more concise.
When specifying the value of scalar inputs in config, one can now specify that value directly as
the key of the input, rather than having to embed it within a value key.
Breaking
The implementation of SQL-based event log storages has been consolidated,
which has entailed a schema change. If you have event logs stored in a
Postgres- or SQLite-backed event log storage, and you would like to maintain
access to these logs, you should run dagster instance migrate. To check
what event log storages you are using, run dagster instance info.
Type matches on both sides of an InputMapping or OutputMapping are now enforced.
New
Dagster is now tested on Python 3.8
Added the dagster-celery library, which implements a Celery-based engine for parallel pipeline
execution.
Added the dagster-k8s library, which includes a Helm chart for a simple Dagit installation on a
Kubernetes cluster.
Dagit
The Explore UI now allows you to render a subset of a large DAG via a new solid
query bar that accepts terms like solid_name+* and +solid_name+. When viewing
very large DAGs, nothing is displayed by default and * produces the original behavior.
Performance improvements in the Explore UI and config editor for large pipelines.
The Explore UI now includes a zoom slider that makes it easier to navigate large DAGs.
Dagit pages now render more gracefully in the presence of inconsistent run storage and event logs.
Improved handling of GraphQL errors and backend programming errors.
Minor display improvements.
dagster-aws
A default prefix is now configurable on APIs that use S3.
S3 APIs now parametrize region_name and endpoint_url.
dagster-gcp
A default prefix is now configurable on APIs that use GCS.
dagster-postgres
Performance improvements for Postgres-backed storages.
dagster-pyspark
Pyspark sessions may now be configured to be held open after pipeline execution completes, to
enable extended test cases.
dagster-spark
spark_outputs must now be specified when initializing a SparkSolidDefinition, rather than in
config.
Added new create_spark_solid helper and new spark_resource.
Improved EMR implementation.
Bugfix
Fixed an issue retrieving output values using SolidExecutionResult (e.g., in test) for
dagster-pyspark solids.
Fixes an issue when expanding composite solids in Dagit.
Better errors when solid names collide.
Config mapping in composite solids now works as expected when the composite solid has no top
level config.
Compute log filenames are now guaranteed not to exceed the POSIX limit of 255 chars.
Fixes an issue when copying and pasting solid names from Dagit.
Termination now works as expected in the multiprocessing executor.
The multiprocessing executor now executes parallel steps in the expected order.
The multiprocessing executor now correctly handles solid subsets.
Fixed a bad error condition in dagster_ssh.sftp_solid.
Fixed a bad error message giving incorrect log level suggestions.
Documentation
Minor fixes and improvements.
Thank you
Thank you to all of the community contributors to this release!! In alphabetical order: @cclauss,
@deem0n, @irabinovitch, @pseudoPixels, @Ramshackle-Jamathon, @rparrapy, @yamrzou.
The selector argument to PipelineDefinition has been removed. This API made it possible to
construct a PipelineDefinition in an invalid state. Use PipelineDefinition.build_sub_pipeline
instead.
New
Added the dagster_prometheus library, which exposes a basic Prometheus resource.
Dagster Airflow DAGs may now use GCS instead of S3 for storage.
Expanded interface for schedule management in Dagit.
Dagit
Performance improvements when loading, displaying, and editing config for large pipelines.
Smooth scrolling zoom in the explore tab replaces the previous two-step zoom.
No longer depends on internet fonts to run, allowing fully offline dev.
Typeahead behavior in search has improved.
Invocations of composite solids remain visible in the sidebar when the solid is expanded.
The config schema panel now appears when the config editor is first opened.
Interface now includes hints for autocompletion in the config editor.
Improved display of solid inputs and output in the explore tab.
Provides visual feedback while filter results are loading.
Better handling of pipelines that aren't present in the currently loaded repo.
Bugfix
Dagster Airflow DAGs previously could crash while handling Python errors in DAG logic.
Step failures when running Dagster Airflow DAGs were previously not being surfaced as task
failures in Airflow.
Dagit could previously get into an invalid state when switching pipelines in the context of a
solid subselection.
frozenlist and frozendict now pass Dagster's parameter type checks for list and dict.
The GraphQL playground in Dagit is now working again.
Nits
Dagit now prints its pid when it loads.
Third-party dependencies have been relaxed to reduce the risk of version conflicts.
The interface for type checks has changed. Previously the type_check_fn on a custom type was
required to return None (=passed) or else raise Failure (=failed). Now, a type_check_fn may
return True/False to indicate success/failure in the ordinary case, or else return a
TypeCheck. The newsuccess field on TypeCheck now indicates success/failure. This obviates
the need for the typecheck_metadata_fn, which has been removed.
Executions of individual composite solids (e.g. in test) now produce a
CompositeSolidExecutionResult rather than a SolidExecutionResult.
dagster.core.storage.sqlite_run_storage.SqliteRunStorage has moved to
dagster.core.storage.runs.SqliteRunStorage. Any persisted dagster.yaml files should be updated
with the new classpath.
is_secret has been removed from Field. It was not being used to any effect.
The environmentType and configTypes fields have been removed from the dagster-graphql
Pipeline type. The configDefinition field on SolidDefinition has been renamed to
configField.
Bugfix
PresetDefinition.from_files is now guaranteed to give identical results across all Python
minor versions.
Nested composite solids with no config, but with config mapping functions, now behave as expected.
The dagster-airflow DagsterKubernetesPodOperator has been fixed.
Dagit is more robust to changes in repositories.
Improvements to Dagit interface.
New
dagster_pyspark now supports remote execution on EMR with the @pyspark_solid decorator.
Nits
Documentation has been improved.
The top level config field features in the dagster.yaml will no longer have any effect.
Third-party dependencies have been relaxed to reduce the risk of version conflicts.