Ask AI

Changelog#

0.9.3#

Breaking Changes

  • Removed deprecated --env flag from CLI
  • The --host CLI param has been renamed to --grpc_host to avoid conflict with the dagit --host param.

New

  • Descriptions for solid inputs and outputs will now be inferred from doc blocks if available (thanks @AndersonReyes !)
  • Various documentation improvements (thanks @jeriscc !)
  • Load inputs from pyspark dataframes (thanks @davidkatz-il !)
  • Added step-level run history for partitioned schedules on the schedule view
  • Added great_expectations integration, through the dagster_ge library. Example usage is under a new example, called ge_example, and documentation for the library can be found under the libraries section of the api docs.
  • PythonObjectDagsterType can now take a tuple of types as well as a single type, more closely mirroring isinstance and allowing Union types to be represented in Dagster.
  • The configured API can now be used on all definition types (including CompositeDefinition). Example usage has been updated in the configuration documentation.
  • Updated Helm chart to include auto-generated user code configmap in user code deployment by default

Bugfixes

  • Databricks now checks intermediate storage instead of system storage
  • Fixes a bug where applying hooks on a pipeline with composite solids would flatten the top-level solids. Now applying hooks on pipelines or composite solids means attaching hooks to every single solid instance within the pipeline or the composite solid.
  • Fixes the GraphQL playground hosted by dagit
  • Fixes a bug where K8s CronJobs were stopped unnecessarily during schedule reconciliation

Experimental

  • New dagster-k8s/config tag that lets users pass in custom configuration to the Kubernetes Job, Job metadata, JobSpec, PodSpec, and PodTemplateSpec metadata.
    • This allows users to specify settings like eviction policy annotations and node affinities.
    • Example:
      @solid(
        tags = {
          'dagster-k8s/config': {
            'container_config': {
              'resources': {
                'requests': { 'cpu': '250m', 'memory': '64Mi' },
                'limits': { 'cpu': '500m', 'memory': '2560Mi' },
              }
            },
            'pod_template_spec_metadata': {
              'annotations': { "cluster-autoscaler.kubernetes.io/safe-to-evict": "true"}
            },
            'pod_spec_config': {
              'affinity': {
                'nodeAffinity': {
                  'requiredDuringSchedulingIgnoredDuringExecution': {
                    'nodeSelectorTerms': [{
                      'matchExpressions': [{
                        'key': 'beta.kubernetes.io/os', 'operator': 'In', 'values': ['windows', 'linux'],
                      }]
                    }]
                  }
                }
              }
            },
          },
        },
      )
      def my_solid(context):
        context.log.info('running')
    

0.9.2#

Breaking Changes

  • The --env flag no longer works for the pipeline launch or pipeline execute commands. Use --config instead.
  • The pipeline execute command no longer accepts the --workspace argument. To execute pipelines in a workspace, use pipeline launch instead.

New

  • Added ResourceDefinition.mock_resource helper for magic mocking resources. Example usage can be found here
  • Remove the row_count metadata entry from the Dask DataFrame type check (thanks @kinghuang!)
  • Add orient to the config options when materializing a Dask DataFrame to json (thanks @kinghuang!)

Bugfixes

  • Fixed a bug where applying configured to a solid definition would overwrite inputs from run config.
  • Fixed a bug where pipeline tags would not apply to solid subsets.
  • Improved error messages for repository-loading errors in CLI commands.
  • Fixed a bug where pipeline execution error messages were not being surfaced in Dagit.

0.9.1#

Bugfixes

  • Fixes an issue in the dagster-k8s-celery executor when executing solid subsets

Breaking Changes

  • Deprecated the IntermediateStore API. IntermediateStorage now wraps an ObjectStore, and TypeStoragePlugin now accepts an IntermediateStorage instance instead of an IntermediateStore instance. (Noe that IntermediateStore and IntermediateStorage are both internal APIs that are used in some non-core libraries).

0.9.0 “Laundry Service”#

Breaking Changes

  • The dagit key is no longer part of the instance configuration schema and must be removed from dagster.yaml files before they can be used.
  • -d can no longer be used as a command-line argument to specify a mode. Use --mode instead.
  • Use --preset instead of --preset-name to specify a preset to the pipeline launch command.
  • We have removed the config argument to the ConfigMapping, @composite_solid, @solid, SolidDefinition, @executor, ExecutorDefinition, @logger, LoggerDefinition, @resource, and ResourceDefinition APIs, which we deprecated in 0.8.0. Use config_schema instead.

New

  • Python 3.8 is now fully supported.
  • -d or --working-directory can be used to specify a working directory in any command that takes in a -f or --python_file argument.
  • Removed the deprecation of create_dagster_pandas_dataframe_type. This is the currently supported API for custom pandas data frame type creation.
  • Removed gevent dependency from dagster
  • New configured API for predefining configuration for various definitions: https://legacy-docs.dagster.io/overview/configuration/#configured
  • Added hooks to enable success and failure handling policies on pipelines. This enables users to set up policies on all solids within a pipeline or on a per solid basis. Example usage can be found here
  • New instance level view of Scheduler and running schedules
  • dagster-graphql is now only required in dagit images.

0.8.11#

Breaking Changes

  • AssetMaterializations no longer accepts a dagster_type argument. This reverts the change billed as "AssetMaterializations can now have type information attached as metadata." in the previous release.