Datahub (dagster-datahub)

This library provides an integration with Datahub, to support pushing metadata to Datahub from within Dagster ops.


We use the Datahub Python Library. To use it, you’ll first need to start up a Datahub Instance. Datahub Quickstart Guide.


dagster_datahub.datahub_rest_emitter ResourceDefinition[source]

Config Schema:
connection (dagster.StringSource)

Datahub GMS Server

token (Union[dagster.StringSource, None], optional)

Personal Access Token

Default Value: None

connect_timeout_sec (Union[Float, None], optional)

Default Value: None

read_timeout_sec (Union[Float, None], optional)

Default Value: None

retry_status_codes (Union[List[Int], None], optional)

Default Value: None

retry_methods (Union[List[String], None], optional)

Default Value: None

retry_max_times (Union[Int, None], optional)

Default Value: None

extra_headers (Union[dict, None], optional)

Default Value: None

ca_certificate_path (Union[String, None], optional)

Default Value: None

server_telemetry_id (Union[String, None], optional)

Default Value: None

disable_ssl_verification (Bool, optional)

Default Value: False

dagster_datahub.datahub_kafka_emitter ResourceDefinition[source]

Config Schema:
connection (strict dict)
Config Schema:
bootstrap (dagster.StringSource)

Kafka Boostrap Servers. Comma delimited

schema_registry_url (dagster.StringSource)

Schema Registry Location.

schema_registry_config (permissive dict, optional)

Extra Schema Registry Config.

Default Value:
{}
topic (String, optional)

Default Value: ‘MetadataChangeEvent_v4’

topic_routes (dict, optional)
Default Value:
{
    "MetadataChangeEvent": "MetadataChangeEvent_v4",
    "MetadataChangeProposal": "MetadataChangeProposal_v1"
}