Ask AI

Using Dagster with Sigma#

This feature is currently experimental.

This guide provides instructions for using Dagster with Sigma using the dagster-sigma library. Your Sigma assets, including datasets and workbooks, can be represented in the Dagster asset graph, allowing you to track lineage and dependencies between Sigma assets and upstream data assets you are already modeling in Dagster.

What you'll learn#

  • How to represent Sigma assets in the Dagster asset graph, including lineage to other Dagster assets.
  • How to customize asset definition metadata for these Sigma assets.
Prerequisites
  • The dagster-sigma library installed in your environment
  • Familiarity with asset definitions and the Dagster asset graph
  • Familiarity with Dagster resources
  • Familiarity with Sigma concepts, like datasets and workbooks
  • A Sigma organization
  • A Sigma client ID and client secret. For more information, see Generate API client credentials in the Sigma documentation.

Set up your environment#

To get started, you'll need to install the dagster and dagster-sigma Python packages:

pip install dagster dagster-sigma

Represent Sigma assets in the asset graph#

To load Sigma assets into the Dagster asset graph, you must first construct a SigmaOrganization resource, which allows Dagster to communicate with your Sigma organization. You'll need to supply your client ID and client secret alongside the base URL. See Identify your API request URL in the Sigma documentation for more information on how to find your base URL.

Dagster can automatically load all datasets and workbooks from your Sigma workspace as asset specs. Call the undefined.load_sigma_asset_specs function, which returns list of AssetSpecs representing your Sigma assets. You can then include these asset specs in your Definitions object:

from dagster_sigma import SigmaBaseUrl, SigmaOrganization, load_sigma_asset_specs

import dagster as dg

sigma_organization = SigmaOrganization(
    base_url=SigmaBaseUrl.AWS_US,
    client_id=dg.EnvVar("SIGMA_CLIENT_ID"),
    client_secret=dg.EnvVar("SIGMA_CLIENT_SECRET"),
)

sigma_specs = load_sigma_asset_specs(sigma_organization)
defs = dg.Definitions(assets=[*sigma_specs], resources={"sigma": sigma_organization})

Load Sigma assets from filtered workbooks#

It is possible to load a subset of your Sigma assets by providing a undefined.SigmaFilter to the undefined.load_sigma_asset_specs function. All workbooks contained in the folders provided to your undefined.SigmaFilter will be fetched.

Note that the content and size of Sigma organization may affect the performance of your Dagster deployments. Filtering the workbooks selection from which your Sigma assets will be loaded is particularly useful for improving loading times.

from dagster_sigma import (
    SigmaBaseUrl,
    SigmaFilter,
    SigmaOrganization,
    load_sigma_asset_specs,
)

import dagster as dg

sigma_organization = SigmaOrganization(
    base_url=SigmaBaseUrl.AWS_US,
    client_id=dg.EnvVar("SIGMA_CLIENT_ID"),
    client_secret=dg.EnvVar("SIGMA_CLIENT_SECRET"),
)

sigma_specs = load_sigma_asset_specs(
    organization=sigma_organization,
    sigma_filter=SigmaFilter(
        workbook_folders=[
            ["my_folder", "my_subfolder"],
            ["my_folder", "my_other_subfolder"],
        ]
    ),
)
defs = dg.Definitions(assets=[*sigma_specs], resources={"sigma": sigma_organization})

Customize asset definition metadata for Sigma assets#

By default, Dagster will generate asset keys for each Sigma asset based on its type and name and populate default metadata. You can further customize asset properties by passing a custom DagsterSigmaTranslator subclass to the undefined.load_sigma_asset_specs function. This subclass can implement methods to customize the asset keys or specs for each Sigma asset type.

from dagster_sigma import (
    DagsterSigmaTranslator,
    SigmaBaseUrl,
    SigmaOrganization,
    SigmaWorkbook,
    load_sigma_asset_specs,
)

import dagster as dg

sigma_organization = SigmaOrganization(
    base_url=SigmaBaseUrl.AWS_US,
    client_id=dg.EnvVar("SIGMA_CLIENT_ID"),
    client_secret=dg.EnvVar("SIGMA_CLIENT_SECRET"),
)


# A translator class lets us customize properties of the built Sigma assets, such as the owners or asset key
class MyCustomSigmaTranslator(DagsterSigmaTranslator):
    def get_asset_spec(self, data: SigmaWorkbook) -> dg.AssetSpec:
        # Adds a custom team owner tag for all Sigma assets
        return super().get_asset_spec(data)._replace(owners=["team:my_team"])


sigma_specs = load_sigma_asset_specs(
    sigma_organization, dagster_sigma_translator=MyCustomSigmaTranslator
)
defs = dg.Definitions(assets=[*sigma_specs], resources={"sigma": sigma_organization})

Load Sigma assets from multiple organizations#

Definitions from multiple Sigma organizations can be combined by instantiating multiple SigmaOrganization resources and merging their specs. This lets you view all your Sigma assets in a single asset graph:

from dagster_sigma import SigmaBaseUrl, SigmaOrganization, load_sigma_asset_specs

import dagster as dg

sales_team_organization = SigmaOrganization(
    base_url=SigmaBaseUrl.AWS_US,
    client_id=dg.EnvVar("SALES_SIGMA_CLIENT_ID"),
    client_secret=dg.EnvVar("SALES_SIGMA_CLIENT_SECRET"),
)

marketing_team_organization = SigmaOrganization(
    base_url=SigmaBaseUrl.AWS_US,
    client_id=dg.EnvVar("MARKETING_SIGMA_CLIENT_ID"),
    client_secret=dg.EnvVar("MARKETING_SIGMA_CLIENT_SECRET"),
)

sales_team_specs = load_sigma_asset_specs(sales_team_organization)
marketing_team_specs = load_sigma_asset_specs(marketing_team_organization)

# Merge the specs into a single set of definitions
defs = dg.Definitions(
    assets=[*sales_team_specs, *marketing_team_specs],
    resources={
        "marketing_sigma": marketing_team_organization,
        "sales_sigma": sales_team_organization,
    },
)