Ask AI

Using Dagster with Looker#

This feature is currently experimental.

This guide provides instructions for using Dagster with Looker using the dagster-looker library. Your Looker assets, such as views, explores, and dashboards, can be represented in the Dagster asset graph, allowing you to track lineage and dependencies between Looker assets. You can also use Dagster to orchestrate Looker PDTs, allowing you to trigger refreshes of these materialized tables on a cadence or based on upstream data changes.

What you'll learn#

  • How to represent Looker assets in the Dagster asset graph.
  • How to customize asset definition metadata for these Looker assets.
  • How to materialize Looker PDTs from Dagster.
Prerequisites
  • The dagster-looker library installed in your environment
  • Familiarity with asset definitions and the Dagster asset graph
  • Familiarity with Dagster resources
  • Familiarity with Looker concepts, like views, explores, and dashboards
  • A Looker instance
  • Looker API credentials to access your Looker instance. For more information, see Looker API authentication in the Looker documentation.

Set up your environment#

To get started, you'll need to install the dagster and dagster-looker Python packages:

pip install dagster dagster-looker

Represent Looker assets in the asset graph#

To load Looker assets into the Dagster asset graph, you must first construct a LookerResource, which allows Dagster to communicate with your Looker instance. You'll need to supply your Looker instance URL and API credentials, which can be passed directly or accessed from the environment using EnvVar.

Dagster can automatically load all views, explores, and dashboards from your Looker instance as asset specs. Call the undefined.load_looker_asset_specs function, which returns a list of AssetSpecs representing your Looker assets. You can then include these asset specs in your Definitions object:

from dagster_looker import LookerResource, load_looker_asset_specs

import dagster as dg

looker_resource = LookerResource(
    client_id=dg.EnvVar("LOOKERSDK_CLIENT_ID"),
    client_secret=dg.EnvVar("LOOKERSDK_CLIENT_SECRET"),
    base_url=dg.EnvVar("LOOKERSDK_HOST_URL"),
)

looker_specs = load_looker_asset_specs(looker_resource=looker_resource)
defs = dg.Definitions(assets=[*looker_specs], resources={"looker": looker_resource})

Load Looker assets from filtered dashboards and explores#

It is possible to load a subset of your Looker assets by providing a undefined.LookerFilter to the undefined.load_looker_asset_specs function. All dashboards contained in the folders provided to your undefined.LookerFilter will be fetched. Additionally, only the explores used in these dashboards will be fetched by passing only_fetch_explores_used_in_dashboards=True to your undefined.LookerFilter.

Note that the content and size of Looker instance may affect the performance of your Dagster deployments. Filtering the dashboards and explores selection from which your Looker assets will be loaded is particularly useful for improving loading times.

from dagster_looker import LookerFilter, LookerResource, load_looker_asset_specs

import dagster as dg

looker_resource = LookerResource(
    client_id=dg.EnvVar("LOOKERSDK_CLIENT_ID"),
    client_secret=dg.EnvVar("LOOKERSDK_CLIENT_SECRET"),
    base_url=dg.EnvVar("LOOKERSDK_HOST_URL"),
)

looker_specs = load_looker_asset_specs(
    looker_resource=looker_resource,
    looker_filter=LookerFilter(
        dashboard_folders=[
            ["my_folder", "my_subfolder"],
            ["my_folder", "my_other_subfolder"],
        ],
        only_fetch_explores_used_in_dashboards=True,
    ),
)
defs = dg.Definitions(assets=[*looker_specs], resources={"looker": looker_resource})

Customize asset definition metadata for Looker assets#

By default, Dagster will generate asset specs for each Looker asset based on its type, and populate default metadata. You can further customize asset properties by passing a custom DagsterLookerApiTranslator subclass to the undefined.load_looker_asset_specs function. This subclass can implement methods to customize the asset specs for each Looker asset type.

from dagster_looker import (
    DagsterLookerApiTranslator,
    LookerResource,
    LookerStructureData,
    LookerStructureType,
    load_looker_asset_specs,
)

import dagster as dg

looker_resource = LookerResource(
    client_id=dg.EnvVar("LOOKERSDK_CLIENT_ID"),
    client_secret=dg.EnvVar("LOOKERSDK_CLIENT_SECRET"),
    base_url=dg.EnvVar("LOOKERSDK_HOST_URL"),
)


class CustomDagsterLookerApiTranslator(DagsterLookerApiTranslator):
    def get_asset_spec(self, looker_structure: LookerStructureData) -> dg.AssetSpec:
        # We create the default asset spec using super()
        default_spec = super().get_asset_spec(looker_structure)
        # We customize the team owner tag for all assets,
        # and we customize the asset key prefix only for dashboards.
        return default_spec.replace_attributes(
            key=(
                default_spec.key.with_prefix("looker")
                if looker_structure.structure_type == LookerStructureType.DASHBOARD
                else default_spec.key
            ),
            owners=["team:my_team"],
        )


looker_specs = load_looker_asset_specs(
    looker_resource, dagster_looker_translator=CustomDagsterLookerApiTranslator
)
defs = dg.Definitions(assets=[*looker_specs], resources={"looker": looker_resource})

Note that super() is called in each of the overridden methods to generate the default asset spec. It is best practice to generate the default asset spec before customizing it.

Materialize Looker PDTs from Dagster#

You can use Dagster to orchestrate the materialization of Looker PDTs. To model PDTs as assets, build their asset definitions by passing a list of RequestStartPdtBuild to undefined.build_looker_pdt_assets_definitions function.

from dagster_looker import (
    LookerResource,
    RequestStartPdtBuild,
    build_looker_pdt_assets_definitions,
    load_looker_asset_specs,
)

import dagster as dg

looker_resource = LookerResource(
    client_id=dg.EnvVar("LOOKERSDK_CLIENT_ID"),
    client_secret=dg.EnvVar("LOOKERSDK_CLIENT_SECRET"),
    base_url=dg.EnvVar("LOOKERSDK_HOST_URL"),
)

looker_specs = load_looker_asset_specs(looker_resource=looker_resource)

pdts = build_looker_pdt_assets_definitions(
    resource_key="looker",
    request_start_pdt_builds=[
        RequestStartPdtBuild(model_name="my_model", view_name="my_view")
    ],
)


defs = dg.Definitions(
    assets=[*pdts, *looker_specs],
    resources={"looker": looker_resource},
)