To automatically attach code references to Python assets' function definitions, you can use the with_source_code_references utility. Any asset definitions passed to the utility will have their source file attached as metadata.
For example, given the following Python file with_source_code_references.py:
Dagster's dbt integration can automatically attach references to the SQL files backing your dbt assets. For more information, see the dagster-dbt integration reference.
Manually attaching local file code references to asset definitions#
In some cases, you may want to manually attach code references to your asset definitions. Some assets may have a more complex source structure, such as an asset whose definition is spread across multiple Python source files or an asset which is partially defined with a .sql model file.
import os
from dagster import(
CodeReferencesMetadataValue,
Definitions,
LocalFileCodeReference,
asset,
with_source_code_references,)@asset(
metadata={"dagster/code_references": CodeReferencesMetadataValue(
code_references=[
LocalFileCodeReference(
file_path=os.path.join(os.path.dirname(__file__),"source.yaml"),# Label and line number are optional
line_number=1,
label="Model YAML",)])})defmy_asset_modeled_in_yaml():...
defs = Definitions(assets=with_source_code_references([my_asset_modeled_in_yaml]))
Each of the code references to manual_references.py will be visible in the Asset details page in the Dagster UI:
Converting code references to link to a remote git repository#
In a local context, it is useful to specify local code references in order to navigate directly to the source code of an asset. However, in a production environment, you may want to link to the source control repository where the code is stored.
If using Dagster Plus, you can use the link_code_references_to_git_if_cloud utility to conditionally convert local file code references to source control links. This utility will automatically detect if your code is running in a Dagster Cloud environment and convert local file code references to source control links, pointing at the commit hash of the code running in the current deployment.
import os
from pathlib import Path
from dagster_cloud.metadata.source_code import link_code_references_to_git_if_cloud
from dagster import(
AnchorBasedFilePathMapping,
Definitions,
asset,
with_source_code_references,)@assetdefmy_asset():...@assetdefanother_asset():...
defs = Definitions(
assets=link_code_references_to_git_if_cloud(
assets_defs=with_source_code_references([my_asset, another_asset]),# Inferred from searching for .git directory in parent directories# of the module containing this code - may also be set explicitly
file_path_mapping=AnchorBasedFilePathMapping(
local_file_anchor=Path(__file__),
file_anchor_path_in_repository="src/repo.py",),))
The link_code_references_to_git utility allows you to convert local file code references to source control links. You'll need to provide the base URL of your git repository, the branch or commit hash, and a FilePathMapping which tells Dagster how to convert local file paths to paths in the repository. The simplest way to do so is with an AnchorBasedFilePathMapping, which uses a local file path and the corresponding path in the repository to infer the mapping for other files.
You may choose to conditionally apply this transformation based on the environment in which your Dagster code is running. For example, you could use an environment variable to determine whether to link to local files or to a source control repository:
import os
from pathlib import Path
from dagster import(
AnchorBasedFilePathMapping,
Definitions,
asset,
link_code_references_to_git,
with_source_code_references,)@assetdefmy_asset():...@assetdefanother_asset():...
assets = with_source_code_references([my_asset, another_asset])
defs = Definitions(
assets=link_code_references_to_git(
assets_defs=assets,
git_url="https://github.com/dagster-io/dagster",
git_branch="main",
file_path_mapping=AnchorBasedFilePathMapping(
local_file_anchor=Path(__file__),
file_anchor_path_in_repository="src/repo.py",),)ifbool(os.getenv("IS_PRODUCTION"))else assets
)