MLflow (dagster-mlflow)
- dagster_mlflow.mlflow_tracking ResourceDefinition
- beta
This API is currently in beta, and may have breaking changes in minor version releases, with behavior changes in patch releases.
This resource initializes an MLflow run that’s used for all steps within a Dagster run.
This resource provides access to all of mlflow’s methods as well as the mlflow tracking client ’s methods.
Usage:
-
Add the mlflow resource to any ops in which you want to invoke mlflow tracking APIs.
-
Add the end_mlflow_on_run_finished hook to your job to end the MLflow run when the Dagster run is finished. Examples:
from dagster_mlflow import end_mlflow_on_run_finished, mlflow_tracking
@op(required_resource_keys={"mlflow"})
def mlflow_op(context):
mlflow.log_params(some_params)
mlflow.tracking.MlflowClient().create_registered_model(some_model_name)
@end_mlflow_on_run_finished
@job(resource_defs={"mlflow": mlflow_tracking})
def mlf_example():
mlflow_op()
# example using an mlflow instance with s3 storage
mlf_example.execute_in_process(run_config={
"resources": {
"mlflow": {
"config": {
"experiment_name": my_experiment,
"mlflow_tracking_uri": "http://localhost:5000",
# if want to run a nested run, provide parent_run_id
"parent_run_id": an_existing_mlflow_run_id,
# if you want to resume a run or avoid creating a new run in the resource init,
# provide mlflow_run_id
"mlflow_run_id": an_existing_mlflow_run_id,
# env variables to pass to mlflow
"env": {
"MLFLOW_S3_ENDPOINT_URL": my_s3_endpoint,
"AWS_ACCESS_KEY_ID": my_aws_key_id,
"AWS_SECRET_ACCESS_KEY": my_secret,
},
# env variables you want to log as mlflow tags
"env_to_tag": ["DOCKER_IMAGE_TAG"],
# key-value tags to add to your experiment
"extra_tags": {"super": "experiment"},
}
}
}
})
-