Logging

Dagster includes a rich and extensible logging system.

Logging from a solid

Any solid can emit log messages at any point in its computation:

builtin_logger.py
4
5
6
7
8
9
10
11
@solid
def hello_logs(context):
    context.log.info("Hello, world!")


@pipeline
def demo_pipeline():
    hello_logs()

Built-in loggers

When you run the pipeline in terminal, you'll find the messages have been logged through a built-in logger.

The context object passed to every solid execution includes the built-in log manager, context.log. It exposes the usual debug, info, warning, error, and critical methods you would expect anywhere else in Python.

When you run Dagster pipelines in Dagit, you'll notice that log messages are visible as colored messages in the console:

Logs also stream back to the Dagit frontend in real time:

Dagit log display

Dagit exposes a powerful facility for filtering log messages based on execution steps and log levels.

Dagit log filtering

Debugging with logs

What happens if we introduce an error into our solid logic?

builtin_logger_error.py
4
5
6
7
8
9
10
11
@solid
def hello_logs_error(context):
    raise Exception("Somebody set up us the bomb")


@pipeline
def demo_pipeline_error():
    hello_logs_error()

Errors in user code are caught by the Dagster machinery to ensure pipelines gracefully halt or continue to execute, but messages including the original tracebacks get logged both to the console and back to Dagit.

Messages at level ERROR or above are highlighted both in Dagit and in the console logs, so we can easily pick them out of logs even without filtering.

Dagit error logs

In many cases, especially for local development, this log viewer, coupled with solid reexecution, is sufficient to enable a fast debug cycle for data pipelines.

Configuring the built-in loggers

Suppose that we've gotten the kinks out of our pipelines developing locally, and now we want to run in production—without all of the log spew from DEBUG messages that was helpful during development.

Just like solids, loggers can be configured when you run a pipeline. For example, to filter all messages below ERROR out of the colored console logger, add the following snippet to your config YAML:

config.yaml
1
2
3
4
loggers:
  console:
    config:
      log_level: ERROR

So when you execute the pipeline with that config, you'll only see the ERROR level logs. Dagit error logs

Environment-specific logging using modes

Logging is environment-specific: you don't want messages generated by data scientists' local development loops to be aggregated with production messages; on the other hand, you may find that in production console logging is irrelevant or even counterproductive.

Dagster recognizes this by attaching loggers to modes so that you can seamlessly switch from, e.g., Cloudwatch logging in production to console logging in development and test, without changing any of your code.

logging_modes.py
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
from dagster_aws.cloudwatch.loggers import cloudwatch_logger

from dagster import ModeDefinition, pipeline, solid
from dagster.loggers import colored_console_logger


@solid
def hello_logs(context):
    context.log.info("Hello, world!")


@pipeline(
    mode_defs=[
        ModeDefinition(name="local", logger_defs={"console": colored_console_logger}),
        ModeDefinition(name="prod", logger_defs={"cloudwatch": cloudwatch_logger}),
    ]
)
def hello_modes():
    hello_logs()

From Dagit, you can switch your pipeline mode to 'prod' and edit config in order to use the new Cloudwatch logger, for example:

config_modes.yaml
1
2
3
4
5
6
loggers:
  cloudwatch:
    config:
      log_level: ERROR
      log_group_name: /my/cool/cloudwatch/log/group
      log_stream_name: very_good_log_stream