Python Logging#

Dagster is compatible and configurable with Python's logging module. At the moment, you can access the configuration options through your dagster.yaml file, which will apply the contained settings to any run launched from your instance. These settings include which python loggers you'd like to capture from, what log level to set your loggers to, and which handlers / formatters you'd like to use to process log messages produced from your runs.

Relevant APIs#

NameDescription
get_dagster_loggerA function that returns a python logger that will automatically be captured by Dagster.

Capturing Python Logs
Experimental
#

By default, logs generated using the python logging module are not captured into the Dagster ecosystem. This means that they are not stored in the Dagster event log, will not be associated with any Dagster metadata (such as step key, run id, etc.), and will not show up in the default view of Dagit.

For example, imagine you have the following code:

import logging
from dagster import op

@op
def ambitious_op():
    my_logger = logging.getLogger("my_logger")
    try:
        x = 1 / 0
        return x
    except ZeroDivisionError:
        my_logger.error("Couldn't divide by zero!")

    return None

With the default behavior, because this code uses a custom python logger (instead of context.log), this log statement will not be added as an event in the Dagster event log, and therefore won't show up in Dagit. However, sometimes it is desirable to change this default behavior, and treat these sorts of log statements identically to how Dagster treats context.log calls.

This can be accomplished by setting the managed_python_loggers key in your dagster.yaml file to a list of python logger names that you would like to capture:

python_logs:
  managed_python_loggers:
    - my_logger
    - my_other_logger

Once this key is set, Dagster will treat any normal python log call from one of the listed loggers in the exact same way as a context.log call, which means you should be able to see this log statement in Dagit:

log-python-error

Please note that you should generally be fairly selective about which logs you wish to capture, especially in a production context. It is possible to overload the event log storage with these events, which may cause certain pages in Dagit to take a long time to load. Consider only capturing the most critical logs, and avoid including debug information if you expect to maintain a large amount of run history.

Note: if a python_log_level is set (see: Configuring A Python Log Level), then the loggers listed here will be set to the given level before a run is launched.

Example: Creating a captured python logger without modifying dagster.yaml#

If you would like to create a logger that is captured by Dagster without modifying your dagster.yaml file, you can use the provided get_dagster_logger utility function. This pattern is useful when logging from inside of nested functions, or other cases where it would be inconvenient to thread through the context parameter to enable calls to context.log.

from dagster import get_dagster_logger, op

@op
def ambitious_op():
    my_logger = get_dagster_logger()
    try:
        x = 1 / 0
        return x
    except ZeroDivisionError:
        my_logger.error("Couldn't divide by zero!")

    return None

Note: The logging module retains global state, meaning the logger returned by this function will be identical if this function is called multiple times with the same arguments in the same process. This means that there may be unpredictable or unituitive results if you set the level of the returned python logger to different values in different parts of your code.

Example: Capturing logs from all python loggers#

If you would like to capture logs from ALL python loggers, you can simply include root in your list, as python loggers are arranged in a hierarchy, with root as the parent of all other loggers:

python_logs:
  managed_python_loggers:
    - root

Configuring A Python Log Level
Experimental
#

If you want to set a global log level for your Dagster instance, you can do this by setting the python_log_level in your dagster.yaml file. This will set the log level of all loggers managed by Dagster. By default, this will just be the context.log logger. If there are custom python loggers that you wish to capture, see Capturing Python Logs.

This allows you to filter out logs below a given level. For example, setting a log level of INFO will filter out all DEBUG level logs.

python_logs:
  python_log_level: INFO

Configuring Python Log Handlers
Experimental
#

In your dagster.yaml file, you can configure handlers and formatters that will apply to the Dagster instance, so all pipeline runs will share the same logging configuration.

python_logs:
  dagster_handler_config:
    handlers:
      myHandler:
        class: logging.StreamHandler
        level: INFO
        stream: ext://sys.stdout
        formatter: myFormatter
    formatters:
      myFormatter:
        format: "My formatted message: %(message)s"

Handler and formatter configuration follows the dictionary config schema format in the Python logging module. Only the handlers and formatters dictionary keys will be accepted, as Dagster creates loggers internally.

From there, standard context.log calls will output with your configured handlers and formatters.

Example: Outputting Dagster Logs to a File#

Suppose we'd like to output all of our Dagster logs to a file. We can use the Python logging module's built-in logging.FileHandler class to send log output to a disk file. We format our config YAML file by defining a new handler myHandler to be a logging.FileHandler object.

Optionally, we can configure a formatter to apply a custom format to our logs. Since we'd like our logs to appear with a timestamp, we define a custom formatter named timeFormatter, attaching it to myHandler.

python_logs:
  dagster_handler_config:
    handlers:
      myHandler:
        class: logging.FileHandler
        level: INFO
        filename: "my_dagster_logs.log"
        mode: "a"
        formatter: timeFormatter
    formatters:
      timeFormatter:
        format: "%(asctime)s :: %(message)s"

Then, we execute the following pipeline:

@solid
def file_log_solid(context):
    context.log.info("Hello world!")


@pipeline
def file_log_pipeline():
    file_log_solid()

After execution, we can read the output log file my_dagster_logs.log. As expected, the log file contains the formatted log!

log-file-output