Dagster is compatible and configurable with Python's logging module. At the moment, you can access the configuration options through your
dagster.yaml file, which will apply the contained settings to any run launched from your instance. These settings include which python loggers you'd like to capture from, what log level to set your loggers to, and which handlers / formatters you'd like to use to process log messages produced from your runs.
|A function that returns a python logger that will automatically be captured by Dagster.|
By default, logs generated using the python logging module are not captured into the Dagster ecosystem. This means that they are not stored in the Dagster event log, will not be associated with any Dagster metadata (such as step key, run id, etc.), and will not show up in the default view of Dagit.
For example, imagine you have the following code:
import logging from dagster import op @op def ambitious_op(): my_logger = logging.getLogger("my_logger") try: x = 1 / 0 return x except ZeroDivisionError: my_logger.error("Couldn't divide by zero!") return None
With the default behavior, because this code uses a custom python logger (instead of
context.log), this log statement will not be added as an event in the Dagster event log, and therefore won't show up in Dagit. However, sometimes it is desirable to change this default behavior, and treat these sorts of log statements identically to how Dagster treats
This can be accomplished by setting the
managed_python_loggers key in your dagster.yaml file to a list of python logger names that you would like to capture:
python_logs: managed_python_loggers: - my_logger - my_other_logger
Once this key is set, Dagster will treat any normal python log call from one of the listed loggers in the exact same way as a
context.log call, which means you should be able to see this log statement in Dagit:
Please note that you should generally be fairly selective about which logs you wish to capture, especially in a production context. It is possible to overload the event log storage with these events, which may cause certain pages in Dagit to take a long time to load. Consider only capturing the most critical logs, and avoid including debug information if you expect to maintain a large amount of run history.
Note: if a
python_log_level is set (see: Configuring A Python Log Level), then the loggers listed here will be set to the given level before a run is launched.
If you would like to create a logger that is captured by Dagster without modifying your
dagster.yaml file, you can use the provided
get_dagster_logger utility function. This pattern is useful when logging from inside of nested functions, or other cases where it would be inconvenient to thread through the context parameter to enable calls to
from dagster import get_dagster_logger, op @op def ambitious_op(): my_logger = get_dagster_logger() try: x = 1 / 0 return x except ZeroDivisionError: my_logger.error("Couldn't divide by zero!") return None
Note: The logging module retains global state, meaning the logger returned by this function will be identical if this function is called multiple times with the same arguments in the same process. This means that there may be unpredictable or unituitive results if you set the level of the returned python logger to different values in different parts of your code.
If you would like to capture logs from ALL python loggers, you can simply include
root in your list, as python loggers are arranged in a hierarchy, with
root as the parent of all other loggers:
python_logs: managed_python_loggers: - root
If you want to set a global log level for your Dagster instance, you can do this by setting the
python_log_level in your dagster.yaml file. This will set the log level of all loggers managed by Dagster. By default, this will just be the
context.log logger. If there are custom python loggers that you wish to capture, see Capturing Python Logs.
This allows you to filter out logs below a given level. For example, setting a log level of
INFO will filter out all
DEBUG level logs.
python_logs: python_log_level: INFO
In your dagster.yaml file, you can configure handlers and formatters that will apply to the Dagster instance, so all pipeline runs will share the same logging configuration.
python_logs: dagster_handler_config: handlers: myHandler: class: logging.StreamHandler level: INFO stream: ext://sys.stdout formatter: myFormatter formatters: myFormatter: format: "My formatted message: %(message)s"
Handler and formatter configuration follows the dictionary config schema format in the Python logging module. Only the
formatters dictionary keys will be accepted, as Dagster creates loggers internally.
From there, standard
context.log calls will output with your configured handlers and formatters.
Suppose we'd like to output all of our Dagster logs to a file. We can use the Python logging module's built-in
logging.FileHandler class to send log output to a disk file. We format our config YAML file by defining a new handler
myHandler to be a
Optionally, we can configure a formatter to apply a custom format to our logs. Since we'd like our logs to appear with a timestamp, we define a custom formatter named
timeFormatter, attaching it to
python_logs: dagster_handler_config: handlers: myHandler: class: logging.FileHandler level: INFO filename: "my_dagster_logs.log" mode: "a" formatter: timeFormatter formatters: timeFormatter: format: "%(asctime)s :: %(message)s"
Then, we execute the following pipeline:
@solid def file_log_solid(context): context.log.info("Hello world!") @pipeline def file_log_pipeline(): file_log_solid()
After execution, we can read the output log file
my_dagster_logs.log. As expected, the log file contains the formatted log!