Python Logging: An In-Depth Tutorial

Python Logging: An In-Depth Tutorial - The Python logging module comes with the standard library and provides basic logging features. By setting it up correctly, a log message can

As applications become more complex, having good logs can be very useful, not only when debugging but also to provide insight for application issues/performance.

The Python standard library comes with a logging module that provides most of the basic logging features. By setting it up correctly, a log message can bring a lot of useful information about when and where the log is fired as well as the log context, such as the running process/thread.

Despite the advantages, the logging module is often overlooked as it takes some time to set up properly and, although complete, in my opinion, the official logging doc at https://docs.python.org/3/library/logging.html does not really give logging best practices or highlight some logging surprises.

This Python logging tutorial is not meant to be a complete document on the logging module but rather a “getting started” guide that introduces some logging concepts as well as some “gotchas” to watch out for. The post will end with best practices and contain some pointers to more advanced logging topics.

Please note that all code snippets in the post suppose that you have already imported the logging module:

import logging

Concepts for Python Logging

This section gives an overview on some concepts that are often encountered in the logging module.

Python Logging Levels

The log level corresponds to the “importance” a log is given: an “error” log should be more urgent then than the “warn” log, whereas a “debug” log should be useful only when debugging the application.

There are six log levels in Python; each level is associated with an integer that indicates the log severity: NOTSET=0, DEBUG=10, INFO=20, WARN=30, ERROR=40, and CRITICAL=50.

All the levels are rather straightforward (DEBUG < INFO < WARN ) except NOTSET, whose particularity will be addressed next.

Python Logging Formatting

The log formatter basically enriches a log message by adding context information to it. It can be useful to know when the log is sent, where (Python file, line number, method, etc.), and additional context such as the thread and process (can be extremely useful when debugging a multithreaded application).

For example, when a log “hello world” is sent through a log formatter:

"%(asctime)s — %(name)s — %(levelname)s — %(funcName)s:%(lineno)d — %(message)s"

it will become

2018-02-07 19:47:41,864 - a.b.c - WARNING - <module>:1 - hello world

Python Logging Handler

The log handler is the component that effectively writes/displays a log: Display it in the console (via StreamHandler), in a file (via FileHandler), or even by sending you an email via SMTPHandler, etc.

Each log handler has 2 important fields:

A formatter which adds context information to a log.
A log level that filters out logs whose levels are inferior. So a log handler with the INFO level will not handle DEBUG logs.

The standard library provides a handful of handlers that should be enough for common use cases: https://docs.python.org/3/library/logging.handlers.html#module-logging.handlers. The most common ones are StreamHandler and FileHandler:

console_handler = logging.StreamHandler()
file_handler = logging.FileHandler("filename")

Python Logger

Logger is probably the one that will be used directly the most often in the code and which is also the most complicated. A new logger can be obtained by:

toto_logger = logging.getLogger("toto")

A logger has three main fields:

Propagate: Decides whether a log should be propagated to the logger’s parent. By default, its value is True.
A level: Like the log handler level, the logger level is used to filter out “less important” logs. Except, unlike the log handler, the level is only checked at the “child” logger; once the log is propagated to its parents, the level will not be checked. This is rather an un-intuitive behavior.
Handlers: The list of handlers that a log will be sent to when it arrives to a logger. This allows a flexible log handling—for example, you can have a file log handler that logs all DEBUG logs and an email log handler that will only be used for CRITICAL logs. In this regard, the logger-handler relationship is similar to a publisher-consumer one: A log will be broadcast to all handlers once it passes the logger level check.

A logger is unique by name, meaning that if a logger with the name “toto” has been created, the consequent calls of logging.getLogger("toto") will return the same object:

assert id(logging.getLogger("toto")) == id(logging.getLogger("toto"))

As you might have guessed, loggers have a hierarchy. On top of the hierarchy is the root logger, which can be accessed via logging.root. This logger is called when methods like logging.debug() is used. By default, the root log level is WARN, so every log with lower level (for example via logging.info("info")) will be ignored. Another particularity of the root logger is that its default handler will be created the first time a log with a level greater than WARN is logged. Using the root logger directly or indirectly via methods like logging.debug() is generally not recommended.

By default, when a new logger is created, its parent will be set to the root logger:

lab = logging.getLogger("a.b")
assert lab.parent == logging.root # lab's parent is indeed the root logger

However, the logger uses the “dot notation,” meaning that a logger with the name “a.b” will be the child of the logger “a.” However, this is only true if the logger “a” has been created, otherwise “ab” parent is still the root.

la = logging.getLogger("a")
assert lab.parent == la # lab's parent is now la instead of root

When a logger decides whether a log should pass according to the level check (e.g., if the log level is lower than logger level, the log will be ignored), it uses its “effective level” instead of the actual level. The effective level is the same as logger level if the level is not NOTSET, i.e., all the values from DEBUG up to CRITICAL; however, if the logger level is NOTSET, then the effective level will be the first ancestor level that has a non-NOTSET level.

By default, a new logger has the NOTSET level, and as the root logger has a WARN level, the logger’s effective level will be WARN. So even if a new logger has some handlers attached, these handlers will not be called unless the log level exceeds WARN:

toto_logger = logging.getLogger("toto")
assert toto_logger.level == logging.NOTSET # new logger has NOTSET level
assert toto_logger.getEffectiveLevel() == logging.WARN # and its effective level is the root logger level, i.e. WARN

# attach a console handler to toto_logger
console_handler = logging.StreamHandler()
toto_logger.addHandler(console_handler)
toto_logger.debug("debug") # nothing is displayed as the log level DEBUG is smaller than toto effective level
toto_logger.setLevel(logging.DEBUG)
toto_logger.debug("debug message") # now you should see "debug message" on screen

By default, the logger level will be used to decide of the a log passes: If the log level is lower than logger level, the log will be ignored.

Python Logging Best Practices

The logging module is indeed very handy, but it contains some quirks that can cause long hours of headache for even the best Python developers. Here are the best practices for using this module in my opinion:

Configure the root logger but never use it in your code—e.g., never call a function like logging.info(), which actually calls the root logger behind the scene. If you want to catch error messages from libraries you use, make sure to configure the root logger to write to a file, for example, to make the debugging easier. By default, the root logger only outputs to stderr, so the log can get lost easily.
To use the logging, make sure to create a new logger by using logging.getLogger(logger name). I usually use __name__ as the logger name, but anything can be used, as long as it is consistent. To add more handlers, I usually have a method that returns a logger (you can find the gist on https://gist.github.com/nguyenkims/e92df0f8bd49973f0c94bddf36ed7fd0).

import logging
import sys
from logging.handlers import TimedRotatingFileHandler
FORMATTER = logging.Formatter("%(asctime)s — %(name)s — %(levelname)s — %(message)s")
LOG_FILE = "my_app.log"

def get_console_handler():
   console_handler = logging.StreamHandler(sys.stdout)
   console_handler.setFormatter(FORMATTER)
   return console_handler
def get_file_handler():
   file_handler = TimedRotatingFileHandler(LOG_FILE, when='midnight')
   file_handler.setFormatter(FORMATTER)
   return file_handler
def get_logger(logger_name):
   logger = logging.getLogger(logger_name)
   logger.setLevel(logging.DEBUG) # better to have too much log than not enough
   logger.addHandler(get_console_handler())
   logger.addHandler(get_file_handler())
   # with this pattern, it's rarely necessary to propagate the error up to parent
   logger.propagate = False
   return logger

After you can create a new logger and use it:

my_logger = get_logger("my module name")
my_logger.debug("a debug message")

Use RotatingFileHandler classes, such as the TimedRotatingFileHandler used in the example instead of FileHandler, as it will rotate the file for you automatically when the file reaches a size limit or do it everyday.
Use tools like Sentry, Airbrake, Raygun, etc., to catch error logs automatically for you. This is especially useful in the context of a web app, where the log can be very verbose and error logs can get lost easily. Another advantage of using these tools is that you can get details about variable values in the error so you can know what URL triggers the error, which user is concerned, etc.

#python