[issue46200] Discourage logging f-strings due to security considerations

Thu Dec 30 02:44:15 EST 2021

New submission from Arie Bovenberg <a.c.bovenberg at gmail.com>:

(I've already submitted this issue to security at python.org, who directed me to propose a documentation change here)

Logging f-strings is quite popular, because the `%` style of logging is often considered ugly.
However, logging preformatted strings comes with security risks which should explicitly be mentioned in the logging documentation.

The following example illustrates the problem:

    logger.info('look: %s', untrusted_string)                # OK
    logger.info('look: %(foo)s', {'foo', untrusted_string})  # OK
    logger.info(f'look: {untrusted_string}')                 # OK (security-wise)
    logger.info(f'look: {untrusted_string}', some_dict)      # DANGER!

The problem is that `untrusted_string` will be interpreted as a string template by the logger. 
If `untrusted_string` has the value `%(foo)999999999s` (where foo is present in `some_dict`), 
logging gets stuck trying to add over a gigabyte of whitespace to a string. 
In other words: a Denial-of-Service attack.

Of course, this combination of f-string and logger arguments is unlikely. But it is plausible that:
- After a refactoring to f-string, removing the dict argument is forgotten;
- A developer adding a variable to a message prefers using f-strings and thus creates a template mixing the two styles (e.g. `{foo} and %(bar)s`);
- The string is passed through a logging Filter or other function which adds a dict argument.

Open questions:
1. Do we want to simply highlight the risks, or actively discourage logging f-strings?

   My thoughts:
   I feel it's important enough to actively discourage logging preformatted strings (i.e. also .format and manual %-formatting).
   A strong recommendation will allow security linters to flag this.
   Additionally, there are other advantages to the formatting built into `logging` (e.g. performance).

2. Should we type annotate logger `msg` as `Literal[str]` following PEP675?

   My thoughts:
   Annotating `msg` as a literal string will greatly help enforce this best practice.
   The adoption of typecheckers exceeds that of security linters.

3. Where should this risk be documented?
  a. In the `logger.debug` function docs
  b. In the `logging` docs introduction (like xml.etree.elementtree)
  c. In a new "security" section of the `logging` docs
  d. An informational PEP on logging security considerations (similar to PEP672)

  My thoughts:
  I fear option (a) would not get the attention of most readers.
  Option (c) or (d) may be useful to also highlight general risks 
  of logging (i.e. log injection)

As soon as there is agreement on these questions, I would be happy to submit a PR.

----------
assignee: docs at python
components: Documentation
messages: 409352
nosy: ariebovenberg, docs at python
priority: normal
severity: normal
status: open
title: Discourage logging f-strings due to security considerations
type: security
versions: Python 3.10, Python 3.11, Python 3.7, Python 3.8, Python 3.9

_______________________________________
Python tracker <report at bugs.python.org>
<https://bugs.python.org/issue46200>
_______________________________________