[Python-Dev] configparser 1.1 - one last bug fix needed

Łukasz Langa lukasz at langa.pl
Mon Dec 13 23:22:29 CET 2010


Hi there.
 
There's one last thing that needs to be done with configparser for 3.2.
Raymond, Fred, Michael and Georg already expressed their approval on that so
unless anybody finds a flaw in the idea expressed below, I'm going to make
the change for 3.2b2:

- the ConfigParser class will be removed

- the SafeConfigParser class will be renamed to ConfigParser

- 2to3 will rename SafeConfigParser classes to ConfigParser

- 2to3 will warn on the subtle behaviour change when ConfigParser classes
  are found

What's the difference?
----------------------

Both ConfigParser and SafeConfigParser implement interpolation, e.g.  option
values can contain special tokens similar to those implemented by Python's
string formatting: %(example)s. These tokens are replaced during get()
operations by values from respective keys (either from the current section
or from the DEFAULT section).

SafeConfigParser was originally introduced to fix a couple of ConfigParser
problems:

- when a token didn't match the %(name)s syntax, it was simply treated as
  a raw value. This caused configuration errors like %var or %(no_closing_s)
  to be missed.

- if someone actually wanted to store arbitrary strings in values, including
  Python formatting strings, there was no way to escape %(name)s in the
  configuration. The programmer had to know in advance that some value may
  hold %(name)s and only get() values from that option using 
  
  get('section', 'option', raw=True)

  Then however, that option could not use interpolation anymore.

- set() originally allowed to store non-string values in the parser. This
  was not meant to be a feature and caused trouble when the user tried to
  save the configuration to a file or get the stored values back using typed
  get() methods.

SafeConfigParser solves these problems by validating interpolation syntax
(only %(name)s or %% are allowed, the latter being an escaped percent sign)
and raising exceptions on syntax errors, and validating type on set()
operations so that no non-string values can be passed to the parser using
the API.

Why change that?
----------------

When ConfigParser was left alone, it remained the default choice for most
end users, including our own distutils and logging libs. This was a very
weak choice, and most current ConfigParser users are not aware of the
interpolation quirks. I had to close a couple of issues related to people
trying to store non-string values internally in the parser.

The current situation needlessly complicates the documentation. Explaining
all the above quirks to each new user who only wants to parse an INI file is
weak at best. Moreover, users trust Python to do the right thing by default
and according to their intuition. In this case, going for the default
configparser.ConfigParser class without consulting the documentation is
clearly a suboptimal choice.

One last argument is that SafeConfigParser is an awkward name. It implicates
the other parsers are somehow unsafe, or that this specific parser protects
users from something. This is generally considered a naming antipattern.

When?
-----

You might ask whether this can be done for 3.2 (e.g. is that a feature or
a bugfix). In Raymond's words, the beta process should be used to flesh out
the APIs, test whether they work as expected and fix suboptimal decisions
before we hit the release candidate stage. He consideres this essentially
a bugfix and I agree.

You might ask why do that now and not for 3.3. We believe that 3.2 is the
last possible moment of introducing a change like that. The adoption rate is
currently still low and application authors porting projects from 2.x expect
incompatibilities. When they are non-issues, handled by 2to3, there's
nothing to be afraid of.

But isn't that... INCOMPATIBLE?!
--------------------------------

Yes, it is. Thanks to the low py3k adoption rate now's the only moment where
there's marginal risk of introducing silent incompatibility (there are
hardly any py3k projects out there). Projects ported from Python 2.x will be
informed by 2to3 of the change. We believe that this will fix more bugs than
it introduces.

Support for bare % signs would be the single case where ConfigParser might
have appeared a more natural solution. In those cases we expect that users
will rather choose to turn off interpolation whatsoever.

Summary
-------

If you have any strong, justified arguments against this bugfix, speak up.
Otherwise the change will be made on Thursday.

-- 
Interpolating regards,
Łukasz Langa
tel. +48 791 080 144
WWW http://lukasz.langa.pl/




More information about the Python-Dev mailing list