[Python-ideas] Python 3000 TIOBE -3%

Nick Coghlan ncoghlan at gmail.com
Wed Feb 15 04:22:02 CET 2012


On Wed, Feb 15, 2012 at 12:43 PM, Stephen J. Turnbull
<stephen at xemacs.org> wrote:
> It's arguable that most applications *should* want errors in these
> cases; I've made that argument myself.  But it's quite clearly not the
> user's intent.

However, from a correctness point of view, it's a big step up from
just saying "latin-1" (which effectively turns off *all* of the
additional encoding related sanity checking Python 3 offers over
Python 2). For many "I don't care about Unicode" use cases, using
"ascii+surrogateescape" for your own I/O and setting
"backslashreplace" on sys.stdout should cover you (and any exceptions
you get will be warning you about cases where your original
assumptions about not caring about Unicode validity have been proven
wrong).

If the logging module doesn't do it already, it should probably be
defaulting to backslashreplace when encoding messages, too (for the
same reason sys.stderr already defaults to that - you don't want your
error reporting system failing to encode corrupted Unicode data).

sys.stdin and sys.stdout are different due to the role they play in
pipeline processing - for those,
locale.getpreferredencoding()+"strict" is a more reasonable default
(but we should make it easy to replace them with something more
specific for a given application, hence
http://bugs.python.org/issue14017)

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia



More information about the Python-ideas mailing list