[Python-Dev] Python Language Summit EuroPython 2010

Mon Jul 26 04:20:03 CEST 2010

On Sun, Jul 25, 2010 at 9:46 PM, Guido van Rossum <guido at python.org> wrote:
..
>> Maybe self.__format__(..).encode('ascii')?  ...encode('utf-8') is a
>> tempting alternative as well.
>
> -1
>
> That would bring back the "it fails for some users but passes for the
> developer" problem. (True, if the developer calls .encode('ascii') it
> may also break, but then at least it is something the developer chose
> to do.)
>
> How hard would it be to recode the sprintf language but with the
> locale fixed to "C"? That would always be ASCII.

This is exactly what I proposed at
http://bugs.python.org/issue7584#msg110240 not so long ago.  Given
that stftime language uses every English letter as one of its codes
(both caps and lower case), it would be an effort, but coding it in
python should not be too hard.   A C implementation would be harder,
but there must be implementations around available under a suitable
license that can be reused.

In short, definitely +1.

> Otherwise,
> str(x).encode('ascii') might work, that's like the ISO format with the
> 'T' replaced by a space.

Before proposing format(x, ..).encode('ascii') above, I considered
str(x).encode('ascii') , but then realized that for user-defined
classes, str(x) is as likely to contain non-ASCII characters as
format(x, ..).

What about .encode('utf-8')?  I thought it was not supposed to break
for any unicode.