Python 3.2 has some deadly infection

Terry Reedy tjreedy at udel.edu
Mon Jun 2 17:34:35 EDT 2014


On 6/2/2014 7:10 AM, Robin Becker wrote:

> there seems to be an implicit assumption in python land that encoded
> strings are the norm.

I don't know why you say that. To have a stream of bytes interpreted as 
characters, open in text mode and give the encoding. Otherwise, open in 
binary mode and apply whatever encoding you want. Image programs like 
Pil or Pillow assume that bytes have image encodings. Same idea.

 > On virtually every computer I encounter that assumption is wrong.

Except for the std streams (see below), it is also not part of Python.

I will just point out that bytes are given meaning by encoding meaning 
into them. Unicode attempts to reduce the hundreds of text encodings to 
just a few, and mostly to just one for external storage and transmission.

> In python I would have preferred for bytes to remain the default io

Do you really think that defaulting the open mode to 'rb' rather than 
'rt' would be a better choice for newbies?

> mechanism, at least that would allow me to decide if I need any decoding.

Assuming that 'rb' is actually needed more than 'rt' for you in 
particular, is it really such a burden to give a mode more often than not?

> As the cat example
> http://lucumr.pocoo.org/2014/5/12/everything-about-unicode/
> showed these extra assumptions are sometimes really in the way.

This example is *only* about the *pre-opened* stdxyz streams. Python 
uses these to read characters from the keyboard and print characters to 
the screen in input, print, and the interactive interpreter. So they are 
open in text mode (which wraps binary read and write). The developers, 
knowing that people can and do write batch mode programs that avoid 
input and print, gave a documented way to convert the streams back to 
binary. (See the sys doc.)

The issue Armin ran into is this. He write a library module that makes 
sure the streams are binary. Someone else does the same. A program 
imports both modules, in either order. The conversion method referenced 
above raises an exception if one attempt to convert an already converted 
stream. Much of the extra code Armin published detects whether the steam 
is already binary or needs conversion.

The obvious solution is to enhance the conversion method so that one may 
say 'convert is needed, otherwise just pass'.

-- 
Terry Jan Reedy




More information about the Python-list mailing list