[Python-ideas] changing sys.stdout encoding

Stephen J. Turnbull stephen at xemacs.org
Wed Jun 6 10:26:21 CEST 2012


Rurpy writes:

 > I'm not sure why stateful matters.  When you change encoding
 > you discard whatever state exists

How do you know what *I* want to do?  Silently discarding buffer
contents would suck.

 > If there is a partially en/decoded character then wouldn't do the
 > same thing you'd do if the same condition arose at EOF?

Again speaking for *myself*, almost certainly not.  On input, if it
happens *before* EOF it's incomplete input, and I should wait for it
to be completed.  If it happens on output, there's a bug somewhere,
and I probably want to do some kind of error recovery.

 > However I have in the past encountered mixed encoding outputting 
 > programs in two contexts; generating test data (i think is was 
 > for automatic detection and extraction of information), and
 > bundling multiple differently-encoded data sets in one package 
 > that were pulled apart again downstream.
 > 
 > That both uses probably could have been designed better is irrelevant; 
 > a hypothetical python programmer's job would have been to produce
 > a python program that would fit into the the existing processes.

No, it's not irrelevant that it's bad design.  Python should not go
out of its way to cater to bad design, if bad design can be worked
around with existing facilities.  Here there are at least two ways to
do it: the method of changing sys.std*'s text encoding that you
posted, and switching sys.std* to binary and doing explicit encoding
and decoding of strings to be input or output.

I have also encountered mixed encoding, in my students' filesystems
(it was not uncommon to see /home/j.r.exchangestudent/KOI8-R/SHIFT_JIS
and similar).  That doesn't mean it should be made easier to generate!



More information about the Python-ideas mailing list