[Python-ideas] changing sys.stdout encoding

Rurpy rurpy at yahoo.com
Wed Jun 6 08:05:35 CEST 2012


On 06/05/2012 01:37 PM, Stephen J. Turnbull wrote:
> Rurpy writes:
> 
>  > It is excessively complex for what is conceptually a simple and
>  > straight-forward operation.
> 
> The operation is not conceptually straightforward.  The problem is
> that you can't just change the encoding of an open stream, encodings
> are generally stateful.  The straightforward way to deal with this
> issue is to close the stream and reinitialize it.  Your proposed
> .set_encoding() method implies something completely different about
> what's going on.

I'm not sure why stateful matters.  When you change encoding
you discard whatever state exists and start with the new encoder
in it's initial state.  If there is a partially en/decoded 
character then wouldn't do the same thing you'd do if the same
condition arose at EOF? 

> I wouldn't object to a method with the semantics of reinitialization,
> but it should have a name implying reinitialization.  It probably
> should also error if the stream is open and has been written to.
> 
>  > Needing to change the encoding of a sys.std* stream is not an 
>  > uncommon need and a user should not have to go through the 
>  > codecs dance above to do so IMO.
> 
> I suspect needing to *change* the encoding of an open stream is
> generally quite rare.  Needing to *initialize* the std* streams with
> an appropriate codec is common.  That's why it doesn't so much matter
> that PYTHONIOENCODING can't be changed within a program.

You are correct that my current concern is reinitializing 
the encoding(s) of the sys.std* streams prior to doing any
operations with them.  I thought that changing the encoding
at any point would be a straight-forward generalization.
 
However I have in the past encountered mixed encoding outputting 
programs in two contexts; generating test data (i think is was 
for automatic detection and extraction of information), and
bundling multiple differently-encoded data sets in one package 
that were pulled apart again downstream

That both uses probably could have been designed better is irrelevant; 
a hypothetical python programmer's job would have been to produce
a python program that would fit into the the existing processes.

However I don't want to dwell on this because it is not my main
concern now, I thought I would just mention it for the record.

> I agree that use of PYTHONIOENCODING is pretty awkward.




More information about the Python-ideas mailing list