[Python-ideas] changing sys.stdout encoding

Rurpy rurpy at yahoo.com
Tue Jun 5 19:20:01 CEST 2012


In my first foray into Python3 I've encountered this problem:
I work in a multi-language environment.  I've written a number 
of tools, mostly command-line, that generate output on stdout.
Because these tools and their output are used by various people
in varying environments, the tools all have an --encoding option
to provide output that meets the needs and preferences of the
output's ultimate consumers. 

In converting them to Python3, I found the best (if not very 
pleasant) way to do this in Python3 was to put something like 
this near the top of each tool[*1]:

  import codecs
  sys.stdout = codecs.getwriter(opts.encoding)(sys.stdout.buffer)

What I want to be able to put there instead is:

  sys.stdout.set_encoding (opts.encoding)

The former I found on the internet -- there is zero probability
I could have figured that out from the Python docs.  It is obscure
to anyone (who has like me generally only needed to deal with 
.encode() and .decode()) who hasn't encountered it before or 
dealt much with the codecs module.  It is excessively complex 
for what is conceptually a simple and straight-forward operation.  
It requires the import of the codecs module in programs that other-
wise don't need it [*2], and the reading of the codecs docs (not
a shining example of clarity themselves) to understand it.  In 
short it is butt ugly relative to what I generally get in Python.

Would it be feasible to provide something like .set_encoding() 
on textio streams?  (Or make .encoding a writeable property?; it
seems to intentionally be non-writeable for some reason but is that
reason really unavoidable?)  If doing this for textio in general is
too hard, then what about encapsulating the codecs stuff above in
a sys.set_encoding() function?  

Needing to change the encoding of a sys.std* stream is not an 
uncommon need and a user should not have to go through the 
codecs dance above to do so IMO.

----
[*1] There are other ways to change stdout's encoding but they
 all have problems AFAICT.  PYTHONIOENCODING can't easily be 
 changed dynamically within program.  Reopening stdout as binary,
 or using the binary interface to text stdout, requires a explicit 
 encode call at each write site.  Overloading print() is obscure
 because it requires reader to notice print was overloaded.

[*2] I don't mean the actual import of the codecs module which
 occurs anyway; I mean the extra visual and cognitive noise 
 introduced by the presence of the import statement in the source.




More information about the Python-ideas mailing list