[Python-ideas] changing sys.stdout encoding

Rurpy rurpy at yahoo.com
Wed Jun 6 22:42:36 EDT 2012


On 06/06/2012 10:09 AM, MRAB wrote:
> On 06/06/2012 08:09, Rurpy wrote:
>> On 06/05/2012 05:56 PM, MRAB wrote:
>>>  On 06/06/2012 00:34, Victor Stinner wrote:
>>>>  2012/6/5 Rurpy<rurpy-/E1597aS9LQAvxtiuMwx3w at public.gmane.org>:
>>>>>   In my first foray into Python3 I've encountered this problem:
>>>>>   I work in a multi-language environment.  I've written a number
>>>>>   of tools, mostly command-line, that generate output on stdout.
>>>>>   Because these tools and their output are used by various people
>>>>>   in varying environments, the tools all have an --encoding option
>>>>>   to provide output that meets the needs and preferences of the
>>>>>   output's ultimate consumers.
[snip]
>>>>>   In converting them to Python3, I found the best (if not very
>>>>>   pleasant) way to do this in Python3 was to put something like
>>>>>   this near the top of each tool[*1]:
>>>>>
>>>>>     import codecs
>>>>>     sys.stdout = codecs.getwriter(opts.encoding)(sys.stdout.buffer)
>>>>
>>>  In Python 3, you should use io.TextIOWrapper instead of
>>>  codecs.StreamWriter. It's more efficient and has less bugs.
>>>
> >>>>   What I want to be able to put there instead is:
> >>>>
> >>>>     sys.stdout.set_encoding (opts.encoding)
[snip]
>>>  And if you _do_ want multiple encodings in a file, it's clearer to open
>>>  the file as binary and then explicitly encode to bytes and write _that_
>>>  to the file.
>>
>> But is it really?
>>
>> The following is very simple and the level of python
>> expertise required is minimal.  It (would) works fine
>> with redirection.  One could substitute any other ordinary
>> open (for write) text file for sys.stdout.
>>
>>    [off the top of my head]
>>    text = 'This is %s text: 世界へ、こんにちは!'
>>    sys.stdout.set_encoding ('sjis')
>>    print (text % 'sjis')
>>    sys.stdout.set_encoding ('euc-jp')
>>    print (text % 'euc-jp')
>>    sys.stdout.set_encoding ('iso2022-jp')
>>    print (text % 'iso2022-jp')
>>
>> As for your suggestion, how do I reopen sys.stdout in
>> binary mode?  I don't need to do that often and don't
>> know off the top of my head.  (And it's too late for
>> me to look it up.)  And what happens to redirected output
>> when I close and reopen the stream?  I can open a regular
>> filename instead.  But remember to make the last two
>> opens with "a" rather than "w".  And don't forget the
>> "\n" at the end of the text line.
>>
>> Could you show me an code example of your suggestion
>> for comparison?
>>
>> Disclaimer: As I said before, I am not particularly
>> advocating for a for a set_encoding() method -- my
>> primary suggestion is a programatic way to change the
>> sys.std* encodings prior to first use.  Here I am just
>> questioning the claim that a set_encoding() method
>> would not be clearer than existing alternatives.
>>
> This example accesses the underlying binary output stream:
> 
> 
> # -*- coding: utf-8 -*-
> 
> import sys
> 
> class Writer:
>      def __init__(self, output):
>          self.output = output
>          self.encoding = output.encoding
>      def write(self, string):
>          self.output.buffer.write(string.encode(self.encoding))
>      def set_encoding(self, encoding):
>          self.output.buffer.flush()
>          self.encoding = encoding
> 
> sys.stdout = Writer(sys.stdout)
> 
> initial_encoding = sys.stdout.encoding
> 
> text = 'This is %s text: 世界へ、こんにちは!'
> sys.stdout.set_encoding('utf-8')
> print (text % 'utf-8')
> sys.stdout.set_encoding('sjis')
> print (text % 'sjis')
> sys.stdout.set_encoding('euc-jp')
> print (text % 'euc-jp')
> sys.stdout.set_encoding('iso2022-jp')
> print (text % 'iso2022-jp')
> 
> sys.stdout.set_encoding(initial_encoding)

OK, let's see if I've got this right...

You take a duplicate of my code, add a class with three
methods and some other statements and you claim the result 
is clearer and simpler than my code?

That is, union (A, B) is simpler than A?

Interesting definition of simpler you've got there :-)




More information about the Python-list mailing list