sys.stdout.write()'s bug or doc bug?

Steven D'Aprano steve at REMOVE-THIS-cybersource.com.au
Sun Dec 28 09:31:11 EST 2008


On Sun, 28 Dec 2008 02:37:55 -0800, Qiangning Hong wrote:

>> > So, my question is, as sys.stdout IS a file object, why it does not
>> > use its encoding attribute to convert the given unicode?  An
>> > implementation bug? A documenation bug?
>>
>> hmm I always thought "sys.stdout" is a "file-like object" not that it
>> IS a file.
> 
> In my original post, I have figured out that sys.stdout IS a file, by
> using type() function.  And isinstance() function tells the same:
> 
> Python 2.5.2 (r252:60911, Dec 18 2008, 12:39:19) [GCC 4.2.1 (Apple Inc.
> build 5564)] on darwin Type "help", "copyright", "credits" or "license"
> for more information.
>>>> import sys
>>>> type(sys.stdout) is file
> True
>>>> isinstance(sys.stdout, file)
> True
> 
> So, sys.stdout SHOULD do what the doc says, otherwise there is a bug
> either in implementation of sys.stdout, or in the documentation of file.

The documentation says:

file.encoding
The encoding that this file uses. When Unicode strings are written to a 
file, they will be converted to byte strings using this encoding. In 
addition, when the file is connected to a terminal, the attribute gives 
the encoding that the terminal is likely to use (that information might 
be incorrect if the user has misconfigured the terminal). The attribute 
is read-only and may not be present on all file-like objects. It may also 
be None, in which case the file uses the system default encoding for 
converting Unicode strings.
New in version 2.3.

http://docs.python.org/library/stdtypes.html#file.encoding


And I agree that sys.stdout is a file. Using Python 2.6:

>>> type(sys.stdout)
<type 'file'>


I can confirm the behaviour you report:

>>> sys.stdout.encoding
'UTF-8'
>>> u = u"\u554a"
>>> print u
啊
>>> sys.stdout.write(u)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
UnicodeEncodeError: 'ascii' codec can't encode character u'\u554a' in 
position 0: ordinal not in range(128)

But if you explicitly convert the string, it works:

>>> sys.stdout.write(u.encode('utf-8'))
啊



I agree that this appears to be a bug, either of the write() method or 
the documentation.


-- 
Steven



More information about the Python-list mailing list