[issue5419] urllib.request.open(someURL).read() returns a bytes object so writing it requires binary mode

Thu Apr 15 14:17:47 CEST 2010

Daniel Haertle <haertle at uni-bonn.de> added the comment:

I got struck by the same feature. In addition, currently the docs are wrong in the examples (at http://docs.python.org/dev/py3k/library/urllib.request.html#examples the output of f.read() is a string instead of bytes). There I propose the change from 

>>> import urllib.request
>>> f = urllib.request.urlopen('http://www.python.org/')
>>> print(f.read(100))
<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
<?xml-stylesheet href="./css/ht2html

to

>>> import urllib.request
>>> f = urllib.request.urlopen('http://www.python.org/')
>>> print(f.read(100).decode('utf-8'))
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtm

The other examples need to be corrected in a similar way.
Even more importantly, the "HOWTO Fetch Internet Resources Using The urllib Package" needs to be corrected too.

In the documentation of urllib.request.urlopen I propose to add a sentence (after the paragraph "This function returns a file-like object...") explaining that reading the object returns bytes that need to be decoded to a string:
"Note that the method read() returns bytes that need to be decoded to a string using decode()."

----------
nosy: +Danh
versions: +Python 3.2, Python 3.3

_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue5419>
_______________________________________