[issue4631] urlopen returns extra, spurious bytes

Daniel Diniz report at bugs.python.org
Sun Dec 14 11:08:12 CET 2008


Daniel Diniz <ajaksu at gmail.com> added the comment:

Jeremy: no, it doesn't.

Python 2.6.1+ (release26-maint:67716M, Dec 13 2008, 10:30:52)
[GCC 4.2.4 (Ubuntu 4.2.4-1ubuntu3)] on linux2

~/release26-maint$ ./python -c "import urllib; print
urllib.urlopen('http://bugs.debian.org/cgi-bin/bugreport.cgi?mbox=yes;bug=123456').readlines()[0]"
>From mechanix at lucretia.debian.net Tue Dec 11 11:32:47 2001

~/release26-maint$ ./python -c "from __future__ import unicode_literals;
import urllib; print
urllib.urlopen('http://bugs.debian.org/cgi-bin/bugreport.cgi?mbox=yes;bug=123456').readlines()[0]"
>From mechanix at lucretia.debian.net Tue Dec 11 11:32:47 2001


FWIW, there are trailing spurious bytes too (note read() gives bytes,
while readlines() both bytes and strings in 3.0):
>>> import urllib.request; content =
urllib.request.urlopen('http://bugs.debian.org/cgi-bin/bugreport.cgi?mbox=yes;bug=123456').read()

Python 3.1a0 (py3k:67702, Dec 11 2008, 11:09:14)
[GCC 4.2.4 (Ubuntu 4.2.4-1ubuntu3)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import urllib.request

>>> content =
urllib.request.urlopen('http://bugs.debian.org/cgi-bin/bugreport.cgi?mbox=yes;bug=123456').read()

>>> content[-30:]
b'PGP SIGNATURE-----\n\n\n\n\n\r\n0\r\n\r\n'

>>> content[:10]
b'f65\r\nFrom '

While in 2.6:
>>> import urllib
>>> content =
urllib.urlopen('http://bugs.debian.org/cgi-bin/bugreport.cgi?mbox=yes;bug=123456').read()
>>> content[-30:]
'---END PGP SIGNATURE-----\n\n\n\n\n'

_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue4631>
_______________________________________


More information about the Python-bugs-list mailing list