[ python-Feature Requests-1599329 ] urllib(2) should allow automatic decoding by charset
SourceForge.net
noreply at sourceforge.net
Wed Nov 22 07:57:13 CET 2006
Feature Requests item #1599329, was opened at 2006-11-19 20:47
Message generated for change (Comment added) made by loewis
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=355470&aid=1599329&group_id=5470
Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: Python Library
Group: None
Status: Open
Resolution: None
Priority: 5
Private: No
Submitted By: Erik Demaine (edemaine)
Assigned to: Nobody/Anonymous (nobody)
Summary: urllib(2) should allow automatic decoding by charset
Initial Comment:
Currently, urllib.urlopen(...).read() returns a string, not a unicode object. Ditto for urllib2. No attempt is made to decode the data using the charset encoding specified in the header ....info()['Content-Type'].
Is it fair to assume that, in Python 3K, urllib....read() will return (Unicode) strings instead of bytes, automatically decoding according to the charset?
Do you think we could expose this futuristic functionality in Python 2? I doubt we could change read() without breaking a lot of existing code that already does this decoding (e.g., http://zesty.ca/python/scrape.py), but perhaps a 'uread()' method could return a unicode object instead of a string.
----------------------------------------------------------------------
>Comment By: Martin v. Löwis (loewis)
Date: 2006-11-22 07:57
Message:
Logged In: YES
user_id=21627
Originator: NO
I don't think urlopen(...).read() should return strings in Py3k, but
instead it should return bytes - in general, resources retrieved are byte
sequences (many are application/octet-stream).
Making the return type depend on the resource being fetched is also
unintuitive.
It might be reasonable to have the user specified "binary" or "text" on
urlopen() (just like regular open()).
----------------------------------------------------------------------
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=355470&aid=1599329&group_id=5470
More information about the Python-bugs-list
mailing list