[Python-checkins] r80092 - python/branches/py3k/Doc/library/urllib.request.rst
Senthil Kumaran
orsenthil at gmail.com
Mon Apr 19 10:12:52 CEST 2010
On Sat, Apr 17, 2010 at 12:05:00PM -0400, R. David Murray wrote:
>
> Senthil, I think that we are in general considering Python 3 a "clean
> start", and avoiding mentioning how things were done in Python 2 except
> where it is important for compatibility (eg: pickle). I think the
> mention of how Python 2 did it actually muddies the explanation of how
> one should do it. I would either drop the mention of Python 2, or
> move it to a footnote (I favor just dropping it).
>
> How about this:
>
> Note that urlopen returns a bytes object. This is because there is no way
> for urlopen to automatically determine the encoding of the byte stream
> it receives from the http sever. In general, a program will decode
> the returned bytes object to string once it determines or guesses
> the appropriate encoding.
Yes, I get your point, David. My write up was more considering the
specific bug where the request was to be explicit and helpful to the
newcomers. Perhaps urllib2 how-to tutorial can provide the specific
details and this specific note can be written along the lines that you
have mentioned.
>
> Aside: I was curious how one went about determining the encoding, and
> found this fascinating document that seems to show just now non-trivial
> doing so is:
>
> http://www.whatwg.org/specs/web-apps/current-work/multipage/parsing.html
>
> And I thought email was a pain to parse. Little did I know.
This is interesting as how other clients are adopting the strategy for
guessing the correct encoding.
--
Senthil
More information about the Python-checkins
mailing list