python3 urlopen(...).read() returns bytes

Glenn G. Chappell glenn.chappell at gmail.com
Mon Dec 22 21:22:47 EST 2008


Okay, so I guess I didn't really *get* the whole unicode/text/binary
thing. Maybe I still don't, but I think I'm getting closer. Thanks to
everyone who replied.

On Dec 22, 1:41 pm, ajaksu <aja... at gmail.com> wrote:
> On Dec 22, 8:25 pm, Christian Heimes <li... at cheimes.de> wrote:
> That said, a "decode to declared HTTP header encoding" version of
> urlopen could be useful to give some users the output they want (text
> from network io) or to make it clear why bytes is the safe way.

Sounds like a great idea. More to the point, it sounds like it's
pretty much a necessary idea.

Consider: reading a web page is an easy one-liner. Now, no one is
going to write that one-liner, and then spend 20 lines trying to get
the Content-Type and encoding figured out. Instead we're all going to
do it the short, easy, *wrong* way. So every program in the world that
uses urlopen gets to have the same bug. Not good. The *right* way
needs to be the *easy* way.

-GGC-



More information about the Python-list mailing list