[Tutor] urllib2, read data with specific encoding
Kent Johnson
kent37 at tds.net
Wed Sep 23 00:04:14 CEST 2009
On Tue, Sep 22, 2009 at 5:04 PM, Sander Sweers <sander.sweers at gmail.com> wrote:
> Hello Tutors, Because a website was giving me issues with unicode
> character I created a function to force the encoding. I am not sure it
> is the correct way to handle these things.
>
> def reader(fobject, encoding='UTF-8'):
> '''Read a fileobject with specified encoding, defaults UTF-8.'''
> r = codecs.getreader(encoding)
> data = r(fobject)
> return data
>
> I would call it like reader(urllib2.urlopen(someurl), 'somencoding').
> Now I am looking for advice if this is the proper way of dealing with
> these type of issues? Is there better practice maybe?
That seems ok if you want a file-like object. If you just want a
string it would be simpler to use
urllib2.urlopen(someurl).read().decode('someencoding')
Kent
More information about the Tutor
mailing list