[Tutor] urllib2, read data with specific encoding

Kent Johnson kent37 at tds.net
Wed Sep 23 00:04:14 CEST 2009


On Tue, Sep 22, 2009 at 5:04 PM, Sander Sweers <sander.sweers at gmail.com> wrote:
> Hello Tutors, Because a website was giving me issues with unicode
> character I created a function to force the encoding. I am not sure it
> is the correct way to handle these things.
>
> def reader(fobject, encoding='UTF-8'):
>    '''Read a fileobject with specified encoding, defaults UTF-8.'''
>    r = codecs.getreader(encoding)
>    data = r(fobject)
>    return data
>
> I would call it like reader(urllib2.urlopen(someurl), 'somencoding').
> Now I am looking for advice if this is the proper way of dealing with
> these type of issues? Is there better practice maybe?

That seems ok if you want a file-like object. If you just want a
string it would be simpler to use
urllib2.urlopen(someurl).read().decode('someencoding')

Kent


More information about the Tutor mailing list