Where to locate existing standard encodings in python

Tim Chase python.list at tim.thechases.com
Tue Nov 11 19:06:03 EST 2008


>>   Content-Type: text/html; charset=utf-8lias
>>
>> For Python to parse this, I had to use Python's list of known encodings 
>> in order to determine whether I could even parse the site (for passing 
>> it to a string's .encode() method). 
> 
> You haven't said why you think you need a list of known encodings!
> 
> I would have thought that just trying it on some dummy data will let you 
> determine very quickly whether the alleged encoding is supported by the 
> Python version etc that you are using.
> 
> E.g.
> 
> | >>> alleged_encoding = "utf-8lias"
> | >>> "any old ascii".decode(alleged_encoding)
> | Traceback (most recent call last):
> |  File "<stdin>", line 1, in <module>
> | LookupError: unknown encoding: utf-8lias

I then try to remap the bogus encoding to one it seems most like 
(in this case, utf-8) and retry.  Having a list of encodings 
allows me to either eyeball or define a heuristic to say "this is 
the closest match...try this one instead".  That mapping can then 
be used to update a mapping file so I don't have to think about 
it the next time I encounter the same bogus encoding.

-tkc






More information about the Python-list mailing list