How to get an encoding a value?

Alex Martelli aleaxit at yahoo.com
Fri Oct 22 13:10:54 EDT 2004


Diez B. Roggisch <deets.nospaaam at web.de> wrote:

> A common approach to guessing the encoding of said string is to try
> something like this:
> 
> s = <some string with unknown encoding>
> encodings ['ascii', 'latin1', 'utf-8', ....] # list of encodings you expect
> for e in encodings:
>     try:
>         if s == s.decode(e).encode(e):
>               break
>     except UnicodeError:
>         pass

Yeah, but it doesn't work.  iso-8859-x would break for any value of x;
can't tell this way if it was latin-1, or any of the others...


Alex



More information about the Python-list mailing list