[Baypiggies] How to verify if a string is properly encoded in utf-8 ?

Tung Wai Yip tungwaiyip at yahoo.com
Fri Feb 10 05:01:54 CET 2006


try:
     unicodetext = bytestring.decode('utf-8')
     # it is very likely that bytestring is utf-8 encoded
except UnicodeDecodeException:
     # this is not UTF-8 encoded

UTF-8 is designed with redundancy. If you can decode it, it is very likely  
that the text stream is UTF-8 encoded.

Wai Yip

>
> Greetings.
>
> A quick question. How can I verify if a given
> string is properly encoded in utf-8 ?
>
> The use case is this - I have a web form that
> I send with the charset set to utf-8. But it
> is possible that the user might change the
> encoding and hence when the form gets submitted,
> a different character might get back to me.
>
> Thanks,
> Krishna.
>
> _______________________________________________
> Baypiggies mailing list
> Baypiggies at python.org
> http://mail.python.org/mailman/listinfo/baypiggies




More information about the Baypiggies mailing list