[Python-Dev] Encoding detection in the standard library?

David Wolever wolever at cs.toronto.edu
Mon Apr 21 19:14:07 CEST 2008


On 21-Apr-08, at 12:44 PM, skip at pobox.com wrote:
>
>     David> Is there some sort of text encoding detection module is the
>     David> standard library?  And, if not, is there any reason not  
> to add
>     David> one?
> No, there's not.  I suspect the fact that you can't correctly  
> determine the
> encoding of a chunk of text 100% of the time mitigates against it.
Sorry, I wasn't very clear what I was asking.

I was thinking about making an educated guess -- just like chardet  
(http://chardet.feedparser.org/).

This is useful when you get a hunk of data which _should_ be some  
sort of intelligible text from the Big Scary Internet (say, a posted  
web form or email message), and you want to do something useful with  
it (say, search the content).


More information about the Python-Dev mailing list