[Python-Dev] Encoding detection in the standard library?
Guido van Rossum
guido at python.org
Mon Apr 21 19:38:03 CEST 2008
To the contrary, an encoding-guessing module is often needed, and
guessing can be done with a pretty high success rate. Other Unicode
libraries (e.g. ICU) contain guessing modules. I suppose the API could
return two values: the guessed encoding and a confidence indicator.
Note that the locale settings might figure in the guess.
On Mon, Apr 21, 2008 at 10:28 AM, Georg Brandl <g.brandl at gmx.net> wrote:
> Christian Heimes schrieb:
>
> > David Wolever schrieb:
> >> Is there some sort of text encoding detection module is the standard
> >> library?
> >> And, if not, is there any reason not to add one?
> >
> > You cannot detect the encoding unless it's explicitly defined through a
> > header (e.g. the UTF BOM). It's technically impossible. The best you can
> > do is an educated guess.
>
> Exactly, and in light of that, I'm -1 for such a standard module.
> We've enough issues with modules implementing (apparently) fully
> specified standards. :)
>
> Georg
>
>
>
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> http://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: http://mail.python.org/mailman/options/python-dev/guido%40python.org
>
--
--Guido van Rossum (home page: http://www.python.org/~guido/)
More information about the Python-Dev
mailing list