[Python-Dev] Encoding detection in the standard library?

Wed Apr 23 12:01:18 CEST 2008

On 2008-04-23 07:26, Terry Reedy wrote:
> ""Martin v. Löwis"" <martin at v.loewis.de> wrote in message 
> news:480EC376.8070406 at v.loewis.de...
> |> I certainly agree that if the target set of documents is small enough it
> |
> | Ok. What advantage would you (or somebody working on a similar project)
> | gain if chardet was part of the standard library? What if it was not
> | chardet, but some other algorithm?
> 
> It seems to me that since there is not a 'correct' algorithm but only 
> competing heuristics, encoding detection modules should be made available 
> via PyPI and only be considered for stdlib after a best of breed emerges 
> with community support. 

+1

Though in practice, determining the "best of breed" often becomes a
problem (see e.g. the JSON implementation discussion).

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Apr 23 2008)
 >>> Python/Zope Consulting and Support ...        http://www.egenix.com/
 >>> mxODBC.Zope.Database.Adapter ...             http://zope.egenix.com/
 >>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________

:::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! ::::

    eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
     D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
            Registered at Amtsgericht Duesseldorf: HRB 46611