An attempt at guessing the encoding of a (non-unicode) string
Christos TZOTZIOY Georgiou
tzot at sil-tec.gr
Mon Apr 5 05:46:58 EDT 2004
On Sat, 3 Apr 2004 12:22:05 -0800, rumours say that "Roger Binns"
<rogerb at rogerbinns.com> might have written:
>Christos TZOTZIOY Georgiou wrote:
>> This could be implemented as a function in codecs.py (let's call it
>> "wild_guess"), that is based on some pre-calculated data.
>Windows already has a related function:
>
>http://msdn.microsoft.com/library/default.asp?url=/library/en-us/intl/unicode_81np.asp
As far as I understand, this function tests whether its argument is a
valid Unicode text, so it has little to do with the issue I brought up:
take a python string (8-bit bytes) and try to guess its encoding (eg,
iso8859-1, iso8859-7 etc).
There must be a similar function used for the "auto guess encoding"
function of the MS Internet Explorer, however:
1. even if it is exported and usable under windows, it is not platform
independent
2. its guessing success rate (until IE 5.5 which I happen to use) is not
very high
<snip>
Thanks for your reply, anyway.
--
TZOTZIOY, I speak England very best,
Ils sont fous ces Redmontains! --Harddix
More information about the Python-list
mailing list