Translation table to map Latin-1 to ASCII?
Rene Pijlman
reageer.in at de.nieuwsgroep
Sun Jan 26 18:48:49 EST 2003
John Machin:
>Rene Pijlman:
>> accentstable = string.join(map(chr, range(192)), "") +
>> "AAAAAAACEEEEIIIIDNOOOOOxOUUUUYpBaaaaaaaceeeeiiiidnooooo/ouuuuypy"
>>
>
>You have mapped 0xF0 (small eth) to o, instead of d. Is this
>deliberate?
No, just plain old ignorance.
>You have mapped 0xDE (capital thorn) to P and 0xFE (small thorn) to p.
>On a sound-alike basis rather than a look-alike basis, it may be more
>appropriate to map these to T & t (or TH and th).
>
>Likewise 0xDF (small sharp s) may be better mapped to s (or ss) than
>to B.
Wow! You people sure are paying attention :-)
>You may wish to contemplate the notion that a single mapping of one
>byte to one byte may not be the best way to go, but of course this
>depends on what you are trying to achieve.
This is just for search log analysis. It's not really a problem
when some Icelandic searches are 1% skewed statistically :-)
But here is the new and improved Latin2AsciiLossyMapping (t)
version 1.0 rc1:
accentstable = string.join(map(chr, range(192)), "") +
"AAAAAAACEEEEIIIIDNOOOOOxOUUUUYTsaaaaaaaceeeeiiiidnooooo/ouuuuyty"
Thanks everyone for your help.
--
René Pijlman
Wat wil jij leren? http://www.leren.nl
More information about the Python-list
mailing list