Translation table to map Latin-1 to ASCII?

John Machin sjmachin at lexicon.net
Sun Jan 26 17:57:19 EST 2003


Rene Pijlman <reageer.in at de.nieuwsgroep> wrote in message news:<t4t73vc63vo2kuf5p5t4ao4cdmng7p2387 at 4ax.com>...

> accentstable = string.join(map(chr, range(192)), "") +
> "AAAAAAACEEEEIIIIDNOOOOOxOUUUUYpBaaaaaaaceeeeiiiionooooo/ouuuuypy"
> 

You have mapped 0xF0 (small eth) to o, instead of d. Is this
deliberate?

You have mapped 0xDE (capital thorn) to P and 0xFE (small thorn) to p.
On a sound-alike basis rather than a look-alike basis, it may be more
appropriate to map these to T & t (or TH and th). I can't imagine
non-Icelandic people searching for 'Porstein' unless they are reading
directly off some Icelandic text in which case they probably would
have enough of a clue to realise that 'Porstein' wouldn't retrieve
anything of much relevance. 'Thorstein' seems much more likely.

Likewise 0xDF (small sharp s) may be better mapped to s (or ss) than
to B.

You may wish to contemplate the notion that a single mapping of one
byte to one byte may not be the best way to go, but of course this
depends on what you are trying to achieve.




More information about the Python-list mailing list