Handle foreign character web input

Chris Angelico rosuav at gmail.com
Wed Jul 3 18:47:06 EDT 2019


On Thu, Jul 4, 2019 at 8:12 AM Igor Korot <ikorot01 at gmail.com> wrote:
>
> Hi, Chris,
>
> On Wed, Jul 3, 2019 at 4:41 PM Chris Angelico <rosuav at gmail.com> wrote:
> >
> > On Thu, Jul 4, 2019 at 7:08 AM Igor Korot <ikorot01 at gmail.com> wrote:
> > >
> > > Hi, Thomas,
> > >
> > > On Sat, Jun 29, 2019 at 11:06 AM Thomas Jollans <tjol at tjol.eu> wrote:
> > > >
> > > > On 28/06/2019 22:25, Tobiah wrote:
> > > > > A guy comes in and enters his last name as RÖnngren.
> > > > With a capital Ö in the middle? That's unusual.
> > > > >
> > > > > So what did the browser really give me; is it encoded
> > > > > in some way, like latin-1?  Does it depend on whether
> > > > > the name was cut and pasted from a Word doc. etc?
> > > > > Should I handle these internally as unicode?  Right
> > > > > now my database tables are latin-1 and things seem
> > > > > to usually work, but not always.
> > > >
> > > >
> > > > If your database is using latin-1, German and French names will work,
> > > > but Croatian and Polish names often won't. Not to mention people using
> > > > other writing systems.
> > > >
> > > > So Günther and François are ok, but Bolesław turns into Boles?aw and
> > > > don't even think about anybody called Владимир or محمد.
> > >
> > > As others pointed out - it is very easy to do transliteration especially if
> > > its' not a user registration that will be done.
> > >
> > > But I would simply not do that at all - create your forms in English and
> > > accept English spellings only.
> > > Most people that do computers this days can enter phonetic spelling
> > > of their first/last names (even in Chinese/Japanese/Hebrew).
> > >
> > > And all European names can be transliterated to English.
> > >
> > > Besides as the OP said - if someone comes to him and will
> > > try to enter the non-English name. The OP might not even have the appropriate
> > > keyboard layout to input such a name. And if this is an (time consuming) event
> > > all (s)he can do is ask for phonetic spelling.
> > >
> > > Thank you.
> > >
> > What you basically just said was "I wish all those ugly foreign names
> > would just go away". Honestly, that's not really an acceptable
> > solution; you assume that you can transliterate any name into
> > "English" in some perfect way, which is acceptable to everyone in the
> > world. And you also assume that this transformation will be completely
> > consistent, so you can ask someone his/her name and always get back
> > the same thing.
> >
> > If you want to do a Latinization and accent strip for the sake of a
> > search, that's fine; but make sure you retain the name as people want
> > it to be retained. Don't be bigoted.
> >
> > https://www.kalzumeus.com/2010/06/17/falsehoods-programmers-believe-about-names/
>
> I'm not opposing this, in fact I'm all for keeping the native name somewhere in
> the DB.
>
> But as I said, imaging the following situation:
>
> You are somewhere in Germany and you have a German version of OS
> (any OS)
> .
> You also have a German keyboard (hardware) with German keys.
>
> Now you are assigned to go to some international events where people
> all over the world will be coming to your presentation and they will be
> registering on you machine
>
> Also imagine that the company policy prohibits you from  modifying the
> system settings.
>
> My solution:
> I would probably grab a lot of registering paper and ask people to enter
> English transliteration of the names on the machine so when you come
> back to the office you can properly enter their names using all those
> different keyboards (maybe virtual ones) to associate them with
> their English counterparts.
>
> Just curious - what would you do?
>

I would use a Compose key (if available) or a software input method
(always available, as long as you have an internet connection and
browser, and often available locally too). With your method, how would
you enter it back at the office, if all you have is an English
transliteration? How do you transform it back into the original
characters?

ChrisA



More information about the Python-list mailing list