Handle foreign character web input

Tobiah toby at tobiah.org
Fri Jun 28 16:58:32 EDT 2019


On 6/28/19 1:33 PM, Chris Angelico wrote:> On Sat, Jun 29, 2019 at 6:31 AM Tobiah <toby at tobiah.org> wrote:
>>
>> A guy comes in and enters his last name as RÖnngren.
>>
>> So what did the browser really give me; is it encoded
>> in some way, like latin-1?  Does it depend on whether
>> the name was cut and pasted from a Word doc. etc?
>> Should I handle these internally as unicode?  Right
>> now my database tables are latin-1 and things seem
>> to usually work, but not always.
> 
> Definitely handle them as Unicode. You'll receive them in some
> encoding, probably UTF-8, and it depends on the browser. Ideally, your
> back-end library (eg Flask) will deal with that for you.
It varies by browser?
So these records are coming in from all over the world.  How
do people handle possibly assorted encodings that may come in?

I'm using Web2py.  Does the request come in with an encoding
built in?  Is that how people get the proper unicode object?

>> Also, what do people do when searching for a record.
>> Is there some way to get 'Ronngren' to match the other
>> possible foreign spellings?
> 
> Ehh....... probably not. That's a human problem, not a programming
> one. Best of luck.

Well so I'm at an event.  A guy comes up to me at the kiosk
and say his name is RÖnngren.  I can't find him, typing in "ron"
so I ask him how to spell his last name.  What does he say, and
what do I type?



More information about the Python-list mailing list