Handle foreign character web input

Terry Reedy tjreedy at udel.edu
Fri Jun 28 19:19:59 EDT 2019


On 6/28/2019 4:25 PM, Tobiah wrote:
> A guy comes in and enters his last name as RÖnngren.
> 
> So what did the browser really give me; is it encoded
> in some way, like latin-1?  Does it depend on whether
> the name was cut and pasted from a Word doc. etc?
> Should I handle these internally as unicode?  Right
> now my database tables are latin-1 and things seem
> to usually work, but not always.

Unless you want to restrict your app to people with or converible to 
latin-1 (western Europe) names, you should use utf-8 or let the database
encode for you.

> Also, what do people do when searching for a record.
> Is there some way to get 'Ronngren' to match the other
> possible foreign spellings?

I have seen a program that converts all latin-1 chars to ascii for matching.

-- 
Terry Jan Reedy





More information about the Python-list mailing list