UTF-8 and latin1

Jon Ribbens jon+usenet at unequivocal.eu
Thu Aug 18 14:56:58 EDT 2022


On 2022-08-18, Tobiah <toby at tobiah.org> wrote:
>> You configure the web server to send:
>> 
>>      Content-Type: text/html; charset=...
>> 
>> in the HTTP header when it serves HTML files.
>
> So how does this break down?  When a person enters
> Montréal, Quebéc into a form field, what are they
> doing on the keyboard to make that happen?

It depends on what keybaord they have. Using a standard UK or US
("qwerty") keyboard and Windows you should be able to type "é" by
holding down the 'Alt' key to the right of the spacebar, and typing
'e'.  If they're using a French ("azerty") keyboard then I think they
can enter it by holding 'shift' and typing '2'.

> As the string sits there in the text box, is it latin1, or utf-8
> or something else?

That depends on which browser you're using. I think it's quite likely
it will use UTF-32 (i.e. fixed-width 32 bits per character).

> How does the browser know what sort of data it has in that text box?

It's a text box, so it knows it's text.


More information about the Python-list mailing list