String encoding in Py2.7

ftg at lutix.org ftg at lutix.org
Tue May 29 03:55:51 EDT 2018


Hello,
Using Python 2.7 (will switch to Py3 soon but Before I'd like to understand how string encoding worked)
Could you please tell me is I understood well what occurs in Python's mind:
in a .py file:
if I write s="héhéhé", if my file is declared as unicode coding, python will store in memory s='hx82hx82hx82'
however this is not yet unicode for python interpreter this is just raw bytes. Right? 
By the way, why 'h' is not turned into hexa value? Because it is already in the ASCII table?
If I want python interpreter to recognize my string as unicode I have to declare it as unicode s=u'héhéhé' and magically python will look for those 
hex values 'x82' in the Unicode table. Still OK?
Now: how come when I declare s='héhéhé', print(s) displays well 'héhéhé'? Is it because of my shell windows that is dealing well with unicode? Or is it 
because the print function is magic?

Thanks



More information about the Python-list mailing list