Unicode chr(150) en dash

s0suk3 at gmail.com s0suk3 at gmail.com
Thu Apr 17 11:32:49 EDT 2008


On Apr 17, 10:10 am, marexpo... at googlemail.com wrote:
> Thank you Martin and John, for you excellent explanations.
>
> I think I understand the unicode basic principles, what confuses me is the usage different applications make out of it.
>
> For example, I got that EN DASH out of a web page which states <?xml version="1.0" encoding="ISO-8859-1"?> at the beggining. That's why I did go for that encoding. But if the browser can properly decode that character using that encoding, how come other applications can't?
>
> I might need to go for python's htmllib to avoid this, not sure. But if I don't, if I only want to just copy and paste some web pages text contents into a tkinter Text widget, what should I do to succesfully make every single character go all the way from the widget and out of tkinter into a python string variable? How did my browser knew it should render an EN DASH instead of a circumflexed lowercase u?
>
> This is the webpage in case you are interested, 4th line of first paragraph, there is the EN DASH:http://www.pagina12.com.ar/diario/elmundo/subnotas/102453-32303-2008-...
>
> Thanks a lot.
>

Simplemente escribe en ingles. Like this, see? No encodings mess.



More information about the Python-list mailing list