a simple unicode question

Diez B. Roggisch deets at nospam.web.de
Mon Oct 19 15:35:42 EDT 2009


George Trojan schrieb:
> A trivial one, this is the first time I have to deal with Unicode. I am 
> trying to parse a string s='''48° 13' 16.80" N'''. I know the charset is 
> "iso-8859-1". To get the degrees I did
>  >>> encoding='iso-8859-1'
>  >>> q=s.decode(encoding)
>  >>> q.split()
> [u'48\xc2\xb0', u"13'", u'16.80"', u'N']
>  >>> r=q.split()[0]
>  >>> int(r[:r.find(unichr(ord('\xc2')))])
> 48
> 
> Is there a better way of getting the degrees?

Instead of this rather convoluted way to specify a degree-sign, better do

  # -*- coding: utf-8 -*-
  ...
  int(r[:r.find(u"°")])


Please note that the utf-8-encoding has *nothing* todo with your string 
- it's just the source-file encoding. Of course your editor must use 
utf-8 for saving the encoding. Or you can use any other one you like.

Diez



More information about the Python-list mailing list