[Tutor] encoding question
Alex Kleider
akleider at sonic.net
Sun Jan 5 08:57:20 CET 2014
On 2014-01-04 21:20, Danny Yoo wrote:
> Oh! That's unfortunate! That looks like a bug on the hostip.info
> side. Check with them about it.
>
>
> I can't get the source code to whatever is implementing the JSON
> response, so I can not say why the city is not being properly included
> there.
>
>
> [... XML rant about to start. I am not disinterested, so my apologies
> in advance.]
>
> ... In that case... I suppose trying the XML output is a possible
> approach.
Well, I've tried the xml approach which seems promising but still I get
an encoding related error.
Is there a bug in the xml.etree module (not very likely, me thinks) or
am I doing something wrong?
There's no denying that the whole encoding issue is still not completely
clear to me in spite of having devoted a lot of time to trying to grasp
all that's involved.
Here's what I've got:
alex at x301:~/Python/Parse$ cat ip_xml.py
#!/usr/bin/env python
# -*- coding : utf -8 -*-
# file: 'ip_xml.py'
import urllib2
import xml.etree.ElementTree as ET
url_format_str = \
u'http://api.hostip.info/?ip=%s&position=true'
def ip_info(ip_address):
response = urllib2.urlopen(url_format_str %\
(ip_address, ))
encoding = response.headers.getparam('charset')
print "'encoding' is '%s'." % (encoding, )
info = unicode(response.read().decode(encoding))
n = info.find('\n')
print "location of first newline is %s." % (n, )
xml = info[n+1:]
print "'xml' is '%s'." % (xml, )
tree = ET.fromstring(xml)
root = tree.getroot() # Here's where it blows up!!!
print "'root' is '%s', with the following children:" % (root, )
for child in root:
print child.tag, child.attrib
print "END of CHILDREN"
return info
if __name__ == "__main__":
info = ip_info("201.234.178.62")
alex at x301:~/Python/Parse$ ./ip_xml.py
'encoding' is 'iso-8859-1'.
location of first newline is 44.
'xml' is '<HostipLookupResultSet version="1.0.1"
xmlns:gml="http://www.opengis.net/gml"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:noNamespaceSchemaLocation="http://www.hostip.info/api/hostip-1.0.1.xsd">
<gml:description>This is the Hostip Lookup Service</gml:description>
<gml:name>hostip</gml:name>
<gml:boundedBy>
<gml:Null>inapplicable</gml:Null>
</gml:boundedBy>
<gml:featureMember>
<Hostip>
<ip>201.234.178.62</ip>
<gml:name>Bogotá</gml:name>
<countryName>COLOMBIA</countryName>
<countryAbbrev>CO</countryAbbrev>
<!-- Co-ordinates are available as lng,lat -->
<ipLocation>
<gml:pointProperty>
<gml:Point srsName="http://www.opengis.net/gml/srs/epsg.xml#4326">
<gml:coordinates>-75.2833,10.4</gml:coordinates>
</gml:Point>
</gml:pointProperty>
</ipLocation>
</Hostip>
</gml:featureMember>
</HostipLookupResultSet>
'.
Traceback (most recent call last):
File "./ip_xml.py", line 33, in <module>
info = ip_info("201.234.178.62")
File "./ip_xml.py", line 23, in ip_info
tree = ET.fromstring(xml)
File "/usr/lib/python2.7/xml/etree/ElementTree.py", line 1301, in XML
parser.feed(text)
File "/usr/lib/python2.7/xml/etree/ElementTree.py", line 1641, in feed
self._parser.Parse(data, 0)
UnicodeEncodeError: 'ascii' codec can't encode character u'\xe1' in
position 456: ordinal not in range(128)
More information about the Tutor
mailing list