From lpd at major2nd.com Sat Apr 3 15:45:23 2010 From: lpd at major2nd.com (L Peter Deutsch) Date: Sat, 3 Apr 2010 06:45:23 -0700 (PDT) Subject: [XML-SIG] Python DTD parser? Message-ID: <20100403134523.5655A19C71@theremin.major2nd.com> Dear XML-SIG, I am trying to find a working, maintained DTD parser written in Python. The main Python XML distribution does not include one. I was using the PyXML parser, but (1) (at least on SourceForge) it hasn't been maintained for years, and (2) (as of the last version I could find, 0.8.4) it has a bug that sometimes causes it to not process the very last line of a DTD -- which in a well-modularized DTD is often an entity reference that pulls in the main content of the DTD! I've written my own DTD parser, but it omits some features, I really don't know how well it conforms to the spec (which is important because I'm also writing some DTDs of my own), and I'd much rather use a well-tested one written by others. Any advice will be appreciated. Sincerely, L Peter Deutsch From fdrake at acm.org Sat Apr 3 18:23:17 2010 From: fdrake at acm.org (Fred Drake) Date: Sat, 3 Apr 2010 12:23:17 -0400 Subject: [XML-SIG] Python DTD parser? In-Reply-To: <20100403134523.5655A19C71@theremin.major2nd.com> References: <20100403134523.5655A19C71@theremin.major2nd.com> Message-ID: On Sat, Apr 3, 2010 at 9:45 AM, L Peter Deutsch wrote: > I was using the PyXML parser, but (1) (at least on SourceForge) > it hasn't been maintained for years, Nor anywhere else; the sourceforge site is "current", as it goes. > and (2) (as of the last version I could find, 0.8.4) it has a bug > that sometimes causes it to not process the very last line of a DTD -- which > in a well-modularized DTD is often an entity reference that pulls in the > main content of the DTD! Are you attempting to only parse the DTD itself, or validate a document according to the DTD? If the former, xml.parsers.expat can be made to serve this purpose. I'm not sure what people are using for DTD-based document validation in Python these days. -Fred -- Fred L. Drake, Jr. "Chaos is the score upon which reality is written." --Henry Miller From fdrake at acm.org Sun Apr 4 10:36:40 2010 From: fdrake at acm.org (Fred Drake) Date: Sun, 4 Apr 2010 04:36:40 -0400 Subject: [XML-SIG] Python DTD parser? In-Reply-To: <20100403235857.8DE8519C71@theremin.major2nd.com> References: <20100403134523.5655A19C71@theremin.major2nd.com> <20100403235857.8DE8519C71@theremin.major2nd.com> Message-ID: On Sat, Apr 3, 2010 at 7:58 PM, L Peter Deutsch wrote: > ?Sorry to have taken > your time for this. Not a problem; I glad I could help! -Fred -- Fred L. Drake, Jr. "Chaos is the score upon which reality is written." --Henry Miller From lpd at major2nd.com Sun Apr 4 01:58:57 2010 From: lpd at major2nd.com (L Peter Deutsch) Date: Sat, 3 Apr 2010 16:58:57 -0700 (PDT) Subject: [XML-SIG] Python DTD parser? In-Reply-To: (message from Fred Drake on Sat, 3 Apr 2010 12:23:17 -0400) References: <20100403134523.5655A19C71@theremin.major2nd.com> Message-ID: <20100403235857.8DE8519C71@theremin.major2nd.com> > Are you attempting to only parse the DTD itself, or validate a document > according to the DTD? Only the former. > If the former, xml.parsers.expat can be made to serve this purpose. Thanks very much. I just now read the xml.parsers.expat documentation, and I see the callbacks that correspond to the callbacks from the PyXML DTD parser. I think I was using the PyXML parser because that's what Stefan Behnel's dtd2py used, which was the starting point for my current project; but in any case, I see no reason to use it any longer. Sorry to have taken your time for this. L Peter Deutsch From lpd at major2nd.com Mon Apr 5 04:48:18 2010 From: lpd at major2nd.com (lpd at major2nd.com) Date: Sun, 4 Apr 2010 19:48:18 -0700 (PDT) Subject: [XML-SIG] More on Python DTD parser? In-Reply-To: (message from Fred Drake on Sun, 4 Apr 2010 04:36:40 -0400) References: <20100403134523.5655A19C71@theremin.major2nd.com> <20100403235857.8DE8519C71@theremin.major2nd.com> Message-ID: <20100405024818.896A719C71@theremin.major2nd.com> Dear XML-SIG, I'm sorry to impose on you, but I've had a very frustrating afternoon trying to report a couple of Python expat bugs on the PSF bug tracker. SourceForge appears to have lost all of my account data (for the second time), and when I tried to register separately on the bug tracker site, the registration process said "An unexpected error occurred during the processing of your message" and failed to complete. I'm using the Ubuntu Linux 8.04 distribution, which includes Python 2.5.2. The libexpat1 version is 2.0.1-0ubuntu1.1 (hardy-updates), but I don't know whether Python uses this or includes its own copy of expat. The smaller problem -- but one that still led me to waste a fair bit of time -- is that the SetParamEntityParsing method of xmlparser objects is simply missing from the documentation of xml.parsers.expat. Unfortunately, the default is to not parse parameter entities, even when reading external DTDs, so calling this method is required for DTDs that use parameter entities. I finally discovered this method by going to the expat Web site and look at the C API. I checked the Python doc for 2.6.5, and this method is still missing. The larger problem is that often (but not always) when the Parse() method of xmlparser returns after completely parsing a file, something happens at the implementation level that results in a completely bogus Python error "TypeError: An integer is required." The error may occur a few Python statements later, which suggests to me that it is a memory bookkeeping problem of some kind, but I have no idea how to track it down. However, it is totally repeatable, and I can provide a very simple example (a 220-line Python driver, most of which isn't executed, and a 7-line DTD) that triggers the problem. I checked the PSF bug tracker, and I thought that this might be the same bug as # 6676, but my test case doesn't call ParseFile more than once on the same parser instance. I would really prefer not to upgrade to a later Python version, especially not to 2.6 or later, but if this bug has been fixed, I'm willing to consider it. As soon as the registration issue gets cleared up, I'll report these issues properly, but meanwhile, I was wondering if either of them (especially the execution error, which has me stalled right now) rings a bell with anyone. Thanks - L Peter Deutsch From rachelbrw at gmail.com Wed Apr 14 20:26:01 2010 From: rachelbrw at gmail.com (Rachel Brown) Date: Wed, 14 Apr 2010 14:26:01 -0400 Subject: [XML-SIG] Tags Message-ID: Hi, Does XBEL supports TAGS attribute? I see that in the element , "TAG" does not exist as an attribute. If it's not there, can we define our own custom attribute to support tags? Thanks, Rachel -------------- next part -------------- An HTML attachment was scrubbed... URL: From flahertyk1 at hotmail.com Mon Apr 26 00:24:53 2010 From: flahertyk1 at hotmail.com (kimmyaf) Date: Sun, 25 Apr 2010 15:24:53 -0700 (PDT) Subject: [XML-SIG] parsing XML with minidom Message-ID: <28359328.post@talk.nabble.com> Hello. I've only done a litte bit of parsing with minidom before but I'm having trouble getting my values out of this xml. I need the latitude and longitude values in bold. I've tried several things. I think that I am getting into the location tag but maybe the getAttribute function is not correct for this example? OK street_address 50 Oakland St, Wellesley, MA 02481, USA 50 50 street_number Oakland St Oakland St route Wellesley Wellesley locality political Wellesley Wellesley administrative_area_level_3 political Norfolk Norfolk administrative_area_level_2 political Massachusetts MA administrative_area_level_1 political United States US country political 02481 02481 postal_code 42.3118520 -71.2632680 ROOFTOP 42.3093524 -71.2665476 42.3156476 -71.2602524 Code: body = dom.getElementsByTagName('GeocodeResponse')[0] for item in body.getElementsByTagName('location'): lat = item.getAttribute('lat') lng = item.getAttribute('lng') -- View this message in context: http://old.nabble.com/parsing-XML-with-minidom-tp28359328p28359328.html Sent from the Python - xml-sig mailing list archive at Nabble.com. From stefan_ml at behnel.de Mon Apr 26 06:44:49 2010 From: stefan_ml at behnel.de (Stefan Behnel) Date: Mon, 26 Apr 2010 06:44:49 +0200 Subject: [XML-SIG] parsing XML with minidom In-Reply-To: <28359328.post@talk.nabble.com> References: <28359328.post@talk.nabble.com> Message-ID: <4BD51A41.400@behnel.de> kimmyaf, 26.04.2010 00:24: > Hello. I've only done a litte bit of parsing with minidom before but I'm > having trouble getting my values out of this xml. I need the latitude and > longitude values in bold. I don't see anything 'bold' in your mail, but your example tells me what data you mean. Here is some untested code using xml.etree.cElementTree: import xml.etree.cElementTree as ET tree = ET.parse("thefile.xml") for tag in tree.getiterator("location"): print tag.findtext("lat"), tag.findtext("lng") Note that cElementTree is both faster and simpler than minidom. Stefan > > OK > > street_address > 50 Oakland St, Wellesley, MA 02481, > USA > > 50 > 50 > street_number > > > Oakland St > Oakland St > route > > > Wellesley > Wellesley > locality > political > > > Wellesley > Wellesley > administrative_area_level_3 > political > > > Norfolk > Norfolk > administrative_area_level_2 > political > > > Massachusetts > MA > administrative_area_level_1 > political > > > United States > US > country > political > > > 02481 > 02481 > postal_code > > > > 42.3118520 > -71.2632680 > > ROOFTOP > > > 42.3093524 > -71.2665476 > > > 42.3156476 > -71.2602524 > > > > > > > > Code: > > body = dom.getElementsByTagName('GeocodeResponse')[0] > > for item in body.getElementsByTagName('location'): > lat = item.getAttribute('lat') > lng = item.getAttribute('lng') From flahertyk1 at hotmail.com Mon Apr 26 23:14:57 2010 From: flahertyk1 at hotmail.com (kimmyaf) Date: Mon, 26 Apr 2010 14:14:57 -0700 (PDT) Subject: [XML-SIG] parsing XML with minidom In-Reply-To: <4BD51A41.400@behnel.de> References: <28359328.post@talk.nabble.com> <4BD51A41.400@behnel.de> Message-ID: <28370309.post@talk.nabble.com> Thanks Stefan. I tried this but it's not getting into the for block for some reason. I'll keep trying! Stefan Behnel-3 wrote: > > kimmyaf, 26.04.2010 00:24: >> Hello. I've only done a litte bit of parsing with minidom before but I'm >> having trouble getting my values out of this xml. I need the latitude and >> longitude values in bold. > > I don't see anything 'bold' in your mail, but your example tells me what > data you mean. > > Here is some untested code using xml.etree.cElementTree: > > import xml.etree.cElementTree as ET > tree = ET.parse("thefile.xml") > for tag in tree.getiterator("location"): > print tag.findtext("lat"), tag.findtext("lng") > > Note that cElementTree is both faster and simpler than minidom. > > Stefan > > > >> >> OK >> >> street_address >> 50 Oakland St, Wellesley, MA 02481, >> USA >> >> 50 >> 50 >> street_number >> >> >> Oakland St >> Oakland St >> route >> >> >> Wellesley >> Wellesley >> locality >> political >> >> >> Wellesley >> Wellesley >> administrative_area_level_3 >> political >> >> >> Norfolk >> Norfolk >> administrative_area_level_2 >> political >> >> >> Massachusetts >> MA >> administrative_area_level_1 >> political >> >> >> United States >> US >> country >> political >> >> >> 02481 >> 02481 >> postal_code >> >> >> >> 42.3118520 >> -71.2632680 >> >> ROOFTOP >> >> >> 42.3093524 >> -71.2665476 >> >> >> 42.3156476 >> -71.2602524 >> >> >> >> >> >> >> >> Code: >> >> body = dom.getElementsByTagName('GeocodeResponse')[0] >> >> for item in body.getElementsByTagName('location'): >> lat = item.getAttribute('lat') >> lng = item.getAttribute('lng') > > _______________________________________________ > XML-SIG maillist - XML-SIG at python.org > http://mail.python.org/mailman/listinfo/xml-sig > > -- View this message in context: http://old.nabble.com/parsing-XML-with-minidom-tp28359328p28370309.html Sent from the Python - xml-sig mailing list archive at Nabble.com. From stefan_ml at behnel.de Tue Apr 27 13:35:32 2010 From: stefan_ml at behnel.de (Stefan Behnel) Date: Tue, 27 Apr 2010 13:35:32 +0200 Subject: [XML-SIG] parsing XML with minidom In-Reply-To: <28370309.post@talk.nabble.com> References: <28359328.post@talk.nabble.com> <4BD51A41.400@behnel.de> <28370309.post@talk.nabble.com> Message-ID: <4BD6CC04.7030302@behnel.de> kimmyaf, 26.04.2010 23:14: > Stefan Behnel-3 wrote: >> kimmyaf, 26.04.2010 00:24: >>> Hello. I've only done a litte bit of parsing with minidom before but I'm >>> having trouble getting my values out of this xml. I need the latitude and >>> longitude values in bold. >> >> I don't see anything 'bold' in your mail, but your example tells me what >> data you mean. >> >> Here is some untested code using xml.etree.cElementTree: >> >> import xml.etree.cElementTree as ET >> tree = ET.parse("thefile.xml") >> for tag in tree.getiterator("location"): >> print tag.findtext("lat"), tag.findtext("lng") > > Thanks Stefan. I tried this but it's not getting into the for block for some > reason. Maybe the document uses namespace declarations that you forgot to show us? Stefan From flahertyk1 at hotmail.com Tue Apr 27 23:32:39 2010 From: flahertyk1 at hotmail.com (kimmyaf) Date: Tue, 27 Apr 2010 14:32:39 -0700 (PDT) Subject: [XML-SIG] parsing XML with minidom In-Reply-To: <28382321.post@talk.nabble.com> References: <28359328.post@talk.nabble.com> <4BD51A41.400@behnel.de> <28370309.post@talk.nabble.com> <4BD6CC04.7030302@behnel.de> <28382321.post@talk.nabble.com> Message-ID: <28382343.post@talk.nabble.com> Now that I look at my file it does not look well formed. Do I have to use a file? I tried to do tree = ET.parse(xml_response) but i got a file IO error... kimmyaf wrote: > > I don't really know... Here's the whole story. > > I am retrieving the xml by calling this link. > > http://maps.google.com/maps/api/geocode/xml?address=50+Oakland+St,Wellesley,MA,02481&sensor=true > > > > Here's the entire function: > > addr = '50+Oakland+St,Wellesley,MA,02481' > > def geocode_addr(addr): > hostname = 'http://maps.google.com/maps/api/geocode/xml?' > prefix = 'address=' > sensor = '&sensor=true' > url = hostname + prefix + addr + sensor > > print url > > handler = urllib2.urlopen(url) > > xml_response = handler.read() > print xml_response > #dom = minidom.parseString(xml_response) > handler.close() > > tree = ET.parse("GeocodeResponse.xml") > print 'here' > for tag in tree.getiterator("location"): > print 'here1' > print tag.findtext("lat") > tag.findtext("lng") > > > *** I actually just pasted the xml from the shell where i printed > xml_response and saved it into an xml file in my folder called > GeocodeResponse.xml to test this... before going through the work of > saving the xml into a file. I got the "here" but not the "here1" > > I'm attaching my actual file.. > > Sorry! I appreciate the help! this is the last piece of functionality i > need to get working for my programming assignment! > > > > > > > > > Stefan Behnel-3 wrote: >> >> kimmyaf, 26.04.2010 23:14: >>> Stefan Behnel-3 wrote: >>>> kimmyaf, 26.04.2010 00:24: >>>>> Hello. I've only done a litte bit of parsing with minidom before but >>>>> I'm >>>>> having trouble getting my values out of this xml. I need the latitude >>>>> and >>>>> longitude values in bold. >>>> >>>> I don't see anything 'bold' in your mail, but your example tells me >>>> what >>>> data you mean. >>>> >>>> Here is some untested code using xml.etree.cElementTree: >>>> >>>> import xml.etree.cElementTree as ET >>>> tree = ET.parse("thefile.xml") >>>> for tag in tree.getiterator("location"): >>>> print tag.findtext("lat"), tag.findtext("lng") >>> >>> Thanks Stefan. I tried this but it's not getting into the for block for >>> some >>> reason. >> >> Maybe the document uses namespace declarations that you forgot to show >> us? >> >> Stefan >> _______________________________________________ >> XML-SIG maillist - XML-SIG at python.org >> http://mail.python.org/mailman/listinfo/xml-sig >> >> > http://old.nabble.com/file/p28382321/GeocodeResponse.xml > GeocodeResponse.xml > http://old.nabble.com/file/p28382321/GeocodeResponse.xml > GeocodeResponse.xml > -- View this message in context: http://old.nabble.com/parsing-XML-with-minidom-tp28359328p28382343.html Sent from the Python - xml-sig mailing list archive at Nabble.com. From flahertyk1 at hotmail.com Tue Apr 27 23:30:39 2010 From: flahertyk1 at hotmail.com (kimmyaf) Date: Tue, 27 Apr 2010 14:30:39 -0700 (PDT) Subject: [XML-SIG] parsing XML with minidom In-Reply-To: <4BD6CC04.7030302@behnel.de> References: <28359328.post@talk.nabble.com> <4BD51A41.400@behnel.de> <28370309.post@talk.nabble.com> <4BD6CC04.7030302@behnel.de> Message-ID: <28382321.post@talk.nabble.com> I don't really know... Here's the whole story. I am retrieving the xml by calling this link. http://maps.google.com/maps/api/geocode/xml?address=50+Oakland+St,Wellesley,MA,02481&sensor=true Here's the entire function: addr = '50+Oakland+St,Wellesley,MA,02481' def geocode_addr(addr): hostname = 'http://maps.google.com/maps/api/geocode/xml?' prefix = 'address=' sensor = '&sensor=true' url = hostname + prefix + addr + sensor print url handler = urllib2.urlopen(url) xml_response = handler.read() print xml_response #dom = minidom.parseString(xml_response) handler.close() tree = ET.parse("GeocodeResponse.xml") print 'here' for tag in tree.getiterator("location"): print 'here1' print tag.findtext("lat") tag.findtext("lng") *** I actually just pasted the xml from the shell where i printed xml_response and saved it into an xml file in my folder called GeocodeResponse.xml to test this... before going through the work of saving the xml into a file. I got the "here" but not the "here1" I'm attaching my actual file.. Sorry! I appreciate the help! this is the last piece of functionality i need to get working for my programming assignment! Stefan Behnel-3 wrote: > > kimmyaf, 26.04.2010 23:14: >> Stefan Behnel-3 wrote: >>> kimmyaf, 26.04.2010 00:24: >>>> Hello. I've only done a litte bit of parsing with minidom before but >>>> I'm >>>> having trouble getting my values out of this xml. I need the latitude >>>> and >>>> longitude values in bold. >>> >>> I don't see anything 'bold' in your mail, but your example tells me what >>> data you mean. >>> >>> Here is some untested code using xml.etree.cElementTree: >>> >>> import xml.etree.cElementTree as ET >>> tree = ET.parse("thefile.xml") >>> for tag in tree.getiterator("location"): >>> print tag.findtext("lat"), tag.findtext("lng") >> >> Thanks Stefan. I tried this but it's not getting into the for block for >> some >> reason. > > Maybe the document uses namespace declarations that you forgot to show us? > > Stefan > _______________________________________________ > XML-SIG maillist - XML-SIG at python.org > http://mail.python.org/mailman/listinfo/xml-sig > > http://old.nabble.com/file/p28382321/GeocodeResponse.xml GeocodeResponse.xml http://old.nabble.com/file/p28382321/GeocodeResponse.xml GeocodeResponse.xml -- View this message in context: http://old.nabble.com/parsing-XML-with-minidom-tp28359328p28382321.html Sent from the Python - xml-sig mailing list archive at Nabble.com. From morillas at gmail.com Wed Apr 28 00:34:54 2010 From: morillas at gmail.com (Luis Miguel Morillas) Date: Wed, 28 Apr 2010 00:34:54 +0200 Subject: [XML-SIG] parsing XML with minidom In-Reply-To: <28382321.post@talk.nabble.com> References: <28359328.post@talk.nabble.com> <4BD51A41.400@behnel.de> <28370309.post@talk.nabble.com> <4BD6CC04.7030302@behnel.de> <28382321.post@talk.nabble.com> Message-ID: 2010/4/27 kimmyaf : > > I don't really know... Here's the whole story. > > I am retrieving the xml by calling this link. > > http://maps.google.com/maps/api/geocode/xml?address=50+Oakland+St,Wellesley,MA,02481&sensor=true > > > > Here's the entire function: > > addr = '50+Oakland+St,Wellesley,MA,02481' > > def geocode_addr(addr): > ? ?hostname = ?'http://maps.google.com/maps/api/geocode/xml?' > ? ?prefix = 'address=' > ? ?sensor = '&sensor=true' > ? ?url = hostname + prefix + addr + sensor > I prefer amara: >>> from amara import bindery >>> doc = bindery.parse("http://maps.google.com/maps/api/geocode/xml?address=50+Oakland+St,Wellesley,MA,02481&sensor=true") >>> locations = doc.xml_select(u'//location') >>> for loc in locations: ... print loc.lat, loc.lng ... 42.3118520 -71.2632680 ;) --lm > ? ?print url > > ? ?handler = urllib2.urlopen(url) > > ? ?xml_response = handler.read() > ? ?print xml_response > ? ?#dom = minidom.parseString(xml_response) > ? ?handler.close() > > ? ?tree = ET.parse("GeocodeResponse.xml") > ? ?print 'here' > ? ?for tag in tree.getiterator("location"): > ? ? ? ?print 'here1' > ? ? ? ?print tag.findtext("lat") > ? ? ? ?tag.findtext("lng") > > > *** I actually just pasted the xml from the shell where i printed > xml_response and saved it into an xml file in my folder called > GeocodeResponse.xml to test this... before going through the work of saving > the xml into a file. I got the "here" but not the "here1" > > I'm attaching my actual file.. > > Sorry! I appreciate the help! this is the last piece of functionality i need > to get working for my programming assignment! > > > > > > > > > Stefan Behnel-3 wrote: >> >> kimmyaf, 26.04.2010 23:14: >>> Stefan Behnel-3 wrote: >>>> kimmyaf, 26.04.2010 00:24: >>>>> Hello. I've only done a litte bit of parsing with minidom before but >>>>> I'm >>>>> having trouble getting my values out of this xml. I need the latitude >>>>> and >>>>> longitude values in bold. >>>> >>>> I don't see anything 'bold' in your mail, but your example tells me what >>>> data you mean. >>>> >>>> Here is some untested code using xml.etree.cElementTree: >>>> >>>> ? ? ? import xml.etree.cElementTree as ET >>>> ? ? ? tree = ET.parse("thefile.xml") >>>> ? ? ? for tag in tree.getiterator("location"): >>>> ? ? ? ? ? print tag.findtext("lat"), tag.findtext("lng") >>> >>> Thanks Stefan. I tried this but it's not getting into the for block for >>> some >>> reason. >> >> Maybe the document uses namespace declarations that you forgot to show us? >> >> Stefan >> _______________________________________________ >> XML-SIG maillist ?- ?XML-SIG at python.org >> http://mail.python.org/mailman/listinfo/xml-sig >> >> > http://old.nabble.com/file/p28382321/GeocodeResponse.xml GeocodeResponse.xml > http://old.nabble.com/file/p28382321/GeocodeResponse.xml GeocodeResponse.xml > -- > View this message in context: http://old.nabble.com/parsing-XML-with-minidom-tp28359328p28382321.html > Sent from the Python - xml-sig mailing list archive at Nabble.com. > > _______________________________________________ > XML-SIG maillist ?- ?XML-SIG at python.org > http://mail.python.org/mailman/listinfo/xml-sig > From bigotp at acm.org Wed Apr 28 01:37:33 2010 From: bigotp at acm.org (Peter Bigot) Date: Tue, 27 Apr 2010 18:37:33 -0500 Subject: [XML-SIG] parsing XML with minidom In-Reply-To: References: <28359328.post@talk.nabble.com> <4BD51A41.400@behnel.de> <28370309.post@talk.nabble.com> <4BD6CC04.7030302@behnel.de> <28382321.post@talk.nabble.com> Message-ID: I'd have to concur with that recommendation. Google is uninterested in defining schema for their APIs, so you need to process the XML manually and hope they don't change their interface. BTW: The lat and lon components of the location are elements, not attributes. For minidom, use: lat = item.getElementsByTagName('lat')[0] lat.normalize() print lat.firstChild.data Much easier when you can generate a proper binding from a schema, or use something like Amara that does so without a schema. Peter On Tue, Apr 27, 2010 at 5:34 PM, Luis Miguel Morillas wrote: > 2010/4/27 kimmyaf : > > > > I don't really know... Here's the whole story. > > > > I am retrieving the xml by calling this link. > > > > > http://maps.google.com/maps/api/geocode/xml?address=50+Oakland+St,Wellesley,MA,02481&sensor=true > > > > > > > > Here's the entire function: > > > > addr = '50+Oakland+St,Wellesley,MA,02481' > > > > def geocode_addr(addr): > > hostname = 'http://maps.google.com/maps/api/geocode/xml?' > > prefix = 'address=' > > sensor = '&sensor=true' > > url = hostname + prefix + addr + sensor > > > > I prefer amara: > > >>> from amara import bindery > >>> doc = bindery.parse(" > http://maps.google.com/maps/api/geocode/xml?address=50+Oakland+St,Wellesley,MA,02481&sensor=true > ") > >>> locations = doc.xml_select(u'//location') > >>> for loc in locations: > ... print loc.lat, loc.lng > ... > 42.3118520 -71.2632680 > > ;) > > --lm > > > print url > > > > handler = urllib2.urlopen(url) > > > > xml_response = handler.read() > > print xml_response > > #dom = minidom.parseString(xml_response) > > handler.close() > > > > tree = ET.parse("GeocodeResponse.xml") > > print 'here' > > for tag in tree.getiterator("location"): > > print 'here1' > > print tag.findtext("lat") > > tag.findtext("lng") > > > > > > *** I actually just pasted the xml from the shell where i printed > > xml_response and saved it into an xml file in my folder called > > GeocodeResponse.xml to test this... before going through the work of > saving > > the xml into a file. I got the "here" but not the "here1" > > > > I'm attaching my actual file.. > > > > Sorry! I appreciate the help! this is the last piece of functionality i > need > > to get working for my programming assignment! > > > > > > > > > > > > > > > > > > Stefan Behnel-3 wrote: > >> > >> kimmyaf, 26.04.2010 23:14: > >>> Stefan Behnel-3 wrote: > >>>> kimmyaf, 26.04.2010 00:24: > >>>>> Hello. I've only done a litte bit of parsing with minidom before but > >>>>> I'm > >>>>> having trouble getting my values out of this xml. I need the latitude > >>>>> and > >>>>> longitude values in bold. > >>>> > >>>> I don't see anything 'bold' in your mail, but your example tells me > what > >>>> data you mean. > >>>> > >>>> Here is some untested code using xml.etree.cElementTree: > >>>> > >>>> import xml.etree.cElementTree as ET > >>>> tree = ET.parse("thefile.xml") > >>>> for tag in tree.getiterator("location"): > >>>> print tag.findtext("lat"), tag.findtext("lng") > >>> > >>> Thanks Stefan. I tried this but it's not getting into the for block for > >>> some > >>> reason. > >> > >> Maybe the document uses namespace declarations that you forgot to show > us? > >> > >> Stefan > >> _______________________________________________ > >> XML-SIG maillist - XML-SIG at python.org > >> http://mail.python.org/mailman/listinfo/xml-sig > >> > >> > > http://old.nabble.com/file/p28382321/GeocodeResponse.xmlGeocodeResponse.xml > > http://old.nabble.com/file/p28382321/GeocodeResponse.xmlGeocodeResponse.xml > > -- > > View this message in context: > http://old.nabble.com/parsing-XML-with-minidom-tp28359328p28382321.html > > Sent from the Python - xml-sig mailing list archive at Nabble.com. > > > > _______________________________________________ > > XML-SIG maillist - XML-SIG at python.org > > http://mail.python.org/mailman/listinfo/xml-sig > > > _______________________________________________ > XML-SIG maillist - XML-SIG at python.org > http://mail.python.org/mailman/listinfo/xml-sig > -------------- next part -------------- An HTML attachment was scrubbed... URL: From fdrake at acm.org Wed Apr 28 07:09:59 2010 From: fdrake at acm.org (Fred Drake) Date: Wed, 28 Apr 2010 01:09:59 -0400 Subject: [XML-SIG] parsing XML with minidom In-Reply-To: References: <28359328.post@talk.nabble.com> <4BD51A41.400@behnel.de> <28370309.post@talk.nabble.com> <4BD6CC04.7030302@behnel.de> <28382321.post@talk.nabble.com> Message-ID: On Tue, Apr 27, 2010 at 7:37 PM, Peter Bigot wrote: > Google is uninterested in defining schema for their APIs, so you need to > process the XML manually and hope they don't change their interface. And indeed, they do change their schemas without real concern backward compatibility. The sitemaps are in the middle of changing even now. -Fred -- Fred L. Drake, Jr. "Chaos is the score upon which reality is written." --Henry Miller From stefan_ml at behnel.de Wed Apr 28 07:43:32 2010 From: stefan_ml at behnel.de (Stefan Behnel) Date: Wed, 28 Apr 2010 07:43:32 +0200 Subject: [XML-SIG] parsing XML with minidom In-Reply-To: <28382343.post@talk.nabble.com> References: <28359328.post@talk.nabble.com> <4BD51A41.400@behnel.de> <28370309.post@talk.nabble.com> <4BD6CC04.7030302@behnel.de> <28382321.post@talk.nabble.com> <28382343.post@talk.nabble.com> Message-ID: <4BD7CB04.8070806@behnel.de> kimmyaf, 27.04.2010 23:32: >> handler = urllib2.urlopen(url) >> xml_response = handler.read() >> handler.close() >> >> tree = ET.parse("GeocodeResponse.xml") >> Do I have to use a file? I tried to do >> >> tree = ET.parse(xml_response) parse() is meant for parsing files. Use fromstring() to parse from a string. This works for me: >>> import xml.etree.cElementTree as ET >>> tree = ET.parse('gmap.xml') >>> print [ (el.findtext('lat'), el.findtext('lng')) ... for el in tree.getiterator('location') ] [('42.3118520', '-71.2632680')] Stefan From flahertyk1 at hotmail.com Wed Apr 28 23:37:13 2010 From: flahertyk1 at hotmail.com (kimmyaf) Date: Wed, 28 Apr 2010 14:37:13 -0700 (PDT) Subject: [XML-SIG] parsing XML with minidom In-Reply-To: <28359328.post@talk.nabble.com> References: <28359328.post@talk.nabble.com> Message-ID: <28394291.post@talk.nabble.com> Thanks all for the help. This gives me alot of good options and I have a few working.... I learned a lot! kimmyaf wrote: > > Hello. I've only done a litte bit of parsing with minidom before but I'm > having trouble getting my values out of this xml. I need the latitude and > longitude values in bold. I've tried several things. I think that I am > getting into the location tag but maybe the getAttribute function is not > correct for this example? > > > > > OK > > street_address > 50 Oakland St, Wellesley, MA 02481, > USA > > 50 > 50 > street_number > > > Oakland St > Oakland St > route > > > Wellesley > Wellesley > locality > political > > > Wellesley > Wellesley > administrative_area_level_3 > political > > > Norfolk > Norfolk > administrative_area_level_2 > political > > > Massachusetts > MA > administrative_area_level_1 > political > > > United States > US > country > political > > > 02481 > 02481 > postal_code > > > > 42.3118520 > -71.2632680 > > ROOFTOP > > > 42.3093524 > -71.2665476 > > > 42.3156476 > -71.2602524 > > > > > > > > Code: > > body = dom.getElementsByTagName('GeocodeResponse')[0] > > for item in body.getElementsByTagName('location'): > lat = item.getAttribute('lat') > lng = item.getAttribute('lng') > -- View this message in context: http://old.nabble.com/parsing-XML-with-minidom-tp28359328p28394291.html Sent from the Python - xml-sig mailing list archive at Nabble.com.