From cjw at ncf.ca Mon Nov 3 20:58:12 2008 From: cjw at ncf.ca (Colin J. Williams) Date: Mon, 03 Nov 2008 14:58:12 -0500 Subject: [XML-SIG] Messages when installing xml Message-ID: <490F57D4.8020608@ncf.ca> An HTML attachment was scrubbed... URL: From stefan_ml at behnel.de Mon Nov 3 21:09:18 2008 From: stefan_ml at behnel.de (Stefan Behnel) Date: Mon, 03 Nov 2008 21:09:18 +0100 Subject: [XML-SIG] Messages when installing xml In-Reply-To: <490F57D4.8020608@ncf.ca> References: <490F57D4.8020608@ncf.ca> Message-ID: <490F5A6E.5080907@behnel.de> Hi, first of all: is there any reason you are not using the latest binary packages of lxml 2.1? Also: the mailing list of lxml is a better place to ask these questions than the more general XML-SIG list. Colin J. Williams wrote: > The log of my install is below: > > Building lxml version 2.2.alpha1-59220. > Building with Cython 0.9.8.1.1. > ERROR: 'xslt-config' is not recognized as an internal or external command, > operable program or batch file. > > ** make sure the development packages of libxml2 and libxslt are installed ** You need to install libxml2 and libxslt and add the xslt-config script that ships with libxslt to your program search path when building lxml. > [...] > building 'lxml.etree' extension > C:\MinGW\bin\gcc.exe -mno-cygwin -mdll -O -Wall -IC:\python25\include > -IC:\python25\PC -c src/lxml\lxml.etree.c -o > build\temp.win32-2.5\Release\src\lxml\lxml.etree.o -w It's generally recommended to build lxml statically against libxml2/libxslt on Windows. There are some build instructions on the web page. Stefan From paul at semsfamily.com Thu Nov 6 17:58:40 2008 From: paul at semsfamily.com (psems) Date: Thu, 6 Nov 2008 08:58:40 -0800 (PST) Subject: [XML-SIG] Installing PyXML problems In-Reply-To: <8AB41BD516C9BA43B679016D08318F1F2B949A3758@MAIL01.mpimp-golm.mpg.de> References: <8AB41BD516C9BA43B679016D08318F1F2B949A3758@MAIL01.mpimp-golm.mpg.de> Message-ID: <20365133.post@talk.nabble.com> I'm new to Python too and ran into this problem also and couldn't find the direct answer... Try this, it worked for me... 1. Visit: http://sourceforge.net/project/showfiles.php?group_id=2435 http://sourceforge.net/project/showfiles.php?group_id=2435 2. Download & Run: Automated MinGW Installer (make sure you check the base, g++ and Make) 3. Update your PATH: Add the C:\mingw\bin directory to the system PATH 4. Create (or edit): C:\Python25\Lib\distutils\distutils.cfg and add the following 2 lines: [build] compiler=mingw32 That should do it for you! It worked for me :) I found this solution at: http://livingpyxml.python-hosting.com/wiki/AmaraWindowsInstallTips Liam Childs wrote: > > error: Python was built with Visual Studio 2003; > extensions must be built with a compiler than can generate compatible > binaries. > Visual Studio 2003 was not found on this system. If you have Cygwin > installed, > you can try compiling with MingW32, by passing "-c mingw32" to setup.py. > -- View this message in context: http://www.nabble.com/Installing-PyXML-problems-tp12405796p20365133.html Sent from the Python - xml-sig mailing list archive at Nabble.com. From brennan.ron at gmail.com Tue Nov 25 21:29:39 2008 From: brennan.ron at gmail.com (Rbrennan) Date: Tue, 25 Nov 2008 12:29:39 -0800 (PST) Subject: [XML-SIG] XML Parsing Newbie Message-ID: <20689034.post@talk.nabble.com> Hello, I am a xml parsing newbie and I am having a hard time because I am so new. I am trying to parse: 100 1 200 400 500 I want to grab the value of threads which is 100, runs which is 1, duration which is 200, process which is 400, and rampup which is 500. Here is my febal attempt at getting the values between the tags: import xml.dom.minidom import sys class textHandler: def parseFile(self): tags = xml.dom.minidom.parse("/home/grinder/grinder-3.0.1/data/reportsInfo.txt") serverelements = tags.getElementsByTagName ( 'server' ) for server in serverelements: print server.childNodes[0].data x = textHandler() x.parseFile() Can anyone help? Thanks, Ron -- View this message in context: http://www.nabble.com/XML-Parsing-Newbie-tp20689034p20689034.html Sent from the Python - xml-sig mailing list archive at Nabble.com. From metolone+gmane at gmail.com Wed Nov 26 02:44:44 2008 From: metolone+gmane at gmail.com (Mark Tolonen) Date: Tue, 25 Nov 2008 17:44:44 -0800 Subject: [XML-SIG] XML Parsing Newbie References: <20689034.post@talk.nabble.com> Message-ID: "Rbrennan" wrote in message news:20689034.post at talk.nabble.com... > > Hello, > > I am a xml parsing newbie and I am having a hard time because I am so new. > I am trying to parse: > > > > > 100 > 1 > 200 > 400 > 500 > > > > I want to grab the value of threads which is 100, runs which is 1, > duration > which is 200, process which is 400, and rampup which is 500. > > Here is my febal attempt at getting the values between the tags: > > import xml.dom.minidom > import sys > > class textHandler: > > def parseFile(self): > tags = > xml.dom.minidom.parse("/home/grinder/grinder-3.0.1/data/reportsInfo.txt") > serverelements = tags.getElementsByTagName ( 'server' ) > for server in serverelements: > print server.childNodes[0].data > > x = textHandler() > x.parseFile() > > > Can anyone help? Under the DOM model, the first child node in is a text node containing the whitespace between and . If you know there is only one threads element under each server you can use: for server in serverelements: print server.getElementsByTagName('threads')[0].childNodes[0].data But you will probably find ElementTree a much more useful xml library: from xml.etree import ElementTree tree = ElementTree.parse("/home/grinder/grinder-3.0.1/data/reportsInfo.txt") print tree.find('config/threads').text -Mark From stefan_ml at behnel.de Wed Nov 26 10:23:54 2008 From: stefan_ml at behnel.de (Stefan Behnel) Date: Wed, 26 Nov 2008 10:23:54 +0100 Subject: [XML-SIG] XML Parsing Newbie In-Reply-To: References: <20689034.post@talk.nabble.com> Message-ID: <492D15AA.3070705@behnel.de> Mark Tolonen wrote: > "Rbrennan" wrote >> I am trying to parse: >> >> >> >> >> 100 >> 1 >> 200 >> 400 >> 500 >> >> >> >> I want to grab the value of threads which is 100, runs which is 1, >> duration which is 200, process which is 400, and rampup which is 500. > > you will probably find ElementTree a much more useful xml library: > > from xml.etree import ElementTree > tree = > ElementTree.parse("/home/grinder/grinder-3.0.1/data/reportsInfo.txt") > print tree.find('config/threads').text There's also lxml.objectify, which seems well adapted to your data. >>> from lxml import objectify >>> server = objectify.parse('thefile.xml').getroot() >>> server.config.threads 100 Note how it shows 100, not '100'. http://codespeak.net/lxml/ Stefan From benkokakao at gmail.com Sun Nov 30 20:25:37 2008 From: benkokakao at gmail.com (Christian Benke) Date: Sun, 30 Nov 2008 20:25:37 +0100 Subject: [XML-SIG] element-value in multiple namespace Message-ID: <20081130202537.c56fec35.benkokakao@gmail.com> Hello! I'm currently struggling to extract some information from a gpx-file (geodata in xml-format). You can see the xml-content here: http://benko.login.cx/2008-11-21.xml So far i've managed to get out most of the values with this function: tree = etree.parse(gpxfile) gpx_namespace = "http://www.topografix.com/GPX/1/1" root = tree.getroot() trackSegments = root.getiterator("{%s}trkseg"%gpx_namespace) for trackSegment in trackSegments: for trackPoint in trackSegment: lat=trackPoint.attrib['lat'] lon=trackPoint.attrib['lon'] altitude=trackPoint.find('{%s}ele'% gpx_namespace).text time=trackPoint.find('{%s}time'% gpx_namespace).text However, i have little idea what the syntax to pick out the values of the Temperature- and Pressure-elements has to be, which have an additional namespace, plus they have a "gpxx:"-prefix. I've struggled yesterday night with this but didn't succeed. Can someone give me a kick-off how this would work? Cheers Christian From stefan_ml at behnel.de Sun Nov 30 21:24:53 2008 From: stefan_ml at behnel.de (Stefan Behnel) Date: Sun, 30 Nov 2008 21:24:53 +0100 Subject: [XML-SIG] element-value in multiple namespace In-Reply-To: <20081130202537.c56fec35.benkokakao@gmail.com> References: <20081130202537.c56fec35.benkokakao@gmail.com> Message-ID: <4932F695.6000406@behnel.de> Hi, Christian Benke wrote: > I'm currently struggling to extract some information from a gpx-file > (geodata in xml-format). You can see the xml-content here: > http://benko.login.cx/2008-11-21.xml > > So far i've managed to get out most of the values with this function: > > tree = etree.parse(gpxfile) > gpx_namespace = "http://www.topografix.com/GPX/1/1" > root = tree.getroot() > trackSegments = root.getiterator("{%s}trkseg"%gpx_namespace) > for trackSegment in trackSegments: > for trackPoint in trackSegment: > lat=trackPoint.attrib['lat'] > lon=trackPoint.attrib['lon'] > altitude=trackPoint.find('{%s}ele'% gpx_namespace).text > time=trackPoint.find('{%s}time'% gpx_namespace).text > > However, i have little idea what the syntax to pick out the > values of the Temperature- and Pressure-elements has to be, which have > an additional namespace Then use that namespace to ask for them in exactly the same way as you did above with the gpx_namespace. http://effbot.org/zone/element-namespaces.htm http://effbot.org/zone/element-xpath.htm > plus they have a "gpxx:"-prefix. Prefixes are not relevant. What counts is the namespace URI. Stefan From benkokakao at gmail.com Sun Nov 30 22:22:57 2008 From: benkokakao at gmail.com (Christian Benke) Date: Sun, 30 Nov 2008 22:22:57 +0100 Subject: [XML-SIG] element-value in multiple namespace In-Reply-To: <4932F695.6000406@behnel.de> References: <20081130202537.c56fec35.benkokakao@gmail.com> <4932F695.6000406@behnel.de> Message-ID: <20081130222257.8afefa62.benkokakao@gmail.com> On Sun, 30 Nov 2008 21:24:53 +0100 Stefan Behnel wrote: > > tree = etree.parse(gpxfile) > > gpx_namespace = "http://www.topografix.com/GPX/1/1" > > root = tree.getroot() > > trackSegments = root.getiterator("{%s}trkseg"%gpx_namespace) > > for trackSegment in trackSegments: > > for trackPoint in trackSegment: > > lat=trackPoint.attrib['lat'] > > lon=trackPoint.attrib['lon'] > > altitude=trackPoint.find('{%s}ele'% gpx_namespace).text > > time=trackPoint.find('{%s}time'% gpx_namespace).text > Then use that namespace to ask for them in exactly the same way as > you did above with the gpx_namespace. > > http://effbot.org/zone/element-namespaces.htm > http://effbot.org/zone/element-xpath.htm I'm afraid thats what i'm trying all the time. Using the script above gives me a trackPoint-element - Running >>> etree.tostring(trackPoint.find('{%s}extensions'% gpx_namespace)) ' \n \n26 \n1020\n \n\n' which is the subtree i'm looking for. I'd assume i can grab the contents of this with: ext_namespace = "http://gps.wintec.tw/xsd" trackPoint.find('{%s}extensions'% gpx_namespace)).find('{%s} Temperature'% ext_namespace) But this is not correct, the result is 'None'. Imho this is exactly the same as further above where i extract lon,lat,altitude,..., besides that i don't use gpx_namespace but ext_namespace in the find(). Should the gpx_namespace be applied additionally to the ext_namespace in the extensions-subelements as it is a subelement of the gpx_namespace? Puzzled Christian