From stefan_ml at behnel.de Sun Jul 1 17:38:36 2007 From: stefan_ml at behnel.de (Stefan Behnel) Date: Sun, 01 Jul 2007 17:38:36 +0200 Subject: [XML-SIG] lxml 1.3 released In-Reply-To: <4687BAFA.1080201@comcast.net> References: <467E51B2.4020207@behnel.de> <4687BAFA.1080201@comcast.net> Message-ID: <4687CA7C.9010409@behnel.de> Hi, Gloria W wrote: > There's no chance of getting an extension to this module which supports > DOM2, is there? I cannot work with the current PyXML DOM2 support. It is > inflexible (does not allow subtree construction/insertion), is buggy, > and bloated. Well, "bloat" is a word I would use for any DOM implementation. lxml is actually quite the opposite of the three: extremely flexible, safe and simple. > I wrote my own, but I don't have time to implement the > range() functionality. Let me know if there are plans to extend this. It > would be great. No. lxml will not support the DOM API. It already has a (mostly?) equivalent API that is much simpler in spirit (and thus much easier to use), so there is no reason for us to take the step back to the impressively un-pythonic DOM API. If you really want a W3C-DOM compatible API and want to use libxml2, there is a project that implements DOM on top of them: libxml2dom. http://www.boddie.org.uk/python/libxml2dom.html I assume this is for porting existing code? But even then, you may consider rewriting the XML parts in lxml. We had a couple of comments on the list that make me believe that this is a) not that hard (depending on your code size/architecture) and b) worth it, at least in the cases I heard about. Oh, and for the really hard-to-port stuff, you can still use Python's DOM support: http://codespeak.net/lxml/sax.html Stefan From stefan_ml at behnel.de Sun Jul 1 18:02:33 2007 From: stefan_ml at behnel.de (Stefan Behnel) Date: Sun, 01 Jul 2007 18:02:33 +0200 Subject: [XML-SIG] lxml 1.3 released In-Reply-To: <4687BAFA.1080201@comcast.net> References: <467E51B2.4020207@behnel.de> <4687BAFA.1080201@comcast.net> Message-ID: <4687D019.1030109@behnel.de> Gloria W wrote: > There's no chance of getting an extension to this module which supports > DOM2, is there? I cannot work with the current PyXML DOM2 support. It is > inflexible (does not allow subtree construction/insertion), is buggy, > and bloated. I wrote my own, but I don't have time to implement the > range() functionality. Let me know if there are plans to extend this. It > would be great. Ah, I forgot to say that lxml.etree is obviously flexible enough to support a DOM compatible implementation on top of itself. It's just that no-one has done it and it is unlikely that someone takes the time to actually do it. It would not add any functionality that isn't there already, just with a less pythonic API. In case you consider starting such a thing, here's how to do it: http://codespeak.net/lxml/element_classes.html Stefan From robert.rawlins at thinkbluemedia.co.uk Mon Jul 2 13:20:21 2007 From: robert.rawlins at thinkbluemedia.co.uk (Robert Rawlins - Think Blue) Date: Mon, 2 Jul 2007 12:20:21 +0100 Subject: [XML-SIG] Help Needed (Will pay if someone is interested) Message-ID: <002701c7bc9a$fa7e5fc0$ef7b1f40$@rawlins@thinkbluemedia.co.uk> Hello Chaps, I'm looking for some help with XML parsing, I've been playing around with this over the past few days and the only solution I can come up with seems to be a little slow and also leaves what I think is a memory leak in my application, which causes all kinds of problems. I have a very simple XML file which I need to loop over the elements and extract the attribute information from, but the loop must be conditional as the attributes must meet a certain criteria. My current solution is using minidom, which I've read isn't one of the better parsers, if anyone knows of any that are better for the task I would love to hear it, the content is extracted regularly so whatever we chose needs to be quick, and validation isn't so important. Take a look at this brief example of the XML we're dealing with: Now this file details events which are possibly going to occur over the next couple of weeks. Now what I need to do is have a function which is called 'getCurrentEvent()' which will return any events that should be occurring at this point in time, or now(). The 'Type' attribute details how often the event it likely to reoccur, 1 being daily, 2 being weekly and so on, if no elements are found which are occurring in this time and date then I would like it to return the default event which is defined in the attributes of the 'schedules' tag. The current solution I have put together uses minidom to loop over the elements from the XML and then does a conditional against a python module called 'period.py'. This works ok, but it's very slow and also contains a memory leak. I need something better and I have no real idea or experience of how to achieve it which is why I'm here with you good gentlemen to try and find a solution. I appreciate this could be quite a challenging task so would be happy to pay someone for their time to solve this for me, you may want to contact me off list to talk about that though and we'd be hoping to get this sorted ASAP. Thanks guys, Rob -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.python.org/pipermail/xml-sig/attachments/20070702/41cd5b1e/attachment.htm From robert.rawlins at thinkbluemedia.co.uk Mon Jul 2 16:21:19 2007 From: robert.rawlins at thinkbluemedia.co.uk (Robert Rawlins - Think Blue) Date: Mon, 2 Jul 2007 15:21:19 +0100 Subject: [XML-SIG] Help Needed (Will pay if someone is interested) Message-ID: <005501c7bcb4$42646780$c72d3680$@rawlins@thinkbluemedia.co.uk> Hello Chaps, I'm looking for some help with XML parsing, I've been playing around with this over the past few days and the only solution I can come up with seems to be a little slow and also leaves what I think is a memory leak in my application, which causes all kinds of problems. I have a very simple XML file which I need to loop over the elements and extract the attribute information from, but the loop must be conditional as the attributes must meet a certain criteria. My current solution is using minidom, which I've read isn't one of the better parsers, if anyone knows of any that are better for the task I would love to hear it, the content is extracted regularly so whatever we chose needs to be quick, and validation isn't so important. Take a look at this brief example of the XML we're dealing with: Now this file details events which are possibly going to occur over the next couple of weeks. Now what I need to do is have a function which is called 'getCurrentEvent()' which will return any events that should be occurring at this point in time, or now(). The 'Type' attribute details how often the event it likely to reoccur, 1 being daily, 2 being weekly and so on, if no elements are found which are occurring in this time and date then I would like it to return the default event which is defined in the attributes of the 'schedules' tag. The current solution I have put together uses minidom to loop over the elements from the XML and then does a conditional against a python module called 'period.py'. This works ok, but it's very slow and also contains a memory leak. I need something better and I have no real idea or experience of how to achieve it which is why I'm here with you good gentlemen to try and find a solution. I appreciate this could be quite a challenging task so would be happy to pay someone for their time to solve this for me, you may want to contact me off list to talk about that though and we'd be hoping to get this sorted ASAP. I'm thinking maybe some form of xquery instead of the iteration? I really don't know, it's up to you. Thanks guys, Rob -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.python.org/pipermail/xml-sig/attachments/20070702/6d1f7614/attachment.html From stefan_ml at behnel.de Mon Jul 2 17:12:31 2007 From: stefan_ml at behnel.de (Stefan Behnel) Date: Mon, 02 Jul 2007 17:12:31 +0200 Subject: [XML-SIG] Help Needed (Will pay if someone is interested) In-Reply-To: <002701c7bc9a$fa7e5fc0$ef7b1f40$@rawlins@thinkbluemedia.co.uk> References: <002701c7bc9a$fa7e5fc0$ef7b1f40$@rawlins@thinkbluemedia.co.uk> Message-ID: <468915DF.8060701@behnel.de> Robert Rawlins - Think Blue wrote: > I?m looking for some help with XML parsing, I?ve been playing around > with this over the past few days and the only solution I can come up > with seems to be a little slow and also leaves what I think is a memory > leak in my application, which causes all kinds of problems. > > > > I have a very simple XML file which I need to loop over the elements and > extract the attribute information from, but the loop must be conditional > as the attributes must meet a certain criteria. > > > > My current solution is using minidom, That's not the solution, that's the problem. Use cElementTree. > which I?ve read isn?t one of the > better parsers, if anyone knows of any that are better for the task I > would love to hear it, the content is extracted regularly so whatever we > chose needs to be quick, and validation isn?t so important. Take a look > at this brief example of the XML we?re dealing with: > > > > > > type="1" start="2007-01-01 00:00:00" end="2007-01-01 00:00:00" /> > > start="2007-01-01 00:00:00" end="2007-01-01 00:00:00" /> > > type="1" start="2007-01-01 00:00:00" end="2007-01-01 00:00:00" /> > > type="3" start="2007-01-01 00:00:00" end="2007-01-01 00:00:00" /> > > > > > > Now this file details events which are possibly going to occur over the > next couple of weeks. Now what I need to do is have a function which is > called ?getCurrentEvent()? which will return any events that should be > occurring at this point in time, or now(). from xml.etree import celementtree as et # Python 2.5 # untested search_date = "2007-02-03 00:00:00" for _, element in et.iterparse("event-file.xml"): if element.tag == event: start = element.get("start") end = element.get("end") if start > search_date: continue if end != start and end < search_date: continue print et.tostring(element) or something like that. You'll love the performance. > The ?Type? attribute details > how often the event it likely to reoccur, 1 being daily, 2 being weekly > and so on, if no elements are found which are occurring in this time and > date then I would like it to return the default event which is defined > in the attributes of the ?schedules? tag. That's much harder, as it requires real date calculation in general. Are you sure you want an XML tree as a database? Why not read the file into a more suitable in-memory data structure and search from there? Stefan From robert.rawlins at thinkbluemedia.co.uk Mon Jul 2 17:19:02 2007 From: robert.rawlins at thinkbluemedia.co.uk (Robert Rawlins - Think Blue) Date: Mon, 2 Jul 2007 16:19:02 +0100 Subject: [XML-SIG] Help Needed (Will pay if someone is interested) In-Reply-To: <468915DF.8060701@behnel.de> References: <002701c7bc9a$fa7e5fc0$ef7b1f40$@rawlins@thinkbluemedia.co.uk> <468915DF.8060701@behnel.de> Message-ID: <006901c7bcbc$551d5e10$ff581a30$@rawlins@thinkbluemedia.co.uk> Hi Stefan. Thanks for getting back to me so quickly, I've been tearing my hair out on this one :-) >> Why not read the file into a more suitable in-memory data structure and search from there? I'd be more than happy to do something like this, I just have no idea how, what type of data structure are you thinking would be simple? Thanks for the cElementTree example the code already look a lot cleaner than that of the minidom stuff I was working on, jeez that stuff was messy lol. Thanks, You'll have to excuse me on any naivety as I'm relatively new to both XML and Python, mixing the two is making my head spin :-D Rob -----Original Message----- From: Stefan Behnel [mailto:stefan_ml at behnel.de] Sent: 02 July 2007 16:13 To: Robert Rawlins - Think Blue Cc: xml-sig at python.org Subject: Re: [XML-SIG] Help Needed (Will pay if someone is interested) Robert Rawlins - Think Blue wrote: > I?m looking for some help with XML parsing, I?ve been playing around > with this over the past few days and the only solution I can come up > with seems to be a little slow and also leaves what I think is a memory > leak in my application, which causes all kinds of problems. > > > > I have a very simple XML file which I need to loop over the elements and > extract the attribute information from, but the loop must be conditional > as the attributes must meet a certain criteria. > > > > My current solution is using minidom, That's not the solution, that's the problem. Use cElementTree. > which I?ve read isn?t one of the > better parsers, if anyone knows of any that are better for the task I > would love to hear it, the content is extracted regularly so whatever we > chose needs to be quick, and validation isn?t so important. Take a look > at this brief example of the XML we?re dealing with: > > > > > > type="1" start="2007-01-01 00:00:00" end="2007-01-01 00:00:00" /> > > start="2007-01-01 00:00:00" end="2007-01-01 00:00:00" /> > > type="1" start="2007-01-01 00:00:00" end="2007-01-01 00:00:00" /> > > type="3" start="2007-01-01 00:00:00" end="2007-01-01 00:00:00" /> > > > > > > Now this file details events which are possibly going to occur over the > next couple of weeks. Now what I need to do is have a function which is > called ?getCurrentEvent()? which will return any events that should be > occurring at this point in time, or now(). from xml.etree import celementtree as et # Python 2.5 # untested search_date = "2007-02-03 00:00:00" for _, element in et.iterparse("event-file.xml"): if element.tag == event: start = element.get("start") end = element.get("end") if start > search_date: continue if end != start and end < search_date: continue print et.tostring(element) or something like that. You'll love the performance. > The ?Type? attribute details > how often the event it likely to reoccur, 1 being daily, 2 being weekly > and so on, if no elements are found which are occurring in this time and > date then I would like it to return the default event which is defined > in the attributes of the ?schedules? tag. That's much harder, as it requires real date calculation in general. Are you sure you want an XML tree as a database? Why not read the file into a more suitable in-memory data structure and search from there? Stefan From dkuhlman at rexx.com Mon Jul 2 20:21:46 2007 From: dkuhlman at rexx.com (Dave Kuhlman) Date: Mon, 2 Jul 2007 11:21:46 -0700 Subject: [XML-SIG] lxml 1.3 released In-Reply-To: <4687CA7C.9010409@behnel.de> References: <467E51B2.4020207@behnel.de> <4687BAFA.1080201@comcast.net> <4687CA7C.9010409@behnel.de> Message-ID: <20070702182146.GA10229@cutter.rexx.com> On Sun, Jul 01, 2007 at 05:38:36PM +0200, Stefan Behnel wrote: > Hi, > > Gloria W wrote: > > There's no chance of getting an extension to this module which supports > > DOM2, is there? I cannot work with the current PyXML DOM2 support. It is > > inflexible (does not allow subtree construction/insertion), is buggy, > > and bloated. > > Well, "bloat" is a word I would use for any DOM implementation. > > lxml is actually quite the opposite of the three: extremely flexible, safe and > simple. Stefan - Just so you don't take silence as thank-less-ness, thank you for lxml. I use it frequently. My rst2odt writer for Docutils (converts reStructuredText to .odt files for OpenOffice oowriter) is built on it. It's great. Thanks much. Dave -- Dave Kuhlman http://www.rexx.com/~dkuhlman From ixzlh at draftca.com Tue Jul 10 04:23:25 2007 From: ixzlh at draftca.com (Carol Mills) Date: Mon, 9 Jul 2007 19:23:25 -0700 Subject: [XML-SIG] corkscrew virile Message-ID: <4692ED9D.6040508@online.be> VPSN Has Wild Day as Stock climbs $0.019 (90.48%) GAIN! VISION AIRSHIPS INC (Other OTC:VPSN.PK) The 24 hrs has been a sky rocket for VPSN. With major news to be released stirring interest has brought huge returns for investors. The key is, knowing when to get on and when to get off a stock, for successful day trading. VPSN has distinct patterns to watch for. This ride is not over. Jump on now and ride the price up on the highest return "Day Trade" we have featured this year. Get on VPSN first thing Tuesday as we stired you in the right direction for Monday. It can make your code more readable. Often applications that use Java Persistence execute queries that return a collection of objects. " He is a regular speaker on enterprise application design. Also, web sites and public APIs used in mashups have very different mechanisms for responding to exception conditions. Because many public APIs provide the response in XML, the server-side code must often convert the response into another data type. However, there are other ways to call a service. It then parses the XML content at the specified URL into an XML Document object. A server-side mashup is also called a proxy-style mashup because a component in the server acts as a proxy to the service. Using the Yahoo Maps Geocoding Service The Yahoo Maps Geocoding service is a REST-based web service that is available for use by other web sites through a public API. List required: java. WSIT addresses key aspects of web services interoperability such as reliable messaging, transaction handling, and security. Here are some other good reasons for using proxy style in doing a mashup: The Java EE and Java SE platforms provide many libraries that make it easy to access other web sites from the server. Additionally, it's good practice to validate the input data to a service. Query class will likely change to better support generics. JavaOne Online has the conference technical sessions in both PDF and multimedia format for free. It then parses the XML content at the specified URL into an XML Document object. The proxy used in a server-side mashup can serve as a buffer between the client and the other web site. The warning is generated because query. In one approach, called a server-side mashup, also known as a proxy-style mashup, you integrate services and content on the server. From robert.rawlins at thinkbluemedia.co.uk Tue Jul 10 16:32:48 2007 From: robert.rawlins at thinkbluemedia.co.uk (Robert Rawlins - Think Blue) Date: Tue, 10 Jul 2007 15:32:48 +0100 Subject: [XML-SIG] Parsing Help Message-ID: <022501c7c2ff$30c4ee90$924ecbb0$@rawlins@thinkbluemedia.co.uk> Hello Guys, I'm looking for some help building a function which can parse some XML for me using ElementTree. The document is of a very consistent format and I've copied an example of the document below. Now, the piece of information I'm looking to retrieve is inside the element and is, in this example , however I want the function to return the standard integer value and not the unit8 encoded version, so instead of my function returning '0x05' it just needs to return '5' which is the standard integer version. I will be passing this XML into the function as a string, so the function will be formed something like this: Def myFunction(XmlAsString): Pass the xml and exract my value.... Return the value as an integer... I'm not sure on the best method to do this, I just want something nice and quick, lightweight and that's not resource hungry. Can anyone offer some advice on this? Thanks guys, Rob -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.python.org/pipermail/xml-sig/attachments/20070710/f454b2d2/attachment.htm From dkuhlman at rexx.com Tue Jul 10 17:53:22 2007 From: dkuhlman at rexx.com (Dave Kuhlman) Date: Tue, 10 Jul 2007 08:53:22 -0700 Subject: [XML-SIG] Parsing Help In-Reply-To: <022501c7c2ff$30c4ee90$924ecbb0$@rawlins@thinkbluemedia.co.uk> References: <022501c7c2ff$30c4ee90$924ecbb0$@rawlins@thinkbluemedia.co.uk> Message-ID: <20070710155322.GA21020@cutter.rexx.com> On Tue, Jul 10, 2007 at 03:32:48PM +0100, Robert Rawlins - Think Blue wrote: > Hello Guys, > > > > I'm looking for some help building a function which can parse some XML for > me using ElementTree. The document is of a very consistent format and I've > copied an example of the document below. > Here are some suggestions. Import ElementTree or Lxml: from xml.etree import ElementTree as etree Or: from lxml import etree Parse the string: root = etree.fromstring(xmlstring) Iterate over the nodes in the tree: for node in root.getiterator(): Check for the "attribute" tag: if node.tag == 'attribute': # But, use something like the following if there is a namespace. #if node.tag == '{%s}attribute' % (node.nsmap['mynamespace'], ): Get the "id" attribute (or None is there isn't one): charid = node.get('id', None) Enough to get you started? Dave -- Dave Kuhlman http://www.rexx.com/~dkuhlman From tosh54 at gmail.com Mon Jul 16 04:39:15 2007 From: tosh54 at gmail.com (Peter Hoffmann) Date: Sun, 15 Jul 2007 19:39:15 -0700 Subject: [XML-SIG] add Namespace Defaulting to ElementTree.write() Message-ID: <1184553555.324929.149040@n2g2000hse.googlegroups.com> Hi! As I somtimes have to look at or even edit xml markup generated by ElementTree, it would be a lot easier if ElementTree could use Namespace Defaultig as described in http://www.w3.org/TR/REC-xml-names/#defaulting Here is an example what I mean: ### raw input Cheaper by the Dozen 1568491379 ### normal ElementTree output Cheaper by the Dozen 1568491379 ### with Patch/set default_namespace="urn:loc.gov:books" Cheaper by the Dozen 1568491379 I wrote a small patch against ElementTree.py (Python 2.5.1 (r251:54863, May 2 2007, 16:56:35) so that one can set a namespace to be used as a default namespace when serealising an xml tree. You can find it at http://user.cs.tu-berlin.de/~tosh/elementtree/ Some basic tests are provided in selftest.py and some examples in test.py. Any chances that the patch or a funcionality like this gets added to ElementTree? Regards Peter From stefan_ml at behnel.de Mon Jul 16 08:29:20 2007 From: stefan_ml at behnel.de (Stefan Behnel) Date: Mon, 16 Jul 2007 08:29:20 +0200 Subject: [XML-SIG] add Namespace Defaulting to ElementTree.write() In-Reply-To: <1184553555.324929.149040@n2g2000hse.googlegroups.com> References: <1184553555.324929.149040@n2g2000hse.googlegroups.com> Message-ID: <469B1040.10903@behnel.de> Peter Hoffmann wrote: > As I somtimes have to look at or even edit xml markup generated by > ElementTree, it would be a lot easier if ElementTree could use > Namespace Defaultig as described in http://www.w3.org/TR/REC-xml-names/#defaulting > > Here is an example what I mean: > ### raw input > xmlns:isbn='urn:ISBN:0-395-36341-6'> > Cheaper by the Dozen > 1568491379 > > > ### normal ElementTree output > > Cheaper by the Dozen > 1568491379 ns1:number> > > > ### with Patch/set default_namespace="urn:loc.gov:books" > > Cheaper by the Dozen > 1568491379 ns1:number> > lxml.etree has been using a property called "nsmap" since the beginning, which is already more generic than just a default namespace. If ElementTree wants to adopt such a feature, I'd be happy if it could keep up compatibility from its own side here. Stefan From jimcat3 at optonline.net Fri Jul 27 19:18:02 2007 From: jimcat3 at optonline.net (Jim Caterbone) Date: Fri, 27 Jul 2007 12:18:02 -0500 Subject: [XML-SIG] Buy Vicodin online today, overnight shipping xyiz kccg v Message-ID: Price list? -------------- next part -------------- A non-text attachment was scrubbed... Name: winmail.dat Type: application/ms-tnef Size: 1232 bytes Desc: not available Url : http://mail.python.org/pipermail/xml-sig/attachments/20070727/f9d2b50c/attachment.bin From dantrevino at gmail.com Mon Jul 30 19:53:05 2007 From: dantrevino at gmail.com (Dan Trevino) Date: Mon, 30 Jul 2007 13:53:05 -0400 Subject: [XML-SIG] problem parsing msproject xml Message-ID: I'm trying to parse project xml. The main thing i'm trying to get at is the task name, which is basically in this structure: 1 do step 1 <-- i want the text from here ... John Doe ... I'm having difficulty figuring out which methods to use to access the data. I cant get to "Name" directly because it is used also for project resources....so I need the task name specifically. Where do I go from here: ============================== >>> prjdoc = minidom.parse('prj.xml') >>> tasklist = prjdoc.getElementsByTagName("Task") >>> for task in tasklist: ... taskname = task.getElementsByTagName('Name') ... print taskname ... [] [] [] [] [] [] ================================ TIA, dan From billk at sunflower.com Tue Jul 31 19:24:47 2007 From: billk at sunflower.com (Bill Kinnersley) Date: Tue, 31 Jul 2007 12:24:47 -0500 Subject: [XML-SIG] problem parsing msproject xml In-Reply-To: References: Message-ID: <46AF705F.308@sunflower.com> Never used DOM and never written a line of Python, but maybe even I know the answer to this one! The minidom documentation suggests that instead of print taskname you should be saying print taskname.firstChild.data Bill K Dan Trevino wrote: > I'm trying to parse project xml. The main thing i'm trying to get at > is the task name, which is basically in this structure: > > 1 > do step 1 <-- i want the text from here > ... > > > John Doe > ... > > > I'm having difficulty figuring out which methods to use to access the > data. I cant get to "Name" directly because it is used also for > project resources....so I need the task name specifically. Where do I > go from here: > > ============================== >>>> prjdoc = minidom.parse('prj.xml') >>>> tasklist = prjdoc.getElementsByTagName("Task") >>>> for task in tasklist: > ... taskname = task.getElementsByTagName('Name') > ... print taskname > ... > [] > [] > [] > [] > [] > [] > > ================================ > TIA, > dan From dantrevino at gmail.com Tue Jul 31 20:02:26 2007 From: dantrevino at gmail.com (Dan Trevino) Date: Tue, 31 Jul 2007 14:02:26 -0400 Subject: [XML-SIG] Fwd: problem parsing msproject xml In-Reply-To: References: <46AF705F.308@sunflower.com> Message-ID: Thanks I tried this, but: >>> prjdoc = xml.dom.minidom.parse('prj.xml') >>> tasklist = prjdoc.getElementsByTagName('Task') >>> for task in tasklist: ... taskname = task.getElementsByTagName('Name') ... print taskname.firstChild.data ... Traceback (most recent call last): File "", line 3, in AttributeError: 'NodeList' object has no attribute 'firstChild' >>> On 7/31/07, Bill Kinnersley wrote: > Never used DOM and never written a line of Python, but maybe even I know > the answer to this one! > > The minidom documentation suggests that instead of > > print taskname > > you should be saying > > print taskname.firstChild.data > > > Bill K > > Dan Trevino wrote: > > I'm trying to parse project xml. The main thing i'm trying to get at > > is the task name, which is basically in this structure: > > > > 1 > > do step 1 <-- i want the text from here > > ... > > > > > > John Doe > > ... > > > > > > I'm having difficulty figuring out which methods to use to access the > > data. I cant get to "Name" directly because it is used also for > > project resources....so I need the task name specifically. Where do I > > go from here: > > > > ============================== > >>>> prjdoc = minidom.parse('prj.xml') > >>>> tasklist = prjdoc.getElementsByTagName("Task") > >>>> for task in tasklist: > > ... taskname = task.getElementsByTagName('Name') > > ... print taskname > > ... > > [] > > [] > > [] > > [] > > [] > > [] > > > > ================================ > > TIA, > > dan > > _______________________________________________ > XML-SIG maillist - XML-SIG at python.org > http://mail.python.org/mailman/listinfo/xml-sig > From stefan_ml at behnel.de Tue Jul 31 20:28:19 2007 From: stefan_ml at behnel.de (Stefan Behnel) Date: Tue, 31 Jul 2007 20:28:19 +0200 Subject: [XML-SIG] problem parsing msproject xml In-Reply-To: References: Message-ID: <46AF7F43.9050608@behnel.de> Dan Trevino wrote: > I'm trying to parse project xml. The main thing i'm trying to get at > is the task name, which is basically in this structure: > > 1 > do step 1 <-- i want the text from here > ... > > > John Doe > ... > > > I'm having difficulty figuring out which methods to use to access the > data. I cant get to "Name" directly because it is used also for > project resources....so I need the task name specifically. Try lxml.etree: >>> # untested >>> from lxml import etree >>> tree = etree.parse("project.xml") >>> print tree.xpath("//Task/Name/text()") ["do step 1", ...] or if you don't like XPath: >>> # untested >>> from lxml import etree >>> tree = etree.parse("project.xml") >>> for task in tree.getiterator("Task"): ... for name in task.findall("Name"): ... print name.text do step 1 ... http://codespeak.net/lxml Stefan