[XML-SIG] I am confused...

Uche Ogbuji uche.ogbuji@fourthought.com
Sun, 28 Jan 2001 08:46:34 -0700


> (for some reason I have not received replies
> from the list in my mailbox - but I'll try
> to answer on reading Martin's reply from Web-page)
> 
> On Sun, 28 Jan 2001, Roman Suzi wrote:
> 
> >- are Python XML tools (and which of them?) up to the task of facilitating
> >site-generation with bearable speed?
> 
> I remember I was doing queries in the form
> "/article/author/name"
> - and it was so slow... (0.5 - 1 sec per query on Celeron 400)

What size was the file?  The time you mentioned is in line for using 4XPath on 
a 640KB file, as you can see in this demo:

[uogbuji@borgia uogbuji]$ python
Python 2.0 (#6, Oct 26 2000, 12:04:19) 
[GCC egcs-2.91.66 19990314/Linux (egcs-1.1.2 release)] on linux2
Type "copyright", "credits" or "license" for more information.
>>> f = open("bigxml", "w")
>>> f.write("<article>\n")
>>> for i in range(10000):
...     f.write("<author><name>Uche Ogbuji</name><name>Roman Suzi</name>
</author>")
... 
>>> f.write("</article>\n")
>>> f.close()              
>>> from Ft.Lib.cDomlette import RawExpatReader
>>> reader = RawExpatReader()
>>> doc = reader.fromUri("bigxml")
>>> from xml.xpath import Evaluate
>>> import time
>>> start = time.time(); result = Evaluate("/article/author/name", 
contextNode=doc); end = time.time()
>>> print end - start
1.24777603149
>>> len(result)
20000
>>> 

bigxml is 640K once generated.  I don't think it's unreasonable for processing 
of that file that navigates through and extracts 20,000 nodes according to a 
path expression.

If you cut the loop to generate only 100 author elements (6.4K file), the 
XPath only takes 0.018 seconds to execute.

I'm curious to learn more about your data and the Python app you're using.  
You say not 4Suite so I assume you mean the old PyPath that used to come in 
PyXML.

> In my application I need many such queries to fill
> the template - that is why speed was unbearable.
> 
> Please, tell me if I did it wrong:
> 
> - parsed xml-file
> - quered each variable in a template-file from the xml-file
> - filled template with values found to produce web-page
>   (some variables go to other pages, for example, content page)
> 
> I am trying to learn XML for 2 years already but am
> still a newbie in practice.
> 
> Anyway, before claiming XML tools for Python slow I need to recheck
> with new versions - if there are no objections to the
> above scheme. (And what is preferrable tool for queries?
> XPath?)

It depends on the nature of the queries.

> Is there any on-line tutorial (?) or just example code
> to learn how to work efficiently with XML from Python?
> (Python is my favorite language while Java is not)
> I read code from xml.* but it doesn't give me clues
> for real usage.

If you get 4Suite there are some examples in the demo directories.  And you 
can always get help here.


-- 
Uche Ogbuji                               Principal Consultant
uche.ogbuji@fourthought.com               +1 303 583 9900 x 101
Fourthought, Inc.                         http://Fourthought.com 
4735 East Walnut St, Ste. C, Boulder, CO 80301-2537, USA
Software-engineering, knowledge-management, XML, CORBA, Linux, Python