[Tutor] man pages parsing (still)

Kent Johnson kent37 at tds.net
Mon Sep 11 17:59:03 CEST 2006


Tiago Saboga wrote:
> Em Segunda 11 Setembro 2006 12:24, Kent Johnson escreveu:
>> Tiago Saboga wrote:
>>> Em Segunda 11 Setembro 2006 11:15, Kent Johnson escreveu:
>>>> Tiago Saboga wrote:
>>>> How big is the XML? 25 seconds is a long time...I would look at
>>>> cElementTree (implementation of ElementTree in C), it is pretty fast.
>>>> http://effbot.org/zone/celementtree.htm
>>> It's about 10k. Hey, it seems easy, but I'd like not to start over again.
>>> Of course, if it's the only solution... 25 (28, in fact, for the cp man
>>> page) isn't really acceptable.
>> That's tiny! No way it should take 25 seconds to parse a 10k file.
>>
>> Have you tried saving the file separately and parsing from disk? That
>> would help determine if the interprocess pipe is the problem.
> 
> Just tried, and - incredible - it took even longer: 46s. But in the second run 
> it came back to 25s. I really don't understand what's going on. I did some 
> other tests, and I found that all the code before "parser.parse(stout)" runs 
> almost instantly; it then takes all the running somewhere between this call 
> and the first event; and the rest is almost instantly again. Any ideas?

What did you try, buffering or reading from a file? If parsing from a 
file takes 25 secs, I am amazed...
> 
> By the way, I've read the pages you indicated at effbot, but I don't see where 
> to begin. Do you know of a gentler introduction to this module 
> (cElementTree)? 

The main ElementTree page is here, but try parsing from a file first, 
your file is so small...
http://effbot.org/zone/element.htm

Kent



More information about the Tutor mailing list