SAX/Python : read an xml from the end to the top

Diez B. Roggisch deets at nospam.web.de
Tue Mar 7 07:17:07 EST 2006


> We don't want to create new output files for every entry ( each entry
> is an event, and we have approximativaly 5 events per minute). So I
> have to stick with this xml input file.

Well, the overall amount of data won't change. But I can understand that
decision. However, you might consider using a  file per day/week.

> I guess, i will parse it till I find the last reported event and update
> the output xml from there, reporting only the events I am interested
> in....I hope SAX won't take too much time to do all this...(let's say 1
> event = 10 tags, 5 events/minutes, xml file running for 1 month -->
> 5400 000 opening tags)...

Use my suggested approach 2 - that boils down to using "seek" and some
hand-written parsing/buffering. A little bit nasty, but better than
consuming all of that file through sax.

Diez



More information about the Python-list mailing list