lisp is winner in DOM parsing contest! 8-]

John Lenton jlenton at gmail.com
Mon Jul 12 06:42:06 EDT 2004


On Mon, 12 Jul 2004 03:19:03 +0300, Alex Mizrahi <udodenko at hotmail.com> wrote:
> Hello, All!
> 
> i have 3mb long XML document with about 150000 lines (i think it has about
> 200000 elements there) which i want to parse to DOM to work with.
> first i thought there will be no problems, but there were..
> first i tried Python.. there's special interest group that wants to "make
> Python become the premier language for XML processing" so i thought there
> will be no problems with this document..
> i used xml.dom.minidom to parse it.. after it ate 400 meg of RAM i killed
> it - i don't want such processing.. i think this is because of that fat
> class impementation - possibly Python had some significant overhead for each
> object instance, or something like this..

in my experience xml.dom.minidom is a hog; there are several other DOM
and tree-building (i.e., DOMish) parsers out there, most of them
better from a performance point of view; my own personal favourite is
libxml2, but google for the issue and you'll even come across people
who have compared the different things.

-- 
John Lenton (jlenton at gmail.com) -- Random fortune:
bash: fortune: command not found



More information about the Python-list mailing list