splitting an XML file on the basis on basis of XML tags

bijeshn bijeshn at gmail.com
Fri Apr 4 02:07:48 EDT 2008


On Apr 3, 11:28 pm, "Diez B. Roggisch" <de... at nospam.web.de> wrote:
> > I abuse it because I can (and because I don't generally work with XML
> > files larger than 20-30meg) :)
> > And the OP never said the XML file for 1TB in size, which makes things
> > different.
>
> Even with small xml-files your advice was not very sound. Yes, it's
> tempting to use regexes to process xml. But usually one falls flat on
> his face soon - because of whitespace or attribute order or <foo></foo>
> versus <foo/> or .. or .. or.
>
> Use an XML-parser. That's what they are for. And especially with the
> pythonic ones like element-tree (and the compatible lxml), its even more
> straight-forward than using rexes.
>
> Diez

yeah, i plan to use SAX.. but the thing is how do you do it with
that?....

forget 1 TB for now... how do you split an XML file which is something
like 70-80 GB... on the basis of my need (thats the post.)?



More information about the Python-list mailing list