XML parser that sorts elements?

Diez B. Roggisch deets at nospam.web.de
Fri Sep 22 11:47:02 EDT 2006


jmike at alum.mit.edu wrote:

> Hi everyone,
> 
> I am a total newbie to XML parsing.  I've written a couple of toy
> examples under the instruction of tutorials available on the web.
> 
> The problem I want to solve is this.  I have an XML snippet (in a
> string) that looks like this:
> 
> <booga foo="1" bar="2">
>   <well>hello</well>
>   <blah>goodbye</blah>
> </booga>
> 
> and I want to alphabetize not only the attributes of an element, but I
> also want to alphabetize the elements in the same scope:
> 
> <booga bar="2" foo="1">
>   <blah>goodbye</blah>
>   <well>hello</well>
> </booga>
> 
> I've found a "Canonizer" class, that subclasses saxlib.HandlerBase, and
> played around with it and vaguely understand what it's doing.  But what
> I get out of it is
> 
> <booga bar="2" foo="1">
>   <well>hello</well>
>   <blah>goodbye</blah>
> </booga>
> 
> in other words it sorts the attributes of each element, but doesn't
> touch the order of the elements.
> 
> How can I sort the elements?  I think I want to subclass the parser, to
> present the elements to the content handler in different order, but I
> couldn't immediately find any examples of the parser being subclassed.

You can sort them by obtaining them as tree of nodes, e.g. using element
tree or minidom.

But you should be aware that this will change the structure of your document
and it isn't always desirable to do so - e.g. html pages would look funny
to say the least if sorted in that way.

Diez



More information about the Python-list mailing list