Finding all instances of a string in an XML file

dieter dieter at handshake.de
Fri Jun 21 02:17:24 EDT 2013


Jason Friedman <jsf80238 at gmail.com> writes:

> I have XML which looks like:
>
> <?xml version="1.0" encoding="UTF-8"?>
> <!DOCTYPE KMART SYSTEM "my.dtd">
> <LEVEL_1>
>   <LEVEL_2 ATTR="hello">
>     <ATTRIBUTE NAME="Property X" VALUE ="2"/>
>   </LEVEL_2>
>   <LEVEL_2 ATTR="goodbye">
>     <ATTRIBUTE NAME="Property Y" VALUE ="NULL"/>
>     <LEVEL_3 ATTR="aloha">
>       <ATTRIBUTE NAME="Property X" VALUE ="3"/>
>     </LEVEL_3>
>     <ATTRIBUTE NAME="Property Z" VALUE ="welcome"/>
>   </LEVEL_2>
> </LEVEL_1>
>
> The "Property X" string appears twice times and I want to output the "path"
> that leads  to all such appearances.

You could use "lxml" and its "xpath" support.

This is a high end approach: you would use a powerful (and big)
infrastructure (but one which could also be of use for other
XML applications). There are more elementary approaches as well
(e.g. parse the XML into a DOM and provide your own visitor
to find the elements you are interested in).




More information about the Python-list mailing list