xml parsing with lxml

Doug OLeary dkoleary at olearycomputers.com
Fri Oct 7 15:35:09 EDT 2016


Hey;

I'm trying to gather information from a number of weblogic configuration xml files using lxml.  I've found any number of tutorials on the web but they all seem to assume a knowledge that I apparently don't have... that, or I'm just being rock stupid today - that's distinct possibility too.

The xml looks like:

<?xml version='1.0' encoding='UTF-8'?>
<domain xmlns=[[irrelevant urls snipped]]">
  <name>Domain1</name>
  <domain-version>10.3.5.0</domain-version>
  <security-configuration>
    <name>[[snipp]]</name>
    <realm>
        [[realm children snipped]
    </realm>
    <default-realm>myrealm</default-realm>
  </security-configuration>
  <log>[[snip]]</log>
  <server>  
    <name>[[snip]]</name>
    <ssl>
       [[snip]]
    </ssl>
    <log>
      <name>[[snip]]</name>
      <file-name> [[snip]]</file-name>
      <rotation-type>byTime</rotation-type>
      <file-count>14</file-count>
      <rotation-time>02:00</rotation-time>
      <log-file-severity>Info</log-file-severity>
    </log>
    <machine>[[snip]]</machine>
    <listen-port>40024</listen-port>
    <listen-port-enabled>true</listen-port-enabled>
    <cluster>snip]]</cluster>
    <web-server>
      [[children snipped]]
    </web-server>
    <listen-address>${hostname}</listen-address>
    <interface-address>${hostname}</interface-address>
    <administration-port>40022</administration-port>
    <java-compiler>javac</java-compiler>
    <server-start>
       [[children snipped]
    </server-start>
   [[rest snipped]
  </server>
</domain>

The tutorials all start out well enough with:

$ python 
Python 3.5.2 (default, Aug 22 2016, 09:04:07) 
[GCC 4.4.7 20120313 (Red Hat 4.4.7-4)] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> from lxml import etree
>>> doc = etree.parse('config.xml')

Now what?  For instance, how do I list the top level children of <domain>.*?</domain>?  In that partial list, it'd be name, domain-version, security-configuration, log, and server.  

For some reason, I'm not able to make the conceptual leap to get to the first step of those tutorials.

The end goal of this exercise is to programatically identify weblogic clusters and their hosts.  

thanks

Doug O'Leary



More information about the Python-list mailing list