String parsing

Paul Boddie paul at boddie.org.uk
Wed May 9 11:41:52 EDT 2007


On 9 May, 06:42, Dennis Lee Bieber <wlfr... at ix.netcom.com> wrote:
>

[HTMLParser-based solution]

Here's another approach using libxml2dom [1] in HTML parsing mode:

import libxml2dom

# The text, courtesy of Dennis.
sample = """<input type="hidden" name="RFP" value="-1"/>
<!--<input type="hidden" name="EnteredBy" value="johnxxxx"/>-->
<input type="hidden" name="EnteredBy" value="john"/>
<input type="hidden" name="ServiceIndex" value="1"/>
<input type="hidden" name="LastUpdated" value="1178658863"/>
<input type="hidden" name="NextPage" value="../active/active.php"/>
<input type="hidden" name="ExistingStatus" value="10" />
<table width="98%" cellpadding="0" cellspacing="0" border="0"
align="center" >"""

# Parse the string in HTML mode.
d = libxml2dom.parseString(sample, html=1)

# For all input fields having the name 'LastUpdated',
# get the value attribute.
last_updated_fields = d.xpath("//input[@name='LastUpdated']/@value")

# Assuming we find one, print the contents of the value attribute.
print last_updated_fields[0].nodeValue

Paul

[1] http://www.python.org/pypi/libxml2dom




More information about the Python-list mailing list