Newbie question: matching

Tobiah toby at rcsreg.com
Thu Apr 15 19:47:41 EDT 2004


This should really be done with the XML parsing
libraries.  I don't remember the libs now, but
I watched a co-worker translate HTML into XML,
and then use minidom, or sax or some other lib
to parse the XML.  It is very convenient once
you see how to do it. You either trigger an
event for each tag/text, or get handed an entire
object tree representing your HTML, which you can
traverse and examine at a much higher level than
you can trying to match tags with regular expressions.

Toby

josh R wrote:
> Hi all,
> 
> I am trying to write some python to parse html code.  I find it easier
> to do in perl, but would like to learn some basic python.  My code
> looks like this:
> 
> line = "<tr>eat at joe's</tr><tr>or else</tr><tr>you'll starve</tr>"
> so = re.compile("(\<tr\>.*?\<\\tr\>)")
> foo=so.match(line)
> print foo.groups()
> 
> I'd like to get an array of elements that looks like this:
> 
> array(0)= <tr>eat at joe's</tr>
> array(1)= <tr>or else</tr>
> array(2)= <tr>you'll starve</tr>
> 
> Could you please tell me the correct way to do the matching?
> 
> also, is there something similiar to perl's s/foo/bar/g?
> 
> Thanks!!!
> Josh




More information about the Python-list mailing list