htmllib question

Greg Jorgensen greg at pdxperts.com
Sun May 20 23:23:55 EDT 2001


On 20 May 2001, John Hunter wrote:

> I have some html that I need to parse.  I want to call some function
> on all of the html unless it is in PRE tag.  Then I just want to
> output it verbatim.

This may be more simplistic that you want. Then again it may be a workable
solution.

---
import re

f = open('somthing.html', 'r')
rgx = re.compile(r'(<pre>.*</pre>)', re.DOTALL+re.IGNORECASE)
chunks = rgx.split(f.read())
f.close()

for chunk in chunks:
    if chunk[0:5].lower() == '<pre>':
        # do something with preformatted chunk
    else:
        # do something with non-pre chunk
---

Greg Jorgensen
PDXperts LLC
Portland, Oregon, USA
gregj at pobox.com







More information about the Python-list mailing list