Question regarding HTMLParser module.

Adonis adonisv at earthlink.net
Mon Jul 28 00:05:27 EDT 2003


When parsing my html files, I use handle_pi to capture some embedded python
code, but I have noticed that in the embedded python code if it contains
html, HTMLParser will parse it as well, and thus causes an error when I exec
the code, raises an EOL error. I have a work around for this as I use
different set of characters rather that <tag> use something like (tag) then
revert it back to <tag> via another function, I was wondering if there is a
way to tell HTMLParser to ignore the embedded tags or another alternative?

Any help would be greatly appreciated.
And another note, I am well aware of Zope, Webware, CherryPy, etc... for
py/html embedding options, but I want this to be a learning experience.

HTML processing instruction:
<?
import time
print time.strftime('%b-%d-%Y')
print '<tt>testing!()</tt>')
>

error:
Traceback (most recent call last):
  File "C:\home\Adonis\python\t.py", line 40, in -toplevel-
    x.feed(z)
  File "C:\Python23\lib\HTMLParser.py", line 108, in feed
    self.goahead(0)
  File "C:\Python23\lib\HTMLParser.py", line 154, in goahead
    k = self.parse_pi(i)
  File "C:\Python23\lib\HTMLParser.py", line 232, in parse_pi
    self.handle_pi(rawdata[i+2: j])
  File "C:\home\Adonis\python\t.py", line 33, in handle_pi
    exec(data)
  File "<string>", line 4
    print '<tt
             ^
SyntaxError: EOL while scanning single-quoted string






More information about the Python-list mailing list