HTMLParser tag contents
Grant Griffin
g2 at seebelow.org
Wed May 10 16:31:29 EDT 2000
Paul Prescod wrote:
>
> Grant Griffin wrote:
> >
> > Therefore, for Python 1.6, I would like to recommend that SGMLParser be
> > modified to provide a method called "get_tag_contents" (or whatever)
> > which can be called at the point of any "end_xxx" to convey the tag's
> > contents (which would include not only text but contained tags and their
> > text.) (The reason SGMLParser has to be modified is that its index into
> > its "rawdata" array is local to its parser routine.)
>
> You could be parsing a 100MB HTML/SGML document 1 K at a time. I don't
> think you want SGMLLIB to keep around the entire 100MB "just in case"
> you ask for the contents of the BODY tag.
>
I guess some sort of size limit could be put in it to cover all but
extreme cases.
Then again, maybe you're right: maybe the solution I had posted was
best. ;-)
when-the-exception-doesn't-prove-the-rule-it-must-prove-the
-exception-ly y'rs,
=g2
p.s. BTW, how long do you 'spose it would take for somebody to _read_ a
web page containing 100MB HTML?! (Lemme see...carry the six...that
would take...waitaminute...about... ;-)
--
_____________________________________________________________________
Grant R. Griffin g2 at dspguru.com
Publisher of dspGuru http://www.dspguru.com
Iowegian International Corporation http://www.iowegian.com
More information about the Python-list
mailing list