[Tutor] Custom (non-standard) SGML Parser and OOP Problem
Adam Kessel
adam@bostoncoop.net
Sun May 18 22:09:50 2003
--azLHFNyN32YCQGCU
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable
I'm writing a custom SGML parser. I started to do it from scratch, but
then it occurred to me that I ought not to reinvent the wheel (the wheel
being sgmllib). =20
I want to use sgmllib-like parsing, but I only want to recognize tags
matching:
<$tag$>
sgmllib is, of course, looking for <[A-Za-z] to start tags, so these
sorts of tags are not detected. I'd also like to not have to deal with
any tags other than <$tag$>.
I discovered I could violate a central tenet of OOP and get this to work,
by putting the following in my code:
sgmllib.starttagopen =3D re.compile('<[>\$]')
(etc.)
That is, overwriting the regexps used in sgmllib to locate tags. But this
seems like a dangerous way to do things. Putting the following in my
parser __init__ doesn't work:
self.starttagopen =3D re.compile('<[>\$]')
Because starttagopen is not defined in sgmllib.SGMLParser, but in sgmllib
itself. =20
What's the right way to do this? My question, boiled down, is: how to
use sgmllib functionality but overwrite sgmllib internals without being a
bad programmer? =20
--Adam Kessel
--azLHFNyN32YCQGCU
Content-Type: application/pgp-signature
Content-Disposition: inline
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.2 (GNU/Linux)
iD8DBQE+x+U9dTf3ZklQ6qYRAkOAAKDIbN0CzQEucydygpxBdZMK+6muBgCdGbuI
+JlGIZaxcM6p2auKnqxh71c=
=LFg/
-----END PGP SIGNATURE-----
--azLHFNyN32YCQGCU--