sgmllib parser keeps old tag data?

Berend van Berkum berend at dotmpe.com
Fri Feb 13 09:51:01 EST 2009


-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On Fri, Feb 13, 2009 at 02:31:40PM +0000, MRAB wrote:
> Berend van Berkum wrote:
> >
> >import sgmllib
> >
> >
> >class MyParser(sgmllib.SGMLParser):
> >
> >	content = ''		
> >	markup = []
> >	span_stack = []
> >
> These are in the _class_ itself, so they will be shared by all its
> instances. You should so something like this instead:
> 
> 	def __init__(self):
> 		self.content = ''
> 		self.markup = []
> 		self.span_stack = []
> 

Yes.. tested that and SGMLParser won't let me override __init__, 
(SGMLParser vars are uninitialized even with sgmllib.SGMLParser(self) call).
Tried some but not the following:
with a differently named init function and one boolean class var 'initialized'
it can check 'if self.initialized' in front of each handler. Does the trick.

Confusion dissolved :)
thanks.

- -- 
 web, http://dotmpe.com                      ()    ASCII Ribbon
 email, berend.van.berkum at gmail.com          /\
 icq, 26727647;  irc, berend/mpe at irc.oftc.net

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)

iD8DBQFJlYjVn70fkTNDJRgRAhFRAJ9XDPaR2zb8EjKfTACDjtzwI7z/9ACgzcmB
Ms1QZ9IoB2s6RJ+tdXJtzfs=
=itBb
-----END PGP SIGNATURE-----



More information about the Python-list mailing list