[XML-SIG] UTF-8 and ISO-8859-1 problems again

Martin v. Loewis martin@loewis.home.cs.tu-berlin.de
Thu, 11 Jan 2001 12:29:59 +0100


> > > Having a closer inspection of PyXML 0.6.3, the original memory leak
> > > from the parser doing it's parsing thing has gone, but there is one
> > > that exists for just purely making a parser.

I found the problem: While I updated the SAX2 driver, I had not
changed the SAX1 driver. With the patch below, I don't get any memory
leak for your example.

There where two problems: For one, drv_pyexpat did not use our pyexpat
module but the Python one if available, and it would not attempt to
break cycles at the end of parsing.

Regards,
Martin

Index: drv_pyexpat.py
===================================================================
RCS file: /cvsroot/pyxml/xml/xml/sax/drivers/drv_pyexpat.py,v
retrieving revision 1.11
diff -u -r1.11 drv_pyexpat.py
--- drv_pyexpat.py	2000/10/05 19:32:52	1.11
+++ drv_pyexpat.py	2001/01/11 11:25:28
@@ -14,10 +14,9 @@
 from xml.sax import saxlib,saxutils
 
 try:
-    import pyexpat
+    from xml.parsers import expat
 except ImportError:
-    # pyexpat not built in core installation, use our own
-    from xml.parsers import pyexpat
+    raise SAXReaderNotAvailable("expat not supported",None)
 
 import urllib,types
 
@@ -57,7 +56,7 @@
 
     def parse(self,sysID):
         self.parseFile(urllib.urlopen(sysID),sysID)
-        
+
     def parseFile(self,fileobj,sysID=None):
         self.reset()
         self.sysID=sysID
@@ -71,6 +70,7 @@
         self.parser.Parse("", 1)
             
         self.doc_handler.endDocument()
+        self.close()
 
     # --- Locator methods. Only usable after errors.
 
@@ -90,7 +90,7 @@
 
     def __report_error(self):
         errc=self.parser.ErrorCode
-        msg=pyexpat.ErrorString(errc)
+        msg=expat.ErrorString(errc)
         exc=saxlib.SAXParseException(msg,None,self)
         self.err_handler.fatalError(exc)
 
@@ -113,7 +113,7 @@
 
     def reset(self):
         self.sysID=None
-        self.parser=pyexpat.ParserCreate()
+        self.parser=expat.ParserCreate()
         self.parser.StartElementHandler = self.startElement
         self.parser.EndElementHandler = self.endElement
         self.parser.CharacterDataHandler = self.characters
@@ -125,8 +125,12 @@
             self.__report_error()
 
     def close(self):
+        if self.parser is None:
+            # make sure close is idempotent
+            return
         if self.parser.Parse("", 0) != 1:
             self.__report_error()
+        self.parser = None
         
 # --- An expat driver that uses the lazy map