sgmllib too slow
Stuart D. Gathman
stuart at bmsi.com
Mon May 6 23:16:10 EDT 2002
I've run into my very first situation where python is not "fast enough". I
am using the sgmllib module to parse HTML attachments in a milter. The
processor idle time goes from 80% to 30% when the HTML parsing is turned
on (machine is also a web server, so this is bad). It takes 5 minutes to
parse a 150K attachment. (100Mhz 604 PPC).
1. Rewriting the whole thing in C is out of the question. Rewriting in
Java is a possibility, and easier than C - but not nearly as easy as
Python.
2. Since sgmllib.SGMLParser is callback based, I could make a flex or
bison grammar in C with recognized elements calling back to the SGMLParser
methods. That may or may not speed things up.
Any suggestions?
--
Stuart D. Gathman <stuart at bmsi.com>
Business Management Systems Inc. Phone: 703 591-0911 Fax: 703 591-6154
"Confutatis maledictis, flamis acribus addictis" - background song for
a Microsoft sponsored "Where do you want to go from here?" commercial.
More information about the Python-list
mailing list