Code that ought to run fast, but can't due to Python limitations.

Stefan Behnel stefan_ml at behnel.de
Sun Jul 5 05:41:18 EDT 2009


Stefan Behnel wrote:
> John Nagle wrote:
>>    Here's some actual code, from "tokenizer.py".  This is called once
>> for each character in an HTML document, when in "data" state (outside
>> a tag).  It's straightforward code, but look at all those
>> dictionary lookups.
>>
>>     def dataState(self):
>>         data = self.stream.char()
>>
>>         # Keep a charbuffer to handle the escapeFlag
>>         if self.contentModelFlag in\
>>           (contentModelFlags["CDATA"], contentModelFlags["RCDATA"]):
> 
> Is the tuple
> 
> 	(contentModelFlags["CDATA"], contentModelFlags["RCDATA"])
> 
> constant? If that is the case, I'd cut it out into a class member (or
> module-local variable) first thing in the morning.

Ah, and there's also this little trick to make it a (fast) local variable
in that method:

	def some_method(self, some_const=(1,2,3,4)):
	    ...

Stefan



More information about the Python-list mailing list