Python example: possible speedup?

Wed Sep 8 17:05:55 EDT 1999

I do not have access to testing files, but I would guess the following would
be a speed up (untested pseudo-code).  At the very least, it makes the code
easier to follow ;) .  It minimizes memory overhead, doesn't bother with
counters at all, uses list indexing to find the end of the package, stores
the last name and value explicitly in a local variable (in fact, I could
have used the name and value variables, as they would always have the last
value, but that wouldn't be very readable), and generally rewrites the whole
algorithm.

	def prime( self, heuristic ):
		'''
		prime buffer with "heuristic" lines
		Should be ~2X larger than the largest possible
		entry in file.
		'''
		self.lines[len(self.lines):] = self.file.readlines(heuristic)

	def next_package(self, heuristic=1000, split=string.split,
strip=string.strip):
		# initial priming of the buffer, in theory we might run past,
		# so use a large heuristic...
		if len( self.lines) < heuristic *2:
			self.prime( heuristic )
		package = {}
		try:
			packageindex = self.lines.index( '\n' )
		except ValueError:# could do more checks here...
			packageindex = len ( self.lines)
		last = None
		for line in self.lines[packageindex:]:
			# is this a continuation
			if line[0] in (' ' ,"\t"):
				if last:
					value = package[ last[0] ] = last[1]+line
					last = (last[0], value)
				else:
					print '''Continued line follows null line!!!'''
			# must be an opening/definition
			else:
				# get "cleaned up" versions
				name, value = map( strip, split( line, ':', 1))
				# store
				package[name] = value
				# save for continuation and deletion
				last = (name, value)
		del self.lines[ packageindex+1:]
		return package # will be an empty dictionary if there is no next package

Enjoy yourselves,
Mike