speedy Python strings?

Mon Jan 19 19:52:19 EST 2004

On Tue, Jan 20, 2004 at 01:43:24AM +0100, Uwe Mayer wrote:
> Hi,
> 
> thanks to previous help my wrapper for an abstract file type with variable
> record length is working now.
> I did some test runs and its awfully slow:
> 
> I didn't want to read in the whole file at once and I didn't like to read it
> step by step (contains 32 bit length terminated strings (Delphi) and 32bit
> integers), so I read in i.e. 2MB, work on that buffer and if the buffer
> goes empty i load some more 2MB, etc.
> For this buffer I use ordinary strings:
> 
> class myFile(file):
>         def read(self, *args):
>                 ...
>                 self.buffer += file.read(self, *args)
>                 ...

  Doing the above in a loop has quadratic complexity.

> 
> and after reading information from the buffer I remove the read part from
> it:
>         
>         text = struct.unpack("L", self.buffer[:4])
>         self.buffer = self.buffer[4:]

  The same is true for this line.

  Your program will be much faster if you make fewer string copies. 
"self.buffer[4:]" is a string copy, as is "self.buffer += anotherString".

  The buffer() builtin is one way to avoid copying, but another way is to
simply keep track of how far into the string you are and use both low and
high indexes when slicing, instead of repeatedly chopping the front off the
string.

  Jp