python newbie - slicing a big memory chunk without GC penalties

Sun Feb 2 14:01:01 EST 2003

"Giovanni Bajo" <noway at sorry.com> writes:

> Hello,
>
> Sorry if the question is trivial, but I am a newbie with Python. I have a
> file read into memory within a 'sequence' (or whatever is returned by

kind of sequence, a string (str)

> file.read()), and I need to process it 512 bytes a time. Now, I was doing

if it is large you may consider reading by small pieces not all file
at once:

f = file('/path/filename')
while 1:
    buf = f.read(512)
    if buf == '': break                 # or `if len(buf) < 512'
    Process(buf)

> something like:
>
> for i in range(0, len(buf)/512):
>     Process(buf[i*512 : (i+1)*512])
>
there is built-in buffer which gives you a view of a porsion without
sliceing (try help(buffer) in interactive shell):

for i in range(0, len(buf)/512):
     Process(buffer(buf, i*512, 512)


> But it seems like a lot of time is wasted in the sequence slicing (before I
> was processing everything in a shot, and it was much faster - and Process is
> O(n) so it should not really matter that much). I tried also other
> approacches like:
>
> while len(buf):
>     Process(buf[0:512])
>     buf = buf[512:]
>
> but it seems even worse. What's the best way to do this?
>

yes, every

	buf = buf[512:]

builds a new string a bit shorter, copy, and drops the previous one, so lots
of copying for a very long string in each iteration,  O(n**2) in summary

> Thanks
> Giovanni
>
>

-- 

=*= Lukasz Pankowski =*=

t o t f s  h i m o f  p h t s s w
h n h i a  o s o f o  r o o a o i
e e a r i  p   t   o  o p   y m s
    t s d  e   h   l  b e     e e
      t        e   s  a d     t
               r      b       h
                      l       i
                      y       n
                              g