modifying small chunks from long string

Steven D'Aprano steve at REMOVEMEcyber.com.au
Mon Nov 14 22:02:04 EST 2005


Tony Nelson wrote:

> >Can I get over this performance problem without reimplementing the
> >whole thing using a barebones list object? I though I was being "smart"
> >by avoiding editing the long list, but then it struck me that I am
> >creating a second object of the same size when I put the modified
> >shorter string in place...
> 
> 
> A couple of minutes experimenting with array.array at the python command 
> line indicates that it will work fine for you.  Quite snappy on a 16 MB 
> array, including a slice assignment of 1 KB near the beginning.  
> Array.array is probably better than lists for speed, and uses less 
> memory.  It is the way to go if you are going to be randomly editing all 
> over the place but don't need to convert to string often.

I have no major objections to using array, but a minor 
one: ordinary lists may very well be more than snappy 
enough, and they have the advantage of being more 
familiar than the array module to many Python programmers.

The time it takes to process a 20MB string will depend 
on the details of the processing, but my back of the 
envelope test using one large input string and an 
intermediate list of strings was *extremely* fast, less 
than half a second for a 20MB input. (See my earlier 
post for details.)

Given that sort of speed, shifting to the less familiar 
array module just to shave the time from 0.49s to 0.45s 
  is premature optimization. Although, in fairness, if 
you could cut the time to 0.04s for 20MB then it would 
be worth the extra work to use the array module.


-- 
Steven.




More information about the Python-list mailing list