modifying small chunks from long string
Steven D'Aprano
steve at REMOVEMEcyber.com.au
Mon Nov 14 22:02:04 EST 2005
Tony Nelson wrote:
> >Can I get over this performance problem without reimplementing the
> >whole thing using a barebones list object? I though I was being "smart"
> >by avoiding editing the long list, but then it struck me that I am
> >creating a second object of the same size when I put the modified
> >shorter string in place...
>
>
> A couple of minutes experimenting with array.array at the python command
> line indicates that it will work fine for you. Quite snappy on a 16 MB
> array, including a slice assignment of 1 KB near the beginning.
> Array.array is probably better than lists for speed, and uses less
> memory. It is the way to go if you are going to be randomly editing all
> over the place but don't need to convert to string often.
I have no major objections to using array, but a minor
one: ordinary lists may very well be more than snappy
enough, and they have the advantage of being more
familiar than the array module to many Python programmers.
The time it takes to process a 20MB string will depend
on the details of the processing, but my back of the
envelope test using one large input string and an
intermediate list of strings was *extremely* fast, less
than half a second for a 20MB input. (See my earlier
post for details.)
Given that sort of speed, shifting to the less familiar
array module just to shave the time from 0.49s to 0.45s
is premature optimization. Although, in fairness, if
you could cut the time to 0.04s for 20MB then it would
be worth the extra work to use the array module.
--
Steven.
More information about the Python-list
mailing list