[Python-Dev] Slice as a copy... by design?

Isaac Morland ijmorlan at cs.uwaterloo.ca
Thu May 22 18:26:51 CEST 2008


On Thu, 22 May 2008, Gary Herron wrote:

> In fact, a slice is *not* always a copy!  In at least some (simple) cases, a 
> slice references the original string:
>
>>>> s = 'abc'
>>>> t = s[:]
>>>> s is t
> True
>>>> id(s)
> 3081872000L
>>>> id(t)
> 3081872000L

I think the more interesting case is where the string objects are not the 
same object but use (portions of) the same underlying array in memory.  If 
I understand correctly, Python does not do this, and I thought I read 
something about why not but I can't remember the details.

Sharing contents is an obvious optimization which in some circumstances 
can dramatically reduce the amount of copying that goes on, but without a 
reasonably clever algorithm to decide when to let the underlying storage 
go (copying any part still in use), extremely bad behaviour can result - 
imagine reading in lots of long lines, then keeping just a short piece of 
each one.

By contrast, the worst that can happen with no sharing is that performance 
and memory use is what you expect - the only "bad" is the apparent missed 
opportunity for optimization.

I wonder if a "shared slice" object would be useful?  That is, an object 
which behaves like a string obtained from a slicing operation except that 
it shares storage.  It could have a .release method to go ahead and copy 
the underlying storage.  One complexity comes to mind immediately - what 
happens if one takes a shared slice of a shared slice?  Presumably it 
shares the original string's storage, but if the first shared slice is 
.released what happens to the second shared slice?  It would be nice if it
shared with the first shared slice, but keeping track of everything could 
get tricky.

I'd be interested in pointers to any existing discussion on this issue.

Trivia - right now there are *no* Google hits for 'python shared slice', 
although there are lots for 'python shared slices'.  They don't appear to 
be talking about the same thing, however (without being exhaustive).

Isaac Morland			CSCF Web Guru
DC 2554C, x36650		WWW Software Specialist


More information about the Python-Dev mailing list