Why don't strings share data in Python?

John Machin sjmachin at lexicon.net
Wed Apr 24 22:06:44 EDT 2002


Mike Coleman <mkc+dated+1021521407.f909ec at mathdogs.com> wrote in message news:<87d6x09trz.fsf at mathdogs.com>...
> Does anyone know why strings (i.e., those of length >1) don't share their data
> in Python?  Since their immutable, it seems like this would be the obvious
> thing to do.  So, for example, the space behavior of this code could be linear
> rather than quadratic/horrific:
> 
> d = {}
> for i in xrange(100000000):
>     d[mybigstring[i:]] = mybigstring[i:]

You might like to:

0. Try this:

d = {}
for i in xrange(100000000):
   suffix = mybigstring[i:]
   d[suffix] = suffix

My guess is that this will create only one copy of each slice, instead
of two.

1. Explain the utility of having a dict d such that d[foo] == foo, for
*all* foo in d. If all you want is logically a set, not a mapping, do
d[suffix] = 1.

2. google("suffix tree"); google("suffix array")



More information about the Python-list mailing list