zip() function troubles

Terry Reedy tjreedy at udel.edu
Fri Jul 27 02:16:22 EDT 2007


"Istvan Albert" <istvan.albert at gmail.com> wrote in message 
news:1185502539.598349.159330 at l70g2000hse.googlegroups.com...
|| if you try it yourself  you'll see that it is very easy to generate 10
| million tuples,

No it is not on most machines.

| on my system it takes 3 (!!!) seconds to do the following:
|
| size = 10**7
| data = []
| for i in range(10):
|    x = [ (0,1) ] * size

x has 10**7 references (4 bytes each) to the same tuple.  Use id() to 
check.  40 megs is manageable.

|    data.append( x )
|
| Now it takes over two minutes to do this:
|
| size = 10**7
| a = [ 0 ] * size
| b = zip(a,a)

b has 40 megs that reference 10 meg *different* tuples.  Each is 20 to 40, 
so 200-400 megs more.  Try
[(i,i) for i in xrange(5000000)]
for comparison (it also makes 10000000 objects plus large list).

| the only explanation I can come up with is that the internal
| implementation of zip must have some flaws

References are not objects.


Terry Jan Reedy






More information about the Python-list mailing list