references again

Niels Diepeveen niels at endea.demon.nl
Tue Jul 11 12:36:03 EDT 2000


Thomas Thiele schreef:
> class X:
>         def __init__(self, s):
>                 self.s = s
> 
> x1 = X("jana")
> print sys.getrefcount(x1.s)
> 
> #Problem 1:
> # that prints 5 why? I expected 2!
> # (only one ref from x1.s plus temp. ref from
> # sys.getrefcount)

Because of interning of name-like string literals, and because the
string is also a constant in the script's code object. Try this:
>>> sys.getrefcount('jana') # 1 from arglist, 2 from interning, 1 from code
4
>>> sys.getrefcount('$jana') # 1 from arglist, 1 from code object
2
>>> sys.getrefcount('ja' + 'na') # just the 1 from the arglist
1

> 
> S = x1.s
> print sys.getrefcount(x1.s)
> 
> #prints 6, ok one reference more
> 
> str = pickle.dumps(x1)
> x2 = pickle.loads(str)
> print sys.getrefcount(x1.s)
> 
> #Problem 2:
> #prints 7, why? Is "jana" still used?

x2.s now refers to the same string object as x1.s, so it has got 1 more
reference.

> 
> x1 = x2
> 
> print sys.getrefcount(x1.s)
> 
> #prints 6, ok one reference less
> 
> #But now the most difficult problem 3:
> 
> one , two, three , four, five = 65,65,65,65,65
> while(1):
>         #creating different string of constant size
>         str = "%c%c%c%c%c" % (one, two, three , four, five)
>         one = one + 1
>         if one > 90:
>                 one = 65
>                 two = two + 1
>                 if two > 90:
>                         two = 65
>                         three = three + 1
>                         if three > 90:
>                                 three = 65
>                                 four = four + 1
>                                 if four > 90:
>                                         four = 65
>                                         five = five + 1
>                                         if five > 65:
>                                                 one , two, three , four, five = 65,65,65,65,65
>         x = X(str)
>         str = pickle.dumps(x)
>         x1 = pickle.loads(str)
>         print str, sys.getrefcount(x1.s),
> 
> #sys.getrefcount() returns 4. I don't know why, but that is not the real
> problem.
> The Problem is that the program eats memory!!!!!
> It seems that all older variants of str will be stored. But there is no
> reference more to the older ones!

Yes, there is, but it's invisible. pickle.Unpickler.load_string() uses
eval() to unpack the string. This means that every name-like string you
unpickle will be put in the interned strings dictionary, so it will be
there forever (and will be reused if the same string ever comes up
again).

To verify that this is indeed your problem, you could change
'%c%c%c%c%c' to '$%c%c%c%c%c'.

It would not be too hard to change this behaviour, but I am not
qualified to say whether that would have any horrible side effects. I
think something like
    def load_string(self):
        s = self.readline()[:-1]
        t = s[1:-1]
        if not '\\' in t:
            self.append(t)
        else:
            self.append(eval(s, {'__builtins__': {}})) # Let's be
careful
might do the trick.

A simpler solution might be to use binary pickles.


-- 
Niels Diepeveen
Endea automatisering




More information about the Python-list mailing list