[Tutor] Python cashes low integers? How? Where?

Danny Yoo dyoo at hkn.eecs.berkeley.edu
Mon Aug 9 19:30:24 CEST 2004



On Mon, 9 Aug 2004, Dick Moores wrote:

> Kent Johnson wrote at 05:19 8/9/2004:
> >Since 3456 is not cached, you are calling getrefcount on a different
> >3456 than the one you assigned to a, b and c!
> >
> > >>> import sys
> > >>> a=b=c=3456
> > >>> sys.getrefcount(3456)
> >2
> > >>> sys.getrefcount(a)
> >4
> > >>> del c
> > >>> sys.getrefcount(a)
> >3
>
> So "qwerty" and "Dick" are cached, but 3456 is not? (Sorry to persist
> with this.)

Hi Dick,


Not a problem, but just as a warning: all of these details are really not
part of Python as a language, but more with its current C implementation.



The implementors of Python added some efficiency tricks to the system, and
some of these tricks are not duplicated in other implmentations of Python.
We had a small discussion about this a few months ago:

    http://mail.python.org/pipermail/tutor/2004-May/029625.html

It turns out that Python does try to internalize ("intern") small
name-like strings in a cache.  The rationale is that these names are
likely to recur in a program, and so it's probably worthwhile to save them
around, to avoid having to churn so many strings out.

So yes, 'querty' and 'Dick', being name-like string literals, will get
cached by CPython 2.3.3.  But you should not really need to worry about
this.  *grin* It's possible that the caching strategy that the
implementors choose might change; it's not set in stone that Python should
do this kind of caching.



> And why the high refcount here? (I restarted Python for this.)
>  >>> import sys
>  >>> g = h = i = 4
>  >>> sys.getrefcount(4)
> 93


Not sure about this one.  It might depend on your runtime environment
(like things in 'sitecustomize.py').  I get a much-reduced refcount for
'4' from a clean startup on my interpreter, from a plain Unix xterm:

###
[dyoo at shoebox idlelib]$ python
Python 2.3.3 (#1, Aug  9 2004, 10:11:39)
[GCC 3.3.3 20040412 (Gentoo Linux 3.3.3-r6, ssp-3.3.2-2, pie-8.7.6)] on
linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> g = h = i = 4
>>> import sys
>>> sys.getrefcount(4)
16
>>> j = 4
>>> sys.getrefcount(4)
17
>>> del j
>>> sys.getrefcount(4)
16
###

The dynamic nature of the runtime makes this hard to predict well.




In fact, when we run from IDLE, then yes, the refcount goes up:

### From IDLE's interactive interpreter
Python 2.3.3 (#1, Aug  9 2004, 10:11:39)
[GCC 3.3.3 20040412 (Gentoo Linux 3.3.3-r6, ssp-3.3.2-2, pie-8.7.6)] on
linux2
Type "copyright", "credits" or "license()" for more information.

    ****************************************************************
    Personal firewall software may warn about the connection IDLE
    makes to its subprocess using this computer's internal loopback
    interface.  This connection is not visible on any external
    interface and no data is sent to or received from the Internet.
    ****************************************************************

IDLE 1.0.2
>>> import sys
>>> sys.getrefcount(4)
105
>>>
###

and that's probably because '4' is used quite a bit by the IDLE internals.
Remember, IDLE is running on Python, so it too may hold references to
numeric constants.


Anyway, hope this helps!



More information about the Tutor mailing list