How Python Implements "long integer"?

Mark Dickinson dickinsm at gmail.com
Mon Jul 6 09:46:05 EDT 2009


On Jul 6, 1:24 pm, Pedram <pm567... at gmail.com> wrote:
> OK, fine, I read longobject.c at last! :)
> I found that longobject is a structure like this:
>
> struct _longobject {
>     struct _object *_ob_next;
>     struct _object *_ob_prev;

For current CPython, these two fields are only present in debug
builds;  for a normal build they won't exist.

>     Py_ssize_t ob_refcnt;
>     struct _typeobject *ob_type;

You're missing an important field here (see the definition of
PyObject_VAR_HEAD):

    Py_ssize_t ob_size; /* Number of items in variable part */

For the current implementation of Python longs, the absolute value of
this field gives the number of digits in the long;  the sign gives the
sign of the long (0L is represented with zero digits).

>     digit ob_digit[1];

Right.  This is an example of the so-called 'struct hack' in C; it
looks as though there's just a single digit, but what's intended here
is that there's an array of digits tacked onto the end of the struct;
for any given PyLongObject, the size of this array is determined at
runtime.  (C99 allows you to write this as simply ob_digit[], but not
all compilers support this yet.)

> }

> And a digit is a 15-item array of C's unsigned short integers.

No: a digit is a single unsigned short, which is used to store 15 bits
of the Python long.  Python longs are stored in sign-magnitude format,
in base 2**15.  So each of the base 2**15 'digits' is an integer in
the range [0, 32767).  The unsigned short type is used to store those
digits.

Exception: for Python 2.7+ or Python 3.1+, on 64-bit machines, Python
longs are stored in base 2**30 instead of base 2**15, using a 32-bit
unsigned integer type in place of unsigned short.

> Is this structure is constant in
> all environments (Linux, Windows, Mobiles, etc.)?

I think it would be dangerous to rely on this struct staying constant,
even just for CPython.  It's entirely possible that the representation
of Python longs could change in Python 2.8 or 3.2.  You should use the
public, documented C-API whenever possible.

Mark



More information about the Python-list mailing list