[Python-3000] [Python-Dev] Integer representation (Was: ssize_t question: longs in header files)

"Martin v. Löwis" martin at v.loewis.de
Tue May 30 07:26:10 CEST 2006


Guido van Rossum wrote:
>> struct PyInt{
>>   struct PyObject ob;
>>   Py_ssize_t value_or_size;
>>   char is_long;
>>   digit ob_digit[1];
>> };
>>
> 
> Nice. I guess if we store the long value in big-endian order we could
> drop is_long, since the first digit of the long would always be
> nonzero. This would save a byte (on average) for the longs, but it
> would do nothing for the wasted space for short ints.

Right; alternatively, the top-most bit of ob_digit[0] could also be
used, as longs have currently 15-bit digits.

> Why do we need to keep the PyLong_* APIs at all? Even at the Python
> level we're not planning any backward compatibility features; at the C
> level I like even more freedom to break things.

Indeed, they should get dropped.

> I worry about all the wasted space for alignment caused by the extra
> flag byte though. That would be 4 byte per integer on 32-bit machines
> (where they are currently 12 bytes) and 8 bytes on 64-bit machines
> (where they are currently 24 bytes).

I think ints should get managed by PyMalloc in Py3k. With my proposal,
an int has 16 bytes on a 32-bit machine, so there wouldn't be any
wastage for PyMalloc (which allocates 16 bytes for 12-byte objects,
anyway). On a 64-bit machine, it would indeed waste 8 bytes per
int.

> That's why I'd like my alternative proposal (int as ABC and two
> subclasses that may remain anonymous to the Python user); it'll save
> the alignment waste for short ints and will let us use a smaller int
> type for the size for long ints (if we care about the latter).

I doubt they can remain anonymous. People often dispatch by type
(e.g. pickle, xmlrpclib, ...), and need to put the type into a
dictionary. If the type is anonymous, they will do

   dispatch[type(0)] = marshal_int
   dispatch[type(sys.maxint+1)] = marshal_int

Plus, their current code as

   dispatch[int] = marshal_int

which will silently break (although it won't be silent if they also
have dispatch[long] = marshal_long).

Regards,
Martin



More information about the Python-3000 mailing list