64-bit port of Python

Tim Peters tim_one at email.msn.com
Tue Feb 8 23:26:13 EST 2000


[posted & mailed]

[Trent and Tim talk about 64-bit porting issues in the Python code.
 Trent also sneakily changes email addresses.]

[Tim]
> AFAIK, core Python code *never* casts a pointer to any sort of
> int, or vice versa, either explicitly or implicitly.

[Trent Mick [mailto:trentm at ActiveState.com]]
> A couple of example where I think the Python core does just that:

Good eye, Trent!  Thank you.  I'm also sending this to Mark Hammond, since
ActiveState now pays him to worry about Python's Windows story -- and
perhaps pays you too <wink>.

> "Modules/arraymodule.c::728":
>
>   static PyObject *
>   array_buffer_info(self, args)
>   	arrayobject *self;
>   	PyObject *args;
>   {
>   	return Py_BuildValue("ll",
>   			     (long)(self->ob_item), (long)(self->ob_size));
>   }
>
> where 'ob_item' is a pointer.

Yes, the author of the new buffer interface code is being shot for many
reasons <wink>.

> "Python/bltinmodule.c::899":
>
>   static PyObject *
>   builtin_id(self, args)
> 	PyObject *self;
> 	PyObject *args;
>   {
> 	PyObject *v;
>
> 	if (!PyArg_ParseTuple(args, "O:id", &v))
> 		return NULL;
> 	return PyInt_FromLong((long)v);
>   }

Oh yes.  Been there forever, and won't work at all (while nothing promises
that id returns an address, it's crucial that "id(x) == id(y)" iff "x is y"
in Python).

> Python sort of relies on C's 'long' to be the largest native integer.

Don't forget that Python was written pre-ANSI, and this was a common
(universal?) assumption in the fuzzier K&R flavor of C.  ANSI C went on to
guarantee the existence of *some* integral type such that a pointer could be
cast to that type and back again without loss of info -- but one committee
member told me that at least he was surprised as all heck when it was
pointed out that the std neglected to say that must be a *standard* integral
type.  The notion that "long isn't long enough" is a loophole in the std,
and I'm not sure it was an intentional one.  Nevertheless, it's an official
one now, so that's that.

> This is evidenced by all the use of PyInt_FromLong() above. There are
> no format specifiers in the PyArg_Parse*() and Py_BuildValue()
> functions for converting a pointer. This was fine when 'long' would
> do. On Win64 sizeof(long)==4 and size(void*)==8.
>
> I think this also brings up some wider issues in the Python source.
> For instance, the python integer type uses the C 'long' type. Was it
> the implicit intention that this be the system's largest native
> integral type, i.e. 32-bits on a 32 sys and 64-bits on a 64-bit
> system?

More the explicit intention that it be the longest standard integral type,
back in the days that was believed to "mean something non-trivial".  It's
been a darned good bet for a decade <wink>.  The advertised semantics at the
Python level promise only that it's at least 32 bits.

> If so, then the representation of the Python integer type will have
> to change (i.e. the use of 'long' cannot be relied upon). One should
> then carry through and change (or obselete) the *_AsLong(),
> *_FromLong() Python/C API functions to becomesomething like
> AsLargestNativeInt(), *_FromLargestNativeInt()  (or some
> less bulky name).
>
> Alternatively, if the python integer type continues to use the C
> 'long' type for 64-bit systems then the following ugly thing
> happens:
>  - A Python integer on a 64-bit Intel chip compiled with MSVC is
>    32-bits wide.
>  - A Python integer on a 64-bit Intel chip compiled with gcc is
>    64-bits wide.
> That cannot be good.

Two things work against all that:

1. In 1.5.2, and more in the current CVS tree, there's already grudging
support for "longer than long" via the config LONG_LONG macro (e.g., under
MS Windows that's already #defined as __int64).  That may spread more,
although it's ugly so will be resisted (Guido hates #ifdef'ing code, and
platform #ifdef'ed macros aren't exactly liked -- each one is that much more
for new ports to wrestle with, and everyone to trip over forever after).

2. It's already not good that int size can matter across platforms with
grosser differences than the above.  For that & other reasons, the sharp
Python-level distinction between (bounded) ints and (unbounded) longs is
slated for (backward compatible) death.  Andrew Kuchling already has much of
the work for that in hand, but unclear whether it will make it into 1.6
(it's not a high priority now, although I expect you just boosted it a bit
...).  Once it's in, "id" can return million-bit ints as easily as it
returns C longs now.

or-if-activestate-solves-this-for-perl-first-we'll-just-rewrite-python-
    in-that<wink>-ly y'rs  - ti






More information about the Python-list mailing list