64-bit port of Python (was: Circular references and python)

Tim Peters tim_one at email.msn.com
Tue Feb 8 01:42:48 EST 2000


[posted & mailed]

[Tim, on the early 90's port of Python to KSR's 64-bit OSF-based
 Unix]

> IIRC, it turned up two spots where the code implicitly assumed
> sizeof(int) == sizeof(long), and that was it (some subtler stuff
> turned up later, but it was by far the easiest port of *any*
> large C program to the Kendall Square architecture;

[ptmick at mail.com]
> Tim, I presume that this was a port to the UNIX 64-bit data model
> LP64

Beats me:  KSR did a 64-bit Unix before Official Buzzwords were invented to
classify the approaches.  char, short, int & long were 1, 2, 4 & 8 bytes
respectively; pointers were 8 bytes.

> and not the Windows LLP64 data model where you can no longer implicitly
> assume that sizeof(long) == sizeof(void *).

Right, definitely not that.  However, the vast bulk of our porting problems
were due to the even lamer <0.1 wink> assumption that sizeof(int) ==
sizeof(void *).  Python never assumed that.

> Do you know of anyone, other that myself, that is interested in
> looking at these issues in the Python source? As well, I would be
> interested to know what the "subtler stuff" was.

It's not what you suspect <wink>.  Almost everything boiled down to mistaken
and unintended assumptions that sizeof(int) == sizeof(long), in and around
the implementations of (unbounded) long arithmetic, and overflow-checking of
int arithmetic.  All that stuff was fixed then.  AFAIK, core Python code
*never* casts a pointer to any sort of int, or vice versa, either explicitly
or implicitly.

Since then, the biggest "sizeof" problems have centered around specific OS
interfaces, especially "large file" support on systems with 64-bit flavors
of seek etc.  This is still a bit of an #ifdef'ed mess, as vendors still
tend to do this in different ways (when they do it at all).

I don't expect MS's variant of 64-bit C to cause any significant problems in
the Python core -- the only areas worth worrying about a priori are the
interfaces to MS's variant of libc.

There's one subtlety that's always been theoretically broken:  Python uses
(C) ints to hold refcounts, but never checks refcounts for overflow.  The
justification for the latter is that you can't possibly have more references
than there are pointers in the address space, so, in effect, this is a very
subtle assumption that sizeof(void*) <= sizeof(int).  You would need to
accumulate gigareferences to a single object before this could break,
though.

And *that's* the level of the bugs that remain <wink>.

sleep-easy-this-won't-be-hard-ly y'rs  - tim






More information about the Python-list mailing list