[Python-Dev] pthreads question: typedef ??? pthread_t and hacky return statements

Tim Peters tim_one@email.msn.com
Wed, 16 Aug 2000 23:34:12 -0400


[Trent Mick]
> I am porting Python to Monterey (64-bit AIX) and have a small
> (hopefully) question about POSIX threads.

POSIX threads. "small question".  HAHAHAHAHAHA.  Thanks, that felt good
<wink>.

> I have Monterey building and passing the threads test suite using
> Python/thread_pthread.h with just one issue:
>
>
> -------------- snipped from current thread_pthread.h ---------------
> long
> PyThread_get_thread_ident(void)
> {
>     volatile pthread_t threadid;
>     if (!initialized)
>         PyThread_init_thread();
>     /* Jump through some hoops for Alpha OSF/1 */
>     threadid = pthread_self();
>     return (long) *(long *) &threadid;
> }
> -------------------------------------------------------------------
>
> Does the POSIX threads spec specify a C type or minimum size for
> pthread_t?

Which POSIX threads spec?  There are so very many (it went thru many
incompatible changes).  But, to answer your question, I don't know but doubt
it.  In practice, some implementations return pointers into kernel space,
others pointers into user space, others small integer indices into kernel-
or user-space arrays of structs.  So I think it's *safe* to assume it will
always fit in an integral type large enough to hold a pointer, but not
guaranteed.  Plain "long" certainly isn't safe in theory.

> Or can someone point me to the appropriate resource to look
> this up. On Linux (mine at least):
>   /usr/include/bits/pthreadtypes.h:120:typedef unsigned long int
> pthread_t;

And this is a 32- or 64-bit Linux?

> On Monterey:
>   typedef unsigned int pthread_t;
>
> That is fine, they are both 32-bits, however Monterey is an LP64 platform
> (sizeof(long)==8, sizeof(int)=4), which brings up the question:
>
> WHAT IS UP WITH THAT return STATEMENT?
>   return (long) *(long *) &threadid;

Heh heh.  Thanks for the excuse!  I contributed the pthreads implementation
originally, and that eyesore sure as hell wasn't in it when I passed it on.
That's easy for me to be sure of, because that entire function was added by
somebody after me <wink>.  I've been meaning to track down where that crap
line came from for *years*, but never had a good reason before.

So, here's the scoop:

+ The function was added in revision 2.3, more than 6 years ago.  At that
time, the return had a direct cast to long.

+ The "Alpha OSF/1" horror was the sole change made to get revision 2.5.

Back in those days, the "patches list" was Guido's mailbox, and *all* CVS
commits were done by him.  So he checked in everything everyone could
convince them they needed, and sometimes without knowing exactly why.  So I
strongly doubt he'll even remember this change, and am certain it's not his
code.

> My *guess* is that this is an attempt to just cast 'threadid' (a
> pthread_t) to a long and go through hoops to avoid compiler warnings. I
> dont' know what else it could be.

Me neither.

> Is that what the "Alpha OSF/1" comment is about?

That comment was introduced by the commit that added the convoluted casting,
so yes, that's what the comment is talking about.

> Anybody have an Alpha OSF/1 hanging around. The problem is that when
> sizeof(pthread_t) != sizeof(long) this line is just broken.
>
> Could this be changed to
>   return threadid;
> safely?

Well, that would return it to exactly the state it was in at revision 2.3,
except with the cast to long left implicit.  Apparently that "didn't work"!

Something else is broken here, too, and has been forever:  the thread docs
claim that thread.get_ident() returns "a nonzero integer".  But across all
the thread implementations, there's nothing that guarantees that!  It's a
goof, based on the first thread implementation in which it just happened to
be true for that platform.

So thread.get_ident() is plain braindead:  if Python wants to return a
unique non-zero long across platforms, the current code doesn't guarantee
any of that.

So one of two things can be done:

1. Bite the bullet and do it correctly.  For example, maintain a static
   dict mapping the native pthread_self() return value to Python ints,
   and return the latter as Python's thread.get_ident() value.  Much
   better would to implement a x-platform thread-local storage
   abstraction, and use that to hold a Python-int ident value.

2. Continue in the tradition already established <wink>, and #ifdef the
   snot out of it for Monterey.

In favor of #2, the code is already so hosed that making it hosier won't be
a significant relative increase in its inherent hosiness.

spoken-like-a-true-hoser-ly y'rs  - tim