[Python-Dev] PEP: Adding data-type objects to Python

Tue Oct 31 16:32:39 CET 2006

Michael Chermside wrote:
> In this email I'm responding to a series of emails from Travis
> pretty much in the order I read them:
> 
>>
>> In the mean-time, how are other packages supposed to communicate  
>> binary information about data with each other?
> 
> Here we disagree.
> 
> I haven't used C-types. I have no idea whether it is well-designed or
> horribly unusable. So if someone wanted to argue that C-types is a
> mistake and should be thrown out, I'd be willing to listen. 
> Until
> someone tries to make that argument, I'm presuming it's good enough to
> be part of the standard library for Python.

My problem with this argument is two fold:

1) I'm not sure you really know what your talking about since you 
apparently haven't used either ctypes or NumPy (I've used both and so 
forgive me if I claim to understand the strengths of the data-format 
representations that each uses a bit better).  Therefore, it's hard for 
me to take your opinion seriously.  I will try though. I understand you 
have a preference for not wildly expanding the ways to do similar 
things.  I share that preference with you.

2) You are assuming that because it's good enough for the standard 
library means that the way they describe data-formats (using a separate 
Python type for each one) is the *one true way*.  When was this 
discussed?   Frankly it's a weak argument because the struct module has 
been around for a lot longer.  Why didn't the ctypes module follow that 
standard?  Or the standard that's in the array module for describing 
data-types.  That's been there for a long time too.  Why wasn't ctypes 
forced to use that approach?

The reason it wasn't is because it made sense for ctypes to use a 
separate type for each data-format object so that you could call 
C-functions as if they were Python functions.  If this is your goal, 
then it seems like a good idea (though not strictly necessary) to use a 
separate Python type for each data-format.

But, there are distinct disadvantages to this approach compared to what 
I'm trying to allow.   Martin claims that the ctypes approach is 
*basically* equivalent but this is just not true.  It could be made more 
true if the ctypes objects inherited from a "meta-type" and if Python 
allowed meta-types to expand their C-structures.  But, last I checked 
this is not possible.

A Python type object is a very particular kind of Python-type.  As far 
as I can tell, it's not as flexible in terms of the kinds of things you 
can do with the "instances" of a type object (i.e. what ctypes types 
are) on the C-level.

The other disadvantage of what you are describing is: Who is going to 
write the code?

I'm happy to have the data-format object live separate from ctypes and 
leave it to the ctypes author(s) to support it if desired.  But, the 
claim that the extended buffer protocol jump through all kinds of hoops 
to conform to the "ctypes standard" when that "standard" was designed 
with a different idea in mind is not acceptable.

Ctypes has only been in Python since 2.5 and the array interface was 
around before that.   Numeric has been around longer than ctypes.  The 
array module and the struct modules in Python have also both been around 
longer than ctypes as well.

Where is the discussion that crowned the ctypes way of doing things as 
"the one true way"

> 
> In a different message, he writes:
>> It also bothers me that so many ways to describe binary data are  
>> being used out there.  This is a problem that deserves being solved.  
>>  And, no, ctypes hasn't solved it (we can't directly use the ctypes  
>> solution).
> 
> Really? Why? Is this a failing in C-types? Can C-types be "fixed"?

You can't grow C-function pointers on to an existing type object.   You 
are also carrying around a lot of weight in the Python type object that 
is un-necessary if all you are doing is describing data.

> 
> I just disagree. (1) I *DO* think we should "just use ctypes because it's
> there". After all, the problem we're trying to solve is one of
> COMPATIBILITY - you don't solve those by introducing competing standards.
> (2) From what I understand of it, I think ctypes is quite capable of
> describing data to be accessed via the buffer protocol.

Capable but not supporting all the things I'm talking about.  The ctypes 
objects don't have any of the methods or attributes (or C function 
pointers) that I've described.  Nor should they necessarily grow them.

> 
> Why? Who cares? Seriously, if we were proposing to describe the layouts
> with a collection of rubber bands and potato chips, I'd say it was a
> crazy idea. But we're proposing using data structures in a computer
> memory. Why does it matter whether those data structures are of the same
> "python type" or different "python types"? I care whether the structure
> can be created, passed around, and interrogated. I don't care what
> Python type they are.

Sure, but the flexibility you have with an instance of a Python type is 
different then when that instance must itself also be a Python type.  It 
*is* different.  This is quite noticeable in C especially.

> 
>> I'm saying that I don't like the idea of forcing this approach on  
>> everybody else who wants to describe arbitrary binary data just  
>> because ctypes is included.
> 
> And I'm saying that I *do*. Hey, if someone proposed getting rid of
> the current syntax for the array module (for Py3K) and replacing it with
> use of ctypes, I'd give it serious consideration. There should be only
> one way to describe binary structures. It should be powerful enough to
> describe almost any structure, easy-to-use, and most of all it should be
> used consistently everywhere.

I'm not opposed to convergence, but ctypes must be willing to come to us 
too.  It's devleopment of a "standard" was not done with the array 
interface in mind so why should it be surprising that it does not fill 
the need for us.

-Travis