Improved struct module

Robin Boerdijk robin.boerdijk at nl.origin-it.com
Tue Oct 12 16:08:49 EDT 1999


Tim Peters <tim_one at email.msn.com> wrote in message
news:000801bf146b$43e99160$11a2143f at tim...
> [Robin Boerdijk]
> > I would like to make a case for replacing the current Python
> > struct module with an improved version of it. I think the
> > struct module contains very useful functionality, but the
> > interface functions (pack and unpack) are inconvenient
> > for a number of situations.
> >
> > To fix the pack/unpack inconvenience, I have written an xstruct
> > module that extends the struct module with a new structdef
> > function to *define* packed binary data structures (similar to
> > C's struct keyword).
> >
> > [and code & docs are at
> >      http://www.sis.nl/python/xstruct/xstruct.html
> > ]
>
> The xstruct interface is nice.  Why is it implemented in C?  That is, it
> seems to combine two things:  random improvements to the std struct
module's
> internals, and a new interface.  The latter is much easier to do and
> maintain if written in Python, and would also be sub-classable then.  You
> should try to get the "random improvements" into the std struct module
> regardless; the new interface would likely have a better chance as 100 new
> lines of Python than as 800-900 new lines of C.

There are a number of reasons for implementing the xstruct module in C as
opposed to implementing it as a Python wrapper around the current struct
module.

1. Implementing it as a Python wrapper would be horribly inefficient and not
as simple as you seem to think. To randomly change the value of a field
somewhere in the middle of a packed binary data buffer, I would have to do
something like:

def __setattr__(self, FieldName, Value):

    Field = self.fields[FieldName]
    BufferBeforeField = self.buffer[:Field.offset]
    NewFieldBuffer = struct.pack(Field.struct_format, Value)
    BufferAfterField = self.buffer[Field.offset + Field.size:] #
    self.buffer = BufferBeforeField + NewFieldBuffer + BufferAfterField

This takes at least 5 memory allocations whereas the xstruct's "s.FieldName
= Value"
requires none, the __getattr__ would have similar problems. Think of what
happens when we have a struct with 20 fields and we need to set these
randomly... If there is any proper use for C, then it is low-level stuff
like this.

I also feel that the xstruct's interface is the more natural interface for
low-level access to packed binary data structures. I'm pretty sure that most
people currently using the pack/unpack interface would not have done so if
they had a more structured interface like xstruct provides (see another
follow-up to your reply). If there is any C code that is redundant, it is
the pack/unpack code <0.9 wink>

2. The xstruct objects support the new buffer interface of Python 1.5.2.
This makes it possible to read and write data directly from a file or socket
into and out of a packed memory buffer. All we have to do is to add support
for the buffer interface to other low level modules like cStringIO and
socket (files already do !!) and we would have a perfect set of seemlessly
fitting, complementary packages. How could I achieve this in native Python ?

Another reason for providing the buffer interface is to facilitate writing
extension modules for C APIs that make heavily use of C structs. The most
notorious example I know is the MQI interface of IBMs MQSeries. I think it
defines more than 20 C structs, all of which I can define in Python now and
still interact nicely with the Python MQI extension module written in C.

> guido-gets-so-c-sick-i-tell-you-he-vomits<wink>-ly y'rs  - tim

I feel for Guido <wink>, but in this (exceptional) case, I think C beats
Python. By implemting this in C, we can actually reduce the amount of C code
that otherwise would need to be written (see MQSeries example).

Robin.






More information about the Python-list mailing list