struct: type registration?

Thu Jun 1 17:48:29 EDT 2006

On 2/06/2006 3:44 AM, Giovanni Bajo wrote:
> John Machin wrote:
> 
>>> Looks like you totally misread my message.
>> Not at all.
>>
>> Your function:
>>
>> def mystring_pack(s):
>>      if len(s) > 20:
>>          raise ValueError, "a mystring can be at max 20 chars"
>>      s = (s + "\0"*20)[:20]
>>      s = struct.pack("20s", s)
>>      return s
>>
>> can be even better replaced by (after reading the manual "For packing,
>> the string is truncated or padded with null bytes as appropriate to
>> make it fit.") by:
>>
>> def mystring_pack(s):
>>      if len(s) > 20:
>>          raise ValueError, "a mystring can be at max 20 chars"
>>      return s
>>      # return s = (s + "\0"*20)[:20] # not needed, according to the
>>      manual # s = struct.pack("20s", s)
>>      # As I said, this particular instance of using struct.pack is a
>> big fat no-op.
> 
> John, the point of the example was to show that one could write custom
> packer/unpacker which calls struct.pack/unpack and, after that,
> post-processes the results to obtain some custom data type.

What you appear to be doing is proposing an API for extending struct by 
registering custom type-codes (ASCII alphabetic?) each requiring three 
call-back functions (mypacker, myunpacker, mylength).

Example registration for an "S" string (fixed storage length, true 
length determined on unpacking by first occurrence of '\0' (if any)).

     struct.register("S", packerS, unpackerS, lengthS)

You give no prescription for what those functions should do. You provide 
"examples" which require reverse engineering to deduce of what they are 
intended to be exemplars.

Simple-minded folk like myself might expect that the functions would 
work something like this:

Packing: when struct.pack reaches the custom code in the format, it does 
this (pseudocode):
     obj = _get_next_arg()
     itemstrg = mypacker(obj)
     _append_to_output_string(itemstrg)

Unpacking: when struct.unpack reaches a custom code in the format, it 
does this (pseudocode):
     n = mylength()
     # exception if < n bytes remain
     obj = myunpacker(remaining_bytes[:n])
     _append_to_output_tuple(obj)

Thus, in a simple case like the NUL-terminated string:

def lengthS():
     return 20
def packerS(s):
     assert len(s) <= 20
     return s.ljust(20, '\0')
     # alternatively, return struct.pack("20s", s)
def unpackerS(bytes):
     assert len(bytes) == 20
     i = bytes.find('\0')
     if i >= 0:
         return bytes[:i]
     return bytes

In more complicated cases, it may be useful for either/both the 
packer/unpacker custom functions to call struct.pack/unpack to assist in 
the assembly/disassembly exercise. This should be (1) possible without 
perturbing the state of the outer struct.pack/unpack invocation (2) 
sufficiently obvious to warrant little more than a passing mention.

> Now, I apologize
> if my example wasn't exactly the shortest, most compact, most pythonic piece
> of code. It was not meant to be. It was meant to be very easy to read and
> very clear in what it is being done. You are nitpicking that part of my code
> is a no-op. Fine.

Scarcely a nitpick. It was very clear that parts of it were doing 
absolutely nothing in a rather byzantine & baroque fashion. What was 
unclear was whether this was by accident or design. You say (*after* the 
examples) that "As shown, the custom packer/unpacker can call the 
original pack/unpack as a basis for their work. ... when called 
recursively ...". What basis for what work? As for recursion, I see no 
"19s", "18s", etc here :-)

> Sorry if this confused you.

It didn't. As a self-confessed idiot, I am resolutely and irredeemably 
unconfused.

> I was just trying to show a
> simple pattern:
> 
> custom packer: adjust data, call struct.pack(), return
> custom unpacker: call struct.unpack(), adjust data, return
> 
> I should have chosen a most complex example probably, but I did not want to
> confuse readers. It seems I have confused them by choosing too simple an
> example.

The problem was that you chose an example that had minimal justification 
(i.e. only the length check) for a custom packer at all (struct.pack 
pads the "s" format with NUL bytes) and no use at all for a call to 
struct.unpack inside the custom unpacker.

Cheers,
John