struct: type registration?

Giovanni Bajo noway at sorry.com
Thu Jun 1 07:52:56 EDT 2006


John Machin wrote:

>> given the ongoing work on struct (which I thought was a dead
>> module), I was wondering if it would be possible to add an API to
>> register custom parsing codes for struct. Whenever I use it for
>> non-trivial tasks, I always happen to write small wrapper functions
>> to adjust the values returned by struct.
>>
>> An example API would be the following:
>>
>> ============================================
>> def mystring_len():
>>     return 20
>>
>> def mystring_pack(s):
>>     if len(s) > 20:
>>         raise ValueError, "a mystring can be at max 20 chars"
>>     s = (s + "\0"*20)[:20]
>
> Have you considered s.ljust(20, "\0") ?

Right. This happened to be an example...

>>     s = struct.pack("20s", s)
>>     return s
>
> I am an idiot, so please be gentle with me: I don't understand why you
> are using struct.pack at all:

Because I want to be able to parse largest chunks of binary datas with custom
formatting. Did you miss the whole point of my message:

struct.unpack("3liiSiiShh", data)

You need struct.unpack() to parse these datas, and you need custom
packer/unpacker to avoid post-processing the output of unpack() just because it
just knows of basic Python types. In binary structs, there happen to be *types*
which do not map 1:1 to Python types, nor they are just basic C types (like the
ones struct supports). Using custom formatter is a way to better represent
these types (instead of mapping them to the "most similar" type, and then
post-process it).

In my example, "S" is a basic-type which is a "A 0-terminated 20-byte string",
and expressing it in the struct format with the single letter "S" is more
meaningful in my code than using "20s" and then post-processing the resulting
string each and every time this happens.


>>>>> import struct
>>>>> x = ("abcde" + "\0" * 20)[:20]
>>>>> x
> 'abcde\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00'
>>>>> len(x)
> 20
>>>>> y = struct.pack("20s", x)
>>>>> y == x
> True
>>>>>
>
> Looks like a big fat no-op to me; you've done all the heavy lifting
> yourself.

Looks like you totally misread my message. Your string "x" is what I find in
binary data, and I need to *unpack* into a regular Python string, which would
be "abcde".


>
>>     idx = s.find("\0")
>>     if idx >= 0:
>>         s = s[:idx]
>>     return s
>
> Have you considered this:
>
>>>>> z.rstrip("\0")
> 'abcde'


This would not work because, in the actual binary data I have to parse, only
the first \0 is meaningful and terminates the string (like in C). There is
absolutely no guarantees that the rest of the padding is made of \0s as well.
-- 
Giovanni Bajo





More information about the Python-list mailing list