[Python-ideas] Ideas for improving the struct module

Steven D'Aprano steve at pearwood.info
Wed Jan 18 20:27:20 EST 2017


On Wed, Jan 18, 2017 at 04:24:39AM -0600, Elizabeth Myers wrote:
> Hello,
> 
> I've noticed a lot of binary protocols require variable length
> bytestrings (with or without a null terminator), but it is not easy to
> unpack these in Python without first reading the desired length, or
> reading bytes until a null terminator is reached.

This sounds like a fairly straight-forward feature request for the 
struct module, which probably could go straight to the bug tracker. 
Unfortunately I can't *quite* work out what the feature request is :-)

If you're asking for struct to support Pascal strings, with a single 
byte (0...255) for the length, it already does with format code "p".

I was going to suggest P for "large" Pascal string, with the length
given by *two* bytes rather than one (0...65535), but P is already in
use. Are you proposing the "$" format code from netstruct? That would be 
interesting, as it would allow format codes:

    B$    standard Pascal string, like p
    I$    Pascal string with a two-byte length
    L$    Pascal string with a four-byte length

4294967295 bytes should be enough for anyone :-)

Another common format is "ASCIIZ", or a one-byte Pascal string including 
a null terminator. People actually use this:

http://stackoverflow.com/questions/11850950/unpacking-a-struct-ending-with-an-asciiz-string

Which just leaves C-style null terminated strings. c/n/N are all already 
in use; I guess that C (for C-string) or S (for c-String) are 
possibilities.

All of these seem like perfectly reasonable formats for the struct 
module to support. They're all in use. struct already supports 
variable-width formats. I think its just a matter of raising one or more 
feature requests, and then doing the work.

I guess this is just my long-winded way of saying +1.



-- 
Steve


More information about the Python-ideas mailing list