Style help for a Smalltalk-hack

MRAB python at mrabarnett.plus.com
Mon Oct 22 21:33:14 EDT 2012


On 2012-10-23 01:43, Travis Griggs wrote:
> I'm writing some code that does a structured read from formatted binary file. The code I came up with looks like:
>
> # get the first four bytes, the first gap field
> chunk = byteStream.read(4)
> while chunk:
>      # interpret the gap bytes
>      gap, = struct.unpack('>I', chunk)
>      # suck off the valveCount
>      valveCount, = struct.unpack('>I', byteStream.read(4))
>      # collect the next valveCount signatures
>      signatures = [struct.unpack('>I', byteStream.read(4))[0] for _ in range(valveCount)]
>      self.script.append(ScriptSpan(gap=gap, valveSet=signatures))
>      # now get the next 4 bytes for the gap of the next iteration, it'll be empty if we're at end
>      chunk = byteStream.read(4)
>
> I can't help but thinking that there's some better way (i.e. more pythonic) to do this that doesn't involve having to use another module (Construct) or exploring generators or something like that. What bugs me about it is that there is two different styles for reading/decoding values from the byte stream. valveCount and signatures are both paired invocations of unpack() and read(). But to detect the end of the stream (file), I have to split the read() and unpack() of the gap value across 3 different lines of the code, and they don't even sit adjacent to each other.
>
> I'm wandering up the Python curve with a passel of Smalltalk experience under my belt, so I expect I'm struggling with trying to map something like this across to python
>
> [byteStream atEnd] whileFalse: [
>       gap := (byteStream next: 4) asInteger.
>       valveCount := (byteStream next: 4) asInteger.
>       signatures := (1 to: valveCount) collect: [:_ | (byteStream next: 4) asInteger].
>       self script add: (ScriptSpan gap: gap valveSet: signatures).
> ]
>
> The part that doesn't seem to be there in the standard python library is the idea of an atEnd message for streams, it's inferred as a byproduct of a read().
>
> Please be gentle/kind. I'm still learning. :) TIA
>
Another way you could do it is:

while True:
     chunk = byteStream.read(4)
     if not chunk:
         break
     ...

And you could fetch multiple signatures in one read:

signatures = list(struct.unpack('>{}I'.format(valveCount), 
byteStream.read(4 * valueCount)))

By the way, in Python the recommended style for variable names (well,
what you'd call a 'variable' in other languages :-)) is lowercase with
underscores, e.g. "byte_stream".




More information about the Python-list mailing list