[SciPy-User] unpacking binary data from a C structure
Charles R Harris
charlesr.harris at gmail.com
Tue Apr 13 11:27:55 EDT 2010
On Tue, Apr 13, 2010 at 7:20 AM, Tom Kuiper <kuiper at jpl.nasa.gov> wrote:
> Dear list,
>
> here's something I find very strange. I have a C structure defined as:
>
> typedef struct
> {
> unsigned short spcid; /* station id - 10, 40, 60, 21 */
> unsigned short vsrid; /* vsr1a, vsr1b ... from enum */
> unsigned short chanid; /* subchannel id 0,1,2,3 */
> unsigned short bps; /* number of bits per sample - 1, 2, 4,
> 8, or
> 16 */
> unsigned long srate; /* number of samples per second in
> kilo-samples
> per second */
> unsigned short error; /* hw err flag, dma error or num_samples
> error,
> 0 ==> no errors */
> unsigned short year; /* time tag - year */
> unsigned short doy; /* time tag - day of year */
> unsigned long sec; /* time tag - second of day */
> double freq; /* in Hz */
> unsigned long orate; /* number of statistics samples per
> second */
> unsigned short nsubchan; /* number of output sub chans */
> }
> stats_hdr_t;
>
> The python module struct unpack expected format is 'HHHH L HHH L d L H'
> Here's a real header structure as it appears at the head of a file:
>
> 0000000 000d 0001 0006 0008
> 0000010 4240 000f 0000 0000
> 0000020 0000 07da 0064 4730
> 0000030 0001 0000 0000 0000
> 0000040 d800 d31d 421d 03e8
> 0000048 0000 0000 0000 0002
>
> Decoded as unsigned shorts:
>
> 0000000 13 1 6 8
> 0000010 16960 15 0 0
> 0000020 0 2010 100 18224
> 0000030 1 0 0 0
> 0000040 55296 54045 16925 1000
> 0000050 0 0 0 2
>
> Matching these to the stats_hdr_t with 'unpack' notation:
>
> 0000000 H H H H
> 0000010 L1 L2 H ?
> 0000020 ? H H L1
> 0000030 L2 ? ? D1
> 0000040 D2 D3 D4 L1
> 0000050 L2 ? ? H
>
> So the actual format is 'HHHH L H xxxx HH L xxxx d L xxxx H'
> What are all the mystery 4-byte blanks? This works:
>
> buf = fd.read(50)
> header = unpack_from('=4H LH2x 2x2HL4xdL4xH',buf)
>
> Since unpacking binary data must be a fairly common activity in
> scientific circles. I hope you will have some suggestions.
>
>
I presume you didn't produce the data, but as a rule of thumb c structures
should not be used to write out binary data, as the binary layout of the
data won't be portable. Text, netcdf, hdf5, or some other standard data
format is preferable, with text being perhaps the most portable. That said,
lots of old data collection programs write out c structures, and no doubt
newer programs do so also.
Chuck
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20100413/646d2a60/attachment.html>
More information about the SciPy-User
mailing list