struct.pack behavior

Cameron Simpson cs at zip.com.au
Wed Jun 25 23:32:12 EDT 2008


On 25Jun2008 22:38, Steven Clark <steven.p.clark at gmail.com> wrote:
| On Wed, Jun 25, 2008 at 7:03 PM, John Machin <sjmachin at lexicon.net> wrote:
| > On Jun 26, 9:00 am, "Steven Clark" <steven.p.cl... at gmail.com> wrote:
| >> Can anyone explain to me why
| >> struct.pack('HB',1,2) gives 3 bytes, whereas struct.pack('BH',1,2)
| >> gives 4 bytes?
| >>
| > Alignment -- read the manual.
| 
| If "the manual" is the help files for the struct module, I've read it
| several times over. I understand endianness; I don't understand
| alignment. Could anyone give a less cryptic / terse answer?

For efficiency reasons many CPUs require particular primitive data
types (integers/pointers of various sizes) to be placed in memory at
particular boundaries. For example, shorts ("H" above, usually two bytes
and probably always so in the struct module) are often required to be
on even addresses, and longer objects to be on 4 or 8 byte boundaries.

This allows for much more efficient memory access on many platforms
(of course the rules depend on the platform). Although RAM _appears_ to
the random access to arbitrary bytes, the underlying hardware will often
fetch chunks of bytes in parallel. If a number spanned the boundaries of
such a chunk it would require two fetch cycles instead of one. So
this is avoided for performance reasons.

So, packing "HB" puts a short at offset 0 (even) and then a byte.
Conversely, packing "BH" puts a byte at offset zero but puts the short
at offset 2 (to be even), leaving a gap after the byte to achieve this,
thus the 4 byte size of the result (byte, gap, short).

This layout procedure is called "alignment".

Cheers,
-- 
Cameron Simpson <cs at zip.com.au> DoD#743
http://www.cskk.ezoshosting.com/cs/

Kilimanjaro is a pretty tricky climb. Most of it's up, until you reach the
very, very top, and then it tends to slope away rather sharply.



More information about the Python-list mailing list