[Python-ideas] Direct byte<->int conversions (was Re: bitwise operations on bytes)

Mark Dickinson dickinsm at gmail.com
Sat Aug 15 18:25:02 CEST 2009


On Sat, Aug 15, 2009 at 4:45 PM, Alexandre
Vassalotti<alexandre at peadrop.com> wrote:
> I believe the byteorder option shouldn't default to use the native
> byte-order however. As you mentioned, it would be a bad choice to
> encourage the default behaviour to be platform-dependent. And since
> the primary purpose of the API is long serialization, it would be
> short-sighted to choose the option that cannot be used for
> serialization as the default.
>
> Whether it should default to 'little' or 'big' is pretty much an
> arbitrary choice. In my patch, I choose to default big-endian since it
> is the standard network byte-order. But maybe the option should
> default to little-endian instead since it more widely used. In
> addition, the patch is slightly more efficient with little-endian.

Given all this, it sounds like byteorder should be a required argument.
'In the face of ambiguity... ' and all that.  As Nick pointed out, we
can always add a default later when a larger set of use-cases has
emerged.

You say that the 'primary purpose of the API is long serialization'.
I'd argue that that's not quite true.  That is, I see two separate
uses:

(1) fixed-size conversions:  e.g., interpreting a three-byte sequence
as an integer for the purposes of bit operations, or converting
an int generated by random.getrandbits(k) to a random byte sequence.
(See http://bugs.python.org/msg69285 and
http://bugs.python.org/msg54262.)

(2) Provide a primitive operation that's useful for serialization
protocols.  Here I'd guess that the details of e.g., how a negative
integer is serialized would vary from protocol to protocol, so
that the serialization code would in most cases still end up
having to specify the fixed_length argument.

I *don't* see int.tobytes and int.frombytes (or whatever the names
turn out to be) as providing integer serialization by themselves.
There's no need for this, since pickle and marshal already do
this job.  Incidentally, the commenters in

http://bugs.python.org/issue467384

have quite a lot to say on this subject.  It's on this basis that
I'm suggesting that the size argument should be required for
int.tobytes.

> On Sat, Aug 15, 2009 at 8:13 AM, Eric Eisner<ede at mit.edu> wrote:
> > sign:
> > patch behavior: default flag: signed=True
> > maybe unsigned as the default?
> Either is fine by me. The advantage with 'signed' as the default is
> 'signed' works with all longs (and not only with non-negative ones).

Agreed.  Of course, this advantage disappears if the size argument
is mandatory.

I don't have any strong opinions about the method and parameter
names, so I'll keep quiet on that subject. :)

Mark



More information about the Python-ideas mailing list