[Python-Dev] PyArg_ParseTuple and 16 bit bitpatterns
Trent Mick
trentm@activestate.com
Tue, 4 Jul 2000 16:42:58 -0700
Hi Jack,
I am your evil man for this one (but then you already knew that).
On Tue, Jul 04, 2000 at 11:50:02PM +0200, Jack Jansen wrote:
> I've adapted PyArg_ParseTuple (and Py_BuildValue) to understand the H
> format specifier, which is meant for 16-bit bitpatterns. (in case you
> didn't follow the discussion last month: the old lowercase h now
> checks values to be in the range -32768..32767, so constants like
> 0x8000 are not acceptable anymore).
I think that the Right Answer is:
b = signed byte
B = unsigned byte
h = signed short
H = unsigned short
i = signed int
I = unsigned int
l = signed long
L = unsigned long
Before my patch we had (no range checking was done so signed vs. unsigned
made no difference):
b = byte
h = short
i = int
l = long
L = LONG_LONG
After my patch (i.e. now) we have:
b = unsigned byte
h = signed short
i = signed int
l = signed long
L = signed LONG_LONG
Notes on that:
- Choosing signed or unsigned for each was based on the common case (or
presuming what the common case was). I.e. unsigned bytes are more common
than signed bytes.
- unsigned byte is 'b' and not (my ideal) 'B', for backward compatibility
reasons
- LONG_LONG as 'L' really sucks because that is the blocker to
PyArg_ParseTuple nirvana (my first list). It should be 'q' and 'Q' for Quad
or something like that.
Your patch is adding (right?):
H = unsigned short
Aside: Now that it will be called 2.0 would we maybe want to go for the Right
Answer. I suspect that a *lot* more people would complain of breakage with
the use of 'L' changing to 'Q', and that I am asking for a lynching.
>
> I haven't added an I and L specifier, because (surprise, surprise:-)
> for 32-bit integers 0x80000000 turns out to be a legal value, unlike
> for their poor 16-bit brethren.
I can't see how 'I' allows 0x80000000 (unless, of course, sizeof(int) > 4 on
your machine) because the 'I' formatter is not in PyArg_ParseTuple. Yes, 'L'
will probably accept 0x80000000 because LONG_LONG is probably 64-bits wide on
your machine hence.
>
> I've currently implemented H as meaning unsigned (i.e. 0..0xffff), but
I woudl suggest allowing [0, USHRT_MAX]. Should be defined in limits.h
(guessing), I think.
> on second thoughts I think allowing -32768..0xffff might be better:
> there's probably lots of code out there that passes -1 when all 16
> flag bits should be set. Please let me know if you have strong
I think that uses of -1 should use either USHRT_MAX or (unsigned short)-1.
> opinions on either meaning before I check this in.
>
> <grumpy mode="on">Note that I'll only adapt PyArg_ParseTuple and the
> gazzilion mac-specific occurrences of "h" where a 16-bit pattern is
> needed. I've done only a very cursory check of other occurences of
> "h", but someone else will have to pick that up if they feel like.
> </grumpy>
I am sorry to have been the cause of work for you. I just think that the
'security' offered by bounds checking all values converted by
PyArg_ParseTuple is important.
Trent
--
Trent Mick
trentm@activestate.com