[Patches] [ python-Patches-1624059 ] fast subclasses of builtin types
SourceForge.net
noreply at sourceforge.net
Sun Feb 25 20:50:53 CET 2007
Patches item #1624059, was opened at 2006-12-28 22:01
Message generated for change (Comment added) made by nnorwitz
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=1624059&group_id=5470
Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: Core (C code)
Group: Python 2.6
>Status: Closed
>Resolution: Fixed
Priority: 5
Private: No
Submitted By: Neal Norwitz (nnorwitz)
Assigned to: Guido van Rossum (gvanrossum)
Summary: fast subclasses of builtin types
Initial Comment:
This is similar to a patch posted on python-dev a few months ago (or more). I modified it to also handle subclassing exceptions which should speed up exception handling a bit. (This was proposed by Guido based on the original patch.) I also dropped an extra bit that was going to indicate if it was a builtin type or a subclass of a builtin type.
----------------------------------------------------------------------
>Comment By: Neal Norwitz (nnorwitz)
Date: 2007-02-25 11:50
Message:
Logged In: YES
user_id=33168
Originator: YES
Committed rev 53911.
Hopefully the checkin comment explains most of what's going on. I
simplified the patch as much as possible. I like to start with less code.
If we can improve the speed, that can be optimized later. I didn't measure
the little variaions. I had measured that it made a real diff in speed for
using an int subclass a long time ago.
This should help a fair amount for exceptions too.
----------------------------------------------------------------------
Comment By: Martin v. Löwis (loewis)
Date: 2007-01-06 06:54
Message:
Logged In: YES
user_id=21627
Originator: NO
File Added: a.c
----------------------------------------------------------------------
Comment By: Martin v. Löwis (loewis)
Date: 2007-01-06 06:24
Message:
Logged In: YES
user_id=21627
Originator: NO
I made a couple of assembler experiments (see attached a.c), with gcc 4.1
on x86.
A "bit mask enumeration" test (f) compiles into four instructions:
movl 8(%eax), %eax
andl $-268435456, %eax
cmpl $1879048192, %eax
je .L18
(fall-through being the else case)
A single bit test of a flag (g) compiles to two instructions:
testl $-1073741824, 8(%eax)
je .L9
(fall-through being the if case)
Adding an identity test (comparison with the address of a global),
followed by a bit mask test (h), compiles into six instructions:
cmpl $int_type, %eax
je .L2
movl 8(%eax), %eax
andl $-268435456, %eax
cmpl $1879048192, %eax
je .L2
(fall-through being the else case)
In the common case, only two of these instructions are executed.
So all-in-all, I would agree with Guido that adding bit flags is more
efficient. However, existing bits cannot be recycled: in existing
binary extension modules, these flags are set, so if the modules don't
get recompiled, the type check would believe that the types are
subtypes.
----------------------------------------------------------------------
Comment By: Guido van Rossum (gvanrossum)
Date: 2007-01-03 19:59
Message:
Logged In: YES
user_id=6380
Originator: NO
This looks fine, but I have some questions about alternative
implementations:
- Why does the typical PyFoo_Check() macro first call PyFoo_CheckExact()
before calling the fast bit checking macro? Did you measure that this is
in fact faster? True, it means always a pointer deref, so maybe it is --
but OTOH it is more instructions.
- Why not have a separate bit for each type? Then you could make the fast
macro test for (flags & mask) != 0 instead of testing for (flag & mask) ==
value. It would use up all the remaining bits, but I suspect there are
some unused (or reusable) bits in lower positions: 1L<<2 is unused (was
GC), and 1L<<11 also seems unused. And bits 18 through 23! And I'm
guessing that INPLACEOPS (1L<<3) isn't all that interesting any more they
were introduced in 2.0... So it really looks like you have plenty of bits.
Of course I don't know if it matters; would be worth it perhaps to look at
the machine code.
- Oops, it looks like your comment is off. You claim to be using bits
24-27, leaving 28-31 free, but in fact you're using bits 28-31!
BTW You're inroducing quite a few lines over 80 chars. Perhaps cut back a
bit?
----------------------------------------------------------------------
Comment By: Neal Norwitz (nnorwitz)
Date: 2006-12-28 22:04
Message:
Logged In: YES
user_id=33168
Originator: YES
I forgot to mention this patch works by using unused bits in tp_flags.
This saves a function call when checking for a subclass of a builtin type.
There's one funky thing about this patch, the change to
Objects/exceptions.c. I didn't investigate why this was necessary, or more
likely I did why when I added it and forgot. I know that without adding
BASE_EXC_SUBCLASS to tp_flags, test_exceptions fails.
----------------------------------------------------------------------
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=1624059&group_id=5470
More information about the Patches
mailing list