[Cython] AddTraceback() slows down generators

Sat Jan 28 21:24:35 CET 2012

Vitja Makarov, 28.01.2012 20:58:
> 2012/1/28 mark florisson <markflorisson88 at gmail.com>:
>> On 28 January 2012 19:41, Vitja Makarov <vitja.makarov at gmail.com> wrote:
>>> 2012/1/28 Stefan Behnel <stefan_ml at behnel.de>:
>>>> Stefan Behnel, 27.01.2012 09:02:
>>>>> any exception *propagation* is
>>>>> still substantially slower than necessary, and that's a general issue.
>>>>
>>>> Here's a general take on a code object cache for exception propagation.
>>>>
>>>> https://github.com/scoder/cython/commit/ad18e0208
>>>>
>>>> When I raise an exception in test code that propagates through a Python
>>>> call hierarchy of four functions before being caught, the cache gives me
>>>> something like a 2x speedup in total. Not bad. When I do the same for cdef
>>>> functions, it's more like 4-5x.
>>>>
>>>> The main idea is to cache the objects in a reallocable C array and bisect
>>>> into it based on the C code "__LINE__" of the exception, which should be
>>>> unique enough for a given module.
>>>>
>>>> It's a global cache that doesn't limit the lifetime of code objects  (well,
>>>> up to the lifetime of the module, obviously). I don't know if that's a
>>>> problem because the number of code objects is only bounded by the number of
>>>> exception origination points in the C source code, which is usually quite
>>>> large. However, only a tiny fraction of those will ever raise or propagate
>>>> an exception in practice, so the real number of cached code objects will be
>>>> substantially smaller.
>>>>
>>>> Maybe thorough test suites with lots of failure testing would notice a
>>>> difference in memory consumption, even though a single code objects isn't
>>>> all that large either...
>>>>
>>>> What do you think?
>>>>
>>>
>>> We already have --no-c-in-traceback flag that disables C line numbers
>>> in traceback.
>>> What's about enabling it by default?
>>>
>> I'm quite attached to that feature actually :), it would be pretty
>> annoying to disable that flag every time. And what would disabling
>> that option gain, as the current code still formats the filename and
>> function name.
> 
> It's rather useful for developers or debugging. Most of the people
> don't need it.

Not untrue. However, at least a majority of developers should be able to
make use of it when it's there, and code is several times more often built
for testing and debugging than for production. So I consider it a virtue
that it's on by default.

> Here is simple benchmark:
> # upstream/master: 6.38ms
> # upstream/master (no-c-in-traceback): 3.07ms
> # scoder/master: 1.31ms
> def foo():
>     raise ValueError
> 
> def testit():
>     cdef int i
>     for i in range(10000):
>         try:
>             foo()
>         except:
>             pass
> 
> Stefan's branch wins but:
>  - there is only one item in the cache and it's always hit

Even if there were substantially more, binary search is so fast you'd
hardly notice the difference.

(BTW, I just noticed that my binary search implementation is buggy - not a
complete surprise. I'll add some tests for it.)

>  - we can still avoid calling PyString_FromString() making function
> name and source file name a python const (I've tried it and I get
> 2.28ms)

I wouldn't mind, but it would be nice to get lazy initialisation for them.

Stefan