Reason why co_filename is no longer interned?

Benjamin Peterson benjamin at python.org
Sun Mar 1 22:10:44 EST 2009


2009/3/1 David Christian <david.christian at gmail.com>:
> On Sun, Mar 1, 2009 at 8:15 PM, Benjamin Peterson <benjamin at python.org> wrote:
>> David Christian <david.christian <at> gmail.com> writes:
>>> This means that where before, you could rely that
>>> <function>.func_code.co_filename == <function1>.func_code.co_filename
>>
>> Regardless of the change's intentionality, you should never rely on that
>> behavior! Interned strings are an implementation detail even at the C level.
>>
>
> Hi Benjamin,
> Thanks for your response.
>
> The code in question is in a tracing function for coverage - running
> for every line in the python code and an extra strcmp for every line
> of python code is expensive.

When you're testing for coverage, is performance really an issue?
Other coverage libraries do this in Python.

>
> And I'm not relying the files being equal, just taking advantage of it
> when it's the case.  The code looks like this:
>
> if(frame->f_code->co_filename == last_filename \
>   || !strcmp(PyString_AS_STRING(frame->f_code->co_filename),
>                  last_filename))
> {
>   < cache-hit!  Use latest coverage object.  >
> }
> else {
>   < cache miss.  Do dictionary lookup of coverage object by filename >
> }
>
> In the code I've looked at, cache hits happen 90+% of the time, which
> means I save a dictionary lookup.  In 2.4, I don't even have to do the
> strcmp, which means that running your code w/ coverage enabled is
> really not that much of a slowdown at all.
>
> Obviously, even with the strcmp, this manner of recording coverage
> data is much faster than running coverage in python, but I was
> wondering if this was a purposeful change or just something that
> happened by accident as part of a larger change.

Well, you could look through the logs of the ast-branch its self.



-- 
Regards,
Benjamin



More information about the Python-list mailing list