[Python-ideas] Why CPython is still behind in performance for some widely used patterns ?

Edward Minnix egregius313 at gmail.com
Fri Jan 26 17:03:31 EST 2018


There are several reasons for the issues you are mentioning.

1. Attribute look up is much more complicated than you would think.
 (If you have the time, watch https://www.youtube.com/watch?v=kZtC_4Ecq1Y that will explain things better than I can)
 The series of operations that happen with every `obj.attr` occurrence can be complicated. It goes something like:
 def get_attr(obj, attr):
     if attr in obj.__dict__:
         value = obj.__dict__[attr]
         if is_descriptor(value):
             return value(obj)
         else:
             return value
     else:
         for cls in type(obj).mro():
             if attr in cls.__dict__:
                 value = cls.__dict__[attr]
                 if is_descriptor(value):
                     return value(obj)
                 else:
                     return value
         else:
             raise AttributeError('Attribute %s not found' % attr)

 Therefore, the caching means this operation is only done once instead of n times (where n = len(whatevers))
2. Function calls

3. Dynamic code makes things harder to optimize
 Python’s object model allows for constructs that are very hard to optimize without knowing about the structure of the data ahead of time.
 For instance, if an attribute is defined by a property, there are no guarantees of obj.attr will return the same thing.

 So in simple terms, the power Python gives you over the language makes it harder to optimize the language.

4. CPython’s compiler makes (as a rule) no optimizations
 CPython’s compiler is a fairly direct source-to-bytecode compiler, not an actual optimizing compiler. So anything beyond constant-folding and
 deletion of some types of debug code, the language isn’t going to worry about optimizing things for you.

So in simple terms, of the languages you mentioned, JavaScript’s object model is substantially less powerful than Python’s, but it also is more straightforward
in terms of what obj.attr means, and the other 3 you mentioned all have statically-typed, optimizing compilers, with a straight-forward method resolution order.

The things you see as flaws end up being the way Pythonistas can add more dynamic systems into their APIs (and since we don’t have macros,
most of our dynamic operations must be done at run-time).

- Ed

On Jan 26, 2018, 16:36 -0500, Pau Freixes <pfreixes at gmail.com>, wrote:
> Hi,
>
> This mail is the consequence of a true story, a story where CPython
> got defeated by Javascript, Java, C# and Go.
>
> One of the teams of the company where Im working had a kind of
> benchmark to compare the different languages on top of their
> respective "official" web servers such as Node.js, Aiohttp, Dropwizard
> and so on. The test by itself was pretty simple and tried to test the
> happy path of the logic, a piece of code that fetches N rules from
> another system and then apply them to X whatevers also fetched from
> another system, something like that
>
> def filter(rule, whatever):
> if rule.x in whatever.x:
> return True
>
> rules = get_rules()
> whatevers = get_whatevers()
> for rule in rules:
> for whatever in whatevers:
> if filter(rule, whatever):
> cnt = cnt + 1
>
> return cnt
>
>
> The performance of Python compared with the other languages was almost
> x10 times slower. It's true that they didn't optimize the code, but
> they did not for any language having for all of them the same cost in
> terms of iterations.
>
> Once I saw the code I proposed a pair of changes, remove the call to
> the filter function making it "inline" and caching the rule's
> attributes, something like that
>
> for rule in rules:
> x = rule.x
> for whatever in whatevers:
> if x in whatever.x:
> cnt += 1
>
> The performance of the CPython boosted x3/x4 just doing these "silly" things.
>
> The case of the rule cache IMHO is very striking, we have plenty
> examples in many repositories where the caching of none local
> variables is a widely used pattern, why hasn't been considered a way
> to do it implicitly and by default?
>
> The case of the slowness to call functions in CPython is quite
> recurrent and looks like its an unsolved problem at all.
>
> Sure I'm missing many things, and I do not have all of the
> information. This mail wants to get all of this information that might
> help me to understand why we are here - CPython - regarding this two
> slow patterns.
>
> This could be considered an unimportant thing, but its more relevant
> than someone could expect, at least IMHO. If the default code that you
> can write in a language is by default slow and exists an alternative
> to make it faster, this language is doing something wrong.
>
> BTW: pypy looks like is immunized [1]
>
> [1] https://gist.github.com/pfreixes/d60d00761093c3bdaf29da025a004582
> --
> --pau
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> https://mail.python.org/mailman/listinfo/python-ideas
> Code of Conduct: http://python.org/psf/codeofconduct/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20180126/dd132200/attachment.html>


More information about the Python-ideas mailing list