Blog "about python 3"

Steven D'Aprano steve+comp.lang.python at pearwood.info
Thu Jan 2 23:49:31 EST 2014


Robin Becker wrote:

> For fairly sensible reasons we changed the internal default to use unicode
> rather than bytes. After doing all that and making the tests compatible
> etc etc I have a version which runs in both and passes all its tests.
> However, for whatever reason the python 3.3 version runs slower

"For whatever reason" is right, unfortunately there's no real way to tell
from the limited information you give what that might be.

Are you comparing a 2.7 "wide" or "narrow" build? Do your tests use any
so-called "astral characters" (characters in the Supplementary Multilingual
Planes, i.e. characters with ord() > 0xFFFF)?

If I remember correctly, some early alpha(?) versions of Python 3.3
consistently ran Unicode operations a small but measurable amount slower
than 3.2 or 2.7. That especially effected Windows. But I understand that
this was sped up in the release version of 3.3.

There are some operations with Unicode strings in 3.3 which unavoidably are
slower. If you happen to hit a combination of such operations (mostly to do
with creating lots of new strings and then throwing them away without doing
much work) your code may turn out to be a bit slower. But that's a pretty
artificial set of code.

Generally, test code doesn't make good benchmarks. Tests only get run once,
in arbitrary order, it spends a lot of time setting up and tearing down
test instances, there are all sorts of confounding factors. This plays
merry hell with modern hardware optimizations. In addition, it's quite
possible that you're seeing some other slow down (the unittest module?) and
misinterpreting it as related to string handling. But without seeing your
entire code base and all the tests, who can say for sure?


> 2.7 Ran 223 tests in 66.578s
> 
> 3.3 Ran 223 tests in 75.703s
> 
> I know some of these tests are fairly variable, but even for simple things
> like paragraph parsing 3.3 seems to be slower. Since both use unicode
> internally it can't be that can it, or is python 2.7's unicode faster?

Faster in some circumstances, slower in others. If your application
bottleneck is the availability of RAM for strings, 3.3 will potentially be
faster since it can use anything up to 1/4 of the memory for strings. If
your application doesn't use much memory, or if it uses lots of strings
which get created then thrown away.


> So far the superiority of 3.3 escapes me, 

Yeah I know, I resisted migrating from 1.5 to 2.x for years. When I finally
migrated to 2.3, at first I couldn't see any benefit either. New style
classes? Super? Properties? Unified ints and longs? Big deal. Especially
since I was still writing 1.5 compatible code and couldn't really take
advantage of the new features.

When I eventually gave up on supporting versions pre-2.3, it was a load off
my shoulders. Now I can't wait to stop supporting 2.4 and 2.5, which will
make things even easier. And when I can ignore everything below 3.3 will be
a truly happy day.


> but I'm tasked with enjoying 
> this process so I'm sure there must be some new 'feature' that will help.
> Perhaps 'yield from' or 'raise from None' or .......

No, you have this completely backwards. New features don't help you support
old versions of Python that lack those new features. New features are an
incentive to drop support for old versions.


> In any case I think we will be maintaining python 2.x code for at least
> another 5 years; the version gap is then a real hindrance.

Five years sounds about right.



-- 
Steven




More information about the Python-list mailing list