[Python-Dev] Python startup time

Thu Jul 20 02:20:39 EDT 2017

On 7/19/2017 10:05 AM, Nick Coghlan wrote:
> P.S. I'll also note that we're not *actually* limited to resolving
> such conflicts in public venues (even though I think that's a good
> default habit for us to retain): as long as we report the outcome of
> any mutual agreements about design priorities back to the relevant
> public venue (e.g. a tracker issue), there's nothing wrong with
> shifting our attempts to better understand each other's perspectives
> to private email, IRC, video chat, etc.

I expect and hope that there will be discussion of this issue at the 
core developer sprint in September, with summary reports back here on pydev.

> It can even make sense to reach out to other
> core devs for help, since it's almost always easier for someone not
> caught in the midst of an argument to see both sides of it, and
> potentially spot a core of agreement amidst various surface level
> disagreements :)

I always understood the Python development process, both for core and 
users, to be "Make it right; then make it faster", with the second 
clause conditioned on 'while keeping it right' and maybe, and especially 
for core development 'if significantly slow'.  (People can rightly work 
on speed of personal code for other reasons.)  I believe we pretty much 
agree on the principles.  The disagreement seems to be on whether a 
particular case is 'significantly slow'.  I believe that the burden of 
proof is with those who propose a change.

The burden of the proof depends on the final qualification: 'without 
adding unnecessary or extreme complexity'.  If there is no added 
complication, the burden is slight.  If not, we will likely disagree 
about complexity and its tradeoff with speed.

About 'keeping it right':  It has been mentioned that more complicated 
code *generally* makes it harder to 'see' that the code is (basically) 
correct. The second line of defense is the automated test suite.  I 
think, for instance, that someone interested in changing namedtuple (to 
a faster and presumably more complicated implementation) should check 
the coverage of the current code, with branches checked both ways. 
Then, bring the coverage up to 100% if is not already, and carefully 
check the test for possible missing cases.

A small static set of test cases cannot cover everything.  The third 
test of an implementation is accumulated user experience.  A new 
implementation starts at 0.  One way to increase that is test the 
implementation with 3rd-part code.  Another, I think, is through 
randomized testing.

Proposal 1: Depending on our confidence in a new implementation, 
simulate user experience with randomized tests, perhaps running for 
hours.  Example: we develop a random (unicode) identifier generator that 
starts with any of the legal initial codepoints and continues with a 
random number of legal follow codepoints.  Then test (old) and new 
namedtuple with random class and a random number of random field names. 
A developer could also use third-party packages, like hypothesis.  Code 
and a summary could be uploaded to bpo.  A summary could even go in the 
code file.

Note 1: Tim Peters did something like this when developing timsort.  He 
provided a nice summary of test cases and time results.

Note 2: Randomized tests require that either a) randomized inputs are 
verified by property or predicate, rather than by hard-coded values, or 
b) inputs are generated from outputs, where either the output or inverse 
generation are randomized.  Tests of sorting can use either 
is_sorted(list(sorted(random_input))) or 
list(sorted(random_shuffle(output))) == output.

Proposal 2: Add randomized tests here and there in the test suite.  Each 
randomized test x 30 buildbots x 2 runs/day x 365 days/year is about 
22000 random inputs a year.  Since each buildbot would be running a 
slightly different test, we need to act on and not ignore sporadic 
failures.  Victor Stinner's buildbot work is making this feasible.

--
Terry Jan Reedy

-- 
Terry Jan Reedy