Entering a very large number

Fri Mar 30 07:48:49 EDT 2018

On 3/30/18 6:41 AM, bartc wrote:
> On 27/03/2018 04:49, Richard Damon wrote:
>> On 3/26/18 8:46 AM, bartc wrote:
>
>>> Hence my testing with CPython 3.6, rather than on something like 
>>> PyPy which can give results that are meaningless. Because, for 
>>> example, real code doesn't repeatedly execute the same pointless 
>>> fragment millions of times. But a real context is too complicated to 
>>> set up.
>
>> The bigger issue is that these sort of micro-measurements aren't 
>> actually that good at measuring real quantitative performance costs. 
>> They can often give qualitative indications, but the way modern 
>> computers work, processing environment is extremely important in 
>> performance, so these sorts of isolated measure can often be 
>> misleading. The problem is that if you measure operation a, and then 
>> measure operation b, if you think that doing a then b in the loop 
>> that you will get a time of a+b, you will quite often be 
>> significantly wrong, as cache performance can drastically affect 
>> things. Thus you really need to do performance testing as part of a 
>> practical sized exercise, not a micro one, in order to get a real 
>> measurement.
>
> That might apply to native code, where timing behaviour of a 
> complicated  chip like x86 might be unintuitive.
>
> But my comments were specifically about byte-code executed with 
> CPython. Then the behaviour is a level or two removed from the 
> hardware and with slightly different characteristics.
>
> (Since the program you are actually executing is the interpreter, not 
> the Python program, which is merely data. And whatever aggressive 
> optimisations are done to the interpreter code, they are not affected 
> by the Python program being run.)
>
But cache behavior may very well still influence it, as a small section 
of byte code may only exercise a small part of the interpreter, and thus 
it might be able to all (or mostly)) live in cache, and thus run faster, 
while a broader program, uses more of the interpreter, and may no longer 
fit in the cache. In some ways, this can be much amplified over a fully 
compiled code as very small changes in byte code can have much bigger 
effects over what gets accessed. You probably do get less opportunity 
for things to speed up by combining pieces, but still plenty of 
opportunity to get slowdowns.

Another factor that you run into is that lookup time can be a factor, 
just the mere presence of lots of other code in the test module, even if 
not executing, can impact the speed it runs at.

-- 
Richard Damon