The Cost of Dynamism (was Re: Pyhon 2.x or 3.x, which is faster?)

Mon Mar 21 11:39:04 EDT 2016

On Monday, March 21, 2016 at 7:19:03 PM UTC+5:30, Steven D'Aprano wrote:
> On Mon, 21 Mar 2016 11:59 pm, Chris Angelico wrote:
> 
> > On Mon, Mar 21, 2016 at 11:34 PM, BartC  wrote:
> >> For Python I would have used a table of 0..255 functions, indexed by the
> >> ord() code of each character. So all 52 letter codes map to the same
> >> name-handling function. (No Dict is needed at this point.)
> >>
> > 
> > Once again, you forget that there are not 256 characters - there are
> > 1114112. (Give or take.)
> 
> Pardon me, do I understand you correctly? You're saying that the C parser is
> Unicode-aware and allows you to use Unicode in C source code? Because
> Bart's test is for a (simplified?) C tokeniser, and expecting his tokeniser
> to support character sets that C does not would be, well, Not Cricket, my
> good chap.

Sticking to C and integer switches, one would expect that
switch (n)
{
  case 1000:...
  case 1001:
  case 1002:
  :
  :
  case 2000:
  default:
}
would compile into faster/tighter code than
switch (n)
{
  case 1:...
  case 100:
  case 200:
  case 1000:
  case 10000:
  default:
}

IOW if the compiler can detect an arithmetic progression or a reasonably dense
subset of one it can make a jump table.  If not it starts deteriorating into
if-else chains

Same applies to char even if char is full-unicode: if the switching is over a
small dense/contiguous subset, a jump table works well (at assembly level)
and so a switch at C level.

[And dicts/arrays of functions are ok approximations to that]