Pyrex speed

John Machin sjmachin at lexicon.net
Sat May 27 20:56:47 EDT 2006


On 28/05/2006 12:10 AM, Gonzalo Monzón wrote:

[good advice snipped]

> 
> Example A:
> This code is more than 80 times faster than a "easy" Python 
> implementation. For every call, it does some bitwise operations and does 
> an array lookup for every string character from argument. Its a lot 
> faster because in Python approach a list lookup is done and it is a lot 
> faster to do a C array lookup -thought that in these C loops no Python 
> type value conversions are needed, if it where the case, C approach 
> would not be so faster than python. I don't know how would perform an 
> array based Python code, but I expect it to be a lot faster than using a 
> list, so Python code can be speed up a lot if you know how to do it.
> 
> // C code:
> int CRC16Table[256]; // Filled elsewhere
> int CalcCRC16(char *str)
> {
>    int crc;
>      for(crc = 0xFFFF; *str != 0; str++) {
>        crc = CRC16Table [(( crc >> 8 ) & 255 )] ^ ( crc << 8 ) ^ *str;

Gonzalo, just in case there are any C compilers out there which need to 
be told:

 >      for(crc = 0xFFFF; *str != 0;) {
 >        crc = CRC16Table [(( crc >> 8 ) & 255 )] ^ ( crc << 8 ) ^ *str++;


>    }
>      return crc;
> }
> 
> # Python code
> gCRC16Table = [] # Filled elsewhere
> def CalcCRC16(astr):
>    crc = 0xFFFFL

Having that L on the end (plus the fact that you are pointlessly 
maintaining "crc" as an *unsigned* 32-bit quantity) will be slowing the 
calculation down -- Python will be doing it in long integers. You are 
calculating a *sixteen bit* CRC! The whole algorithm can be written 
simply so as to not need more than 16-bit registers, and not to pollute 
high-order bits in 17-or-more-bit registers.

>    for c in astr:
>        crc = gCRC16Table[((crc >> 8) & 255)] ^ ((crc & 0xFFFFFF) << 8) ^ 
> ord(c)

Note that *both* the C and Python routines still produce a 32-bit result 
with 16 bits of high-order rubbish -- I got the impression from the 
previous thread that you were going to fix that.

This Python routine never strays outside 16 bits, so avoiding your "& 
255" and a final "& 0xFFFF" (which you don't have).

def CalcCRC16(astr):
     crc = 0xFFFF
     for c in astr:
         crc = gCRC16Table[crc >> 8] ^ ((crc & 0xFF) << 8) ^ ord(c)
     return crc

==============
To the OP:

I'd just like to point out that C code and Pyrex code can gain 
signicantly (as the above example does) by not having to use ord() and 
chr().

As Gonzalo says, read the generated C code. Look for other cases of 
using Python built-ins that could be much faster with a minor bit of 
effort in Pyrex e.g. "max(a, b)" -> "(a) > (b) ? (a) : (b) " or if you 
don't like that, a cdef function to get the max of 2 ints will be *way* 
faster than calling Python's max()



More information about the Python-list mailing list