[Spambayes-checkins] spambayes/spambayes message.py,1.17,1.18

Skip Montanaro skip at pobox.com
Tue Apr 22 15:20:11 EDT 2003


    Tim> Compiled the crlf regex, and added a couple comments

Tim,

I doubt compiling the regex helps.  Running timeit.py suggests this is the
case:

    % timeit.py -s "data = 'line 0\r\nline 1\nline 2\rline 3\r\r\nline 4\n' ; import re ; CRLF='\r\n'" "re.sub(r'\r\n|\n|\r', CRLF, data)"
    10000 loops, best of 3: 38.4 usec per loop
    % timeit.py -s "data = 'line 0\r\nline 1\nline 2\rline 3\r\r\nline 4\n' ; import re ; CRLF='\r\n' ; CRLFRE = re.compile(r'\r\n|\n|\r')" "re.sub(CRLFRE, CRLF, data)"
    10000 loops, best of 3: 38.5 usec per loop

I ran both several times.  The above times are the best I came up with.

In fact, looking at the code for sre.sub and sre._compile suggests that
using strings should be slightly faster than using compiled regular
expressions, because strings are cached with their compiled re's, whereas
compiled re's aren't.

Skip



More information about the Spambayes-checkins mailing list