[Speed] Performance comparison of regular expression engines

Maciej Fijalkowski fijall at gmail.com
Sun Mar 6 04:30:15 EST 2016


this is really difficult to read, can you tell me which column am I looking at?

On Sun, Mar 6, 2016 at 11:21 AM, Serhiy Storchaka <storchaka at gmail.com> wrote:
> On 06.03.16 09:14, Maciej Fijalkowski wrote:
>> Any chance you can rerun this on pypy?
>
> Results on PyPy 2.2.1 (I'm not sure I could build the last PyPy on my computer):
>
>                                                re str.find
>
> Twain                                   5 5.469 3.852
> (?i)Twain                              10   8.646
> [a-z]shing                            165   17.24
> Huck[a-zA-Z]+|Saw[a-zA-Z]+             52   7.763
> \b\w+nn\b                              32     101
> [a-q][^u-z]{13}x                      445   167.6
> Tom|Sawyer|Huckleberry|Finn           314   8.583
> (?i)Tom|Sawyer|Huckleberry|Finn       477    16.3
> .{0,2}(Tom|Sawyer|Huckleberry|Finn)   314   270.9
> .{2,4}(Tom|Sawyer|Huckleberry|Finn)   237     262
> Tom.{10,25}river|river.{10,25}Tom       1   8.461
> [a-zA-Z]+ing                        10079     348
> \s[a-zA-Z]{0,12}ing\s                7160   115.8
> ([A-Za-z]awyer|[A-Za-z]inn)\s          50   16.62
> ["'][^"']{0,30}[?!\.]["']            1618   14.45
>
> Alternative regular expression engines need extension modules and don't work on PyPy for me.
>
> For comparison results on CPython 2.7.11+:
>
>                                                re  regex    re2   pcre str.find
>
> Twain                                   5   4.423  2.699  8.045   93.4  4.181
> (?i)Twain                              10   50.07  3.563  20.35  185.6
> [a-z]shing                            165   98.68  6.365  23.71   2886
> Huck[a-zA-Z]+|Saw[a-zA-Z]+             52   58.97  50.26  19.52   1016
> \b\w+nn\b                              32   130.1  416.5  18.38  740.7
> [a-q][^u-z]{13}x                      445   406.6  7.935   5886   7137
> Tom|Sawyer|Huckleberry|Finn           314   53.09   59.1  20.33   5377
> (?i)Tom|Sawyer|Huckleberry|Finn       477   281.2  338.5  23.77   7895
> .{0,2}(Tom|Sawyer|Huckleberry|Finn)   314   419.5   1142  20.69   6423
> .{2,4}(Tom|Sawyer|Huckleberry|Finn)   237   410.9   1013  18.99   5224
> Tom.{10,25}river|river.{10,25}Tom       1   63.17  58.31  18.94  260.2
> [a-zA-Z]+ing                        10079   203.8  363.8  43.78 1.583e+05
> \s[a-zA-Z]{0,12}ing\s                7160   127.1  26.65  34.23 1.114e+05
> ([A-Za-z]awyer|[A-Za-z]inn)\s          50   147.6  412.4  21.57   1172
> ["'][^"']{0,30}[?!\.]["']            1618   85.88  86.55  22.22 2.576e+04
>
> And on Jython 2.5.3 with JRE 7:
>
>                                                re str.find
>
> Twain                                   5      34      3
> (?i)Twain                              10     251
> [a-z]shing                            165     564
> Huck[a-zA-Z]+|Saw[a-zA-Z]+             52     281
> \b\w+nn\b                              32     510
> [a-q][^u-z]{13}x                      445    1786
> Tom|Sawyer|Huckleberry|Finn           314     102
> (?i)Tom|Sawyer|Huckleberry|Finn       477    1232
> .{0,2}(Tom|Sawyer|Huckleberry|Finn)   314    1345
> .{2,4}(Tom|Sawyer|Huckleberry|Finn)   237    1353
> Tom.{10,25}river|river.{10,25}Tom       1     305
> [a-zA-Z]+ing                        10079    1211
> \s[a-zA-Z]{0,12}ing\s                7160     571
> ([A-Za-z]awyer|[A-Za-z]inn)\s          50     676
> ["'][^"']{0,30}[?!\.]["']            1618     431
>
>
> _______________________________________________
> Speed mailing list
> Speed at python.org
> https://mail.python.org/mailman/listinfo/speed


More information about the Speed mailing list