[Python-Dev] Algoritmic Complexity Attack on Python

Scott A Crosby scrosby@cs.rice.edu
30 May 2003 17:02:51 -0500


On Fri, 30 May 2003 15:00:21 -0500, Jeff Epler <jepler@unpythonic.net> writes:

> On Fri, May 30, 2003 at 11:18:14AM -0500, Scott A Crosby wrote:
> > On 
> > 
> >    http://www.cs.rice.edu/~scrosby/hash/CrosbyWallach_UsenixSec2003/LengthEffects.png
> > 
> > UHASH exceeds the performance of perl's hash function, which is
> > simpler than your own.
> 
> I notice that you say "with strings longer than around 44-bytes,
> UHASH dominates all the other hash functions, due in no small part to its
> extensive performance tuning and *hand-coded assembly routines.*"
> [emphasis mine]  It's all well and good for people who can run your

We benchmarked it, and without assembly optimizations, uhash still
exceeds perl. Also please note that we did not create uhash. We merely
used it as a high performance universal hash which we could cite and
benchmark.

Freshly computed raw benchmarks on a P2-450 are at the end of this
email. Looking at them now. I think we may have slightly err'ed and
used the non-assembly version of the hash in constructing the graphs,
because the crossover point compared to perl looks to be about 20
bytes with assembly, and about 48 without.

Roughly, they show that uhash is about half the speed on a P2-450
without assembly. I do not have benchmarks on other platforms to
compare it to. However, CW is known be about 10 times worse,
relatively, than jenkin's on a SPARC.

The python community will have to judge whether the performance
difference of the current hash is worth the risk of the attack.

Also, I'd like to thank Tim Peters for telling me about the potential
of degradation that regular expressions may offer.

Scott



Time benchmarking actual hash (including benchmarking overhead) with a
working set size of 12kb.

Time(perl-5.8.0): 12.787 Mbytes/sec with string length 4, buf 12000
Time(uh_cw-1024): 6.010 Mbytes/sec with string length 4, buf 12000
Time(python): 14.952 Mbytes/sec with string length 4, buf 12000
Time(test32out_uhash): 4.584 Mbytes/sec with string length 4, buf 12000
Time(test32out_assembly_uhash): 6.014 Mbytes/sec with string length 4, buf 12000

Time(perl-5.8.0): 29.125 Mbytes/sec with string length 16, buf 12000
Time(uh_cw-1024): 11.898 Mbytes/sec with string length 16, buf 12000
Time(python): 36.445 Mbytes/sec with string length 16, buf 12000
Time(test32out_uhash): 19.169 Mbytes/sec with string length 16, buf 12000
Time(test32out_assembly_uhash): 25.660 Mbytes/sec with string length 16, buf 12000

Time(perl-5.8.0): 45.440 Mbytes/sec with string length 64, buf 12000
Time(uh_cw-1024): 16.168 Mbytes/sec with string length 64, buf 12000
Time(python): 62.213 Mbytes/sec with string length 64, buf 12000
Time(test32out_uhash): 71.396 Mbytes/sec with string length 64, buf 12000
Time(test32out_assembly_uhash): 106.873 Mbytes/sec with string length 64, buf 12000

Time benchmarking actual hash (Including benchmarking overhead) with a
working set size of 6mb.

Time(perl-5.8.0): 8.099 Mbytes/sec with string length 4, buf 6000000
Time(uh_cw-1024): 4.660 Mbytes/sec with string length 4, buf 6000000
Time(python): 8.840 Mbytes/sec with string length 4, buf 6000000
Time(test32out_uhash): 3.932 Mbytes/sec with string length 4, buf 6000000
Time(test32out_assembly_uhash): 4.859 Mbytes/sec with string length 4, buf 6000000

Time(perl-5.8.0): 20.878 Mbytes/sec with string length 16, buf 6000000
Time(uh_cw-1024): 9.964 Mbytes/sec with string length 16, buf 6000000
Time(python): 24.450 Mbytes/sec with string length 16, buf 6000000
Time(test32out_uhash): 16.168 Mbytes/sec with string length 16, buf 6000000
Time(test32out_assembly_uhash): 19.929 Mbytes/sec with string length 16, buf 6000000

Time(perl-5.8.0): 35.265 Mbytes/sec with string length 64, buf 6000000
Time(uh_cw-1024): 14.400 Mbytes/sec with string length 64, buf 6000000
Time(python): 46.650 Mbytes/sec with string length 64, buf 6000000
Time(test32out_uhash): 48.719 Mbytes/sec with string length 64, buf 6000000
Time(test32out_assembly_uhash): 63.523 Mbytes/sec with string length 64, buf 6000000