[Cython] Hash-based vtables

Dag Sverre Seljebotn d.s.seljebotn at astro.uio.no
Tue Jun 5 19:09:37 CEST 2012

On 06/05/2012 07:01 PM, Dag Sverre Seljebotn wrote:
> On 06/05/2012 09:25 AM, Stefan Behnel wrote:
>> Dag Sverre Seljebotn, 04.06.2012 21:44:
>>> This can cause crashes/stack smashes
>>> etc. if there's lower-64bit-of-md5 collisions, but a) the
>>> probability is incredibly small, b) it would only matter in
>>> situations that should cause an AttributeError anyway, c) if we
>>> really care, we can always use an interning-like mechanism to
>>> validate on module loading that its hashes doesn't collide with
>>> other hashes (and raise an exception "Congratulations, you've
>>> discovered a phenomenal md5 collision, get in touch with cython
>>> devs and we'll work around it right away").
>> I'm not a big fan of such an attitude. If this happens at runtime, it can
>> induce any cost from cheap-at-test-time to
>> hugely-expensive-in-production.
>> Thinking with my evil hat on, this can potentially be data triggered from
>> the outside (e.g. if a JIT compiler is involved at one end), thus
>> possibly
>> even leading to a security hole.
>> We should try to produce software that others can build a business on.
> Well, I'd build a business on something that fails with a 5e-7
> probability any day :-) (given that you trust my estimates in the other
> post; I think they were rather conservative myself)

This was put the wrong way. The chance was 5e-7 that it would fail for 
anybody over the course of human history (and that was a rather 
pessimistic estimate).

So a more "individual tack":

Assume that the process contains 200 MB of method definitions alone, 
with each method definition being a 8 character string. (That should 
mean the executable should be several gigabytes :-))

That puts the probability of collision at 10^-34 for that process 
containing a 64-bit hash collision.


More information about the cython-devel mailing list