Generating valid identifiers

Laszlo Nagy gandalf at shopzeus.com
Thu Jul 26 14:08:59 EDT 2012


>> * Would it be a problem to use CRC32 instead of SHA? (Since security is
>> not a problem, and CRC32 is faster.)
> What happens if you get a collision?
>
> That is, you have two different long identifiers:
>
> a.b.c.d...something
> a.b.c.d...anotherthing
>
> which by bad luck both hash to the same value:
>
> a.b.c.d.$AABB99
> a.b.c.d.$AABB99
>
> (or whatever).
Yes, that was the question. How do I avoid that? (Of course I can avoid 
that by using a full sha256 hash value.)
>> * Can somebody think of a
>> better algorithm, that would give a bigger chance of recognizing the
>> original identifier from the modified one?
> Rather than truncating the most significant part of the identifier, the
> field name, you should truncate the least important part, the middle.
>
> a.b.c.d.e.f.g.something
>
> goes to:
>
> a.b...g.something
>
> or similar.
Yes, this is a good idea. Thank you.





More information about the Python-list mailing list