Generating a unique identifier

Steven D'Aprano steve at REMOVE-THIS-cybersource.com.au
Fri Sep 7 22:58:33 EDT 2007


On Fri, 07 Sep 2007 08:47:58 -0700, Paul Rubin wrote:

> Paul Rubin <http://phr.cx@NOSPAM.invalid> writes:
>> def unique_id():
>>    return os.urandom(10).encode('hex')
> 
> Sorry, make that 32 or 40 instead of 10, if the number of id's is large,
> to make birthday collisions unlikely.

I did a small empirical test, and with 16 million ids, I found no 
collisions.

However, I did find that trying to dispose of a set of 16 million short 
strings caused my Python session to lock up for twenty minutes until I 
got fed up and killed the process. Should garbage-collecting 16 million 
strings really take 20+ minutes?


> If you don't want the id's to be that large, you can implement a Feistel
> cipher using md5 or sha as the round function pretty straightforwardly,
> then just feed successive integers through it. That also guarantees
> uniqueness, at least within one run of the program.  I have some sample
> code around for that, let me know if you need it.

I'm not sure that I need it, but I would certainly be curious to see it.


Thanks,


-- 
Steven.



More information about the Python-list mailing list