Generating a unique identifier

Paul Rubin http
Sat Sep 8 00:15:50 EDT 2007


Steven D'Aprano <steve at REMOVE-THIS-cybersource.com.au> writes:
> > Sorry, make that 32 or 40 instead of 10, if the number of id's is large,
> > to make birthday collisions unlikely.
> 
> I did a small empirical test, and with 16 million ids, I found no 
> collisions.

16 million 32-byte ids?  With string and dictionary overhead that's
probably on the order of 1 GB.  Anyway, 16 bytes is enough, as
mentioned elsewhere.

> However, I did find that trying to dispose of a set of 16 million short 
> strings caused my Python session to lock up for twenty minutes until I 
> got fed up and killed the process. Should garbage-collecting 16 million 
> strings really take 20+ minutes?

Maybe your system was thrashing, or maybe the GC was happening during
allocation (there was some discussion of that a while back).

> > If you don't want the id's to be that large, you can implement a Feistel
> I'm not sure that I need it, but I would certainly be curious to see it.

I posted some code elsewhere in the thread.



More information about the Python-list mailing list