List of Numbers

Steven Taschuk staschuk at telusplanet.net
Sat Apr 5 16:18:16 EST 2003


Quoth Simon Faulkner:
> I have a list of about 5000 numbers in a text file - up to 14 digits
> each.
> 
> I need to check for duplicates.
> 
> What would people suggest as a good method?

On Unixy systems,
	sort file |uniq -d
is probably easiest.  The approach is to sort the file and check
for consecutive duplicates.  This would be easy to implement in
Python too.

(This approach is in general more memory-intensive than a solution
using a hash table instead of sorting, the difference being more
or less pronounced as there are many or few duplicates.  But five
thousand 14-digit numbers in ASCII representation is 75000 bytes,
which is negligible.)

-- 
Steven Taschuk                                        staschuk at telusplanet.net
"Study this book; read a word then ponder on it.  If you interpret the meaning
 loosely you will mistake the Way."         -- Musashi, _A Book of Five Rings_





More information about the Python-list mailing list