A sets algorithm

Paulo da Silva p_s_d_a_s_i_l_v_a_ns at netcabo.pt
Sun Feb 7 21:22:53 EST 2016


Às 21:46 de 07-02-2016, Paulo da Silva escreveu:
> Hello!
> 
> This may not be a strict python question, but ...
> 
> Suppose I have already a class MyFile that has an efficient method (or
> operator) to compare two MyFile s for equality.
> 
> What is the most efficient way to obtain all sets of equal files (of
> course each set must have more than one file - all single files are
> discarded)?
> 

After reading all suggestions I decided to try first the
defaultdict(list), as first suggested by Oscar, in several steps. First
with sizes and then with other partial contents or/and "strong" hashes
as suggested by Cem.

Thank you very much to all who responded for all helpful suggestions.
If I find something better I'll report here.
Paulo




More information about the Python-list mailing list