[Tutor] Sorting of files based on filesize
C Smith
smichr at hotmail.com
Tue Apr 12 08:40:43 CEST 2005
--request for a method of sorting disk files based on size so as to
fill backup disks--
You may want to check out the karp.py routine posted at
http://aspn.activestate.com/ASPN/Mail/Message/python-Tutor/749797
Right now it is coded to split N numbers into 2 groups that have sums
as nearly identical as possible. The set of numbers it is coded to test
are are the first 50 square roots of integers; you would replace these
with a list of N file sizes that are close to 2X a single disk capacity
and it would tell you how to split up the group into two
nearly-identically sized groups. It's "very, very, very clever, and
runs in an eyeblink" says Tim Peters. You might even want to use Tim's
greedy "knapsack filler" approach that he initially proposed as part of
the thread above which will try random selections from a list of values
and keep track of the one that came closest to a target value. Your
target value would be 2X the storage limit and then you could use the
karp routine to split it nicely. Tim's greedy approach is at
http://mail.python.org/pipermail/tutor/2001-August/008075.html
An alternative to breaking your list of file sizes into sublists that
are close to 2X your disk capacity would be to generalize the algorithm
to break the numbers into M groups rather than 2 groups...but I'm not
sure how easy that will be.
You're going to love using the karp for this problem :-)
/c
More information about the Tutor
mailing list