Best way to compare the contents of two directories

Robin Siebler robin.siebler at palmsource.com
Sat Sep 11 20:40:43 EDT 2004


I have two directory trees that I want to compare and I'm trying to
figure out what the best way of doing this would be.  I am using walk
to get a list of all of the files in each directory.

I am using this code to compare the file lists:

def compare_files(first_list, second_list, first_dir, second_dir):
        missing = in_first_only(first_list, second_list)
        for item in missing:
            index = first_list.index(item)
            print first_list[index] + ' does not exist in ' +
second_dir[index]
            first_list.pop(index); first_dir.pop(index)
        return first_list, second_list, first_dir, second_dir

However, before I actually compare the files, I want to compare the
directories and if a directory is mising in either set, I want to
report it:

dir_list_a = ['d:\\results\\foldera\\','d:\\results\\folderb\\','d:\\results\\folderc\\']
dir_list_b = ['c:\\results\\foldera\\','c:\\results\\folderb\\']

output:
   'folderc' exists in d:\results but not in c:\results


I am using splitall (from the Python Cookbook) to split the paths into
there parts and appending this to a list, but I can't figure out the
best way to compare the contents of the resulting 2 lists and I think
I am starting to make things *too* complicated:

def splitall(path):
    """
        Source: Python Cookbook
            Credit: Trent Mick

        Split a path into all of its parts.
    """
    allparts = []
    while 1:
        parts = os.path.split(path)
        if parts[0] == path:
            allparts.insert(0, parts[0])
            break
        elif parts[1] == path:
            allparts.insert(0, parts[1])
            break
        else:
            path = parts[0]
            allparts.insert(0, parts[1])
    return allparts

After using this, I end up with this:

dir_list_a = [['d:\\', 'results', 'foldera', 'd:\\', 'results',
'folderb', 'd:\\', 'results', 'folderc']]
dir_list_b = 
[['d:\\', 'results', 'foldera', 'd:\\', 'results', 'folderb']]



More information about the Python-list mailing list