Finding duplicate file names and modifying them based on elements of the path

Larry.Martell@gmail.com larry.martell at gmail.com
Wed Jul 18 18:20:51 EDT 2012


I have an interesting problem I'm trying to solve. I have a solution
almost working, but it's super ugly, and know there has to be a
better, cleaner way to do it.

I have a list of path names that have this form:

/dir0/dir1/dir2/dir3/dir4/dir5/dir6/file

I need to find all the file names (basenames) in the list that are
duplicates, and for each one that is a dup, prepend dir4 to the
filename as long as the dir4/file pair is unique. If there are
multiple dir4/files in the list, then I also need to add a sequence
number based on the sorted value of dir5 (which is a date in ddMONyy
format).

For example, if my list contains:

/dir0/dir1/dir2/dir3/qwer/09Jan12/dir6/file3
/dir0/dir1/dir2/dir3/abcd/08Jan12/dir6/file1
/dir0/dir1/dir2/dir3/abcd/08Jan12/dir6/file2
/dir0/dir1/dir2/dir3/xyz/08Jan12/dir6/file1
/dir0/dir1/dir2/dir3/qwer/07Jan12/dir6/file3

Then I want to end up with:

/dir0/dir1/dir2/dir3/qwer/09Jan12/dir6/qwer_01_file3
/dir0/dir1/dir2/dir3/abcd/08Jan12/dir6/abcd_file1
/dir0/dir1/dir2/dir3/abcd/08Jan12/dir6/file2
/dir0/dir1/dir2/dir3/xyz/08Jan12/dir6/xyz_file1
/dir0/dir1/dir2/dir3/qwer/07Jan12/dir6/qwer_00_file3

My solution involves multiple maps and multiple iterations through the
data. How would you folks do this?



More information about the Python-list mailing list