Arrange files according to a text file

MRAB python at mrabarnett.plus.com
Sat Aug 27 19:48:20 EDT 2011


On 28/08/2011 00:18, Ric at rdo.python.org wrote:
> Thank you so much. The code worked perfectly.
>
> This is what I tried using Emile code. The only time when it picked
> wrong name from the list was when the file was named like this.
>
> Data Mark Stone.doc
>
> How can I fix this? Hope I am not asking too much?
>
Have you tried the alternative word orders, "Mark Stone" as well as
"Stone, Mark", picking whichever name has the best ratio for either?
>
> import os
> from difflib import SequenceMatcher as SM
>
> path = r'D:\Files '
> txt_names = []
>
>
> with open(r'D:/python/log1.txt') as f:
>      for txt_name in f.readlines():
>          txt_names.append(txt_name.strip())
>
> def ignore(x):
>       return x in ' ,.'
>
> for filename in os.listdir(path):
>       ratios = [SM(ignore,filename,txt_name).ratio() for txt_name in
> txt_names]
>       best = max(ratios)
>       owner = txt_names[ratios.index(best)]
>       print filename,":",owner
>
>
>
>
>
> On Sat, 27 Aug 2011 14:08:17 -0700, Emile van Sebille<emile at fenx.com>
> wrote:
>
>> On 8/27/2011 1:15 PM Ric at rdo.python.org said...
>>>
>>> Hello Emile ,
>>>
>>> Thank you for the code below as I have not encountered SequenceMatcher
>>> before and would have to take a look at it closer.
>>>
>>> My question would it work for a text file list of names about 25k
>>> lines and a directory with say 100 files inside?
>>
>> Sure.
>>
>> Emile
>>
>>
>>>
>>> Thank you once again.
>>>
>>>
>>> On Sat, 27 Aug 2011 11:06:22 -0700, Emile van Sebille<emile at fenx.com>
>>> wrote:
>>>
>>>> On 8/27/2011 10:03 AM Ric at rdo.python.org said...
>>>>> Hello,
>>>>>
>>>>> What would be the best way to accomplish this task?
>>>>
>>>> I'd do something like:
>>>>
>>>>
>>>> usernames = """Adler, Jack
>>>> Smith, John
>>>> Smith, Sally
>>>> Stone, Mark""".split('\n')
>>>>
>>>> filenames = """Smith, John - 02-15-75 - business files.doc
>>>> Random Data - Adler Jack - expenses.xls
>>>> More Data Mark Stone files list.doc""".split('\n')
>>>>
>>> >from difflib import SequenceMatcher as SM
>>>>
>>>>
>>>> def ignore(x):
>>>>       return x in ' ,.'
>>>>
>>>>
>>>> for filename in filenames:
>>>>       ratios = [SM(ignore,filename,username).ratio() for username in
>>>> usernames]
>>>>       best = max(ratios)
>>>>       owner = usernames[ratios.index(best)]
>>>>       print filename,":",owner
>>>>
>>>>
>>>> Emile
>>>>
>>>>
>>>>
>>>>> I have many files in separate directories, each file name
>>>>> contain a persons name but never in the same spot.
>>>>> I need to find that name which is listed in a large
>>>>> text file in the following format. Last name, comma
>>>>> and First name. The last name could be duplicate.
>>>>>
>>>>> Adler, Jack
>>>>> Smith, John
>>>>> Smith, Sally
>>>>> Stone, Mark
>>>>> etc.
>>>>>
>>>>>
>>>>> The file names don't necessary follow any standard
>>>>> format.
>>>>>
>>>>> Smith, John - 02-15-75 - business files.doc
>>>>> Random Data - Adler Jack - expenses.xls
>>>>> More Data Mark Stone files list.doc
>>>>> etc
>>>>>
>>>>> I need some way to pull the name from the file name, find it in the
>>>>> text list and then create a directory based on the name on the list
>>>>> "Smith, John" and move all files named with the clients name into that
>>>>> directory.
>>>>
>>




More information about the Python-list mailing list