Find if a file existing within 1000s of folder/sub-folder - each file has a unique presence

Tim Golden mail at timgolden.me.uk
Thu May 21 05:06:33 EDT 2015


On 21/05/2015 09:07, Steven D'Aprano wrote:
> On Thursday 21 May 2015 15:34, chaotic.sid at gmail.com wrote:
> 
>> So I was trying to dir /s /b using python.
>> Now since the file's path name is computed using other part of the code, I
>> am feeding in a variable here and somehow it does not seem to work.. :(
>>
>> Intent is to run following command from python
>> dir /s /b "c:/abc/def/ghjmain\features\XYZ\*<filename>"
> 
> I don't use Windows and have no idea what the /s and /b flags do, but 
> something like this should work:

Just for the uninitiated: /b shows only filenames ("bare") and /s shows
subdirectories.

[ ... snip os.walk example ...]

> path = "c:/abc/def/ghj/main/features/XYZ/*" + str(tempID) + ".cli"
> command = "dir /s /b " + path
> output = subprocess.check_output(command, shell=True)
> 
> Actually, I think you don't need the shell argument. Try this instead:

This is one of the few cases on Windows where you actually *do* need the
shell=True. shell=True invokes "cmd.exe" which is needed for the
command, such as dir and copy, which aren't standalone
executables but subcommands of cmd.exe.

I agree with Steven that os.walk, or something derived from it, is
definitely the way to go here, unless the OP has tried it and found it
to be too slow. Clearly, some kind of cacheing would help since the idea
seems to be to find, one at a time, a filename in any one of a large
hierarchy of directories.

Python 3.5 has the brand-new os.scandir which does a more efficient job
than os.listdir, using the underlying OS facilities on each platform and
cacheing where possible. But that's really leading edge, although Ben
Hoyt (the author) maintains an up-to-date version on github:

https://github.com/benhoyt/scandir


Just for the exercise, here's code which builds a dictionary mapping
filename to location(s) found:

<code>
#!python3
import os, sys

START_FROM = "c:/temp"

files = {}
for dirpath, dirnames, filenames in os.walk(START_FROM):
	for filename in filenames:
		files.setdefault(filename, []).append(dirpath)

filename_to_find = "temp.txt" ## input("Filename: ")
found_in = files.get(filename_to_find)
if found_in:
	print("Found in: ", ", ".join(found_in))
else:
	print("Not found")

</code>


TJG



More information about the Python-list mailing list