newb question: file searching

Steve M sjmaster at gmail.com
Tue Aug 8 18:26:53 EDT 2006


jaysherby at gmail.com wrote:
> Okay.  This is almost solved.  I just need to know how to have each
> entry in my final list have the full path, not just the file name.

from http://docs.python.org/lib/os-file-dir.html:

walk() generates the file names in a directory tree, by walking the
tree either top down or bottom up. For each directory in the tree
rooted at directory top (including top itself), it yields a 3-tuple
(dirpath, dirnames, filenames).

dirpath is a string, the path to the directory. dirnames is a list of
the names of the subdirectories in dirpath (excluding '.' and '..').
filenames is a list of the names of the non-directory files in dirpath.
Note that the names in the lists contain no path components. To get a
full path (which begins with top) to a file or directory in dirpath, do
os.path.join(dirpath, name).

--------

So walk yields a 3-tuple, not just a filename. You seem to be somewhat
aware of this where you refer to files[2] in your list comprehension,
but I believe that is not constructed correctly.

Try this (untested):

def get_image_filepaths(target_folder):
    """Return a list of filepaths (path plus filename) for all images
in target_folder or any subfolder"""
    import os
    images = []
    for dirpath, dirnames, filenames in os.walk(target_folder):
        for filename in filenames:
            normalized_filename = filename.lower()
            if normalized_filename.endswith('.jpg') or
normalized_filename.endswith('.gif'):
                filepath = os.path.join(dirpath, filename)
                images.append(filepath)
    return images

import os
images = get_image_filepaths(os.getcwd())



> Also, I've noticed that files are being found within hidden
> directories.  I'd like to exclude hidden directories from the walk, or
> at least not add anything within them.  Any advice?

Decide how you identify a hidden directory and test dirpath before
adding it to the images list. E.g., test whether dirpath starts with
'.' and skip it if so.

>
> Here's what I've got so far:
>
> def getFileList():
> 	import os
> 	imageList = []
> 	for files in os.walk(os.getcwd(), topdown=True):
> 		imageList += [file for file in files[2] if file.endswith('jpg') or
> file.endswith('gif')]
> 	return imageList




More information about the Python-list mailing list