Python glob and raw string

Neil Cerutti neilc at norwich.edu
Thu Jan 16 13:14:30 EST 2014


On 2014-01-16, Xaxa Urtiz <urtizvereaxaxa at gmail.com> wrote:
> Hello everybody, i've got a little problem, i've made a script
> which look after some files in some directory, typically my
> folder are organized like this :
>
> [share]
> folder1
> ->20131201
> -->file1.xml
> -->file2.txt
> ->20131202
> -->file9696009.tmp
> -->file421378932.xml
> etc....
> so basically in the share i've got some folder
> (=folder1,folder2.....) and inside these folder i've got these
> folder whose name is the date (20131201,20131202,20131203
> etc...) and inside them i want to find all the xml files.
> So, what i've done is to iterate over all the folder1/2/3 that
> i want and look, for each one, the xml file with that:
>
> for f in glob.glob(dir +r"\20140115\*.xml"):
> ->yield f
>
> dir is the folder1/2/3 everything is ok but i want to do
> something like that :
>
> for i in range(10,16):
> ->for f in glob.glob(dir +r"\201401{0}\*.xml".format(i)):
> -->yield f
>
> but the glob does not find any file.... (and of course there is
> some xml and the old way found them...) 
> Any help would be appreciate :) 

I've done this two different ways. The simple way is very similar
to what you are now doing. It sucks because I have to manually
maintain the list of subdirectories to traverse every time I
create a new subdir.

Here's the other way, using glob and isdir from os.path, adapted
from actual production code.

class Miner:
    def __init__(self, archive):
        # setup goes here; prepare to acquire the data
        self.descend(os.path.join(archive, '*'))

    def descend(self, path):
        for fname in glob.glob(os.path.join(path, '*')):
            if os.path.isdir(fname):
                self.descend(fname)
            else:
                self.process(fname)

    def process(self, path):
        # Do what I want done with an actual file path.
	# This is where I add to the data.

In your case you might not want to process unless the path also
looks like an xml file.

mine = Miner('myxmldir')

Hmmm... I might be doing too much in __init__. ;)

-- 
Neil Cerutti




More information about the Python-list mailing list