parsing directory for certain filetypes

Robert Bossy Robert.Bossy at jouy.inra.fr
Mon Mar 10 10:28:37 EDT 2008


royG wrote:
> hi
> i wrote a function to parse a given directory and make a sorted list
> of  files with .txt,.doc extensions .it works,but i want to know if it
> is too bloated..can this be rewritten in more efficient manner?
>
> here it is...
>
> from string import split
> from os.path import isdir,join,normpath
> from os import listdir
>
> def parsefolder(dirname):
>     filenms=[]
>     folder=dirname
>     isadr=isdir(folder)
>     if (isadr):
>         dirlist=listdir(folder)
>         filenm=""
>   
This las line is unnecessary: variable scope rules in python are a bit 
different from what we're used to. You're not required to 
declare/initialize a variable, you're only required to assign a value 
before it is referenced.


>         for x in dirlist:
>              filenm=x
> 	     if(filenm.endswith(("txt","doc"))):
>                  nmparts=[]
> 		 nmparts=split(filenm,'.' )
>                  if((nmparts[1]=='txt') or (nmparts[1]=='doc')):
>   
I don't get it. You've already checked that filenm ends with "txt" or 
"doc"... What is the purpose of these three lines?
Btw, again, nmparts=[] is unnecessary.

>                       filenms.append(filenm)
>         filenms.sort()
>         filenameslist=[]
>   
Unnecessary initialization.

>         filenameslist=[normpath(join(folder,y)) for y in filenms]
> 	numifiles=len(filenameslist)
>   
numifiles is not used so I guess this line is too much.

> 	print filenameslist
> 	return filenameslist
>   

Personally, I'd use glob.glob:


import os.path
import glob

def parsefolder(folder):
    path = os.path.normpath(os.path.join(folder, '*.py'))
    lst = [ fn for fn in glob.glob(path) ]
    lst.sort()
    return lst


I leave you the exercice to add .doc files. But I must say (whoever's 
listening) that I was a bit disappointed that glob('*.{txt,doc}') didn't 
work.

Cheers,
RB



More information about the Python-list mailing list