file handling
Alex Martelli
alex at magenta.com
Thu Aug 10 07:49:34 EDT 2000
<tjernstrom at my-deja.com> wrote in message
news:8mtt9m$90j$1 at nnrp1.deja.com...
> I'm new to Python and have trouble finding an answer to a simple
> question.
>
> I have written a small script that process html-files. All I need to do
> is find a way to send all html files in a directory to this file while
> ignoring the rest of the files in this directory (and do the same for
> all the subdirectories).
> Here's my short script:
>
> import re
> def ProcessFile(file, spath, tpath):
[snip]
> I'd really apreciate help or some tips on where in the documentation to
> look for answers.
The Library Reference is what you want -- it's both online and also
part of the Python installation.
If it weren't for the 'all the subdirectories' part, then:
def processAll(spath, tpath):
import glob
import os.path
for file in glob.glob(os.path.join(spath,"*.html")):
ProcessFile(file, '', tpath)
might work. Note that glob.glob() conserves the path,
and you probably don't want to be troubled to have to
os.path.split it again, whence the '' 2nd arg to ProcessFile.
[ProcessFile would also be well-advised to use os.path
rather than string-level operations, by the way].
Since you need to look to all subdirectories too, you'll
need os.path.walk to walk the tree and fnmatch.fnmatch
to do the selection. Here's a decent approach:
import os.path
import fnmatch
def ishtml(file):
return fnmatch.fnmatch(file,'*.html')
def processadir(tpath,path,files):
for file in filter(ishtml,files):
ProcessFile(file,path,tpath)
def processAll(spath,tpath):
os.path.walk(spath,processadir,tpath)
Alex
More information about the Python-list
mailing list