recursive file editing
Peter Otten
__peter__ at web.de
Sun Apr 4 06:11:25 EDT 2004
TaeKyon wrote:
> I'm a python newbie; here are a few questions relative to a
> problem I'm trying to solve; I'm wandering if python is the best
> instrument or if awk or a mix of bash and sed would be better:
>
> 1) how would I get recursively descend
> through all files in all subdirectories of
> the one in which my script is called ?
>
> 2) each file examined should be edited; IF a string of this type is found
>
> foo.asp=dev?bar (where bar can be a digit or an empty space)
>
> it should always be substituted with this string
>
> foo-bar.html (if bar is an empty space the new string is foo-.html)
>
> 3) the names of files read may themselves be of the sort foo.asp=dev?bar;
> the edited output file should also be renamed according to the same rule
> as above ... or would this be better handled by a bash script ?
>
> Any hints appreciated
>
The following code comes with no warranties. Be sure to backup valuable data
before trying it. You may need to edit the regular expressions. Call the
script with the directory you want to process.
Peter
import os, re, sys
class Path(object):
def __init__(self, folder, name):
self.folder = folder
self.name = name
def _get_path(self):
return os.path.join(self.folder, self.name)
path = property(_get_path)
def rename(self, newname):
if self.name != newname:
os.rename(self.path, os.path.join(self.folder, newname))
self.name = newname
def processContents(self, operation):
data = file(self.path).read()
newdata = operation(data)
if data != newdata:
file(self.path, "w").write(newdata)
def __str__(self):
return self.path
def files(rootfolder):
for folder, folders, files in os.walk(rootfolder):
for name in files:
yield Path(folder, name)
fileExpr = re.compile(r"^(.+?)\.asp\=dev\?(.*)$")
filePattern = r"\1-\2.html"
textExpr = re.compile(r"([/\'\"])(.+?)\.asp\=dev\?(.*?)([\'\"])")
textPattern = r"\1\2-\3.html\4"
if __name__ == "__main__":
for f in files(sys.argv[1]):
f.rename(fileExpr.sub(filePattern, f.name))
f.processContents(lambda s: textExpr.sub(textPattern, s))
More information about the Python-list
mailing list