Critique a newbie's "Smart-copy" class please...?
Alex Martelli
alex at magenta.com
Sun Dec 19 13:39:20 EST 1999
Mostly to learn, partly to end up with a useful artefact,
I've written a 'smart-copy' class, intended to take as
input a file (or similar) which may include embedded
Python expressions and statements, and copy it to an
output file (or similar) with substitutions.
Statements and expressions are identified through
arbitrary regular-expression objects passed in to the
class, for maximum convenience (so regex's can be
chosen that will be unlikely to conflict with whatever
kind of material is being copied); a 'globals' dictionary
for the evaluation is also passed in.
For example, with the test-code:
if __name__=='__main__':
"Test: stdin->stdout with full processing"
import re
import sys
x=23
rex=re.compile('@([^@]+)@')
rbe=re.compile('\+')
ren=re.compile('-')
rco=re.compile('=')
cop = copier(rex, globals(), rbe, ren, rco)
cop.copy(sys.stdin)
expression will be identified by being enclosed
in '@...@', statements are on lines starting with
+, and may be continued with lines starting with
=, and are terminated with lines starting with -,
so, for example, if standard input is
a first line (x is @x@)
+for w in range(1,5):
hello @w@ there
+ if w%2:
the number @w@ is odd
= else:
the number @w@ is even
-
-
a last line (w is now @w@)
standard output will be:
a first line (x is 23)
hello 1 there
the number 1 is odd
hello 2 there
the number 2 is even
hello 3 there
the number 3 is odd
hello 4 there
the number 4 is even
a last line (w is now 4)
After not a little soul-searching I've settled for
'slurping' the file being read with .readlines(),
and a recursive set-up for nested statements
(mostly intended for if, for, while thingies). It
seems like the simplest set of choices here.
I think the key feature still missing is including
other files, which would probably also be done
best by recursion, but before doing that I'd
like some critique of my style and choices by
more experienced Pythonistas. I'm unsure
about everything, from, how verbose or detailed
it might be best to be in doc strings and comments
in this language, to, have I gone overboard in
making things part of my object's state rather
than passing them in as arguments, etc, etc...
So, thanks in advance for whatever critiques and
suggestions you may offer.
"""
Smart-copy of a "template" file to another file
(e.g., to standard output, for CGI/&c purposes).
The file is copied, except that Python expressions
embedded in it are expanded (and some Python
statements, such as conditionals and loops, can
also be embedded, on a line-oriented basis).
The Python expressions are identified through a
compiled regular-expression object, which, on each
and every line, is used for a _search_ on the string;
if a MatchObject "match" results, match.group(1) is
eval'd as a Python expression, and substituted in
place; a dictionary for the evaluation must also be
passed. Many such matches per line are possible.
Statements can also be embedded; this is mostly
intended to be used with if/elif/else, for, while.
Statement-related lines are recognized through 3
more regular-expression objects that are passed in,
one each for 'statement', 'continuation', 'finish',
used for reg-ex 'match' (i.e., from line-start).
The 'stat' and 'cont' re's are followed by the
corresponding statement lines (beginning statement,
and continuation statement -- the latter makes
sense mostly for, e.g., 'else', and 'elif') while the
rest of the 'finish' lines is ignored. Statements can
be properly nested; all statements end at end of
file unless 'finish' lines were found sooner.
"""
import sys
import string
class _nevermatch:
"Polymorphic with a regex that never matches"
def match(self, line):
return 0
class copier:
"Smart-copier class, see module's docstring"
def copyblock(self,i,last):
"Main copy method: process lines [i,last) of block"
def repl(match,self=self):
"return the eval of a found expression, for replacement"
return '%s' % eval(match.group(1),self.globals,self.locals)
block = self.locals['_bl']
while i<last:
line = block[i]
match = self.restat.match(line)
if match: # a statement starts "here" (at line [i])
# i is the last line to _not_ process
stat = string.strip(match.string[match.end(0):])
j=i+1 # look for 'finish' r.e. from here onwards
nest=1 # count nesting level of statements
while j<last:
line = block[j]
# first look for nested statements or 'finish' lines
if self.restend.match(line): # found a statement-end
nest = nest - 1 # update (decrease) nesting
if nest==0: break # j is the first line to _not_
process
elif self.restat.match(line): # found a nested
statement
nest = nest + 1 # update (increase) nesting
elif nest==1: # look for continuation only at this
nesting
match = self.recont.match(line)
if match: # found a
continuation-statement
nestat =
string.strip(match.string[match.end(0):])
stat = '%s _cb(%s,%s)\n%s' % (stat,i+1,j,nestat)
i=j # again, i is the last line to _not_
process
j=j+1
stat = '%s _cb(%s,%s)' % (stat,i+1,j)
# for debugging, uncomment...: print "-> Executing:
{"+stat+"}"
exec stat in self.globals,self.locals
i=j+1
else: # normal line, just copy with substitution
self.ouf.write(self.regex.sub(repl,line))
i=i+1
def __init__(self, regex, dict, restat=None, restend=None, recont=None,
ouf=sys.stdout):
"Initialize self's data fields"
self.regex = regex
self.globals = dict
self.locals = { '_cb':self.copyblock }
self.restat = restat or _nevermatch()
self.restend = restend or _nevermatch()
self.recont = recont or _nevermatch()
self.ouf = ouf
def copy(self, inf=sys.stdin, block=None):
"Entry point: copy-with-processing a file, or block of lines"
if not block: block = inf.readlines()
self.locals['_bl'] = block
self.copyblock(0, len(block))
Alex
More information about the Python-list
mailing list