Critique a newbie's "Smart-copy" class please...?

Alex Martelli alex at magenta.com
Sun Dec 19 13:39:20 EST 1999


Mostly to learn, partly to end up with a useful artefact,
I've written a 'smart-copy' class, intended to take as
input a file (or similar) which may include embedded
Python expressions and statements, and copy it to an
output file (or similar) with substitutions.

Statements and expressions are identified through
arbitrary regular-expression objects passed in to the
class, for maximum convenience (so regex's can be
chosen that will be unlikely to conflict with whatever
kind of material is being copied); a 'globals' dictionary
for the evaluation is also passed in.

For example, with the test-code:

if __name__=='__main__':
    "Test: stdin->stdout with full processing"
    import re
    import sys
    x=23
    rex=re.compile('@([^@]+)@')
    rbe=re.compile('\+')
    ren=re.compile('-')
    rco=re.compile('=')
    cop = copier(rex, globals(), rbe, ren, rco)
    cop.copy(sys.stdin)

expression will be identified by being enclosed
in '@...@', statements are on lines starting with
+, and may be continued with lines starting with
=, and are terminated with lines starting with -,
so, for example, if standard input is

a first line (x is @x@)
+for w in range(1,5):
  hello @w@ there
+   if w%2:
    the number @w@ is odd
=   else:
    the number @w@ is even
-
-
a last line (w is now @w@)

standard output will be:

a first line (x is 23)
  hello 1 there
    the number 1 is odd
  hello 2 there
    the number 2 is even
  hello 3 there
    the number 3 is odd
  hello 4 there
    the number 4 is even
a last line (w is now 4)


After not a little soul-searching I've settled for
'slurping' the file being read with .readlines(),
and a recursive set-up for nested statements
(mostly intended for if, for, while thingies).  It
seems like the simplest set of choices here.

I think the key feature still missing is including
other files, which would probably also be done
best by recursion, but before doing that I'd
like some critique of my style and choices by
more experienced Pythonistas.  I'm unsure
about everything, from, how verbose or detailed
it might be best to be in doc strings and comments
in this language, to, have I gone overboard in
making things part of my object's state rather
than passing them in as arguments, etc, etc...

So, thanks in advance for whatever critiques and
suggestions you may offer.


"""
    Smart-copy of a "template" file to another file
    (e.g., to standard output, for CGI/&c purposes).

    The file is copied, except that Python expressions
    embedded in it are expanded (and some Python
    statements, such as conditionals and loops, can
    also be embedded, on a line-oriented basis).

    The Python expressions are identified through a
    compiled regular-expression object, which, on each
    and every line, is used for a _search_ on the string;
    if a MatchObject "match" results, match.group(1) is
    eval'd as a Python expression, and substituted in
    place; a dictionary for the evaluation must also be
    passed.  Many such matches per line are possible.

    Statements can also be embedded; this is mostly
    intended to be used with if/elif/else, for, while.

    Statement-related lines are recognized through 3
    more regular-expression objects that are passed in,
    one each for 'statement', 'continuation', 'finish',
    used for reg-ex 'match' (i.e., from line-start).

    The 'stat' and 'cont' re's are followed by the
    corresponding statement lines (beginning statement,
    and continuation statement -- the latter makes
    sense mostly for, e.g., 'else', and 'elif') while the
    rest of the 'finish' lines is ignored. Statements can
    be properly nested; all statements end at end of
    file unless 'finish' lines were found sooner.
"""

import sys
import string

class _nevermatch:
    "Polymorphic with a regex that never matches"
    def match(self, line):
        return 0

class copier:
    "Smart-copier class, see module's docstring"
    def copyblock(self,i,last):
        "Main copy method: process lines [i,last) of block"
        def repl(match,self=self):
            "return the eval of a found expression, for replacement"
            return '%s' % eval(match.group(1),self.globals,self.locals)
        block = self.locals['_bl']
        while i<last:
            line = block[i]
            match = self.restat.match(line)
            if match:   # a statement starts "here" (at line [i])
                # i is the last line to _not_ process
                stat = string.strip(match.string[match.end(0):])
                j=i+1   # look for 'finish' r.e. from here onwards
                nest=1  # count nesting level of statements
                while j<last:
                    line = block[j]
                    # first look for nested statements or 'finish' lines
                    if self.restend.match(line):    # found a statement-end
                        nest = nest - 1     # update (decrease) nesting
                        if nest==0: break   # j is the first line to _not_
process
                    elif self.restat.match(line):   # found a nested
statement
                        nest = nest + 1     # update (increase) nesting
                    elif nest==1:   # look for continuation only at this
nesting
                        match = self.recont.match(line)
                        if match:                   # found a
continuation-statement
                            nestat =
string.strip(match.string[match.end(0):])
                            stat = '%s _cb(%s,%s)\n%s' % (stat,i+1,j,nestat)
                            i=j     # again, i is the last line to _not_
process
                    j=j+1
                stat = '%s _cb(%s,%s)' % (stat,i+1,j)
                # for debugging, uncomment...: print "-> Executing:
{"+stat+"}"
                exec stat in self.globals,self.locals
                i=j+1
            else:       # normal line, just copy with substitution
                self.ouf.write(self.regex.sub(repl,line))
                i=i+1
    def __init__(self, regex, dict, restat=None, restend=None, recont=None,
ouf=sys.stdout):
        "Initialize self's data fields"
        self.regex      = regex
        self.globals    = dict
        self.locals     = { '_cb':self.copyblock }
        self.restat     = restat or _nevermatch()
        self.restend    = restend or _nevermatch()
        self.recont     = recont or _nevermatch()
        self.ouf        = ouf
    def copy(self, inf=sys.stdin, block=None):
        "Entry point: copy-with-processing a file, or block of lines"
        if not block: block = inf.readlines()
        self.locals['_bl'] = block
        self.copyblock(0, len(block))


Alex






More information about the Python-list mailing list