From jeremy@cnri.reston.va.us Sun Mar 5 17:58:12 2000 From: jeremy@cnri.reston.va.us (Jeremy Hylton) Date: Sun, 5 Mar 2000 12:58:12 -0500 (EST) Subject: [Compiler-sig] copyright/license BS (was: P2C stuff) In-Reply-To: References: Message-ID: <14530.41012.487618.285485@bitdiddle.cnri.reston.va.us> Getting back to this thread a little late... >>>>> "GS" == Greg Stein writes: GS> No... this is saying "do whatever. I don't care." In no way do I GS> believe anybody *is* trying to claim ownership. I'm simply GS> saying that Jeremy (and/or whoever) can do what they want. Do GS> whatever. No need to check with me. I guess I'm not clear on the status of p2c. The main reason I asked was to avoid having two out-of-sync copies of transformer.py in the world. GS> Heck... many of the modules that I've written, I call Public GS> Domain. In other words: I'm not even asserting a copyright! Except that P2C has a copyright notice, and is not in the public domain. That's the other reason I asked (though a secondary reason). Jeremy From jeremy@cnri.reston.va.us Mon Mar 6 19:28:12 2000 From: jeremy@cnri.reston.va.us (Jeremy Hylton) Date: Mon, 6 Mar 2000 14:28:12 -0500 (EST) Subject: [Compiler-sig] example checkers based on compiler package Message-ID: <14532.1740.90292.440395@goon.cnri.reston.va.us> There was some discussion on python-dev over the weekend about generating warnings, and Moshe Zadke posted a selfnanny that warned about methods that didn't have self as the first argument. I think these kinds of warnings are useful, and I'd like to see a more general framework for them built are Python abstract syntax originally from P2C. Ideally, they would be available as command line tools and integrated into GUIs like IDLE in some useful way. I've included a couple of quick examples I coded up last night based on the compiler package (recently re-factored) that is resident in python/nondist/src/Compiler. The analysis on the one that checks for name errors is a bit of a mess, but the overall structure seems right. I'm hoping to collect a few more examples of checkers and generalize from them to develop a framework for checking for errors and reporting them. Jeremy ------------ checkself.py ------------ """Check for methods that do not have self as the first argument""" from compiler import parseFile, walk, ast, misc class Warning: def __init__(self, filename, klass, method, lineno, msg): self.filename = filename self.klass = klass self.method = method self.lineno = lineno self.msg = msg _template = "%(filename)s:%(lineno)s %(klass)s.%(method)s: %(msg)s" def __str__(self): return self._template % self.__dict__ class NoArgsWarning(Warning): super_init = Warning.__init__ def __init__(self, filename, klass, method, lineno): self.super_init(filename, klass, method, lineno, "no arguments") class NotSelfWarning(Warning): super_init = Warning.__init__ def __init__(self, filename, klass, method, lineno, argname): self.super_init(filename, klass, method, lineno, "self slot is named %s" % argname) class CheckSelf: def __init__(self, filename): self.filename = filename self.warnings = [] self.scope = misc.Stack() def inClass(self): if self.scope: return isinstance(self.scope.top(), ast.Class) return 0 def visitClass(self, klass): self.scope.push(klass) self.visit(klass.code) self.scope.pop() return 1 def visitFunction(self, func): if self.inClass(): classname = self.scope.top().name if len(func.argnames) == 0: w = NoArgsWarning(self.filename, classname, func.name, func.lineno) self.warnings.append(w) elif func.argnames[0] != "self": w = NotSelfWarning(self.filename, classname, func.name, func.lineno, func.argnames[0]) self.warnings.append(w) self.scope.push(func) self.visit(func.code) self.scope.pop() return 1 def check(filename): global p, check p = parseFile(filename) check = CheckSelf(filename) walk(p, check) for w in check.warnings: print w if __name__ == "__main__": import sys # XXX need to do real arg processing check(sys.argv[1]) ------------ badself.py ------------ def foo(): return 12 class Foo: def __init__(): pass def foo(self, foo): pass def bar(this, that): def baz(this=that): return this return baz def bar(): class Quux: def __init__(self): self.sum = 1 def quam(x, y): self.sum = self.sum + (x * y) return Quux() ------------ checknames.py ------------ """Check for NameErrors""" from compiler import parseFile, walk from compiler.misc import Stack, Set import __builtin__ from UserDict import UserDict class Warning: def __init__(self, filename, funcname, lineno): self.filename = filename self.funcname = funcname self.lineno = lineno def __str__(self): return self._template % self.__dict__ class UndefinedLocal(Warning): super_init = Warning.__init__ def __init__(self, filename, funcname, lineno, name): self.super_init(filename, funcname, lineno) self.name = name _template = "%(filename)s:%(lineno)s %(funcname)s undefined local %(name)s" class NameError(UndefinedLocal): _template = "%(filename)s:%(lineno)s %(funcname)s undefined name %(name)s" class NameSet(UserDict): """Track names and the line numbers where they are referenced""" def __init__(self): self.data = self.names = {} def add(self, name, lineno): l = self.names.get(name, []) l.append(lineno) self.names[name] = l class CheckNames: def __init__(self, filename): self.filename = filename self.warnings = [] self.scope = Stack() self.gUse = NameSet() self.gDef = NameSet() # _locals is the stack of local namespaces # locals is the top of the stack self._locals = Stack() self.lUse = None self.lDef = None self.lGlobals = None # var declared global # holds scope,def,use,global triples for later analysis self.todo = [] def enterNamespace(self, node): ## print node.name self.scope.push(node) self.lUse = use = NameSet() self.lDef = _def = NameSet() self.lGlobals = gbl = NameSet() self._locals.push((use, _def, gbl)) def exitNamespace(self): ## print self.todo.append((self.scope.top(), self.lDef, self.lUse, self.lGlobals)) self.scope.pop() self._locals.pop() if self._locals: self.lUse, self.lDef, self.lGlobals = self._locals.top() else: self.lUse = self.lDef = self.lGlobals = None def warn(self, warning, funcname, lineno, *args): args = (self.filename, funcname, lineno) + args self.warnings.append(apply(warning, args)) def defName(self, name, lineno, local=1): ## print "defName(%s, %s, local=%s)" % (name, lineno, local) if self.lUse is None: self.gDef.add(name, lineno) elif local == 0: self.gDef.add(name, lineno) self.lGlobals.add(name, lineno) else: self.lDef.add(name, lineno) def useName(self, name, lineno, local=1): ## print "useName(%s, %s, local=%s)" % (name, lineno, local) if self.lUse is None: self.gUse.add(name, lineno) elif local == 0: self.gUse.add(name, lineno) self.lUse.add(name, lineno) else: self.lUse.add(name, lineno) def check(self): for s, d, u, g in self.todo: self._check(s, d, u, g, self.gDef) # XXX then check the globals def _check(self, scope, _def, use, gbl, globals): # check for NameError # a name is defined iff it is in def.keys() # a name is global iff it is in gdefs.keys() gdefs = UserDict() gdefs.update(globals) gdefs.update(__builtin__.__dict__) defs = UserDict() defs.update(gdefs) defs.update(_def) errors = Set() for name in use.keys(): if not defs.has_key(name): firstuse = use[name][0] self.warn(NameError, scope.name, firstuse, name) errors.add(name) # check for UndefinedLocalNameError # order == use & def sorted by lineno # elements are lineno, flag, name # flag = 0 if use, flag = 1 if def order = [] for name, lines in use.items(): if gdefs.has_key(name) and not _def.has_key(name): # this is a global ref, we can skip it continue for lineno in lines: order.append(lineno, 0, name) for name, lines in _def.items(): for lineno in lines: order.append(lineno, 1, name) order.sort() # ready contains names that have been defined or warned about ready = Set() for lineno, flag, name in order: if flag == 0: # use if not ready.has_elt(name) and not errors.has_elt(name): self.warn(UndefinedLocal, scope.name, lineno, name) ready.add(name) # don't warn again else: ready.add(name) # below are visitor methods def visitFunction(self, node, noname=0): for expr in node.defaults: self.visit(expr) if not noname: self.defName(node.name, node.lineno) self.enterNamespace(node) for name in node.argnames: self.defName(name, node.lineno) self.visit(node.code) self.exitNamespace() return 1 def visitLambda(self, node): return self.visitFunction(node, noname=1) def visitClass(self, node): for expr in node.bases: self.visit(expr) self.defName(node.name, node.lineno) self.enterNamespace(node) self.visit(node.code) self.exitNamespace() return 1 def visitName(self, node): self.useName(node.name, node.lineno) def visitGlobal(self, node): for name in node.names: self.defName(name, node.lineno, local=0) def visitImport(self, node): for name in node.names: self.defName(name, node.lineno) visitFrom = visitImport def visitAssName(self, node): self.defName(node.name, node.lineno) def check(filename): global p, checker p = parseFile(filename) checker = CheckNames(filename) walk(p, checker) checker.check() for w in checker.warnings: print w if __name__ == "__main__": import sys # XXX need to do real arg processing check(sys.argv[1]) ------------ badnames.py ------------ # XXX can we detect race conditions on accesses to global variables? # probably can (conservatively) by noting variables _created_ by # global decls in funcs import string import time def foo(x): return x + y def foo2(x): return x + z a = 4 def foo3(x): a, b = x, a def bar(x): z = x global z def bar2(x): f = string.strip a = f(x) import string return string.lower(a) def baz(x, y): return x + y + z def outer(x): def inner(y): return x + y return inner From Moshe Zadka Tue Mar 7 05:25:43 2000 From: Moshe Zadka (Moshe Zadka) Date: Tue, 7 Mar 2000 07:25:43 +0200 (IST) Subject: [Compiler-sig] Re: example checkers based on compiler package In-Reply-To: <14532.1740.90292.440395@goon.cnri.reston.va.us> Message-ID: On Mon, 6 Mar 2000, Jeremy Hylton wrote: > I think these kinds of warnings are useful, and I'd like to see a more > general framework for them built are Python abstract syntax originally > from P2C. Ideally, they would be available as command line tools and > integrated into GUIs like IDLE in some useful way. Yes! Guido already suggested we have a standard API to them. One thing I suggested was that the abstract API include not only the input (one form or another of an AST), but the output: so IDE's wouldn't have to parse strings, but get a warning class. Something like a: An output of a warning can be a subclass of GeneralWarning, and should implemented the following methods: 1. line-no() -- returns an integer 2. columns() -- returns either a pair of integers, or None 3. message() -- returns a string containing a message 4. __str__() -- comes for free if inheriting GeneralWarning, and formats the warning message. > I've included a couple of quick examples I coded up last night based > on the compiler package (recently re-factored) that is resident in > python/nondist/src/Compiler. The analysis on the one that checks for > name errors is a bit of a mess, but the overall structure seems right. One thing I had trouble with is that in my implementation of selfnanny, I used Python's stack for recursion while you used an explicit stack. It's probably because of the visitor pattern, which is just another argument for co-routines and generators. > I'm hoping to collect a few more examples of checkers and generalize > from them to develop a framework for checking for errors and reporting > them. Cool! Brainstorming: what kind of warnings would people find useful? In selfnanny, I wanted to include checking for assigment to self, and checking for "possible use before definition of local variables" sounds good. Another check could be a CP4E "checking that no two identifiers differ only by case". I might code up a few if I have the time... What I'd really want (but it sounds really hard) is a framework for partial ASTs: warning people as they write code. -- Moshe Zadka . http://www.oreilly.com/news/prescod_0300.html From mwh21@cam.ac.uk Tue Mar 7 08:31:23 2000 From: mwh21@cam.ac.uk (Michael Hudson) Date: 07 Mar 2000 08:31:23 +0000 Subject: [Compiler-sig] Re: example checkers based on compiler package In-Reply-To: Moshe Zadka's message of "Tue, 7 Mar 2000 07:25:43 +0200 (IST)" References: Message-ID: Moshe Zadka writes: > On Mon, 6 Mar 2000, Jeremy Hylton wrote: > > > I think these kinds of warnings are useful, and I'd like to see a more > > general framework for them built are Python abstract syntax originally > > from P2C. Ideally, they would be available as command line tools and > > integrated into GUIs like IDLE in some useful way. > > Yes! Guido already suggested we have a standard API to them. One thing > I suggested was that the abstract API include not only the input (one form > or another of an AST), but the output: so IDE's wouldn't have to parse > strings, but get a warning class. That would be seriously cool. > Something like a: > > An output of a warning can be a subclass of GeneralWarning, and should > implemented the following methods: > > 1. line-no() -- returns an integer > 2. columns() -- returns either a pair of integers, or None > 3. message() -- returns a string containing a message > 4. __str__() -- comes for free if inheriting GeneralWarning, > and formats the warning message. Wouldn't it make sense to include function/class name here too? A checker is likely to now, and it would save reparsing to find it out. [little snip] > > I'm hoping to collect a few more examples of checkers and generalize > > from them to develop a framework for checking for errors and reporting > > them. > > Cool! > Brainstorming: what kind of warnings would people find useful? In > selfnanny, I wanted to include checking for assigment to self, and > checking for "possible use before definition of local variables" sounds > good. Another check could be a CP4E "checking that no two identifiers > differ only by case". I might code up a few if I have the time... Is there stuff in the current Compiler code to do control flow analysis? You'd need that to check for use before definition in meaningful cases, and also if you ever want to do any optimisation... > What I'd really want (but it sounds really hard) is a framework for > partial ASTs: warning people as they write code. I agree (on both points). Cheers, M. -- very few people approach me in real life and insist on proving they are drooling idiots. -- Erik Naggum, comp.lang.lisp From DavidA@ActiveState.com Wed Mar 8 00:24:12 2000 From: DavidA@ActiveState.com (David Ascher) Date: Tue, 7 Mar 2000 16:24:12 -0800 Subject: [Compiler-sig] FYI: python CVS snapshots now include nondist subtree Message-ID: If I didn't screw up, the nightly CVS snapshots available at http://starship.python.net/crew/da/pythondists/ should now include the nondist subtree.. --david From jstok@bluedog.apana.org.au Mon Mar 13 10:43:20 2000 From: jstok@bluedog.apana.org.au (Jason Stokes) Date: Mon, 13 Mar 2000 21:43:20 +1100 Subject: [Compiler-sig] Is this the place to talk about the CNRI Python implementation? Message-ID: <000001bf8cdb$41417ac0$4be60ecb@jstok> That is to say, the main Python interpreter? I know this is the compiler sig, but there doesn't appear to be a list for hacking on the main implementation. Yet there must be. Can anyone help me out? From guido@python.org Mon Mar 13 14:47:23 2000 From: guido@python.org (Guido van Rossum) Date: Mon, 13 Mar 2000 09:47:23 -0500 Subject: [Compiler-sig] Is this the place to talk about the CNRI Python implementation? In-Reply-To: Your message of "Mon, 13 Mar 2000 21:43:20 +1100." <000001bf8cdb$41417ac0$4be60ecb@jstok> References: <000001bf8cdb$41417ac0$4be60ecb@jstok> Message-ID: <200003131447.JAA19202@eric.cnri.reston.va.us> > That is to say, the main Python interpreter? I know this is the compiler > sig, but there doesn't appear to be a list for hacking on the main > implementation. Yet there must be. Can anyone help me out? The Python newsgroup is the best place to start. --Guido van Rossum (home page: http://www.python.org/~guido/) From mhammond@skippinet.com.au Wed Mar 15 02:12:11 2000 From: mhammond@skippinet.com.au (Mark Hammond) Date: Wed, 15 Mar 2000 13:12:11 +1100 Subject: [Compiler-sig] Update for list.append change Message-ID: The P2C file "transformer.py" was bitten by the list.append change. Here is a diff for the version in the CVS tree of the compiler - Bill or Greg will also need to update P2C itself... Mark. RCS file: /projects/cvsroot/python/nondist/src/Compiler/compiler/transformer.py,v retrieving revision 1.8 diff -r1.8 transformer.py 572c572 < results.append(type, self.com_node(nodelist[i])) --- > results.append( (type, self.com_node(nodelist[i])) ) 839c839 < clauses.append(expr1, expr2, self.com_node(nodelist[i+2])) --- > clauses.append( (expr1, expr2, self.com_node(nodelist[i+2])) ) 961c961 < items.append(self.com_node(nodelist[i]), self.com_node(nodelist[i+2])) --- > items.append( (self.com_node(nodelist[i]), self.com_node(nodelist[i+2])) ) From gstein@lyra.org Thu Mar 16 12:17:30 2000 From: gstein@lyra.org (Greg Stein) Date: Thu, 16 Mar 2000 04:17:30 -0800 (PST) Subject: [Compiler-sig] Update for list.append change In-Reply-To: Message-ID: Fixed and checked in. Thanx! -g On Wed, 15 Mar 2000, Mark Hammond wrote: > The P2C file "transformer.py" was bitten by the list.append change. > > Here is a diff for the version in the CVS tree of the compiler - Bill or > Greg will also need to update P2C itself... > > Mark. > > RCS file: > /projects/cvsroot/python/nondist/src/Compiler/compiler/transformer.py,v > retrieving revision 1.8 > diff -r1.8 transformer.py > 572c572 > < results.append(type, self.com_node(nodelist[i])) > --- > > results.append( (type, self.com_node(nodelist[i])) ) > 839c839 > < clauses.append(expr1, expr2, self.com_node(nodelist[i+2])) > --- > > clauses.append( (expr1, expr2, self.com_node(nodelist[i+2])) ) > 961c961 > < items.append(self.com_node(nodelist[i]), > self.com_node(nodelist[i+2])) > --- > > items.append( (self.com_node(nodelist[i]), > self.com_node(nodelist[i+2])) ) > > > _______________________________________________ > Compiler-sig mailing list > Compiler-sig@python.org > http://www.python.org/mailman/listinfo/compiler-sig > -- Greg Stein, http://www.lyra.org/ From ludvig.svenonius@excosoft.se Fri Mar 31 17:17:39 2000 From: ludvig.svenonius@excosoft.se (Ludvig Svenonius) Date: Fri, 31 Mar 2000 19:17:39 +0200 Subject: [Compiler-sig] __getattr__ inflexibility Message-ID: I was wondering about the __getattr__-built-in method. Currently it is called only if the attribute could not be found in the instance dictionary. Would it not be more flexible to -always- call it upon referencing an attribute, thus allowing programmers to override the default behaviour of simply returning the value matching the name (for example, the instance could dispatch an event before returning the value, or update it from an outside source). What I'm missing in Python is a feature to define derived member fields that don't simply contain static values, but rather dynamic ones (like method return values) but in every other respect behave like a normal member (included in the dir() listing, but referenced without using parentheses). The reason I'm asking for this is that I am trying to create an API as syntactically similar to the W3X XML DOM Core (http://www.w3.org/TR/REC-DOM-Level-1/) as possible, but the actual objects behind the interface are only stubs that reference a dynamic instance tree inside a C++-application embedding the Python interpreter. Thus, I would like to be able to reference DOM members such as Node.nodeValue using the ordinary Python syntax (n.nodeValue) but the actual value cannot be represented as a normal Python member, since it must be fetched from within the C++ application (via a supplied extension). I could accomplish this by defining methods to retrieve the value instead of members, but the DOM standard defines that these values should be member fields, not methods, so in order to achieve syntactical compliance with DOM, I have to somehow intervene when the member is being referenced, and manually update its value before it is returned. I tried using __getattr__ for this, but because of the limitation that it is only called if the attribute is not found in the instance dictionary, it didn't work. I could get the reference syntax to work by just manually checking the attribute names in __getattr__, calling the extension functions and returning the values, but then the dir() function will not list the "simulated" members, because they are not actually in the dictionary, and if I try to get around this by putting them there, then __getattr__ won't be called. Perhaps there is another way to do what I'm trying to accomplish, but if not, would it not be a good idea to change the semantics of __getattr__ to be more similar with __setattr_, so attribute referencing behaviour can be overridden to allow things like composite and derived members? ================================================================== Ludvig Svenonius - Researcher Excosoft AB * Electrum 420 * Isafjordsgatan 32c SE-164 40 Kista * Sweden Phones: +46 8 633 29 58 * +46 70 789 16 85 mailto:ludvig.svenonius@excosoft.se ================================================================== From thomas.heller@ion-tof.com Fri Mar 31 19:48:45 2000 From: thomas.heller@ion-tof.com (Thomas Heller) Date: Fri, 31 Mar 2000 21:48:45 +0200 Subject: [Compiler-sig] __getattr__ inflexibility References: Message-ID: <046701bf9b4a$20919440$4500a8c0@thomasnotebook> > I was wondering about the __getattr__-built-in method. Currently it is > called only if the attribute could not be found in the instance dictionary. > Would it not be more flexible to -always- call it upon referencing an > attribute, thus allowing programmers to override the default behaviour of > simply returning the value matching the name (for example, the instance > could dispatch an event before returning the value, or update it from an > outside source). What I'm missing in Python is a feature to define derived > member fields that don't simply contain static values, but rather dynamic > ones (like method return values) but in every other respect behave like a > normal member (included in the dir() listing, but referenced without using > parentheses). This could be achieved by simply allowing mapping objects instead of only dictionaries. As I pointed out in a post to python-dev, (see http://www.python.org/pipermail/python-dev/2000-March/004448.html) the changes to Objects/classobject.c would be very small and would have nearly no impact on performance. The requirements I have are somewhat similar the what you describe. Thomas Heller