Jeremy Hylton : weblog : February 2004 last modified Thu Mar 17 01:11:16 2005

Jeremy Hylton's Web Log, February 2004

Rob Page on CMSWatch List

permanent link
Monday, February 02, 2004

Rob Page made the CMSWatch list of "20 Leaders to Watch in 2004:"

Sometimes open-source guys wear suits. For Rob Page, ex-U.S. Marine and head of Zope Corporation (founders of the open-source Zope platform), 2004 will likely be "all business" as the company sheds some of its extraneous Python R&D work to focus on content management modules. While much of the global Zope community's CMS development energy shifts to the lighterweight (but quite usable) Plone spin-off, Zope Corporation is placing a big bet on semi-commercial software add-ons -- under "visible source" licensing terms -- targeted for specific industries, like media and higher education....

Noted by Paul Everitt. CMSWatch is run by Barry's neighbor Tony Byrne.

Aikido language?

permanent link
Tuesday, February 03, 2004

A new scripting language from Sun Labs, Aikido. No idea what's new about this language.

Perhaps it's just a question of pitching the language better. I had an immediate positive reaction to Groovy: I see why this is cool. I didn't see anything cool in Aikido.

Google bombing

permanent link
Tuesday, February 03, 2004

The Times had a short piece about "Google bombing" a couple of weeks ago, including a discussion of whether Google was worried about people tampering with its results. Ed Felton explains why the question of tampering doesn't make a lot of sense:

Google is not a mysterious Oracle of Truth but a numerical scheme for aggregating the preferences expressed by web authors. It's a form of democracy -- call it Googlocracy.

I was making a similar point to Heather last week. To claim that tampering is a problem presupposes that there is a specific correct ranking of search results for a particular query that search engines strive to attain. Rather Google has a specific scheme that returns useful results, based on the links people make. If I add a link for miserable failure, why is that link any less valid that a link to Python documentation?

Problems with nested scopes and lazy evaluation

permanent link
Wednesday, February 04, 2004

It's a strain to combine the new generator expressions feature and nested scopes. The standard scoping rules are inconvenient for the lazy evaluation of generator expressions, but it's very strange for a list comprehension and an apparently identical generator expression to produce different results.

There is a lively discussion in the Sourceforge patch tracker concerning generator expressions. The crux of the matter is the difference between generator expressions and list comprehensions:

>>> lst = [lambda x:x+y for y in range(3)]
>>> for f in lst:print f(0)
...
2
2
2

>>> gen = (lambda x:x+y for y in range(3))
>>> for f in gen:print f(0)
...
0
1
2

The target of the generator expression is evaluated in a copy of the environment that the gen expr was defined in. The nested target scope captures the values of names at definition time rather than run time. It will be a wart if these two expressions yield different sequences.

Armin Rigo says he's ready to rant about problems with nested scopes. "As far as I'm concerned it is yet another reason why free variable bindings ("nested scopes") are done wrong in Python currently :-(" I wonder what the other reasons are?

In the generator expressions discussion, the question has been phrased in terms of late binding versus early binding, where late binding means the standard namespace rules and early binding means copying the value of a binding when the function is defined. (Early binding works like default argument values.)

In the absence of rebinding of free variables, I don't think there are any cases where early or late binding makes a difference for plain nested functions. It would be unusual for a nested function to depend on changes to a free variable, because changes could only occur while the block containing the actual binding was being executed. On the other hand, Guido is now inclined to allow rebinding of free variables . I think it would make a big difference then.

Scheme doesn't seem to have the same problems with its namespaces. I suspect that lexical scoping and side-effects have problematic interactions. The functional style typical of Scheme avoids those problems in many cases. Or, set! stands out in a way that "for x in range(3)" doesn't. (More generally, I wonder how other strict languages integrate lazy features -- imperative languages in particular.)

It looks like delayed evaluation interacts very poorly with side effects. The examples discussed on the list involved a free variable that in a generator expression that is rebound after the gen expr is created. I think all of the examples involve for loops, where the target of the for loop is used in the target expression. This kind of code will always be delicate.

On python-dev, Tim Peters has posted the only real world examples of generator expressions. He notes:

Whenever I've written a list-of-generators, or in the recent example a generator pipeline, I have found it semantically necessary, without exception so far, to capture the bindings of the variables whose bindings wouldn't otherwise be invariant across the life of the generator. [If] it turns out that this is always, or nearly almost always, the case, across future examples too, then it would just be goofy not to implement generator expressions that way ("well, yes, the implementation does do a wrong thing in every example we had, but what you're not seeing is that the explanation would have been a line longer had the implementation done a useful thing instead" <wink>).

The example Tim posted was about a pipeline of predicates on a text stream. He can use generator expressions to create a lazy filter that yield a token only if each predicate returns True. It is delicate code; the first few times I read it, it was hard to understand. I'd feel better about generator expressions if the examples were easier to read, but maybe that will just come with time.

    pipe = source
    for p in predicates:
        # add a filter over the current pipe, and call that the new pipe
        pipe = e for e in pipe if p(e)

How would you re-write the example to use the late binding approach that I advocated on python-dev earlier?

    def extend(pipe, pred):
        return e for e in pipe if pred(e)

    pipe = source
    for p in predicates:
        pipe = extend(pipe, p)

This seems a clearer, because it avoids any free variables in the generator expression. It is a few more lines of code.

A related complaint, first voiced by David Beazley, is that the target of the for loop in a list comprehension should not be visible outside of the list comp expression. If the for target happens to have the same name as a local variable, you get a conflict. It's relatively easy mistake to make, because the list comprehension is just an expression. Expressions don't bind names, right? It even looks distinct, since it's wrapped in brackets.

Guido has recently changed his mind about the issue and agrees with David that the list comprehension name bindings should not be visible to the enclosing function. It should be have as if the list comprehension were defined in a nested function.

Thus,

L = [x for x in range(3)]
would be equivalent to
def listcomp():
    return [x for x in range(3)]

L = listcomp()

The implementation could just use renaming and avoid the expense of a function call, although producing useful variable names in error messages would be tricky.

It's also, strictly speaking, not compatible with the current definition, which allows odd expressions like "[x for x in x]" where x is a local variable defined earlier in the code block. It could be made to work with a special renaming approach, but the semantics would sound odd: The namespace of list comp targets is separate from the namespace of list comp generator expressions.

PyCon Schedule Posted

permanent link
Saturday, February 21, 2004

The PyCon schedule and list of talks are both online. It may be the strongest program ever for a Python conference.

There is a great collection of interesting topics: implementation talks including Jim Hugunin on IronPython and talks on type inference, PyPy, and Spy; applications like VoIP, kitchen design, and spam filtering; and a full session on Pyrex by Paul Prescod. There's one session on Zope and two others on Web programming; there are fewer Twisted talks, but the two scheduled ones look good.

It's great to see an informal, low-cost conference attracting so many good presentations. I was a little surprised that there were so few Zope talks. There were a lot of Zope talks at EuroPython; maybe Zope users and developers are more likely to be there.

The PyCon sprints are going to be a bigger part of PyCon this year. We had four small, two-day sprints last year. There are more sprints this year and many more sprints -- 75 people registered at last count. There's lots of Zope here -- Zope 3, Zope 2, and Plone -- and Chandler, Twisted, reStrucutedText, and a generic Python core and sundry topics sprint. With luck, there will also be one on Mailman or the email package. Neal Norwitz and I are hoping to polish off the new AST compiler by then.