[Python-Dev] generator expression syntax

Robert Mollitor mollitor at earthlink.net
Wed Mar 24 00:46:37 EST 2004


Hi,

If I may, I would like to make a comment about the generator expression 
syntax.

The idea of generator expressions seems good, but the proposed syntax 
seems
a little wrong to me.  It seems a little wrong to me.

First, the syntax is too dependent on the parentheses.  To my mind, 
this is a fourth
meaning for parentheses.  (1) Parentheses group and associate 
expressions:
"(a * (b + c))", "if (a and b)".  (2) Parentheses construct tuples: 
"(1, 2, 3)", "()", "('a',)".
(3) Parentheses enclose argument lists (arguably a special case of 
tuple-constructor):
"def f(a, b, c)", "obj.dump()", "class C (A, B)".  And now (4*) 
generator expressions:
"(x*x for x in list)".  I realize that in some sense the parentheses 
are not part of the
expression syntax (since we wouldn't require "sum((x * x for x in 
list))"), but they are
mandatory nonetheless because you can't have "a = x*x for x in list".  
This seems
like it stretching a familiar construct too far.

Second, it looks like a "tuple comprehension".  The list comprehension 
construct
yields a list.  A generator expression looks like it should yield a 
tuple, but it doesn't.
In fact, the iterator object that is returned is not even 
subscriptable.  While

	def f(arg):
		for a in arg:
			print a
	f(x*x for x in (1,2,3))

will work as expected,

	def g(arg):
		print arg[1:]
	g(x*x for x in (1,2,3))

will not.

Third, it seems Lisp-ish or Objective-C-ish and not Pythonic to me.  I 
realize that is just a
style thing, but that's the flavor I get.

Fourth, it seems like this variable binding thing will trip people up 
because it is not obvious
that a function is being defined.  Lambdas have variable binding 
issues, but that is obviously
a special construct.  The current generator expression construct is too 
deceptively simple
looking for its own good, in my opinion.  My (admittedly weak) 
understanding of the variable
binding issue is that the behavior of something like

	a = (1,2,3)
	b = (x*x for x in a)
	a = (7,8,9)
	for c in b:
		print c

is still up in the air.  It seems that either way it is resolved will 
be confusing for various reasons.


OK, I not completely sure if this will work to everyone's satisfaction, 
but here is my proposal:  replace
the

	(x*x for x in a)

construct with

	lambda: yield x*x for x in a

CONS

  - "Ick, lambdas"
  -  Its longer

PROS

  - Lambdas are funky, but they are explicitly funky: look up 'lambda' 
in the index and go to that
section of the book

  - Use the variable binding rules of lambas and people will be as happy 
with that as they are with
lambdas in the first place (for better or worse)

- No new meaning for parentheses are introduced

- The grammar change is straight-forward (I think):

	replace

		lambdef: 'lambda' [varargslist] ':' test

	with

		lambdef: 'lambda' ( varargslist ':' test | ':' ( test | 'yield' test 
gen_for ))

	or

		lambdef: 'lambda' ( varargslist ':' test | ':' ( test | 'yield' test 
[gen_for] ))

(The last variant would allow a single element generator to be 
specified.  Maybe not terribly useful,
but as useful as

	def f(a): yield a

I suppose)

So here would be the recasting of some of examples in PEP 289:

	sum(lambda: yield x*x for x in range(10))

	d = dict (lambda: yield x, func(k) for k in keylist)

	sum(lambda: yield a.myattr for a in data)

	g = lambda: yield x**2 for x in range(10)
	print g.next()

	reduce(operator.add, lambda: yield x**2 for x in range(10))

	lambda: yield for x in (1, 2, 3)   # assuming we don't use list_for 
instead of gen_for in the grammar, I guess

	# Now if we use lambda behavior, then I don't think we would have free 
variable binding, so
	x = 0
	g = lambda:yield x for c in "abc" # The 'c' variable would not be 
exported
	x = 1
	print g.next()	# prints 1 (current x), not 0 (captured x)
	x = 2
	print g.next()     # would it print 2 now?  Obviously I don't have a 
firm grasp on the bytecode implementation

	# I think the following would work, too
	for xx in lambda: yield x*x for x in range(10):
		print xx

	# If so, how's this for decadent
	for i,a in lambda: yield i,list[i] for i in range(len(list)):
		print i, a


I hope this provided some food for constructive thought.


Robert Mollitor




More information about the Python-Dev mailing list