macro FAQ

Sat Aug 23 20:42:23 EDT 2003

"Andrew Dalke" <adalke at mindspring.com> writes:

> Here's a proposed Q&A for the FAQ based on a couple recent threads.
> Appropriate comments appreciated

Such an FAQ item is probably a good idea.

However, I think that we should distil out much of the opinion
(particularly the opinions about opinions of others :-) and try to
base it on fact:

- what are macros (C-like and Lisp-like),

- what are the technical difficulties with introducing them into
  Python

> X.Y:  Why doesn't Python have macros like in Lisp or Scheme?
> 
> Before answering that, a clarification on what 'macro' means.
> A Lisp macro is a way of modifying code when that code is first
> defined.  It can rearrange the structure of the code, and add and
> remove parts of it.  Unlike C's #define macro language, Lisp
> macros understand the structure of the expression and the
> context (or "closure") in which it is found.

A closure is a function which "remembers" variables from an enclosing
lexical scope (also available in Python, since 2.1). Lisp macros
merely replace themselves with some source code (their expansion)
before the compiler sees them. I'm not sure what you mean by "macros
... understand the context".

> Here's a simple example of what a macro might do for Python.
> A complaint about Python is that constructors for simple data
> types require typing the parameter names three times, as in
> 
> class Country:
>     def __init__(self, name, capitol):
>         self.name = name
>         self.capitol = capitol

[snip]

Probably not the most exciting or edifying example, particularly
given that metaclasses should be able to do it.

I'm sure we'll be able to come up with a better one. I'll put some
ideas forward at the end, to act as a starting point.

> Macros work well in Lisp or Scheme, where the code is the data, all
> written pretty much as a parse tree.

This is very important, and is central to the problem of "how do we do
it in Python?"

> The second problem is that Python's code blocks do not
> store the original parse tree, so there's nothing to manipulate.

[...] 

> Nowhere will you find the original parse tree.

This is typically true in Lisp as well, particularly after you have
compiled your code. (Common Lisp _must_ have a compiler, but may _in
addition_ have an interpreter; the standard defines a function which
returns the source code of a function, but it is allowed to say
"sorry, you can't have it".)

The point in Lisp is that you _write_ your code in the form of a parse
tree, in Lisp lists, and you can pass your parse tree to a macro
without having to re-format it, and have the macro easily manipulate
it. It's not clear how you would do this in Python.

But you seem to be implying that macros go off and find the parse
trees of code that has already been compiled, and then change it?
(Hint: they don't.)

> The deeper question is, should Python include macros?
>
> People with a Lisp background might have a problem understanding
> this viewpoint.

For the record, I am not advocating the inclusion of macros in
Python. (I do object to the suggestion that macros are responsible for
some supposed "fragmentation" of Lisp.)

> More importantly, the detractors -- including those with plenty of
> experience using macros in Lisp -- argue that macros cause dialects
> to form.

Could you please give me a reference to someone "with plenty of
experience using macros in Lisp" arguing this ?

I just don't believe it. (That's not to say that it's not true.)

> Macros can modify other code to make it fit the problem better,

What do you mean by "other" code? Macros modify the code that is
passed to them as an argument, transforming it before the compiler
gets to see it. I get the impression that you believe that macros can
somehow modify code from other parts of the program. They can't.

Ultimately, macros just save you a lot of typing of source code. (And
thereby save you a lot of bugs.) If you can't type it as soure code,
then a macro can't do it.

> while functions only use other code but make no modifications.

This only strengthens the above suspicion, but I'm not really sure
what you mean, here.

> This makes them very powerful but means that understanding a section
> of code requires also knowing about any macros which might use the
> code.

What do you mean by "macros using code" ?

The macros are part of the code, just like functions are. To
understand the code, you must understand what the macros and functions
do.

> In an extreme case which wouldn't be used in real projects, every *
> could be replaced with a +.

This almost completely convinces me that you are very confused about
what macros can achieve.

Either, you are suggesting that one might write a macro which replaces
every * operator with a + operator, and then pass _the entire source
code_ of a project to it (possible, but patently absurd); or you are
suggesting that it is possible to write a macro "over here" which,
somehow, surreptitiously modifies existing source code "over there".

a) Macros cannot do the latter.

b) You can achieve something very similar in Python, by re-binding
   attributes of __builtins__.

Think of the point of macros in another way. The point is _not_ to
take existing source code, and to change it. The point is to take
something that is not (necessarily) valid source code, and turn it
into valid source code, before the compiler gets to see it. Why?
Because this allows you to create a shorthand notation, and use macros
to expand it to real code. Think in terms of encoding design patterns.

> (As an aside, some proponents argue that macros and
> functions are essentially the same.  Alex Martelli made
> an interesting observation about one difference:  it's often
> worthwhile to turn a set of code into a function even if it
> is only called from one place, because it helps simplify
> the code into smaller chucks which are easier to understand.
> Macros, on the other hand, should almost never be used
> unless they are used many times.)

I broadly agree. (I think :-) 

A function written for a single-location-call, does not really
abstract anything, it just moves some code elsewhere to make the
original location look less hairy. Rarely would you need to use a
macro for this purpose, but in such cases there would be no objection
to writing a single-invocation macro.

Macros which truly abstract something, can be difficult to write and
difficult to read; in such cases it is important that the cost of
creating a robust macro and the cost incurred by others in trying to
understanding it, be offset by the savings made by its _repeated_
usage.

But then, functions which provide an abstraction are also more
difficult to write and understand than ones which merely "move code
out of the way", and you'd have to think twice whether the abstraction
is really useful, before deciding to pay the price for writing it, and
making readers understand it.

Even in this respect, there is no clear-cut distinction between
functions and macros.

> With only one or a small group of people working together
> on a project there is little problem.  Macros help in developing
> idioms specific to the problem and group.  When groups
> share code, they also share idioms, and anyone who has had
> to deal with UK vs. US English knows the effect idioms have
> in mututal understanding.
> 
> Those against macros say their benefits do not outweigh
> the likelihood that the Python community will be broken up
> into distinct subsets, where people from one group find it
> hard to use code from another.

I believe that anyone reaching such a conclusion can only do so
on the basis of a misunderstanding of what macros can do.

People in the music software "group" will find it hard to use code
from people writing software for bioinformatics ... with or without
macros.  This has nothing to do with macros.

OK, I promised some examples of macros. Note that I have not shown a
single line of Lisp code in these threads, because I agree that it is
likely to be meaningless to most readers of this group. I'll try
continue without having to resort to Lisp examples.

==== Example 1: a debugging aid ================================

One litte macro I have for debugging purposes (let's call it "show"),
allows me to print out the source code of a number of expressions,
along with their values. At this point, note that Python's horribly
irregular syntax <0.5 wink> already starts causing problems: should
the macro look like a block of code, or should it look like a function
call; in Lisp both look identical.

The "block" version of the show macro invocation:

    show:
        sin(x)
        a
        a + b + c

The "function" version of the show macro invocation:

    show(sin(x), a, a + b + c)

In both cases, the action of the macro should be to replace itself
with its expansion, _before_ the compiler gets to see the source code.
The expansion should look like this:

    print "sin(x) =>", sin(x)
    print "a =>", a
    print "a + b + c =>", a + b + c

Note the key points:

1) The macro receives some data, and transforms it into valid Python
   source code

2) This happens before compile time

3) Nothing "outside" the macro call gets affected by the macro's
   action.

Can this be implemented using metaclasses, or any other existing
Python mechanism ?

=== Example 2: Alexander Schmolck's updating classes ===============

Alexander recently expressed the desire to have all existing instances
of a class be updated, when he changes the source of his class, and
re-evaluates the definition.

This might be achieved by doing something like the following:

    temp = MyClass
    class MyClass:
        blah
    temp.__dict__ = MyClass.__dict__

I'm not so much interested in the fine Python details of exactly what
needs to be modified (__dict__, __bases__, whatever); what is
important is that at least 3 distinct steps need to be taken, and
trying to bundle these three steps (this pattern) into a function
is impossible, because you can't pass the text of a class definition
body to a function. (You could pass it as a string, but then your IDE
would treat it as a string and refuse to indent it for you, and
writing classes this way would be very unnatural.)

A macro might help as follows. Write a macro called updating_class,
which you would use instead of the built-in class statement. This
macro might work according to an algorithm like this:

    class_name = <the first token of what was passed in>
    the_body = <everything that was passed in except, the first token>
    expansion  = "temp = %s" % class_name
    expansion += "class %s%s" % (class_name, the_body)
    expansion += "temp.__dict__ = %s.__dict__" % class_name
    return expansion

Here I'm assuming that the return value of the macro replaces the
macro call in the source code, before the compiler gets to see the
source code (which is pretty much what happens in Lisp).

So,

    updating_class foo:
        blah

would turn into

    temp = foo
    class foo:
        blah
    temp.__dict__ = foo.__dict__

Again, note that any code not appearing "inside" the macro call, is
not affected by the macro in any way ... almost ...

... This second example also serves the purpose of demonstrating free
variable capture (the other type of variable capture is called macro
argument capture). Variable capture is what is (according to the
Schemers) unhygenic about CL macros. What is it? Well, note that, if a
variable called "temp" had already existed in our enclosing scope, the
updating_class macro expansion would have clobbered it.

Scheme gets around this by ensuring that "temp" lives inside a local
scope of the macro. The CL philosophy respects the fact that sometimes
you deliberately _want_ to capture some variables. CL allows you to
protect yourself against variable capture by using the "gensym"
function; a function which creates a symbol (think of it as an
identifier), which is guaranteed not to have existed before.

This has turned out rather long, but I hope that it demystifies these
scary exotic macros for at least one or two people on the list.