[Python-Dev] Re: Proposal: get rid of compilerlike.py

Eric S. Raymond esr@thyrsus.com
Sat, 1 Sep 2001 17:52:10 -0400


Guido van Rossum <guido@python.org>:
> I have now re-read that discussion; it's in the archives starting this
> message:
> 
> http://mail.python.org/pipermail/python-dev/2001-August/016629.html

As have I.  All the stuff in this thread was before the checkin; 
you were in fact mistaken about the timing of most of the discussion.

> There were several suggestions to merge it with fileinput and some
> suggestions to restructure it.  You seem to have ignored these except
> the criticism on the name "ccframe" (by choosing an even worse name
> :-).

I did not ignore these suggestions (one that I took was Greg Ward's
suggestion that, after all, just throwing an exception was the right
thing).  And I was in fact planning to merge this thing with fileinput.

Then I looked as what would have to be done to the documentation of
fileinput -- in fact, I edited together a combined fileinput
documentation page.  The result was a mess that convinced me that this
does indeed need to be a separate module.  There wasn't enough
coherence between the old fileinput stuff and my entry points to even
make the *documentation* look like a logical unit, let alone the code.

> > What is going on here?  Is it possible that you are mistaken about the
> > timing of the checkin, and that what you thought was discussion afterwards
> > was discussion before?  Or am I somehow missing listmail?
> 
> Your mail was probably broken -- it wouldn't be the first time :-(.

In the event, my mail was not broken.

> There are two posts in the archives that start with a quote from the
> checkin mail:
> 
> http://mail.python.org/pipermail/python-dev/2001-August/017131.html
> http://mail.python.org/pipermail/python-dev/2001-August/017132.html

Right...one of which completely misses the point by suggesting that
this is a filter framework, and the other one of which is a "me too"
basically addressing the naming issue.  Guido, you are yourself
*notorious* for dismissing naming issues with "that's unimportant" and
"we can fix it later".  How can you criticize me for doing likewise?
 
> > As for process issues...I agree that we need better procedures and
> > criteria for what goes into the library.  As you know I've made a
> > start on developing same, but my understanding has been that *you*
> > don't think you'll have the bandwidth for it until 2.2 is out.
> 
> That's not an excuse for you to check in random bits of code.

So what, exactly, makes this 'random'?  

That, Guido, is not a rhetorical question.  We don't have any
procedures.  We don't have any guidelines.  We don't have any history
of anything but discussing submissions on python-dev before somebody
with commit access checks them in.  If no -1 votes and the judgment of
somebody with commit privileges who has already got a lot of stuff
in the library is not sufficient, *what is*?

I'm not trying to be difficult here, but this points at a weakness in
our way of doing things.  I want to play nice, but I can't if I don't
know your actual rules.  I don't know what *would* have been sufficient if
what I did was not.  I don't think anyone else does, either.

> Some comments on the code:

This is the sort of critique I was looking for two weeks ago, not a bunch
of bikeshedding about how the thing should be named.
 
> - A framework like this should be structured as a class or set of
>   related classes, not a bunch of functions with function arguments.
>   This would make the documentation easier to read as well; instead of
>   having a bunch of functions you pass in, you customize the framework
>   byu overriding methods.

Yes, I thought of this.  There's a reason I didn't do it that way.
Method override would work just fine as a way to pass in the filename
transformer, but not the data transformer.

The problem is this: the driver or "go do it" method of your
hypothetical class (the one you'd pass sys.argv[1:]) can't know which
overriden method to call in advance, because which one is right would
depend on the argument signature of the hook function -- does it take
filelike objects, does it take two strings, etc.  Actually it's worse
than that; two of the cases (the sponge and the line-by-line
filtering) aren't even distinguishable by type signature.

So, what the driver function could do is step through three method
names looking to see which if any is overridden in the user-created
subclass.  But would that really be a gain in clarity over having three
functions in the module?  I'm willing to listen if you think the
answer is "yes" and want to tell me why, but it didn't seem so to me.

There's something else I could have done.  I could have required that
the hook function use specific unique formal argument names in each of the
three cases and then had the driver code use inspect to dispatch among
them -- but that seemed even more klugey.

Maybe there is a really elegant and low-overhead method of wrapping
these functions in a class, and I have just not found it yet.  But if
so, it is not (as you appear to believe) for lack of looking.  If you
have an insight that I have missed, I will cheerfully accept
instruction on this issue.

> - The name "compilerlike" is a really poor choice (there's nothing
>   compiler-like in the code).

No, there isn't.  It's called "compilerlike" because it's a framework
for making compilerlike interfaces out of functions.  But I'm not 
attached to that name; CompilerFramework or something of that sort
would be fine.
 
> - I would like to see failure to open the file handled differently (so
>   the caller can issue decent error message for inaccessible input
>   files without having to catch all IOError exceptions), but again
>   this is a policy issue that should be customizable.

Originally the code originally fielded file I/O errors by complaining
to stderr and then exiting.  At least two respondents argued that it
should simply throw an exception and let the caller do policy, and
upon reflection I came to agree with this (this is one of those
suggestions you thought I was ignoring).

I realize it's tempting to try and embed a range of policy options in
the module to save time, but unless we can have reasonable confidence
that they will cover all important cases I don't judge the complexity 
overhead to be worth it.  Again, I am open to instruction on this.

> - The policy of not writing the output if it's identical to the input
>   should be optional.  There are contexts (like when the tool is
>   invoked by a Makefile) where not writing the output could be
>   harmful: if you touch the input without changing it, Make would
>   invoke the tool over and over again because the output doesn't get
>   touched by the tool.

Interesting point.  A better rule, perhaps, would be to suppress
writing of output only if both the content *and* the transformed
filename are identical -- that would avoid doing a spurious touch
on a no-op modification in pace, without confusing Make.

>                      Moreover, there seems to be some bugs: if the
>   output is the same as the input, the output file is not written even
>   if a filename transformation was requested (making Make even less
>   happy); when a transformation is specified by a string, an undefined
>   variable 'stem' is used.  Hasty work, Eric. :-(

I'll take the hit for this; my test framework should have covered that case
and didn't, because I was in a hurry to get in before the freeze.  However;
I know the other cases work because I'm *using* them.

OK, so here's how I see it:

1. I made a minor implementation error with one case; this can be fixed.

2. You were mistaken in believing that (a) there was no discussion or
endorsement of the idea before hand, and that (b) I did not defend or
justify the design.

3. Some of the respondents simply missed the point; this thing is *not* 
a framework for creating filters, and shouldn't be named like one or
put in the wrong library bin because of it.

4. There is room for technical debate about the interface design, but no
choice I'm aware of that is *obviously* better than three functions -- the
class-wrapper approach would have unobvious problems doing the hook function
dispatch properly.

5. I was trying to do the right thing, but we sorely lack a useful set of
norms for what constitutes `good' vs. `bad' librsary checkins.  I am 
actively interested in helping solve problem.
-- 
		<a href="http://www.tuxedo.org/~esr/">Eric S. Raymond</a>

Non-cooperation with evil is as much a duty as cooperation with good.
	-- Mohandas Gandhi