lazy (?) evaluation, request for advice

Fri Jan 7 22:24:07 EST 2000

[Alex Martelli]
> Here we go again.
> ...

> ... [arguments of the form "I'm implementing my language
>      using Python, therefore it has Python semantics"]

The imposition of lazy semantics onto unattributed names cannot be explained
by reference to the Python language, without detailing your implementation.
No Python programmer using your language could account for their program's
runtime behavior, since such behavior is not provided by Python directly,
and there is (as you had determined already in your original post) no way
even to *trick* it into such behavior:  you have to emulate (or otherwise
interfere with) key parts of Python's runtime evaluation yourself to get the
behavior you want.  So names in your language do not mean what they mean to
a Python programmer -- they have different behavior.  That's about all there
is to it!

You don't want Python's builtin behavior, and that's OK by me -- don't
*call* it Python anyway and I'll have no argument (although I'll still have
advice you dislike ...).

"eval" etc are irrelevant; those just happen to be what you're using (or may
use) to implement your variant semantics.

It is not the same as library etc API design.  The lowest level of what the
user sees works in ways that Python does not, in areas where Python does not
provide hooks to change the way it works.

> ...
> If nothing changes except for speed,

I count speed as a crucial aspect of behavior here.  So do you (you've said
so often enough <wink>).  Lazy evaluation has purely semantic consequences
too in a language with side-effects (changes in when-- or even if --things
get called can have very visible consequences).

> how can what-used-to-be-Python's semantics suddenly become
> "my own private and badly designed language"'s...?

While I have suggested several times that it would be better to let your
users write in straight Python (yes, with an ordinary "import this, that"
module UI), I've said very little about your design.  I certainly didn't
call it "badly designed".  If anyone asked <wink>, I would *not* say it's
badly designed.  I merely think it (as I said before) falls into a classic
little-language trap (providing ungeneralizable magic catering to its first
envisioned applications).  It's a classic trap *because* it looks so
attractive at first.

> ...
> So, I'll have a textfield where (as one of the different
> optional ways to express what it is they're looking for)
> users can enter a Python expression (an expression in the
> language which you loudly claim is not Python but rather
> "my own specially purpose-designed language") -- if they do
> that then clic a certain pushbutton, the expression gets
> (perhaps computed, then) evaluated as above.

I agree with Tres that it would be nice if this GUI showed (or offered to
show) a translation into Python -- as he said, "learn by example".  When
they get to writing complex functions, to do things you haven't anticipated,
they'll need to learn how to get things done for themselves.

Obviously, the harder it is *to* supply a straight Python equivalent, the
more dubious your original claim that "I'd much rather let the user write
the deal-selection function in Python" (you do realize we're still arguing
about that single sentence -- indeed, the last two words of it <0.7 wink>?).

>> From what you've said so far, it doesn't *look* like anything
>> fancier than a regexp replace is needed; e.g.,

> Looks to me like that could be slow if the vocabulary is
> pretty large.  E.g., hundreds of possible words, of which
> maybe 4 to 6 will typically be used in one expressions.

Again, it's impossible to be truly helpful here before you define your
language fully:  if you're not defining your own language, how is it that
all your questions are about language translation and implementation
techniques <0.1 wink>?  The most appropriate techniques depend on details of
the source language, and c.l.py hasn't seen a definition. "What's the best
way to sort?" -- same thing.

Anyway, that you may have hundreds of special keywords wasn't mentioned
before.  Yes, the Python/Perl flavor of backtracking regexp engine will bog
down given hundreds of alternatives; but if you're going to do the
translation once and then run the function thousands (hundreds of
thousands?) of times, does that matter?  Time it and see!  The regexp engine
runs at C speed, so it's not obvious that speed is an issue here.

> ...
> I have no intention of somehow forbidding users from having
> string literals in the expression they can enter, although
> offhand I'm not sure what use they could have for them in
> such an expression.

Then you have a harder parsing problem.  I'd suggest looking at these next:

1) What (the current CVS version of) pyclbr.py (std library) does
   to skip over string literals rapidly.  This is ad hoc regexp
   trickery.

2) tokenize.py (std library).  High-level parsing of "Python-like"
   syntax.  Delivers a pure token stream, not a parse tree.  But
   that shouldn't matter unless you intend to introduce e.g.
   new block constructs.  No canned way to convert back to source,
   but very easy to write that.

3) Marc-Andre Lemburg's mxTextTools pkg (Starship), provided you
   can tolerate shipping with an extension module (i.e., the
   search engine in that is coded in C).  This will be the
   fastest; a non-backtracking recognition DFA can be built that
   recognizes keywords in one pass no matter how many there are;
   ditto for skipping over strings; etc.

4) John Aycock's SPARK system (see DejaNews for a current URL).
   This is wonderful for rapid prototyping of translators, but
   not swift compared to the alternatives.

...

>> Didn't work, eh?

> Apparently not.

Ya -- I picked that up.  Thanks for the confirmation <wink>.

offense-is-where-you-look-for-it-ly y'rs  - tim