Web devel with python. Whats the best route?

Thu Jan 11 04:28:16 EST 2001

"Sam Penrose" <spenrose at well.com> wrote in message
news:spenrose-149DD9.22095710012001 at news.dnai.com...
    [snip]
> > +for i in range(len(results)):
> >     <TR> <TD><B>@i@</B></TD> <TD>@results[i]@</TD> </TR>
> > -
> >
> > All the presentation logic goes into the template, all of
> > the computation into the Python CGI script, and my little
    [snip]
> 1) There is an irreducible need to either put Python statements into
> HTML or HTML tidbits into Python. You've chosen the former; my
> preference is for the latter:

Or else place both Python and HTML pieces into something 'above'
either, yes (it doesn't _have_ to be HTML, of course; any kind of
textual 'presentation language' will do just as well).  Against
the 'something above either' approach (side note)...:
    Even if the (somewhat richer/more complicated/more flexible and
    indirect) '3-pronged' approach is used, the presentation language
    template will still have to be using SOME kinds of markers to
    signify [a] substitution points for Python-computed values, [b]
    pieces that need to be repeated and conditions for repetition,
    [c] conditionally-included pieces (could be framed as a special
    case of [b] with 0 or 1 'repetitions').  [b] in particular seems
    to require some sort of minilanguage, and I don't see a need to
    invent, document, and implement a special-purpose minilanguage
    for something I can better express directly in Python.
So, back to the 2-pronged approaches only...:

For me, the deciding issue is separating issues of computation and
presentation.  With the presentation language (whatever it might
be) embedding Python pieces, all of the presentation issues are
fully concentrated in the 'template' file; the only level of coupling
are the names of the variables being presented (there's nothing that
stops a template-author from putting business logic in the Python
snippets he or she embeds, but the convention is that this is "just
not done" -- no enforcement... highly Pythonesque IMHO:-).

There is an underlying asymmetry: I'm quite willing to commit to
Python, only, as the language dealing with computation (as far as
the interface to presentation is concerned; if any other language
is involved with computing the results, Python will be dealing
with that, presenting a Python-only view at the interface anyway).

On the other hand, I'm *not* willing to commit to HTML, only, as
the language dealing with presentation; I want emitting RTF, or
Postscript, etc (or even different 'kinds' of HTML -- dialects
for different user-agents, 'skins', etc), to be as easy as using
a different presentation-language-template file (the broad class
of 'textual presentation languages' seems a satisfactory set).
[Actually, what I _do_ have that IS important to me in that toy
application are different HTML templates for different *human*
languages -- just one Python, since the computations are the
same, but different templates anyway; maybe supporting multiple
human languages is not important to you, but, working in a
country whose native language is not English, it looms large].

I'm not sure that it is this asymmetry that makes the direction
I've chosen for the embedding so satisfactory to me, but I do
suspect it's that.

> 2) You appear to use '+' and '@' (and '-'?) as non-Python, non-HTML
> markers to indicate the border between the two languages. One of the

Right, that's what I was using in this example (the '-' at line start
indicates 'de-dent', i.e., block-end; not strictly needed if one trusts
whitespace, I guess, but what with whitespace-eating viruses about to
be unchained, I played it safe:-).

> reasons I don't like systems involving executable language statements
> embedded in HTML, such as Zope, is that they all seem to require such
> markers. There are two costs associated with this:
>    i. Yet Another Syntax to learn. Someone who knows Python and HTML
> (me, for example) will not know how the syntax works. It looks as though
> you are using the '+' (an otherwise valid HTML character) to indicate
> the beginning of a Python block and the paired '@' markers for string
> substitution--but there might be regular expressions involved. I can't
> tell.

No, of course you can't tell from an undocumented 3-line example -- sorry,
I was not meaning to imply one could!  Particularly given that...:

There is a class of programmer, to whom I appear to belong, who's
past master at delaying key architectural decisions through insertion
of possibly-unwarranted generality in "infrastructure" modules.  In
my case, it comes from a lifetime spent programming (to-me) fascinating
software infrastructure and regularly clashing with a slight variant
of one of Parkinson's Laws -- and herein lies a tale.

The original version of the Law I have in mind: time spent discussing
decisions at a board-meeting is inversely proportional to the actual
financial impact of each decision.  Rationale: a decision swinging huge
amounts of money is likely to need lots of research and possibly some
specialistic background to even start approaching understanding, and
by far most board members won't have the needed background and will not
have bothered to do lots of homework in advance anyway.  Such decisions
often end up rubber-stamped because of that -- the proponent is the
only one who really understands the pro's and con's, he or she gives a
skewed presentation reflecting a specific viewpoint, the board members
are not willing to display too much ignorance (and the fact that they
have NOT read through the 400-pages document summarizing the points that
was supplied in advance of the meeting:-).

OTOH, board-meetings also take lots of trivial decisions, and _those_
often require no special background or preparation -- anybody can argue
hotly and longly about whether the 2000 pounds budgeted for repainting
the walls of the executives' bathroom is or isn't a reasonable amount,
far more than they can understand whether the 2 million pounds budgeted
for an overhaul of the firm's information-systems are.  Whence, MUCH
more heated discussion on the thousands, than on the millions.

The programming equivalent is, "syntax sugar".  I've made many deep,
and sometimes subtly wrong, design decisions over the last 25 years.
But when I go present my prototype, and propose turning it into a key
production element which will contribute to make or break our software's
overall success, hardly ever does anybody perk up and say, "I don't
understand why, in the transaction protocol you present on page 84,
you specify _synchronous_ handshaking -- what will *that* do to our
scalability on multiprocessor systems, haven't you accidentally ended
up introducing unwarranted processor-affinity?".  Oh no, the debates
are invariably about my choice of commas as opposed to semicolons (or
vice versa) as item-separators in debug-dumps, or something equally
"crucial".  THAT is something everybody's sure to have a strong opinion
about, even if they think processor affinity is a fancy new ice
cream taste.

So, over the years, one learns the (not necessarily optimal) habit of
leaving such silly surface issues flexible and easily generalized --
perhaps even exposing them as subsystem configuration variables or the
like.  Saves just too much time in design-review meetings!-(

So, in this vein, do you think I *hard-wired* the way embedded Python
expressions and statements are specified in the templates?!  *Not on
your life*!  They're *arguments* to the 'expand-this-template' function;
*regular expressions*, even, just so as to ensure I can wiggle out of
whatever syntax-sugar quibble is thrown across the meeting room.  (Yeah,
yeah, I know this was a little toy utility for my own use, and NOT bloody
likely to ever see the light of a design-review meeting room -- but
certain habits are hard to unlearn!-).

You can probably read between the lines that I'm not _proud_ of this
meta-decision to shirk decision, in the general case.  Here, though,
it was easy to rationalize, because I did in fact need this generality
to enable _whatever_ textual presentation language as the underlying
language... I could NOT rely on specific characteristics of HTML.
Also, I *was* keen to allow stuff that would survive intact any
round-trip through my favourite HTML editors (Arachnophylia & such).

>    ii. Another layer of magic characters to parse in BOTH HTML and
> Python--quick, how do you tell the difference between the '@' marking a
> Python variable, the '@' in an email address and the '@' and end user
> typed as an abbreviation for 'at' in a data chunk that you are
> re-inserting into your template for some reason?

I don't get the difference between the 2nd and 3rd cases -- it seems
to boil down to 'how do you tell the difference between occurrences
of the RE that define an embedded expression, and ones which just
happen'.  Answer: you pick an RE that doesn't "just happen".  The RE
I always end up using for embedded Python expressions in HTML is
'@([^@]+)@' (the group in it defines what goes back to eval) -- it
so happens that one never needs _two_ @-signs _on the same line_ in
my HTML usage (if two email addresses are given, they can be given
on separate lines since the intervening linebreak is no trouble).
If that wasn't the case (e.g. with a different textual presentation
language, or a different HTML style using <PRE>, etc), you'd just
choose a different RE -- '@@([^@]+)@' or whatever you like best --
see what I mean about wiggling out of syntax-sugar issues?-)

> What about '+' in URLs
> and Python statements? Presumbaly there are rules (Yet Another Syntax),

Embedded Python statements are line-oriented, which is hardly 'another'
syntax given that Python itself is; so, the '+' (or whatever else -- will
you believe, I made *that* into an RE too...!-) only matters if it's the
very first character on the line ('^+(.*)$', if you please... I'm surely
NOT proud of that, but it does give lots of wiggleroom:-).

> conventions (documented or otherwise), workarounds for troublesome
> cases, etc., but you've inherently added a third layer of logic, whose
> interaction with the Python and HTML layers will, in non-trivial
> projects, require non-trivial management.

Sorry, but I have to disagree on the latter point -- this *is* pretty
trivial stuff.  Hardwiring the RE's (which would then not really need
to be REs any more, at least for the statements) would in fact simplify
things (if the presentation-language flexibility could be dropped).

> 3) Those costs seem to have been worth it for you and many others
> (Digital Creations not least). I don't wish to argue with how others get
> their work done if that's what works for them. I do want to note that
> straight Python and HTML work fine for projects involving thousands of
> lines of Python in a couple dozen modules and hundreds of HTML files.

But if, for example, you need to be preparing your output in English,
or Italian, or French, depending on a 'session-state' variable, how
do you handle that without some form or other of template-substitution?

WITH the template thingy, I can just take a reference template (e.g.
the English one) and have it translated by somebody whose main skill
will be natural language translation -- all they have to understand
about Python, HTML, or the RE's in play to merge the two, is "enough
to avoid breaking things"... they just translate the textual parts,
after all -- having done a couple of templates with some supervision,
they've "learned the rules" (the tiny subset thereof they actually
need care about) and can just proceed.

An alternate (but more complex) approach would ensure translators
need only see a "catalog of messages", strictly pieces of natural
language text *with* markers (you really can't do without some kind
of marker-syntax, as word-order can change between languages even
among items that are program-output) -- make the translators' life
slightly simpler at the price of more substantial program logic
(merging the natural-language-message-catalog, the presentation-
language-template, *and* the program results -- be it on the fly,
or, partly beforehand).  In my previous experiences with programs
needing multiple natural languages' output (in somewhat different
settings), I've found this approach somewhat inferior: isolating
the messages reduces amount of context available to translators, and
risks impairing translation-quality; and I believe translators are
generally smart enough people to learn what pieces of the template
they should leave alone, particularly if you give them an easy way
to run output-tests displaying the effects of their work on a given
template-page.

Alex