Why isn't Python king of the hill?

Mon Jun 4 16:31:08 EDT 2001

A further episode in the discussion between Martijn and myself:

> Or you can do evil and cool stuff with AccessRules (the whole
> SiteRoot stuff).
> [ ... quotes from me about controller-centered models ... ]
> What exactly do you mean by embedding controller elements in the
> view? I'm not entirely sure what you'd refer to as the 
> controller here.

I'm not sure how to manipulate AccessRules; I think this might be
one of the Zope features that we currently hafta do without.

I think maybe I'm running along here under the assumption that
everyone more or less understands Model/View/Controller, and
understands it in the same way that I do. Let me try to define my
terms.

The "View" element of an application is exclusively responsible
for presentation. Your forms and your reports, basically; in a
web application, this is your HTML and the code that generates
your HTML.

The "Controller" element responds to user input and interfaces it
with the business objects of the model. The controller is in
some ways like glue code, but this is (or can be) where you might
have form validation logic, or where you translate your form
fields into the attributes of a data object. The controller
element is typically the center of your application; it might be
compared to the main() function in a C program. It presents
various views when required, retrieves model objects, and glues
them all together.

The "Model" refers to your business objects and your business
logic. The model should describe the data elements in your
problem domain, and should include methods for manipulating them
and performing calculations upon them.

Again, this is my understanding of the MVC super-architecture;
others may have a different take on pieces of it. I know a lot of
people try very hard to push everything down to the model; I tend
to want to keep the logic and data that describes the problem
domain in the model, the stuff that describes the current
application in the controller, and the stuff that implements the
current deployment medium in the view. This promotes maximum
reuse, and makes it easier to re-target your application. Going
to client/server with an applet or even an application written in
SWING isn't exactly trivial, but it's an exercise in rewriting
the view and some of your controller, not a start-from-scratch.

> Hm, yes, I think this makes sense (though it could lead to horrible
> URLs). Of course you don't give the details here so I'm not
> entirely sure I get the entire picture right.

The URLs need not be any more "terrible" than in any other web
application, including any Zope application. If your main servlet
responds to all URLs that begin with '/myapp', any sub-path is
something that the servlet can switch on. The difference here is
that /myapp doesn't map to a folder object either on the
filesystem or in some content repository. It's mapped in the
application server to mean "invoke this servlet"; additional path
information is passed on to the servlet.

So if http://myhost/myapp points at my main controller servlet,
then the "myapp" servlet needs to know what to do when you access
the URL http://myhost/myapp/login. In some cases, it will
directly invoke some action, which will in turn decide whether
this is an initial request or a post of form-values, and then
perform the appropriate processing. That processing may entail
the include of or forward to a JSP page. In other cases, the main
controller servlet may defer handling to another servlet. This is
where your local architecture can grow baroque, especially if you
don't make the decisions about how things will be done up front.

Meanwhile, your controller has the request before anything else.
That means the controller can take action based on form values or
other request parameters. As far as I know, Zope provides a
solution for the most common application of this (user
authentication), but does not provide much in the way of a
toolkit for other similar problems.

An example might be a look-and-feel switch -- if you have a
cobranded application, you might serve from a different batch of
pages, or use decorations and stylesheets from a different
directory hierarchy. 

You /can/ do these things in Zope, it just requires hacking
around the way Zope is set up to serve pages, not invoke
processing logic.

> [Model/View/Controller as JSP/Servlets/Beans]

I think Model/View/Controller as Beans/JSP/Servlets is more
accurate, but yes. :)

> [JSP pages get compiled, not interpreted]
> 
> We're not going to accomplish this in Python, though of course
> you could still write (possibly with C extensions) a fast
> interpreter for web pages in Python. Anyway, we're generally
> not using Python because insane speeds are the issue, so while
> this isn't a major bottleneck for me I'm not going to bother
> too much with this.

Or you could take your Zope objects and turn them into actual
python code objects that generate HTML and perform some actions
when called. It's not going to suddenly challenge C for
near-hardware speeds, but it's a significant step ahead of
doing the lexical parsing at runtime -- at least the Python
interpreter is written in C.

I don't think it's a huge priority, but it's one of the things
the Java stuff does to reduce runtime load, which in turn
promotes better scaling and better overall peformance.

> > Your JSP pages have access to at least four levels of state
> > (there might be another I'm forgetting): application, session,
> > request, and page. The first three correspond to the same state
> > repositories managed by the servlet. The 'page' context is for
> > local objects.
> 
> Right, in Zope you'd have 'acquired things', session object (if you
> have Core Session Tracking or something similar installed), REQUEST
> object and folder context, for these. I'm just trying to translate
> the concepts.

Almost.

I think the acquired things don't map onto application state
quite like you suggest. There's one repository, shared by all the
sessions running inside a single process. This isn't initialized
with web-content properties. It's a place to stick objects at
runtime.

I don't know what the session object gets you, or how it gets it
for you, but I assume that's correct. The REQUEST object does map
to the request context. The page context maps more closely to the
namespace of a DTML document.

> > You can forward control to other servlets or JSP pages, and you
> > can also include pages.
> 
> What do you mean by 'forwarding control'? Control of what? 
> 
> [ snip my explanations ]
> 
> Ah, I think I more or less understand. I'm not sure what the use of 
> forward is compared to include, though. Why'd you ever want to forward
> control and never come back? Generate half a page with one JSP page,
> and the other with another? 

<dtml-return> and RESPONSE.redirect() provide more-or-less
similar functionality in DTML, as compared with the use of
<dtml-var> to include a method.

My typical page processing paradigm is to submit form data to the
page that renders the form. If all the data are acceptable and no
processing errors arise, I forward on to another page. If all the
data are not acceptable, I redisplay the current page with error
messages. Others do it other ways.

> > You can associate a bean with an object, and map form properties
> > onto the bean's attributes. I think that this is most useful when
> > you're programming pure JSP, but it's pretty useful when you're
> > not.
> 
> Right, Zope's REQUEST object does this automatically (the 
> 'form' subobject).

Except that your "form" subobject is one standard object, not an
object customized for your particular form. The servlet API
allows you to pick the attributes off the REQUEST object, too;
it's just somewhat more convenient to load up a bean.

> > I've discussed session management before. Sounds like they're
> > doing some of that in newer Zope? That's good. It's a very hard
> > thing to do well. I would be concerned about ZEO and ZODB being
> > used in their unadulterated form to handle session management for
> > a high traffic site. It just seems like there'd be too much I/O
> > for ZODB/ZEO to handle it efficiently.
> 
> They're working on various improvements to that too, of course.
> ZODB without undo for sessions for instance. ZODB layered on
> top of various stuff, like the berkely database. By default
> Core Session Tracking just keeps the sessions in RAM and they
> expire after a customizable time.
>
> > Servlet containers implement session management for you. The
> > servlet containers are presently not obliged to make these
> > sessions either persistent or distributable, but some of them do,
> > including at least one of the freely-available containers.
> > Persistent, distributable sessions provide good load balancing
> > and good failover.
> 
> Right, the intention in Zope is to do this with ZEO and Core Session
> Tracking.

The current solution is a reasonable solution, but it doesn't
scale well. Your front-end load balancing hasta be session-aware,
so it can redirect to the right instance of Zope.

Using ZEO and Core Session Tracking could be The Right Thing. It
certainly seems like the two technologies could be combined
reasonably. Until it's arrived, though, you're still rolling your
own sessions if you deploy on more than one box.

> > Servlets get a request and a response object passed to them.
> 
> Like Zope's DTML pages, and of course you can pass those along
> to Python Scripts or other python code.

Yes, but the servlets ARE Java code, which means that you don't
NEED to stick a block of code at the top of every page, designed
to invoke them.

> Neat. With Zope exporting subfolders or simply copying 
> Data.fs is relatively
> painless, but you need to copy over any external methods and special
> Python products manually, which isn't hard but an extra step.

Yeah, Zope isn't all that hard to roll out, as long as you do
everything in Zope. You can still get sync problems with the
external dependencies, though. If you use a lot of those (and I
suspect that you will in a logic-intensive app), then you're
likely going to hafta come up with some other deployment
mechanism.

> > Adding EJB entails a few sacrifices and a few big gains. Sticking
> > your servlets into the J2EE "framework" means that you hafta put
> > your "application" level state into an EJB or a database (or at
> > least knowing that anything you stick into the application
> > storage can't be counted on to persist any longer than the
> > request). You really need to make anything you put into the
> > session area serializable (though there are a couple exceptions
> > to that rule). There are a couple other rules that go along with
> > that, I can't put fingers on them off the top of my head. All the
> > sacrifices are really pretty minimal.
> 
> Okay, parts of this is a small problem with ZODB as well, of course.
> I suspect ZODB makes this even easier, as it exploits Python's
> flexibility.

Well, except that we're not talking about just ZODB or ZEO here,
we're talking about distributing dynamic (by which I mean
run-time) state. That includes things like concurrent access
issues and synchronization of access, RMI techniques, and
whatever else it takes to implement these things.

It's also not just Session stuff. If you could publish arbitrary
objects to the ZEO store and retrieve them by key -- something
I'm sure is possible, but not sure how difficult -- you'd be
close.

> Sort of like ZEO with Core Session Tracking. :) Of course this
> is still being heavily developed, but it seems to be getting
> near offering equivalent features.

I can't speak to that, because I haven't used (and can't really,
at the moment) Core Session Tracking, let alone using it with
ZEO.
> 
> > The J2EE stuff makes deployment still one more step simpler, too.
> > It pushes parameters that might change during runtime out one
> > notch further, so that deployment configuration can be separated
> > from the details of servlet instantiation, URI mappings, and
> > other miscellany concerning the interaction of the app and the
> > app server.
> 
> Hm, I'm not sure what this part means, exactly. I can't
> translate it to any Zope terms, which is how I tend to think
> about things, myself.

I might argue that trying to put everything into Zope terms is
part of what's limiting your comprehension. :) Zope is really a
very different animal.

Still, if you were to take your Zope application, and all its
external methods, and all the little mini-servers that helped
distribute your backend processing job, and stuck them all in a
big tar file, then had one configuration file that detailled ALL
the configuration details that would vary from machine to machine
within an installation, and all the details that would vary from
installation to installation -- if you assembled that tarball and
that config file, you'd have the deployment mechanism used by the
Enterprise stuff.

> Yes, I'm not trotting them out so much to do 'see, we have that
> too!', though there is a component in that, but to learn more
> about both by comparing the two, and to see if there are
> interesting directions which Zope is missing. I think Java's
> system works better for model/view/controller type situations
> right now; Zope's framework for that is currently still in
> design phases. Java also has clear (standardized) interfaces
> for it, something which Zope needs a lot more of. 

All of which I'll agree with. And I'm not trying to suggest that
Zope is a poor product, either. I just don't feel like it's a
good choice for a web-based application -- a website with some
dynamic elements, maybe, and I could even see it being used for
an online store.

And I confess that the more we talk about these things, the more
optomistic I am about the future. Java's solutions and toolkits
are great (or at least pretty good), but I would like a good,
solid alternative to all that wordy Java.

> I don't know how to compare ZEO with the scalability features
> you sketched out. I would guess it's a question of tradeoffs;
> which is better probably depends on your requirements and
> development style.

I think ZEO is really cool. I like it a lot, but it seems like it
only answers about half my concerns. That's also true about the
enterprise java beans -- the Java Data Objects are really more
interesting to me from the standpoint of my dynamic data model,
but both ZEO and EJB do really well for providing keyed access to
persistent objects. I think ZEO is a little more light-weight
than EJB? I'm not sure how effectively you can "synchronize" the
ZEO stuff against multiple threads / processes contending for the
same data.

> > Like I said once before,
> > maybe we can beat Java to the punch here.
> 
> Python certainly has a lot of flexibility to work in its favor here,
> though I don't know if we have the development resources in the
> Python/Zope world to go fast on this. These discussions often seem to
> devolve into feature wishlists and pie-in-the-sky designing.

That's maybe true, but like I said, I started work on it at one
point, and probably put a few solid days of effort into it. What
I've arrived at might be less than perfect -- in fact, it IS less
than perfect -- but it was a running start at something that
would marshal a database table into an object.

Even at that level of utility, it's a very, very useful tool.

> [ snip lots, including a description of EOF's qualifiers ]
> Would this get translated into a query into a relational database?

Yes. EOF provides a couple different ways to do things -- you can
build up these qualifiers piecewise, or you can write out the
query string. But then EOF goes and traces it through its
internal mapping of entities to database elements, and produces a
SQL query that retrieves the appropriate object.

> > When you fetch these objects from the database, the EOF puts
> > "Faults" in the place of relationships. When you first try to
> > read from a fault object, it runs off and fetches the object from
> > the database.
> 
> A proxy pattern.

Yes, and very effective in reducing the strain on the database
while still preserving the whole notion of composed objects.
There's also a means by which you can pre-fetch some of the
relationships -- that way, if you're planning an operation that
will iterate over all the eggs in a spam_and_eggs_breakfast, you
don't need to go back to the database multiple times.

> > But that's not all! EOF also provides sophisticated caching and
> > data source controls. EOF also provides multiple 'editing
> > contexts' with undo capabilities and the ability to nest editing
> > contexts so that changes can be made within the context of other
> > changes, and committed incrementally, or abandoned en masse.
> 
> Does this use relational database transaction features?
> (with subtransactions?)

It doesn't require relational database transaction features, and
I believe that it provides access to transactions at a more
direct level. EOF's "editing contexts" are in-memory
representations. Basically, every time you create an editing
context, any object modified in that context is first copied from
its parent context, then modified. If you abandon the changes,
the context closes and you never write the changes back onto the
parent context. Commit them, and all your changed objects are
promoted back into the parent context.

This is especially handy when you're building sequences of web
pages several pages deep before the final commit. You can nest
your editing contexts, and when the user hops back four pages,
you can either make the hop with him and toss the deeper changes,
or even just hop around with him and wait for him to commit the
whole lot of them.

> *part* of the good effects you attribute to standard APIs can
> be dealt with by good API *documentation* instead. Another part
> deals with user interface; I do think Zope's web UI helps
> people who have little programming experience tremendously;
> I've seen this happen several time.  A hot topic in the Zope
> world has to do with explicit interfaces, and clear
> documentation for them. This inspired PEP 245, currently under
> consideration for 2.2. While I don't agree with some of the
> details of that PEP, I do think the ability to have a form of
> explicit interfaces (run time checked if checked for at all!
> ability to lie about implementing an interface! this is Python :)
> is very important to the future health of Zope and similar large
> Python systems.

I think I agree with you here. I've worked around the interface
issue in the past (subclass empty classes, then test to see if
my instance is a subclass of a particular empty class), but it
seems like being able to unambiguously promise to implement a
certain interface is much /clearer/.

I think Zope's web UI is a mixed blessing. It cuts down on the
learning curve to get in and start making changes, but in the
end, it becomes an impediment to being able to work smoothly -- a
simple CVS archive of the DTML would be more practical later in
the game.

And yeah, documentation could correct a lot, especially if the
documentation addressed best practices and common problems
experienced when developing larger and more complicated sites.
But I can't comment extensively on Zope -- like I said, I'm stuck
on an old version. That means all the docs that I've read point
people at ways of doing things that don't work for me, and even
most of the advice on the mailing list refers to features I can't
access. Many of my issues with Zope may be that Zope as I know it
isn't nearly as sophisticated as Zope as it could be. :)

Always a pleasure,
--G.