[melbourne-pug] Web App Architecture

Sat Mar 17 10:12:20 CET 2012

Hi Adam,

We've given this quite a bit of thought, and seem to have come to the same conclusions as you and Javier. So yeah, we think it's generally a good idea for fairly complex apps, but not so relevant for content sites.

Our original web app — a hosted email marketing platform similar to Campaign Monitor, MailChimp etc but aimed at organisations with more complex requirements — was based on CherryPy with Mako for templates and SQLAlchemy for ORM. We didn't want to go down the monolithic framework path, primarily because I'd worked on a reasonably complex project in Rails a few years earlier and had found I spent all my time trying to work around the restrictions it imposed on me (I'm sure it's much better now; this was pre-1.0 Rails in 2005 or thereabouts).

About two years ago we re-wrote everything from the ground up, switching from EC2 to a dedicated VMware cluster, from MySQL to Postgres, and from server-side templating in the CherryPy web app to an app architected essentially as you describe, with a few separate Tornado processes providing JSON APIs called from client-side JavaScript, and another Tornado process handling things like initial login and real-time notifications (long polling).

This had a few practical advantages for us:
• We were able to unify all data access into a single set of HTTP APIs, used by our web app as well as by customers' applications;
• Navigation/interaction latency is almost eliminated because we're able to intelligently retrieve most data before the customer requests it;
• We're able to support things like live editing and instant display of changes (records created, deleted, modified) among all connected clients without much extra work;
• Since we were already using proper bookmarkable links within the application (originally jquery.address.js but recently HTML5 navigation), it was easy to add more "view state" to the URL, so in fact bookmarkability and browser history navigation *improved* when we switched to an all-AJAX app;
• Separating things into one process per logical unit of functionality made performance tuning, scaling and code deployments much easier.

However, we've also encountered the following general challenges:
• Browser support — we gave up on supporting any IE version prior to 8, not because of non-compliance with standards but because their JavaScript engines are too slow to provide decent performance with the amount of data we're displaying;
• Security — loading everything through AJAX means you need to pass tokens in every single request to avoid XSRF issues (http://stackoverflow.com/a/6075862);
• Staff skill-sets — going down this path means you're doing serious application development in JS, not just adding nifty-but-minor features with a few jQuery calls, so it means either your front-end developers are going to have to take on the whole client-side application, or your back-end developers will need to learn JavaScript (not such a big deal now with Node etc);
• Blocking database queries — not really an issue with CherryPy since each request runs in a separate thread, but in Tornado you really need to keep all requests well under a second since they'll block everything, or set up a thread pool to insulate the Tornado main thread from blocking queries.

To give you some context in terms of the second two questions you asked, our application has six developers: one designer who does HTML, CSS, and presentational JS; a back-end developer who does Python and SQL; two who do everything from SQL to CSS; a customer support guy who also does Python and SQL in support of specific customer requirements; and an infrastructure guy who we share 50% with another company. We support a few hundred users total, with probably 30-40 people logged in at any given time. Many of our users spend the majority of their working days inside our application so ease of use, responsiveness and reliability are really important to us. The other side of our application is public-facing, and runs things like "view online" versions of emails, some basic web forms/landing pages, unsubscribe pages, signup pages, and tracks clicks and opens. That side also uses Tornado and Postgres (as well as a bit of CherryPy), and handles a couple of million requests per day.

If I were to re-write our system from scratch using the tools and techniques available now, as opposed to two years ago, I'd still be using a few Tornado processes on the server side, but I'd structure communications between client and server very differently: rather than our current approach which uses a "REST" (in the sense of "what many people call REST but actually isn't at all") API for everything, I'd create a true REST API for the application's core resources, and then do all of the RPC stuff (which doesn't map neatly to REST) via web sockets. On the client side, I'd still be using underscore.js/jqote2-style templates, but I'd build the application using backbone.js rather than rolling our own. I'd also make much more use of HTML5 storage for longer-term caching.

I'm not going to put myself out there as a source of best practice, but I do have an extensive and painful library of suboptimal practice which I'd be happy to share…

Cheers,
Ben

On 17/03/2012, at 18:39 , Adam MacLeod wrote:

> Hello Happy People of MPUG,
> 
> I have recently been hearing suggestions that rendering HTML on the
> server side is becoming less relevant as JavaScript MVC (and other)
> frameworks are developing.
> 
> Personally I feel that having all of the client rendering logic live
> inside the browser as a client to an API based server application
> seems like a decent way to go. I'm sure no one needs to hear this but
> I believe it should allow the front end developers to write client
> interaction code independently and the back end developers to focus on
> data, security, speed and scalability.
> 
> As I see it there will be a MC (model-controller) type back-end that
> supplies JSON to a model in a JavaScript MVC framework like
> backbone.js, ember.js or what have you. I feel that the existing
> Django/RoR type mega-framework is simply too heavy for this new MC
> backend. Perhaps one of the micro-frameworks like Flask, Pyramid or
> similar might be suitable.
> 
> I am extremely interested in hearing people's opinions on the following:
> 
> - Whether this is/will ever be a good idea.
> - The size/scope of applications that this is suitable for (personal
> website? small-time start up? buzzword-driven-development? Twitter?)
> - The current best practices regarding this type of architecture.
> 
> I guess the real meat of my message comes down to the following:
> 
> - Which back-end would you use in Python for an application written
> in this style?
> 
> Hoping to hear everyone chip in a few cents to this discussion :)
> 
> Cheers!
> Adam MacLeod