Long: Python Is Really Middleware

Tim Daneliuk tundra at tundraware.com
Tue Jul 31 05:50:03 EDT 2001


              *** Python Is Really "Middleware" ***
                         Tim Daneliuk
              Copyright (c) 2001, TundraWare Inc.

             Feel free to pass it around however you like,
             so long as you don't change, delete, or add
             anything, and preserve the author and copyright
             info above. 
             

Introduction
-------------

There has been an ongoing discussion about how Python, C, and
many other languages relevance as "systems" languages 
here (comp.lang.python) recently.  This started as a brief (Ha!)
response, and evolved into the Epistle below. 


A Little History 
(Or, "Why C Isn't Really The Most Common Systems Language Around")
------------------------------------------------------------------

I would venture to guess that there is at least as much, and maybe
*more* "systems code in Forth as in C.  Forth is the language of
choice for a great many real-time and embedded systems like
those used by NASA and the Aerospace/Military folks.  It is compact,
easily ported, very fast, and has a pretty interesting programming
paradigm (something of a cross between a high-level macro assembler,
an HP RPN calculator, and C itself).  I'm no Forth expert by any means,
but it is very widely used - it's just invisible, the way really
good code should be ;)  It even shows up in general systems programming.
IIRC, the Mac firmware bootstrap loader is written in Forth (?) and
I am certain that part of the FreeBSD bootloader is in Forth - I just
looked at the code a while ago.

For general OS development, 'C' is certainly the dominant language of
choice nowadays.  But I would point out that the among the very biggest
body of systems code, that has been around a *very* long time, is IBM's
OS/390 (or whatever they call it these days - MVS by any other name),
a big part of which is written in assembler.

As I've mentioned here before, the *biggest* (size of data * work arrival
rate * number of users...) TP systems in the world (Visa, Mastercard,
American Airlines, United Airlines...) run on TPF, a special transaction
OS which is pretty much *all* assembler. (And no, you couldn't get
a C based OS to do what TPF does even if you did have a couple hundred
million dollars to redo it, IMHO.)

But, for most (systems) things, 'C' is a fine choice.  I've done everything
from real-time kernel work to OS utilities in 'C'.  But... times are a' 
changin'.  Systems languages are great when you need fast, compact code.
But they crater big time when you have really big development efforts
(only Microsoft thinks an OS needs to be huge - on most scales, an OS
is a medium-sized effort, compared, say to an ERP system, a transaction
manager, or even something as mundane as Payroll and GL).  Systems
languages also tend to reek when you have lots of programmers involved,
because the language makes it so easy for programmers to shoot each other
in the foot (or, in some malevolent cases, The Back).

Before bandwidth and memory got cheap (256M = $39 today), we either had
to throw mainframes at problems or write in efficient languages like 'C'.
Well, we don't have to do it for most things any more.  That's why I like
the 'C'-Python combo.  Python is still a little rough around the edges
(as the many PEP discussions reveal), but it is a superb general purpose
programming language, very well suited to the vast majority of applications
and utility programming tasks.  It has a way to go in efficiency - we
really do need, and will soon see, native code compilers - but it makes
the *programmer* efficient in ways I've never seen in over 20 years of
doing this stuff.  I've been leading technology teams for the better part of
the last decade - this means I wrote very little code myself - and I'm
amazed at how fast I can sling *correct* Python even with my rusty old
middle-aged programming chops. (I forgot just how much FUN programming
can and should be. For an example of how the Elderly program, go have
a giggle at:  http://www.tundraware.com/Software/hb/HB.tar.gz ;)

Programming language debates have always been with us - you youngsters
missed doozys in the Assembler-FORTRAN and, later, the FORTRAN-COBOL
wars.  But what motivates these wars never goes away:

- How do we write more correct, maintainable, code *faster*?  

- How do we minimize applications dependency on particular 
  systems and infrastructure choices?

- How do we spend most of our time/energy/money on our applications
  logic and not the plumbing underneath?

- How do we make enough money so <gender/species-of choice> will like us?



A Little Philosophy
-------------------

Cheap Energy was literally the fuel of the first Industrial Revolution.
Cheap Software is the fuel of this Economic Revolution.  For those of you
keeping score and who still read instead of watching Jerry Springer, the 
current high tech market behavior is remarkably similar to the what took
place in the auto manufacturing consolidations at the beginning of the
last century.  He/She who wins the Software Battle wins this economy.

The winning technology will not emerge victorious because of paradigmatic 
elegance, but because of a proven track record in meeting the criteria above
By that measure, C++ is a clear loser.  Perl does well, but not in 
the "maintainability" arena.  Java, started out very well, but is starting
to show its middle-aged spread because too much about using Java involves
"optional" additions (J2EE, JMS, EJB, EIEIO...) which have poor implementation
track records across vendors.  Moreover, too much of the initiatives in 
the Java world start and end with the Web, even when the intended audience is 
behind the server.   The really hard/interesting/commercially lucrative 
opportunities are not in/on the web.  They are in the back rooms of 
multi-billion dollar corporations who need way more than a just a shiny new 
interface. 

(For those of you Neo-Marxists in the audience who think that "Information
Just Wants To Be FREE", I should mention that the survivability of
any technology has always been primarily a function of commercial adoption,
at least in the long run.  How many people write in COBOL today? (Many!)
How many program in Eiffel? Snobol? ML?  Oberon?  Haskell?  'Nuff said.
Economic Reality trumps Bad Collectivist Theory every time. Thank-You,
Adam Smith.) 

This, BTW, is something that Microsoft seems to be really getting,
and then, only recently.  .NET may be a "distributed web infrastructure" 
in drag, but make no mistake about it, .NET is Microsoft's Trojan Horse
to get themselves entrenched in the *back office* of the biggest technology
buyers in the commercial world.  The only things holding them back at the
moment are:  

- They have a really hard time admitting to themselves that
  the world was/is/will remain heterogeneous.  For .NET to really win the
  day, it has to run *equally* well on Win32/*nix/MVS/AS400/Tandem...
  I remember talking to a couple of their Enterprise computing people
  almost 10 years ago who really *got* this, but were frustrated by
  the Top Brass' lack of vision.  Had Gates and Balmer even just Rented
  A Clue on this issue in the 80s, Microsoft *would* own the world -
  and then Gates could have fired Janet Reno!)

- Their OS core is so UI centric, it is doubtful that it can easily
  really make the jump to being a serious back-room large-scale
  contender.  Yes, Win2K is a quantum leap forward for them, but then,
  it's predecessors were such unredeemable garbage, a (big) step forward was
  long overdue.  Moreover, there's more to this picture than a clean kernel.
  There are issues of systems management, interoperability, recovery from
  failure, and such where Win32 is just plain *dreadful*. (OK, your enterprise
  servers just Blue Screened.  Every minute of downtime is $1M of lost
  income.  Lesse now, that's about $5-$10M per reboot for a typical
  Win32 barf event.)

But... they have a Big Bag Of Money, *really* smart people working there, 
and an unrelenting focus on their future.  (That's why the only way their 
competitors can stay in the game is to go whining to the DOJ about how 
"unfair" the real world is -  And I'm a lifelong *Unix* weenie!)


A Big Opportunity
-----------------

This is the real reason to stay on top of Python.  The vagaries of
the "Mine is bigger than yours"  battles between McNealy, Gates, and
Ellison will be with us until they aren't.  Python lives outside
this Billionaire Battlezone because none of the warring factions own the
technology and it plays nicely with them all.  It is precisely because
both Perl and Python have avoided choosing sides in these silly
ego competitions that they have and will survive.  Just watch, it's
already happening.  CIOs and CTOs are being asked to make significant
decisions about their companies' technology future.  Picking .NET
means choosing C# (yeah, yeah, Microsoft pitches the language neutrality
of the CLR, but where you you suppose they will innovate? In Perl,
Python, or C#?)  Picking Sun and most every other Unix, means
picking Java. 

But there's something even more profound here that has dawned on me
as I've explored Python.  Every serious CIO/CTO you'll meet
(No, not the ones that don't shave yet and spent Everyone Else's
Money in the last 5 years) will tell you that infrastructure is
a necessary evil.  It is applications logic that runs their world
.
The problem is that applications live a really long time - 20 years
is not unusual - but they have to change infrastructure like
underwear (every two or three years ;))  This means that in the lifetime
of one business application, there may be 5 or 6 major upheavals in
operating systems, disk farms, communications technologies, coffee
makers...  This just *kills* big IT operations in costs. - Not the
costs of buying all this crap, but the cost of keeping those old 
apps running across all these changes.  And, no, rewriting the apps
in not a realistic option.  YOU, try convincing the CIO at Schwab she
needs to rewrite their trading system because you have a "Really Cool
New OS Upgrade."

To respond to this problem, large IT shops have increasingly
turned to "Middleware" - a layer of code between the application
and the infrastructure to "insulate" the apps from the actual
syntax and semantics of the underlying system.  Examples of this
include (and these are all somewhat arguable), RPCs, JMS, and
ODBC.  Even the original 'sockets' implementation at Berkeley
had this in mind - the code has provision (never well/completely
implemented) for protocols other than AF_UNIX and AF_INET.  In
principle, we should have seen AF_SNA... 

But, there's a rub.  When you buy Middleware, you are
(if you did your homework right) removing a large part of the
dependency the apps have on networking, OS, and so forth. 
BUT, you are marrying the Middleware vendor until that
application goes away.  Middleware vendors know this, so they
act like drug dealers: The first shot is cheap - thereafter, they'll
get their pound of flesh when you HAVE to have the latest version
of their product to run your General Ledger or Billing System
on the newest FuzzWuzzy 101 supercomputer (that *still* takes 5
minutes to boot Windows 3000).

Back In The Day, Middleware was the only way to crack the dependence
between apps and the underlying plumbing.  The cost of machinery
and network bandwidth prevented you from having a generic
Middleware layer between every OS service and the application.
You abstracted only those things (like networking, especially)
that you expected would change a lot, because it was simply too
computationally intensive (== $$$) to abstract everything in the system.

BUT, as I said before, things are changin'.  Machine and network
bandwidth are really cheap.  What we can now afford to do is 
abstract pretty much all the OS services that modern applications
need so that they never actually directly touch the "plumbing".

<sidebar>

This is, more-or-less, *exactly* what .NET is all about (and for
that matter, J2EE).  Microsoft (and Sun) finally figured out that
the battle really isn't about operating systems or languages 
(those are the "Trojan Horse" to which I alluded above).  
The battle is about  object and content interoperability.
.NET will "fix" these problems for you, and all you have to do, is
go down to the Crossroads in Seattle (or Silicon Gulch) and 
sell your soul to, um, ... Microsoft (or Sun), *forever* because 
their middleware will *always* be in your shorts, no matter how 
many times you want to change them.

</sidebar>

But, techno-politics aside, the *best* place to abstract a
system is *IN THE PROGRAMMING LANGUAGE*.  Let interpreter,
compiler, and library writers worry about the faucets, pipes,
and fittings of a modern system, and present a unified model
to applications writers.  To do this, you have to have 
several things:

- A runtime environment that can handle this level of abstraction
  without killing performance or requiring ghastly amounts of
  computer to do the Usual Things Applications Do.
  Why? Because the first sell to The Boss, is an economic one.
  If you have The Answer, but it needs 30 teraflops to run
  General Ledger - You lose, thanks for playing.

- A *small* core language with a *big, standardized library*
  that does most of the Usual Things Applications Do.
  Why? Because a big language is hard to port and hard to optimize.
  Libraries need to be standardized so that Applications Do The Things
  They Do in mostly the same way across different OS, networking, and
  distribution infrastructures.  You'll never have 100% system transparency,
  but you can get close.

- Stability in the core language.  Why?  Because if CIOs hate
  infrastructure churn, they REALLY hate applications churn.
  Remember, they want to focus on their golf games, not whether
  their programming language of choice returns floats or floors
  in a division. (I just *couldn't* resist ;)))

- A meaningful version/feature control system *within* the language.
  Why?  So applications can survive across language upgrades.


Both Sun and Microsoft will tell you that this is precisely what
(Java/J2EE, C#/.NET) give you.  BUT, that's cuz they want to
get married to you - forever - with no possible future divorce.

Now notice, ahem, cough,cough, Python does *exactly* these things:

- The level of programming abstraction is just about perfect
  for large applications, and the machinery needed to run it
  is quite reasonable.

- The core language is very small and (reasonably) stable.

- The "Batteries Included" modules approach of Python cover
  a huge part of What Applications Programs Do and this gets
  richer release by release.  This Standard Set Of Abstractions
  is the technical core of why Python is the ideal middleware.

- Constructs like "from future..." give the programmer the ability
  to build armor into their programs in anticipation of
  language evolution.  It has been mentioned here, and I agree
  heartily, that a "requires ..." verb needs to be added to
  allow the programmer to stipulate what the minimum versions of
  Python language and libraries are needed to run properly.


But, Python has one, very important commercial advantage.  It's 
not owned by a vendor.  So long as they don't mirror the vendor
systems architecture too closely, CIOs can deploy applications
which have an excellent chance of surviving Yet Another Infrastructure
Upgrade.


Flies In The Ointment
---------------------

Well, it's never really that simple.  Even with something as powerful
as Python there are plenty of issues that make the Applications-
Infrastructure boundary forever problematic.  There are also
commercial concerns:

- If applications are deployed (in ANY language) using a distribution
  topology and architecture which closely mirrors the underlying
  infrastructure's topology, migration to other infrastructures
  is Really Painful (DAMHIKT).  Fer instance, if you write a Python
  application which depends on socket datagram broadcasting, and
  then have to accommodate a new (or old) network that does not support
  broadcast, umm, you have a problem.  SEMANTICS MATTER.

- Infrastructure vendors *always* offer stuff that is hard/impossible
  to do at higher layers of abstraction.  it is inevitable that real
  systems will have to reach into the guts now and then to get things
  done.  The issue here is whether the CIO and team are savvy enough
  to localize this sort of thing in places that can later be easily
  changed.  STRUCTURE MATTERS.

- The Pain And Suffering of infrastructure churn has caused more than
  one CIO to lose their job.  Knowing this, some technology leaders
  resist doing those upgrades "On their watch."  In this situation,
  applications are forced to cope with infrastructure deficiencies
  by coding around them.  This leads to REALLY ugly, hard-to-maintain
  systems.  CHANGING UNDERWEAR MATTERS.

- As a matter of living in the Real World, it is always preferable
  to Buy rather than Build applications.  The promise of both
  EJB and COM+ (and now J2EE and .NET) was that you'd be able to 
  buy at least major subsystems and plug them together with a lot 
  less effort than writing them from scratch.  This has turned out
  to be laughably not true, at least insofar as behind-the-server
  enterprise class applications go.  (Who cares about the web,
  it's just a better VT100 ;)  IN-STOCK AT K-MART MATTERS.

  Now, there's not a lot of that kind of software being vended that
  is Python-based AFAIK, but there is a kind of Python 
  "Trojan Horse" here we can exploit to sneak in when they're
  not looking - it's OK, they'll thank us later.   There are two
  problems *every* large IT shop has.  These problems never go away
  and anyone who helps solve any part of them will be a Hero.
  These problems are: 1) Making the old applications talk to each
  other and to new media like Da Web and Mobile.  2) Normalizing
  data for exchange between and among old and new apps.  For you
  XML-weenies: XML, in-and-of-itself, cannot do this no matter
  how many times you say "semantic markup".  The data I'm talking
  about it domain specific, requires human intelligence to
  understand and encode in the first place. XML will help, but
  we need specialized loaders and tools to do all the heavy
  lifting.  Crack some part of these two problems - and Python
  is ideal for both, so long as the performance issues don't
  get in the way - and you'll live Happily Ever After - or until
  the CTO starts losing at golf and needs another "win".



A Little Warning
----------------

I worry about one thing and one thing only in the Python world.  It
is something I have witnessed in *every* new technology I've
ever seen.  Python is dangerously close to becoming a victim of 
Feeping Creaturism - not so much In Fact, but rather in this community's
mindset of forever wanting to fiddle one more feature into the 
language.  This will KILL Python commercially if it happens 
because too many variations on the core language theme make 
deployment and management of real systems too expensive.  

I, for one, would like to see a date picked for a permanent moratorium
on the language proper, after which, only bug fixes and new modules 
could be added.  After that date, language changes would have to be 
part of some new language ("Grail"?) which would owe no allegiance to
Python at all.


It is now 4.45am and I must crawl back into my coffin before dawn.
Flap, flap, flap .... Creaaaak, Slam.
------------------------------------------------------------------------------
Tim Daneliuk
tundra at tundraware.com



More information about the Python-list mailing list