[IPython-dev] The hubrid rant I just posted c.l.py

Ville Vainio vivainio at kolumbus.fi
Tue Jun 29 10:09:59 EDT 2004


I said I would do some advocacy, so here goes:

Pythonic Nirvana - towards a true Object Oriented Environment
=============================================================

IPython (by Francois Pinard) recently (next release - changes are
still in CVS) got the basic abilities of a system shell (like
bash). Actually, using it as a shell is rather comfortable. This all
made me think...

Why do we write simple scripts to do simple things? Why do we
serialize data to flat text files in order to process them? Everything
could be so much simpler and immensely more powerful if we operated on
data *directly* - we should operate on objects, lists, dictionaries,
functions, not files. We shouldn't write scripts to perform our
routines - and way too often, we don't even try, because moving the
data between scripts is too troublesome, involving file formats, argv
parsing and other tedious stuff.

If we use a shell with full Python capabilities, we can introduce new
funtionality for integrating our environments very easily. Consider a
project framework for managing a software project::

   >> import pf
   >> pf.projects
  
   --> [<project 'foo'>, <project 'bar'>]

   >> p = pf.projects[0]   # we want to work with project 'foo'
   >> headers = Set([f for f in p.files() if f.endswith(".h")])
   >> srcs = p.files - headers
   >> found = Set([f for f in srcs if find_any(f, headers)])

   >> r = findradio("kazoo classics")[0] # some diversion needed
   >> music.play   # manipulates our music player
   >> music.vol 100

   # back to work...
   >> notfound = srcs - found

   # who has made a header that is not used?
   >> jackasses = Set([p.author(file) for file in notfound])


Now we have the names of our victims in 'jackasses' variable. We want
to make that variable accessible to all in our project team, in a
persistent store. We use tab completion to find the databases::

   >> export jackasses <TAB>

   Completions: "fooproject_db" "barproject_db" "private"

Note how tab completions for "export" notices that we are trying to
enter the second parameter. It knows that it is the database name, and
the completion mechanism (written for export) dynamically queries the
list of databases for which we are allowed to export data. After
seeing the choices we choose one of them::

   >> export jackasses "fooproject_db" name="slackers"

Now the list of guys is accessible to everybody. We also compose a
volatile email to everybody::

   >> xemacs tmp  

   >> for m in [Mail(to=name, bodyfile="tmp") for name in jackasses]: m.send

And rat the guys to the management::

  >> xemacs tmp2

  # mail contents
  #  The following guys have not been doing their jobs:
  #  @("\n".join(jackasses))

  >> cont = open(tmp2).read().expand()   # runs it through EmPy template 
expansion system.

  >> Mail(to=p.boss, body=cont).send

Notice how jackasses variable was used inside the mail. We can also
schedule some extra hours for the guys to check if their headers are
needed, create a cron script to monitor that they have fixed the bugs,
etc.

The boss might want to fire them at once:

  >> l = import "slackers"
  >> [e.fire() for e in Employee(l)]

Or the boss might want to do some more extensive firing to invigorate
the company::

  >> ent = MyEnterprise()
 
  Auth needed!
  Password: ******

  >> st = stats(ent.allemployees())
  >> avgperf = st.average_performance


  >> def dead_weight(emp):
  ..   if emp.performance() < avgperf: return True
  ..   return False

  >> ent.fire_if(dead_weight)

Typing all that might seem like a lot of work. However, a lot of it
will probably be implemented as functions aggregating the
functionality. Most of the lines here could be factored out to
a function (note that I didn't say script)::
 
   def unused_headers(prj):
     """ returns the header files that are not used in the project """
     ... implementation ...


With conventional scripting techniques, nobody would want to do all
this. With the pythonic approach, creating this kind of business
intelligence is a breeze, eliminating tons of tedious routine!

Obviously this all can be done in specific scripts, which start doing
the thing "from the start" - open database connections, ask the user
to select the project, etc. However, writing such scripts is a lot of
work. With the interactive, much more dynamic approach, pieces of
functionality can be implemented one by one, and they become usable
immediately.

I can imagine that for power users and "knowledge workers", this type
of system would yield immense power. The payback is also incremental -
the whole system grows and gets refactored, streamlining the
process. In the end most of it will probably be driven by a simple
GUI. Especially the "fire below average employees" function, which
should not be run in a yearly cron job - only when needed. Various GUI
threads could be running in the same Python process, manipulating the
same namespace

What needs to be done
---------------------

Not surprisingly, "we're not there yet".

- IPython needs some refactoring (the codebase isn't quite scalable
  and extensible enough yet). Francois can use some help.

- Flexible persistence system needs to be itengrated. ZODB, perhaps?

- Domain specific modules (like project / "employee management"
  systems) need to be implemented. This luckily mostly writes itself.

- Flexible, but easy to use protocols need to be specified for
  documenting the functions, argument expansion, gui interaction etc. A
  gui module should display the documentation for the "current" function
  and possible arguments, so there's no need to press tab at all times.

Still, all in all, we're almost there. This has the same "feel" of
tight integration that I imagine the Lisp Macine guys were
experiencing, but Python is much more scripting-friendly and easier to
learn.

Ergo, world domination.





More information about the IPython-dev mailing list