Design: Idiom for classes and methods that are customizable by the user?

Dirk Bächle tshortik at gmx.de
Thu May 12 16:51:43 EDT 2016


Hi there,

I'm one of the SCons (http://www.scons.org) developers and am working on a partial redesign of our architecture. I'd like to come up 
with a decent proposal about how to extend and rewrite our code base, such that it's easier for a user to overload and replace some 
parts of the functionality.


Concrete example
================

We're a build system and all the code is written in Python. Even more so, our build scripts (build description files) are Python 
scripts...allowing the user to have the full power of this beautiful language at his/her fingertips whenever it's needed.
The single build steps are file-oriented, meaning that our "targets" are always either files or directories (simplified).
Each of these files and directories is tracked by an instance of the "Node" class, which is able to provide infos like:

   - Did this (source) file change?
   - Which are my children (meaning, on which other Nodes do I depend)?
   - Which commands, or "Actions", are required to build this Node (could be a Program or a Library, for example)?

We then have a "Taskmaster" which operates on the set of known "Node"s (think: "Strategy"). It loops over the Nodes and tries to 
find one that isn't up-to-date. If all its dependencies are met (=all its children are up-to-date), the Node is ready to get built 
(again, largely simplified).

What happens now and then is, that users are unhappy with the way this Taskmaster proceeds. One peculiar detail is, that our 
"default" Taskmaster always deletes the old target file before re-building it...and in special situations this may be seen as unwanted.
So it would be good to offer a set of Taskmasters to the user, where he can choose from. Even better would be, if the user could add 
a Taskmaster of his own (probably derived from the "original") and activate it...from a build description file (=Python script), so 
without touching the core sources.

To put it shortly, the user should be able to slip a new Taskmaster under the covers of SCons...and it should behave as if it 
would've been built-in.


My current approach
===================

I'm currently following the "Factory" pattern (more or less) as I know it from C++ and similar languages. In the Taskmaster module I 
have a dictionary:


# The built-in Taskmasters
types = {'default' : DefaultTaskmaster,
          'noclean' : NocleanTaskmaster}

def create(key, targets, top, node):
     """ Simple factory for creating an actual Taskmaster,
         based on the given key.
     """
     if key in types:
         return types[key](targets, top, node)

     return DefaultTaskmaster(targets, top, node)

def add(key, taskm_class):
     """ Register the given Taskmaster class, if its key doesn't
         exist yet.
     """
     if not key in types:
         types[key] = taskm_class


with two supporting functions. I'm leaving out all the boilerplate stuff here, like parsing command-line options for the 
to-be-instantiated Taskmaster type and so on.
But this is the core idea so far, simple and certainly "pythonic" to some degree...


My questions
============

- Is this a good approach, that I could use for other parts of the architecture as well, e.g. the Node class mentioned above?
- Are there other options that have stood the test of time under operational conditions? If yes, I'd be interested to get links and 
pointers to the corresponding projects (I'm prepared to read stuff and do further investigations on my own).


When talking about "other options" I'm mainly thinking in the direction of "plugins" (Yapsy?)...not sure whether this idea could 
really fly.


Some criteria
=============

- The approach should work under 2.7.x (and higher) and Python 3.x as well. We're currently rewriting our code (runs only under 
Python 2.7 so far) to a common codebase, using "futurize".
- It should be applicable to classes and simple methods within arbitrary modules. One specialized scheme for classes and methods 
each would be okay though.
- Heavy-weight mechanism are a no-go...in large build projects we have to initialize 800k instances of the Node class, and more.
- Some of our classes, especially Node, use the "slots" mechanism in order to save memory (and yes, it's necessary! ;) ). So, 
certain techniques (=metaclasses?) are probably not compatible with that...


Your advice and input on these thoughts are welcome. If required, I can provide more details in a certain area...but I didn't want 
to make this initial email too long.

Thanks a lot in advance for your answers and best regards,

Dirk Baechle



More information about the Python-list mailing list