Portrait of a "real life" __metaclass__

Mark Shroyer usenet-mail at markshroyer.com
Sat Nov 10 06:34:49 EST 2007


On 2007-11-10, Jonathan Gardner <jgardner.jonathangardner.net at gmail.com> wrote:
> On Nov 9, 7:12 pm, Mark Shroyer <usenet-m... at markshroyer.com> wrote:
>> I guess this sort of falls under the "shameless plug" category, but
>> here it is: Recently I used a custom metaclass in a Python program
>> I've been working on, and I ended up doing a sort of write-up on it,
>> as an example of what a "real life" __metaclass__ might do for those
>> who may never have seen such a thing themselves.
>>
>> http://markshroyer.com/blog/2007/11/09/tilting-at-metaclass-windmills/
>>
>> So what's the verdict?  Incorrect?  Missed the point completely?
>> Needs to get his head checked?  I'd love to hear what
>> comp.lang.python has to (anthropomorphically) say about it.
>>
>
> Kinda wordy.

Yeah, my fingers sort of latch onto the keyboard sometimes and just
won't let go.  Sorry about that ;)

> Let me see if I got the point:
>
> - You already had a bunch of classes that did age matching on date
> time objects.
>
> - You were building a class that matched emails.
>
> - You wanted to reuse the code for age matching to do email matching
> (based on the message's age)
>
> - So you wrote a metaclass that replaced the match() method with a
> proxy that would either dispatch to the old match() method (if it was
> a datetime object) or dispatch to the new match() method (which
> matched based on the message's date.)

Close, but not quite.  The proxy *always* dispatches to
AgeSpec.match(), but if the result of that method is an AgeSpec
itself, then the proxy wraps the result back up in a Matcher, which
works out rather conveniently for the rest of the application.

> Sounds Java-y, if that's even a word.

Yeah, you pretty much nailed my original background right there.  On
the other hand, I've also done a lot of work in Perl and C, and
pride myself on striving not to resort to OO patterns where they
aren't really useful.  So let me try to defend my reasoning, if it
is in fact defensible...

> Too many classes, not enough functions. You can tell you are doing
> Java in Python when you feel the urge to give everything a name
> that is a noun, even if it is completely based of a verb, such as
> "matcher". My opinion is that if you want to talk about doing
> something in Python such as matching, starting writing functions
> that match and forget the classes. Classes are for real nouns,
> nouns that can do several distinct things.
>
> What would I have done? I wouldn't have had an age matching class. I
> would have had a function that, given the datetime and a range
> specification, would return true or false. Then I would've written
> another function for matching emails. Again, it takes a specification
> and the email and returns true or false.

There isn't much difference between

  match_calendar_month(2007, 11, message)

and

  m = CalendarMonthMatcher(2007, 11)
  m.match(message)

so of course you're right that, were that all I'm doing with these
matchers, it would be a waste to implement them as classes.  But
take for example two of my app's mailbox actions -- these aren't
their real names, but for clarity let's call them ArchiveByMonth and
SaveAttachmentsByMonth.  The former moves messages from previous
months into an archival mbox file ./archives/YYYY/MM.mbox
corresponding to each message's month, and the latter saves message
attachments into a directory ./attachments/YYYY/MM/.  Each of these
actions would work by using either match_calendar_month() or
CalendarMonthMatcher().match() to perform its action on all messages
within a given month; then it iterates through previous months and
repeats until there are no more messages left to be processed.

In my object-oriented implementation, this iteration is performed by
calling m.previous() on the current matcher, much like the
simplified example in my write-up.  Without taking the OO approach,
on the other hand, both types of actions would need to compute the
previous month themselves; sure that's not an entirely burdensome
task, but it really seems like the wrong place for that code to
reside.  (And if you tackle this by writing another method to return
the requisite (year, month) tuple, and apply that method alongside
wherever match_calendar_month() is used...  well, at that point
you're really just doing object-oriented code without the "class"
keyword.)

Furthermore, suppose I want to save attachments by week instead of
month: I could then hand the SaveAttachmentsByPeriod action a
WeekMatcher instead of a MonthMatcher, and the action, using the
matcher's common interface, does the job just as expected.  (This is
an actual configuration file option in the application; the nice
thing about taking an OO approach to this app is that there's a very
straightforward mapping between the configuration file syntax and
the actual implementation.)

It could be that I'm still "thinking in Java," as you rather
accurately put it, but here the object-oriented approach seems
genuinely superior -- cleaner and, well, with better encapsulated
functionality, to use the buzzword.

> If I really wanted to pass around the specifications as objects, I
> would do what the re module does: have one generic object for all the
> different kinds of age matching possible, and one generic object for
> all the email objects possible. These would be called,
> "AgeMatchSpecification", etc... These are noun-y things. Here,
> however, they are really a way of keeping your data organized so you
> can tell that that particular dict over there is an
> AgeMatchSpecification and that one is an EmailMatchSpecification. And
> remember, the specifications don't do the matching--they merely tell
> the match function what it is you wanted matched.

Oddly enough, the re module was sort of my inspiration here:

  my_regex = re.compile("abc")
  my_regex.match("some string")

(Sure, re.compile() is a factory function that produces SRE_Pattern
instances rather than the name of an actual class, but it's still
used in much the same way.)

> Now, part of the email match specification would probably include bits
> of the date match specification, because you'd want to match the
> various dates attached to an email. That's really not rocket science
> though.
>
> There wouldn't be any need to integrate the classes anymore if I did
> it that way. Plus, I wouldn't have to remember a bunch of class names.
> I'd just have to remember the various parameters to the match
> specification for age matching and a different set of parameters for
> the email matching.

You're sort of missing the bigger picture of this application,
although that's entirely not your fault as I never fully described
it to begin with.  The essence of this project is that I have a
family of mailbox actions (delete, copy, archive to mailbox, archive
by time period, ...) and a family of email matching rules (match
read messages, match messages with attachments, match messages of a
certain size, match messages by date, ...) of which matching by date
is only one subtype -- but there are even many different ways to
match by date (match by number of days old, match by specific
calendar month, match by specific calendar month *or older*, match
by day of the week, ...); not to mention arbitrary Boolean
combinations of other matching rules (and, or, not).

My goal is to create a highly configurable and extensible app, in
which the user can mix and match different "action" and "matcher"
instances to the highest degree possible.  And using class
definitions really facilitates that, to my Java-poisoned mind.  For
example, if the user writes in the config file

  actions = (
    (
      # Save attachments from read messages at least 10 days old
      mailbox => (
        path => '/path/to/maildir',
        type => 'maildir',
      ),
      match => (
        type => And,
        p => (
          type => MarkedRead,
          state => True,
        ),
        q => (
          type => DaysOld,
          days => 10,
        ),
      ),
      action => (
        type => SaveAttachments,
        destination => '/some/directory/',
      ),
    ),
  )

(can you tell I've been working with Lighttpd lately?)

then my app can easily read in this dictionary and map the
user-specified actions directly into Matcher and Action instances;
and this without me having to write a bunch of code to process
boolean logic, matching types, action parameters, and so on into a
program flow that has a structure needlessly divergent from the
configuration file syntax.  It also means that, should a user
augment the program with his own Matcher or Action implementation,
as I intend to make it easy to do, then those implementations can be
used straightaway without even touching the code for the
configuration file reader.

As for the decision to use a metaclass proxy to my AgeSpec classes,
I'm fully prepared to admit wrongdoing there.  But I still believe
that an object-oriented design is the best approach to this problem,
at least considering my design goals.  Or am I *still* missing the
point?

Thanks for your input!
Mark

-- 
Mark Shroyer
http://markshroyer.com/



More information about the Python-list mailing list