Ruby Impressions

Sun Jan 13 00:11:39 EST 2002

Adam Spitz:
>When I look at a piece of code like this...
>
>class Person
>  def initialize(name, age, gender)
>    @name, @age, @gender = name, age, gender
>  end
>end
>
>...my brain parses it, thinks, "Oh, it's just initializing some
>attributes," and then thinks of the whole chunk as one "unit".

Sure, and my brain sees the same in

class Person:
  def __init__(self, name, age, gender):
    self.name, self.age, self.gender = name, age, gender

or (more commonly written as)

class Person:
  def __init__(self, name, age, gender):
    self.name = name
    self.age = name
    self.gender = gender

and in similar constructs in C++ and Perl and ...

>Call it
>an idiom, or a pattern, or whatever you want - point is, I see the
>whole initialize() method as one thing, and I want to give a name to
>that thing.

For clarity, this doesn't handle keyword arguments.  Support for those
is possible, but a bit tedious and error prone to write on the fly:

class SpitzInitialize:
  def __init__(self, *args):
    if len(args) != len(self._allowed_spitz_names):
      raise TypeError("Expecting %d values, got %d" %
                      (len(self._allowed_spitz_names), len(args)))
    for k, v in zip(self._allowed_spitz_names, args):
        setattr(self, k, v)
    self.initialize()
  def initialize(self):
    pass

This can be used in your code as a mixin, where inheritence is used for
code reuse.

class Person(SpitzInitialize):
  self._allowed_spitz_names = ["name", "age", "gender"]
  def initialize(self):
    ...

Another way is to produce the __init__ function you want, like this

from __future__ import nested_scopes  # needs Python 2.2

def init_attributes(*argnames):
  def __init__(self, *args):
    if len(args) != len(argnames):
      raise TypeError("Expecting %d values, got %d" %
                      (len(argnames), len(args)))
    for k, v in zip(argnames, args):
      setattr(self, k, v)
    if hasattr(self, "initialize"):
      self.initialize()
  return __init__

class Person:
  __init__ = init_attributes("name", "age", "gender")

This actually works.  Here's a test case.

>>> person = Person("Andrew", "31", "male")
>>> person.name
'Andrew'
>>> person.age
'31'
>>> person.gender
'male'
>>>

You could also parameterize the init_attributes function to call any
function you may want, instead of 'initialize'.

A problem with this approach is the lack of support for inheritance.
Neither support the ability to pass terms higher up the inheritance
tree.  I looked at your Wiki page, and the same seems to hold true
with your Ruby examples.

Another problem is that these are *non*standard*, which means
they are harder to understand, harder to debug, and harder to maintain.

Another problem is that once you start this route you also start wanting
variations.  Keyword parameters.  Optional parameters.  Default values.
Range checking to a list of restricted values.  And more.

There is not nor can there be consensus on the best way to support
all of these.  Your simple case, where the initialization function
only sets the attributes, isn't common enough in my experience to
warrant special casing that need at the expense of extra language
complexity.

And it's so easy to write this code that if you really need it for
your code you can easily add it... but other aspects of your code
is likely to suffer.

>So I create
>a method called init_attributes (which I'm not going to show here,
>because it's kinda hairy, but you can find it on that Wiki page I
>mentioned), and from that point on I can express the "Oh, it's just
>initializing some attributes" concept directly in my code:

As far as I can tell, it dynamically calling the Ruby compiler on
the fly.  That seems horrendously complicated to me.  Doesn't the
code I show above do essentially the right thing you want?

> The point is that it expresses my intention directly, and
>contains less duplication. This makes my code *easier* to read, as
>long as the reader knows what the init_attributes() subroutine does.

Bingo!  And the reader won't know what it means without digging
through the rest of the system.  And that poor soul won't know what

  init_attributes( "name" = (None, ["Andrew", "Brandy", "Cindy", "Dylan"],
None),
                   "age" = (18, None, None),
                   "gender" = (None, ["male", "female"], \
                                   "Sorry, we're a bit old-fashioned
here") )
says that:
  "name" can take one of four possible values
  "age" can take any value, with a default value of 18
  "gender" can take only two values, and if those values aren't used then
the
      error message for the TypeError exception is as given

>Maybe now you're going to argue that we should never create any
>subroutines, because they force readers to go to the extra trouble of
>looking up what they do. I won't have an answer to that.

Ohh!  Is that a strawman argument?  Or the fallacy of the excluded middle?
Perhaps the fallacy of the false dilemma?  Maybe even all three?  I never
did remember the names of all the rhetorical tricks.  Basically, no one
ever said what you propose here as the counter to your proposal.

The argument is that too many constructs (especially sublty different ones!)
for the same action is harder to learn, use, and maintain.  It is easily
possible for the addition of subroutines to make things more complicated
than a solution without subroutines.  The skill in programming comes in
knowing what is appropriate.

I do not think what you want is appropriate enough to go in the core
language.  It is appropriate enough for local cases.

For example, I have one module where the docstring for the module lists all
of the available attributes (there are a dozen or so).  During module
initialization the docstring is parsed to identify the parameter names,
one-liner description, and if it is an optional or required parameter.
This information is used by the __init__ to make sure that the needed
kwargs are given.

This was written so I wouldn't have *four* different lists of arguments
(docstring, parameter listing, local variable, class attribute).  On the
other hand, it does force all the callers to use keyword arguments, which
is *good* in this case because it forces additional self-documentation
in the caller.  The caller is part of a configuration file that many people,
including many non-Python programmers, will reference.  So seeing the name
is useful because positional requirements will force most people to have
to reference the API.

                    Andrew
                    dalke at dalkescientific.com