Unification of Methods and Functions

Wed May 12 14:45:43 EDT 2004

On 10 May 2004 16:53:06 -0700, moughanj at tcd.ie (James Moughan) wrote:

>David MacQuigg <dmq at gain.com> wrote in message news:<889t90tdl9o9t25cv5dj6k5rnktuce0jin at 4ax.com>...
>> On 8 May 2004 07:07:09 -0700, moughanj at tcd.ie (James Moughan) wrote:

< snip topics we have finished >

>You are not solving a problem; that's the problem. :)  If there were a
>real programming task then it would be more trivial to show why your
>object model is broken.

I could give you an example from IC Design, but for the course I
teach, I chose to use a similar hierarchy based on something everyone
would understand - a taxonomy of animals.  Nothing in this example is
something you wouldn't find in a real program to model an integrated
circuit.  Instead of animal names like Cat, we would have the names of
cells in the hierarchy, names like bgref25a.  Instead of a variable to
count the number of animals at each level, we might have several
variables to track the total current on each of several supply lines.
Like the counts in the Animals.py hierarchy, we need the total current
to each cell, including all of its subcells.

I'm sure there are other examples from other specialties.  In
accounting, I can imagine a hierarchy of accounts, with a total for
each account including all of its subaccounts.  Don't just assume that
the problem isn't real because you haven't encountered it in your
work.

<snip>

>If you can't take it below 70 pages and you only have 4 hours... maybe
>it's not such a great idea to try this?  I can't see your students
>benefiting from what you're proposing to do, if you have so little
>time.

I think I could do it in 30 pages and 4 hours total ( lecture, lab,
and homework ), but not if I need to cover the topics that both Mark
Lutz and I consider important to basic OOP in the current version of
Python.  The 30 pages assumes the unification of methods and functions
that I have proposed.

<snip> 

>> >> >OK: "The whole idea of having these structures in any program is
>> >> >wrong."
>> >> >
>> >> >Firstly, the program uses a class hierarchy as a data structure.  That
>> >> >isn't what class heirarchies are designed for, and not how they should
>> >> >be used IMO. But it's what any bright student will pick up from the
>> >> >example.
>> >> 
>> >> The classes contain both data and functions.  The data is specific to
>> >> each class.  I even show an example of where the two-class first
>> >> example forced us to put some data at an inappropriate level, but with
>> >> a four class hierarchy, we can put each data item right where it
>> >> belongs.
>> >> 
>> >
>> >The data is not specific to the class.  It's specific to the class and
>> >it's subclasses.  Subclasses should be dependent on the superclass,
>> >and generally not the other way around.
>> 
>> What data are we talking about?  numMammals is specific to Mammal.
>> genus is specific to Feline, but *inherited* by instances of a
>> subclass like Cat.
>
>The numAnimals etc... data, which is stored in Animals but gets
>arbitrarily altered by the actions of subclasses of Animal, and
>therefore is not specific to animal; it doesn't represent the state of
>the Animal class or of Animal objects, but of a whole bunch of
>subclasses of Animal.

The total current to an IC is the sum of the currents to all of its
subcircuits.  That current is a single number, for example, 35
microamps.  It has a name "Iss".  Iss is a characteristic of the IC
which appears in data sheets, etc.  It is a variable representing the
state of the entire IC.  It does not represent the state of any
subcircuit in the IC, even though it gets "altered" whenever one of
those subcircuit currents changes.

Looks like this whole argument comes down to what we mean by the word
"specific".  Let's drop it and focus on the more interesting topics in
this thread.

>> >> Nothing in the Bovine class can affect anything in a Cat.  Feline and
>> >> Bovine are independent branches below Mammal.  Adding a Mouse class
>> >> anywhere other than in the chain Cat - Feline - Mammal - Animal cannot
>> >> affect Cat.  Could you give a specific example?
>> >> 
>> >
>> >Say someone adds a mouse class but doesn't call the constructor for
>> >Mammal.  The data produced by mammal and therefore cat is now
>> >incorrect, as instances of mouse are not included in your count.  In a
>> >real example, anything might be hanging on that variable - so e.g.
>> >someone adds some mouse instances and the program crashes with an
>> >array index out of bounds (or whatever the Pythonic equivalent is :) )
>> >, or maybe we just get bad user output.  This type of behaviour is
>> >damn-near impossible to debug in a complex program, because you didn't
>> >change anything which could have caused it.  It's caused by what you
>> >didn't do.
>> 
>> These are normal programming errors that can occur in any program, no
>> matter how well structured.  I don't see how the specific structure of
>> Animals.py encourages these errors.
>
>Imagine if your structure had been implemented as one of the basic
>structures of, say, Java.  That is, some static data in the Object
>class stores state for all the subclasses of Object.  Now, someone
>coming along and innocently creating a class can break Object -
>meaning that may break anything with a dependency on Object, which is
>the entire system.  So I write a nice GUI widget and bang! by some
>bizzare twist it breaks my program somewhere else because of an error
>in, say, the StringBuffer class.  This is analagous to what you are
>implementing here.

I'll need an example to see how these general worries can affect the
Animals_2 hierarchy.  What I see is quite robust.  I added a Feline
class between Mammal and Cat, and I had to change only two lines in
the Cat class.  ( And I could avoid even that if I had used a "super"
call instead of a direct call to the Mammal functions.)

>While errors are always going to happen, OOP calls on some conventions
>to minimize them.  The most absolutely vital of these is that it's
>clear what can break what.  Generally I should never be able to break
>a subsystem by breaking it's wrapper; definitely I should never be
>able to break a superclass by breaking it's subclass; and I
>*certainly* shouldn't be able to break a part of the system by
>changing something unconnected to it.  The whole of OOP derives, more
>or less directly, from these principles.  Expressions like 'A is a
>part/type of B' derive from this philosophy, not the other way around.

Sounds good.

>Your program breaks with this concept.  It allows an event in Cat to
>affect data in Mammal and in Animal, which also has knock-on effects
>for every other subclass of these.  Therefore it is bad object
>oriented programming.

We are modeling the real world here.  When you add a lion to a zoo,
you add one to the count of all animals.  When you add 2 microamps to
the core currents in a bandgap voltage reference, you add that same 2
microamps to the total supply current.

I'm no expert in OOP, but what I have seen so far is not near as clear
in structure as the origninal Animals_2 example.

>It takes us back to the days before even structured programming, when
>no-one ever had any idea what the effects of altering or adding a
>piece of code would be.
>
>It is therefore not a good teaching example. :)

I'll need to see something better before I abandon the curent example.
The problem may be our expectations of OOP.  I see classes as modeling
the real world, including variables that are altered by changes in
subclasses.  You seem to have some computer science notion of what a
class should be.  I'm not saying its wrong, but unless it helps me
solve my real-world problems, in a better way than what I am doing
now, I won't use it.

I'm reminded of the criticism Linus Torvalds got when he first
published Linux.  The academic community thought it was the worst,
most fundamentally flawed design they had ever seen.  It did not fit
some expectation they had that a "microkernel" architecture was the
proper way to design an OS.  Luckily, Mr. Torvalds was not dependent
on their approval, and had the confidence to move ahead.

>> >> I'm not sure what you mean by "side effects" here.  The show()
>> >> function at each level is completely independent of the show()
>> >> function at another level.  >
>> >
>> >But the inventory data isn't independent.  It's affected by classes
>> >somewhere else in the heirarchy.  Worse, it's done implicitly.
>> 
>> The "inventory data" actually consists of independent pieces of data
>> from each class. ( numCats is a piece of inventory data from the Cat
>> class.)  I'm sorry I just can't follow this.
>>
>
>numMammals OTOH is not just a piece of data from one class - it's a
>piece of data stored in one class, but which stores data about events
>in many different classes, all of which are outside it's scope.

Exactly as we see in objects in the real world.

>> >> Chaining them together results in a
>> >> sequence of calls, and a sequence of outputs that is exactly what we
>> >> want.  The nice thing about separating the total "show" functionality
>> >> into parts specific to each class is that when we add a class in the
>> >> middle, as I did with Feline, inserted between Mammal and Cat, it is
>> >> real easy to change the Cat class to accomodate the insertion.
>> >> 
>> >> Python has a 'super' function to facilitate this kind of chaining.
>> >> Michele Simionato's 'prototype.py' module makes 'super' even easier to
>> >> use. Instead of having Cat.show() call Mammal.show() I can now just
>> >> say super.show() and it will automatically call the show() function
>> >> from whatever class is the current parent.  Then when I add a Feline
>> >> class between Mammal and Cat, I don't even need to change the
>> >> internals of Cat.
>> >
>> >That's fine - providing you're not using a class heirarchy to store
>> >data.  It's not the act of calling a method in a super-class which is
>> >a bad idea, it's the way you are making *the numbers outputted* from
>> >cat dependent of actions taken *or not taken* in another class
>> >*completely outside cat's scope*.
>> 
>> Seems like this is the way it has to be if you want to increment the
>> counts for Cat and all its ancestors whenever you create a new
>> instance of Cat.  Again, I'm not understanding the problem you are
>> seeing.  You seem to be saying there should be only methods, not data,
>> stored in each class.
>> 
>
>That's the way it has to be, if you want to write it like that. 
>However there is nothing to say that a given problem must use a
>certain class structure.  If you come up with a solution like this
>then it's near-guaranteed that there was something badly wrong with
>the way you modelled the domain.  Either the program shouldn't need to
>know the number of instances which ever existed of subclasses of
>mammal or else your class structure is wrong.

Trust me, the need is real.  We just need to find the optimum example
to show how Python solves the problem.

In my work as a software product engineer, I've learned to deal with
two very common criticisms.  1) The user doesn't need to do that.  2)
The user is an idiot for not understanding our wonderful methodology.
These are generally irrefutable arguments that can only be trumped by
a customer with a big checkbook.  I generally don't engage in these
arguments, but on one occasion, I couldn't resist.  I was trying to
show an expert how a complicated feature could be done much more
easily with simpler functions we already had in our program.

His argument was basically -- every expert in this company disagrees
with you, and you're an idiot for not understanding how our new
feature works.  I replied that I was the one who wrote the User Guide
on that feature.  He started to say something, but it was only a
fragment of a word, and it kind of fell on the table and died.  There
was a few seconds of silence, while he tried to figure out if he could
call me a liar.  I just looked right at him without blinking.

Forget what you have learned in books.  Think of a real zoo.  Think
how you would write the simplest possible program to do what Animals_2
does -- keep track of all the different classes of animals, and
display the characteristics of any animal or class, including
characteristics that are shared by all animals in a larger grouping.

>And, as general rule, you should think carefully before using classes
>to store data; that's typically what objects are for.  I used static
>data in programs quite a lot before I realised that it too-often bit
>me later on.

Classes *are* objects.  I think you mean instances.  I make a
distinction between class variables and instance variables, depending
on whether the variable is different from one instance to another.
Every instance has a different cat.name, but all cats share the genus
"feline".  In fact, they share that genus with all other members of
the Feline class.  That is why I moved it from Cat to Feline as soon
as our example was big enough to include a Feline class.

>> >> In one syntax we need special "static methods" to handle calls where a
>> >> specific instance is not available, or not appropriate.  In another
>> >> syntax we can do the same thing with one universal function form.
>>
>> To try and get to the bottom of this, I re-wrote the Animals.py
>> example, following what I think are your recommendations on moving the
>> static methods to module-level functions.  I did not move the data out
>> of the classes, because that makes no sense to me at all.
>>
>
>*Sigh*  No, I must say that doesn't help much. :-\
>
>As I said, there is something wrong with the whole idea behind it; the
>design needs refactoring, not individual lines of code.
>
>Having said that, I'll try to redact the issues as best I can, on the
>basis that it may illustrate what I mean.
>
>OK: start with the basics.  We need iterative counting data about the
>individual elements of the heirarchy.
>
>The first thing is that we need to factor out the print statements. 
>Your back-end data manipulation modules should never have UI elements
>in them.  So, whatever form the data manipulation comes in, it should
>be abstract.

You are adding requirements to what I already have.  OK if it doesn't
slow the introductory presentation too much.

>Secondly, we want to keep the data stored in each class local to that
>class.  So, Mammal can store the number of Mammals, if that turns out
>to be a good solution, but not the number of it's subclasses.  OTOH we
>could remove the data from the classes altogether.

Think of a real zoo.  If you ask the zookeeper how many animals he
has, will he tell you only the number that are animals, but are not
also lions or tigers or any other species?  That number would be zero.

I really do want numMammals to display the total number of all
mammals, whether or not they are a member of some other class in
addition to Mammal.

If I were to guess at your objection to this, I would assume you are
worried that the different counters will get "out-of-sync", if for
example, someone directly changes one of these variables, rather than
calling the appropriate functions to make a synchronized change.

My answer to that is to make the counter variables private.  I've
added a leading underscore to those names.  numMammals is now
_numMammals.

>Thirdly, it would probably be nice if we had the ability to implement
>the whole thing in multiple independant systems.  Currently the design
>only allows one of "whatever-we're-doing" at a time, which is almost
>certainly bad.

???

>After a bit of brainstorming this is what I came up with.  It's not a
>specific solution to your problem; instead it's a general one.  The
>following class may be sub-classed and an entire class-heirarchy can
>be placed inside it.  It will then generate automatically the code to
>keep a track of and count the elements of the class heirarchy,
>returning the data you want at a method call.
>
>This is done with a standard OO tool, the Decorator pattern, but
>ramped up with the awesome power of the Python class system. :)

My non-CIS students are not familiar with the Decorator pattern.  I
fear that will make this example incomprehesible to them.

>class Collective:
>    class base: pass
>
>    def startup(self, coll, root):
>        #wrapper class to count creations of classes
>        self.root = root
>        class wrapper:
>            def __init__(self, name, c):
>                self.mycount = 0
>                self.c = c
>                self.name = name
>            def __call__(self, *arg):
>                tmp = self.c(*arg) 
>                self.mycount += 1  
>                return self.c(*arg)
>        self.wrapper = wrapper
>        #replace every class derived from root with a wrapper
>        #plus build a table of the
>        self.wrap_list = []
>        for name, elem in coll.__dict__.items():
>            try:
>                if issubclass(elem, self.root):
>                    tmp = wrapper(name, elem)
>                    self.__dict__[name] = tmp
>                    self.wrap_list.append(tmp)
>            except: pass
>
>    #when subclassing, override this
>    #call startup with the class name
>    #and the root of the class heirarchy
>    def __init__(self):
>        self.startup(Collective, self.base)
>
>    #here's the stuff to do the counting
>    #this could be much faster with marginally more work
>    #exercise for the reader... ;)
>
>    def get_counts(self, klass):
>        counts = [ (x.c, (self.get_sub_count(x), x.name)) \
>            for x in self.super_classes(klass) ]
>        counts.append( (klass.c, (self.get_sub_count(klass),
>klass.name)) )
>        counts.sort(lambda x, y: issubclass(x[0], y[0]))
>        return [x[-1] for x in counts]
>
>    def get_sub_count(self, klass):
>        count = klass.mycount
>        for sub in self.sub_classes(klass):
>            count += sub.mycount
>        return count
>    def super_classes(self, klass):
>        return [x for x in self.wrap_list if issubclass(klass.c, x.c)
>\
>            and not x.c is klass.c]
>    def sub_classes(self, klass):
>        return [x for x in self.wrap_list if issubclass(x.c, klass.c)
>\
>            and not x.c is klass.c]
>
>So we can now do:
>
>class animal_farm(Collective):
>    class Animal: pass
>    class Mammal(Animal): pass
>    class Bovine(Mammal): pass
>    class Feline(Mammal): pass
>    class Cat(Feline): pass
>    def __init__(self):
>        self.startup(animal_farm, self.Animal)
>
>
>a_farm = animal_farm()
>cat = a_farm.Cat()
>feline = a_farm.Mammal()
>print a_farm.get_counts(a_farm.Feline)
>
>>>> [(2, 'Animal'), (2, 'Mammal'), (1, 'Feline')]
>
>
>The above code is 51 lines with about 10 lines of comments.  For a
>project of any size, this is a heck of an investment; I believe it
>would take a fairly determined idiot to break the system, and *most
>importantly*, they would be able to trace back the cause from the
>effect fairly easily.

This is an impressive bit of coding, but I can assure you, as an
introduction to OOP, it will blow away any non-CIS student.  It may
also be difficult to modify, for example, if we want to do what
Animals_2 does, and provide a custom display of characteristics for
each class.

One possibility is to make this an Animals_3 example.  Animals_1 was a
simple two-class structure.  It served to introduce instance
variables, and some basic concepts like inheritance.  When we moved to
Animals_2, we pointed out the limitations of Animals_1, like not
having enough classes to put variables like 'genus' where they really
belong.

Maybe we should go one more step, and make this a third example.  We
can point out the limitations of Animals_2 in the introduction to
Animals_3.  I can see the benefit of moving the print statements to
the top level.  This is needed if we ever want to make the classes in
Animals_2 work in some kind of framework with other classes.  The
show() functions in Animals_2 could be modified to return a list of
strings instead of printing directly to the console.

I've posted your program as Solution 3 to the exercise at
http://ece.arizona.edu/~edatools/Python/Exercises/  Could you give us
a brief description of the advantages and disadvantages compared to
the original.  I'm not able to do that, because I'm having difficulty
restating what you have said above in terms that students will
understand.  I cannot, for example, explain why your solution is more
robust.

>Admittedly the solution is on the complicated side, though perhaps
>someone with more experience than me could simplify things. 
>Unfortunately, a certain amount of complexity is just a reflection of
>the fact that your demands strain the OO paradigm right to it's limit.
> You could possibly implement the same thing in Java with a Factory
>pattern, and perhaps the reflection API.

Your vast experience may be blinding you to the problems non-CIS
students will have with these more complex solutions.  I may be
pushing a paradigm to some limit, but these are real-world problems
that should be easily solved with a good OOP language.

-- Dave

>(Of course I'm none too sure I could do that after many years of
>hacking Java vs a few weeks of Python!)
>
>
>> Take a look at http://ece.arizona.edu/~edatools/Python/Exercises/ and
>> let me know if Animals_2b.py is what you had in mind.  If not, can you
>> edit it to show me what you mean?
>>
>> -- Dave