Language design

Steven D'Aprano steve+comp.lang.python at pearwood.info
Fri Sep 13 01:08:04 EDT 2013


On Thu, 12 Sep 2013 20:23:21 -0700, Mark Janssen wrote:

>>> Really?  Are you saying you (and the community at-large) always derive
>>> from Object as your base class?
>>
>> Not directly, that would be silly.
> 
> Silly?  "Explicit is better than implicit"... right?

If I'm inheriting from str, I inherit from str explicitly:

class MyStr(str): ...

and then str in turn inherits from object explicitly. I certainly do not 
inherit from object and then re-implement all the string methods from 
scratch:

class MyStr(object):
    def __new__(cls, value): ...
    def upper(self): ... 
    def lower(self): ...
    # and so on...

That would be ridiculous, and goes against the very idea of inheritance. 
But nor do I feel the need to explicitly list the entire superclass 
hierarchy:

class MyStr(str, object):
    ...


which would be silly. Only somebody who doesn't understand how 
inheritance works in Python would do that. There's simply no need for it, 
and in fact it would be actively harmful for larger hierarchies.


>>> But wait is it the "base" (at the bottom of the hierarchy) or is it
>>> the "parent" at the top?  You see, you, like everyone else has been
>>> using these terms loosely, confusing yourself.
>>
>> Depends on whether I'm standing on my head or not.
>>
>> Or more importantly, it depends on whether I visualise my hierarchy
>> going top->down or bottom->up. Both are relevant, and both end up with
>> the *exact same hierarchy* with only the direction reversed.
> 
> Ha,  "only the direction reversed".  That little directionality that
> you're passing by so blithely is the difference between whether you're
> talking about galaxies or atoms.

It makes no difference whether I write:

    atoms -> stars -> galaxies

or

    galaxies <- stars <- atoms

nor does it make any difference if I write the chain starting at the top 
and pointing down, or at the bottom and pointing up.

Your objection implies that writing family trees with the most distant 
ancestor (the root of the tree) at the top of the page somehow confuses 
people into thinking that perhaps they are the progenitor of people who 
lived generations earlier. That's absurd -- people simply are not as 
stupid as you think.


> The simplicity of Python has seduced you into making an "equivocation"
> of sorts.  It's subtle and no one in the field has noticed it.  It crept
> in slowly and imperceptively.

Ah, and now we come to the heart of the matter -- people have been 
drawing tree-structures with the root at the top of the page for 
centuries, and Mark Janssen is the first person to have realised that 
they've got it all backwards.


>>> By inheriting from sets you get a lot of useful functionality for
>>> free.  That you don't know how you could use that functionality is a
>>> failure of your imagination, not of the general idea.
>>
>> No you don't. You get a bunch of ill-defined methods that don't make
>> sense on dicts.
> 
> They are not necessarily ill-defined.  Keep in mind Python already chose
> (way back in 1.x) to arbitrary overwrite the values in a key collision. 
> So this problem isn't new.  You've simply adapted to this limitation
> without knowing what you were missing.

No, Python didn't "arbitrarily" choose this behaviour. It is standard, 
normal behaviour for a key-value mapping, and it is the standard 
behaviour because it is the only behaviour that makes sense for a general 
purpose mapping.

Python did not invent dicts (although it may have invented the choice of 
name "dict").

If you think of inheritance in the Liskov Substitution sense, then you 
might *consider* building dicts on top of sets. But it doesn't really 
work, because there is no way to sensibly keep set-behaviour for dicts.

For example, take intersection of two sets s and t. It is a basic 
principle of set intersection that s&t == t&s.

But now, take two dicts with the same keys but different values, d and e. 
What values should be used when calculating d&e compared to e&d? Since 
they are different values, we can either:

* prefer the values from the left argument over that from the right;
* prefer the values from the right argument over that from the left;
* refuse to choose and raise an exception;
* consider the intersection empty

The first three choices will break the Liskov Substitution Principle, 
since now dicts *cannot* be substituted for sets. The fourth also breaks 
Liskov, but for a different reason:

# sets
(key in s) and (key in t) implies (key in s&t);

but

# dicts
(key in d) and (key in e) *does not* imply (key in d&e)


So whatever you do, you are screwed Liskov-wise. You cannot derive dicts 
from sets and still keep the Liskov Substitution Principle.

Of course, LSP is not the only way to design your inheritance 
hierarchies. An alternative is to design them in terms of delegating 
implementation to the superclass. As Raymond Hettinger puts it, your 
classes tells it's parent to do some of the work. But in this case, 
you're still screwed: you can derive sets from dicts, and in fact the 
first implementation of sets in Python did exactly that, but you cannot 
derive dicts from sets.

So either way, whether you are an OOP purist who designs your classes 
with type-theoretic purity and the Liskov Substitution Principle in mind, 
or a pragmatist who designs your classes with delegation of 
implementation in mind, you can't sensibly derive dicts from sets.


>>>>> 3) It used the set literal for dict, so that there's no obvious way
>>>>> to do it.  This didn't get changed in Py3k.
>>>>
>>>> No, it uses the dict literal for dicts.
>>>
>>> Right.  The dict literal should be {:} -- the one obvious way to do
>>> it.
>>
>> I don't agree it is obvious. It is as obvious as (,) being the empty
>> tuple or [,] being the empty list.
> 
> You're just being argumentative.  If there are sets as built-ins, then
> {:} is the obvious dict literal, because {} is the obvious one for set. 
> You don't need [,] to be the list literal because there is no simpler
> list-type.

The point is that the obvious way to write an empty collection is using a 
pair of delimiters, not to shove an arbitrary separator separating 
nothing at all in there:

[] is an empty list, not [,]
() is an empty tuple, not (,)
{} is an empty (dict|set), not {,} or {:}


We can't have {} be both an empty set and an empty dict, so one of the 
two has to miss out. Neither is more obvious than the other. dicts are 
more important data structures, and they have historical precedence, so 
they win. If Python was being re-invented from scratch now, or if sets 
had been around just as long as dicts, people might have chosen to given 
sets higher priority than dicts, but frankly I doubt it. It's much more 
common to want an empty dict than an empty set, at least in my experience.

 
>>>> And the obvious way to form an empty set is by calling set(), the
>>>> same as str(), int(), list(), float(), tuple(), dict(), ...
>>>
>>> Blah, blah.  Let me know when you got everyone migrated over to
>>> Python.v3.
>>
>> What does this have to do with Python 3? It works fine in Python 2.
> 
> I mean, you're suggestions are coming from a "believer", not someone
> wanting to understand the limitations of python or whether v3 has
> succeeded at achieving its potential.

"not someone wanting to understand the limitations of python..." -- are 
you aware that I started this thread?



-- 
Steven



More information about the Python-list mailing list