PEP on path module for standard library

Sat Jul 23 01:01:58 EDT 2005

George Sakkis wrote:
> Bringing up how C models files (or anything else other than primitive types
> for that matter) is not a particularly strong argument in a discussion on
> OO design ;-)

While I have worked with C libraries which had a well-developed
OO-like interface, I take your point.

Still, I think that the C model of a file system should be a
good fit since after all C and Unix were developed hand-in-hand.  If
there wasn't a good match then some of the C path APIs should be
confusing or complicated.  Since I don't see that it suggests that
the "path is-a string" is at least reasonable.

> Liskov substitution principle imposes a rather weak constraint

Agreed.  I used that as an example of the direction I wanted to
go.  What principles guide your intuition of what is a "is-a"
vs a "has-a"?

> Take for example the case where a PhoneNumber class is subclass
> of int. According to LSP, it is perfectly ok to add phone numbers
> together, subtract them, etc, but the result, even if it's a valid
> phone number, just doesn't make sense.

Mmm, I don't think an integer is a good model of a phone number.
For example, in the US
  00148762040828
will ring a mobile number in Sweden while
  148762040828
will give a "this isn't a valid phone number" message.

Yet both have the same base-10 representation.  (I'm not using
a syntax where leading '0' indicates an octal number. :)

> I wouldn't say more complicated, but perhaps less intuitive in a few cases, e.g.:
> 
>> path(r'C:\Documents and Settings\Guest\Local Settings').split()
> ['C:\\Documents', 'and', 'Settings\\Guest\\Local', 'Settings']
> instead of
> ['C:', 'Documents and Settings', 'Guest', 'Local Settings']

That is why the path module using a different method to split
on pathsep vs. whitespace.  I get what you are saying, I just think
it's roughly equivalent to appealing to LSP in terms of weight.

Mmm, then there's a question of the usefulness of ".lower()" and
".expandtabs()" and similar methods.  Hmmm....

> I just noted that conceptually a path is a composite object consisting of
> many properties (dirname, extension, etc.) and its string representation
> is just one of them. Still, I'm not suggesting that a 'pure' solution is
> better that a more practical that covers most usual cases.

For some reason I think that

  path.dirname()

is better than

  path.dirname

Python has properties now so the implementation of the latter is
trivial - put a @property on the line before the "def dirname(self):".

I think that the string representation of a path is so important that
it *is* the path.  The other things you call properties aren't quite
properties in my model of a path and are more like computable values.

I trust my intuition on this, I just don't know how to justify it, or
correct it if I'm wrong.

				Andrew
				dalke at dalkescientific.com