[Python-Dev] PEP 3144 review.

Guido van Rossum guido at python.org
Tue Sep 29 05:18:43 CEST 2009


On Mon, Sep 28, 2009 at 7:17 PM, R. David Murray <rdmurray at bitdance.com> wrote:
> On Mon, 28 Sep 2009 at 13:43, Guido van Rossum wrote:
>>
>> On Mon, Sep 28, 2009 at 1:36 PM, R. David Murray <rdmurray at bitdance.com>
>> wrote:
>> I would say that there certainly are precedents in other areas for
>> keeping the information about the input form around. For example,
>> occasionally it would be handy if parsing a hex integer returned an
>> object that was compatible with other integers but somehow kept a hint
>> that would cause printing it to use hex by default.
>>
>> I see keeping around the IP address used to create a network object as
>> a similar edge case. I would probably define the __eq__ method to
>> implement equivalency (ignoring the form of the input), just as I
>> would in the case of the (hypothetical) hex integers. If you wanted to
>> do a comparison that includes the input IP address, you could use (a,
>> a.ip) == (b, b.ip).
>
> Ignoring Antoine's excellent points for the moment, to my mind
> one difference between your integer-with-hex-hint and ipaddr's
> ip-with-netmask is that there is a one to one mapping between the
> canonical hex representations and the integer.
>
> If you want to argue that the _exact_ input string be preserved in the
> integer object, then there might be more of a direct analogy.  Except that
> ipaddr isn't preserving the input _string_, it's preserving a canonical
> representation of that string, so even there the analogy wouldn't be
> all that close.  The difference here is that ipaddr is preserving input
> information that is irrelevant to the object being returned (a network).

I disagree that recording the exact input string would be a better
analogy; as you say the hypothetical hex integer doesn't save the
exact input string -- it remembers that the base was 16 That happens
to be 1 bit of information if you're only considering hex vs. decimal,
but one could imagine a further version that supports any base between
1 and 36 (or however far you want to go).

Whether I typed 01.01.01.01/024 or 1.1.1.1/24 is (presumably)
irrelevant for this case, since the byte values are the same; whether
I typed 1.1.1.1/24 or 1.1.1.0/24 *is* relevant (for Peter). There's
probably another form, 16843009/24, which is also equivalent.

> The fractions case is much closer.

Hardly -- I've never heard of someone who had a use case for
denormalized fractions, but I don't doubt that Peter has a use case
for denormalized IPNetwork objects. (Do you doubt that Peter has such
a use case? If so, we have a much more fundamental disagreement.)

> Or consider another analogy: the ipaddr case is like the mod function
> returning a single object that is supposed to be the remainder,
> but also has an extra field that records the original input number.
> This might be a useful operator and object type in certain contexts, but
> it is something that would be layered on top of the real mod function
> that returns a simple integer remainder.  You would never approve this
> remainder-with-saved-input as the data type returned by the mod operator.
> (I hope :)

Not a very strong argument, since that use case is purely
hypothetical. I brought up the hex integer example to show that it is
possible to conceive of use cases for objects that record some sort of
information about their creation history. Now you're bending my line
of reasoning by going from the assertion "Guido would never approve of
mod-returning-remainder-with-saved-input" (which happens to be true
until a real use case is found) to "Guido would never approve of *any*
operator that keeps some traces of its creation history" (which is
false -- dict being just one counter-example :-).

> Similarly, there should be a basic ipaddr parsing function (and I don't
> much care how you spell it, even though I have opinions) that returns
> an ip address and a network object when given an ipaddress-plus-netmask
> string. An additional object built upon those fundamental data types that
> remembers the input IP would be fine, but the data type that remembers
> the IP should _not_ be the fundamental data type in the system.  That
> strikes me as poor design, OO or otherwise.

Poor design is highly subjective, and I simply disagree that one
design is a priori better or worse than the other. This is Python,
where practicality beats purity, so things like expediency of
implementation, frequency of various uses, etc., matter.

Right now, without knowing more about Peter's use case, I'd sat that
__eq__ should ignore the .ip attribute, but that's more based on
trying to get enough people to drop their opposition than on a full
understanding of the use case. I do note that if Peter's use case is
at all common, reducing the number of classes is a worthy goal, and
Python has a bit of a history of preferring a small number of
Swiss-army-knife classes over a multitude of simple classes.

> In fact, your "sometimes it would be useful if" phrasing indicates that
> your 'integer with hex hint' data type would also not be the fundamental
> data type, but instead a subclass of int.

Which proves nothing, see above. As a matter of fact, my "sometimes it
would be useful" was meant as a gentle nudge in the direction of
keeping the .ip attribute.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)


More information about the Python-Dev mailing list