Comparisons and sorting of a numeric class....

Dave Angel davea at davea.name
Tue Jan 6 22:39:49 EST 2015


On 01/06/2015 09:01 PM, Andrew Robinson wrote:
>
> On 01/06/2015 06:02 AM, Dave Angel wrote:
>> On 01/06/2015 08:30 AM, Andrew Robinson wrote:
>>>
>>>>> So, I'm not sure I can subclass boolean either because that too is a
>>>>> built in class ...  but I'm not sure how else to make an object that
>>>>> acts as boolean False, but can be differentiated from false by the
>>>>> 'is'
>>>>> operator.  It's frustrating -- what good is subclassing, if one cant
>>>>> subclass all the base classes the language has?
>>
>> I said earlier that I don't think it's possible to do what you're
>> doing without your users code being somewhat aware of your changes.
> Aye. You did.  And I didn't disagree. :)
> The goal is merely to trip up those who don't know what I'm doing as
> little as possible and only break their code where the very notion of
> uncertainty is incompatible with what they are doing, or where they did
> something very stupid anyway... eg: to break it where there is a good
> reason for it to be broken.
> I may not achieve my goal, but I at least hope to come close...
>
>>
>> But as long as the user doesn't check for the subclass-ness of your
>> bool-like function, you should manage.  In Python, duck-typing is
>> encouraged, unlike java or C++, where the only substitutable classes
>> are subclasses.
>
> but if you can't subclass a built in type -- you can't duck type it --
> for I seem to recall that Python forbids duck typing any built in class
> nut not subclasses.  So your two solutions are mutually damaged by
> Guido's decision;  And there seem to be a lot of classes that python
> simply won't allow anyone to subclass.  ( I still need to retry
> subclassing float, that might still be possible. )
>
> Removing both options in one blow is like hamstringing the object
> oriented re-useability principle completely.  You must always re-invent
> the wheel from near scratch in Python....
>>>
>>> --Guido van Rossum
>>>
>>> So, I think Guido may have done something so that there are only two
>>> instances of bool, ever.
>>> eg: False and True, which aren't truly singletons -- but follow the
>>> singleton restrictive idea of making a single instance of an object do
>>> the work for everyone; eg: of False being the only instance of bool
>>> returning False, and True being the only instance of bool returning
>>> True.
>>>
>>> Why this is so important to Guido, I don't know ... but it's making it
>>> VERY difficult to add named aliases of False which will still be
>>> detected as False and type-checkable as a bool.  If my objects don't
>>> type check right -- they will likely break some people's legacy code...
>>> and I really don't even care to create a new instance of the bool object
>>> in memory which is what Guido seems worried about, rather I'm really
>>> only after the ability to detect the subclass wrapper name as distinct
>>> from bool False or bool True with the 'is' operator.  If there were a
>>
>> There's already a contradiction in what you want.  You say you don't
>> want to create a new bool object (distinct from True and False), but
>> you have to create an instance of your class.  If it WERE a subclass
>> of bool, it'd be a bool, and break singleton.
> Yes there seems to be a contradiction but I'm not sure there is ... and
> it stems in part from too little sleep and familiarity with other
> languages...
>
> Guido mentioned subclassing in 'C' as part of his justification for not
> allowing subclassing bool in python.
> That's what caused me to digress a bit...  consider:
>
> In 'C++' I can define a subclass without ever instantiating it; and I
> can define static member functions of the subclass that operate even
> when there exists not a single instance of the class; and I can typecast
> an instance of the base class as being an instance of the subclass.  So
> -- (against what Guido seems to have considered) I can define a function
> anywhere which returns my new subclass object as it's return value
> without ever instantiating the subclass -- because my new function can
> simply return a typecasting of a base class instance;  The user of my
> function would never need to know that the subclass itself was never
> instantiated... for they would only be allowed to call static member
> functions on the subclass anyway, but all the usual methods found in the
> superclass(es) would still be available to them.  All the benefits of
> subclassing still exist, without ever needing to violate the singleton
> character of the base class instance.

PYTHON IS NOT C++.  There is no typecasting, because Python "variables" 
don't have types, only the object has a type, and you can't force it to 
be something different.  Unlike C++, Python has strict typing.  And 
unlike Python that typing is on the data, NOT on the "variable."  A 
Python variable is just a reference to the real data, and has no type of 
its own.

If you try to create a new instance of bool, it'd either be True or it'd 
be False (exactly two objects, singletons), and in no way would it 
belong to your hypothetical subclass.

As long as the user is not foolish enough to check the type, you can 
create a class that's equivalent to bool, and PRETEND it's a subclass. 
You really need to read up on duck-typing.

If you create an object that pretends to be bool, but isn't quite, then 
you have to decide just which characteristics of the original class 
you're going to violate, and convince your users not to use those 
characteristics.  That's true in any language, but Python makes it 
especially easy.




>   ...phical reason for what Guido wants that he
> hasn't fully articulated...?
> If I understood him better-- I wouldn't be making wild ass guesses and
> testing everything I can think of to work around what he chose...
>

Don't try to understand Guido, try to understand Python.  Names do not 
have types, objects do.

>
> Yep.  Python cuts off re-usability at the ankles...
> I have NO way to signal to my users that my object is compatible with
> bool.

Duck typing.

>  For that's what subclass typechecks are about...  If someone
> *needs* an object that does everything bool does (proto-type bool), the
> only portable test for compatibility is to check if the object is a bool...

Nonsense.  Duck typing.  If you're told it will behave in a certain way, 
you use it as though it is of that type.  If you check the type, you're 
probably thinking in some other language.

> That pretty much kills legacy support / compatability... no one can know
> my object is compatible with bool in a portable fashion...

Documentation.

>
>>> I wouldn't mind making a totally
>>> separate class which returned the False instance; eg: something like an
>>> example I modified from searching on the web:
>>>
>>> class UBool():
>>>      def __nonzero__(self): return self.default
>>>      def __init__( self, default=False ): self.default = bool(default)
>>>      def default( self, default=False ): self.defualt = bool(default)
>>>
>>> but, saying:
>>>  >>> error=UBool(False)
>>>  >>> if error is False: print "type and value match"

Wny use "is" here?  That's obviously contradicting the singleton nature 
of the False object.

>>>
>>>  >>> if not error: print "yes it is false"
>>> ...
>>> yes it is false
>>
>> No, the object False is not referenced in the above expression. You're
>> checking the "falseness" of the expression.  Same as if you say
>>       if not 0
>>       if not mylist
>
> Hmm... your a bit confusing / unclear ?
> I think the object returned by 'not' is True or False.

You can think it.  But you created that object with UBool().  So it 
clearly isn't the False object.

>  So the
> expression as a whole does reference either False or True objects.

When you applied "not" to the object, the result will be True.  But your 
object is not False.  It's false.

> I
> didn't think that 'error' was the False object itself, just that the
> evaluation of any expression containing 'error' eventually called
> __nonzero__() which I defined to return the False object.
>
> What I was trying to figure out is order of precedence ; when does
> __nonzero__() get called, and is it called at all.  I think it was
> called, because the object reference itself is something that is a non
> null pointer

No pointers in Python.  Just bindings.  And they're never null.  A name 
is either bound to an object or the name doesn't exist.

Here we need help from someone who knows more about this particular 
detail.  I don't know which special methods bool() or not will call, but 
I'd think it would be something with a better name than __nonzero__


> ... and I would expect 'not (...something nonzero.,.) ' to
> evaluate as false and the print statement NOT to be executed.
>
> However, the print statement was executed -- so that possibility was
> eliminated.  So, I am pretty sure that at some point __nonzero__() was
> called, and error was replaced with whatever __nonzero__() returned;  In
> my test, that would be the 'False' object. Correct?
>
Still somebody else needed.

>>
>>>  >>> print error.__nonzero__()
>>> False
>>>  >>> if error==False: print "It compares to False properly"
>>> ...
>>
>> You control this one by your __eq__ method.
>
> Yes... now we're really getting somewhere.
> That's something I overlooked.
>
> Question:  If two different class's instances are being compared by
> '==', and both define an __eq__ method, which method gets called?  ( I
> don't know if that applied here... but I'm not familiar with order of
> operations )

The left object's methods are generally examined first.  So if you said
     mylist * 5

you'd be using a __mul__ method of list class.

>
>> 1) read up more closely on special methods, and on the meanings of
>> id() and 'is'
>>
>> 2) And don't expect that any change you make at this level could be
>> transparent to all existing applications.  It's a contradiction in terms.
>>
>>
> 1) Yes -- I'll do that.  although -- my interpretation of 'is' was
> simply tiredness... I knew better and forgot.  You put my head back on
> right.  Thanks.
>
> 2) I never did have that expectation ; I just want to do the best I
> can... and not settle for third best...
> Thanks. :)
>
> To sum up:
> It's fairly clear that whenever my library returns actual True and False
> objects from magnitude comparison operators, there will be full backward
> compatibility with float.  So -- for cases where the two numeric types
> don't have any difference in meaning, there will still be full
> compatibility.  (That, thankfully, is the most typical use case...)
>
> For the remaining quasi 'False'  return values,  I know the 'is'
> operator must always fail.  So if anyone stupidly puts '(a>b) is False'
> in their legacy floating point code for no really good reason -- well,
> too bad; it breaks --  But that's the best I can do.
>
> However; I think I can still hope to be compatible with '(a>b) == False'

Just talk your users into saying:
       a <= b
   or  not (a > b)

instead of (a>b) == False
It's not clear what value you'd WANT that last comparison to give.

IE. what value should:
      PartTrue == False
produce, with no other context?

> by defining my own '==' operator ; and still allow users an explicit
> 'is' test for my singleton-like instances of PartTrue, Unknown, and
> PartFalse so that they can distinguish them from an actual 'False'
> object...
>
> I think I can also define a relative certainty magnitude operator for
> falseness so users can test if  PartTrue is more true than False. (
> PartTrue > False ) etc. and thereby allow them to make their own custom
> sorts based on relative true-ness when needed.
>
> And, finally, on the sort operation consistency you mentioned in an
> earlier email -- your comments, and another posters, about that is well
> taken.  It's something I'll have to review again later... but in
> essence, I don't think it's a problem because whenever two variables are
> definitely '>' or '<' each other with an actual True or False object
> returned -- I already know the consistency with respect to a third
> variable will hold.  On the other hand, By definition, all quasi false
> values return false for both '>' and '<', so what ends up happening is
> that python treats any uncertainty as if the variables were equal, and
> so they are grouped together but left in the same order as they were
> originally sent to the search. (stable).

Without more specifics, I kinda doubt it.  But without more specifics I 
can't refute it either.

> But I don't think that stops me from treating the sort order of all
> items which are quasi-equal, as a second search (hierarchical search).
> eg: Sort first by definite magnitude, then among those deemed 'equal',
> sort them by average expectation value...  That's going to be good
> enough for me and my users, for the sort order of truly quasi equal
> things is arbitrary anyway... as long as it's not 'defintely wrong'
> they'll have no reason to complain.

You could write a sort that works that way, but I don't think you can 
assume that the built-in sort will.  sort assumes that the comparison 
works in a certain way, and if it doesn't, the symptoms could range from 
random ordering (including items drastically out of order) to a possibly 
never terminating sort.

-- 
DaveA



More information about the Python-list mailing list