Behaviour of enumerated types

Bengt Richter bokr at oz.net
Fri Nov 18 23:17:54 EST 2005


On Sat, 19 Nov 2005 11:10:42 +1100 (EST), Ben Finney <bignose+hates-spam at benfinney.id.au> wrote:

>Bengt Richter <bokr at oz.net> wrote:
>> Ben Finney <bignose+hates-spam at benfinney.id.au> wrote:
>> >Getting a numeric index might be useful in a language such as
>> >Pascal, with no built-in dict or sequence types. In Python, where
>> >any immutable object can be a dict key, and any sequence can be
>> >iterated, it seems of no use.
>>
>> Does your concept of enumeration not have a fixed order of a set of
>> names?
>
>It does. The values are iterable in the same order they were specified
>when creating the Enum.
>
>> If it does, what is more natural than using their index values as
>> keys to other ordered info?
>
>I don't see why. If you want sequence numbers, use enumerate(). If
>not, the enum object itself can be used directly as an iterable.
I changed mine so the enum _class_ is iterable, but enum instances are not.
>
>> OTOH, the index values (and hence my enums) are[1] not very good as
>> unique dict keys, since they compare[2] promiscuously with each
>> other and other number types.
>
>Indeed, that's why (in my design) the values from the enum are only
>useful for comparing with each other. This, to me, seems to be the
>purpose of an enumerated type.
Have you tried yet to use two different enum instances as keys in
the same dict? Then try to sort the keys(or items is the values are
misc different enums). I hit that, and changed __cmp__ to compare
(typename, <intvalue or other if not int subtype>) tuples. That sorts
items grouped by enum type if they're keys. I think you can always
pass a stricter cmp to sorted if you want to assert type equality.

>
>> To me the int correspondence is as expectable and natural as
>> a,b,c=range(3) (at least as a default) though I think different
>> enumerations should be different types.
>
>That's a complete contradiction, then. If you want them to be
>different types, you don't want them to be integers.
No, it's not a contradiction. Different int _sub_types are different
types ;-)

>
>> Note that the ordering of int values makes the instances nicely
>> sortable too, e.g.,
>
>That only means the enum values need to compare in the same sequence;
>it doesn't mean they need to correspond to integer values.
True.

>
>> But bottom line, I really thing the int base type is more than an
>> implementation detail. I think it's natural for an _ordered_ set of
>> names ;-)
>
>I think I've addressed all your current concerns; I don't believe an
>inherent correlation to integers is necessary at all.
Necessary wasn't the question for me. It's whether it's desirable. YMMV ;-)

>
>It's no more necessary than saying that ["a", "b", "c"] requires that
>there be some specific correlation between the values of that list and
>the integers 0, 1, 2. If you *want* such a correlation, in some
>particular case, use enumerate() to get it; but there's nothing about
>the values themselves that requires that correspondence.
Yet it is comforting to know that ['a', 'b', 'c'][0] will interpret
the [0] to mean the first in the sequence (unless someone is doing
a list-like repr trick based on some other type ;-).

I haven't yet converted my generated enum classes to singletons,
but I did make it so you can define a named enumeration class and
iterate the class itself (its metaclass defines __getitem__). What
you get is the particular enum class instance, e.g. (... time passes,
never mind, I cached instanced for an effective singleton set of named numbers.

The class methods are introduced via metaclass in the makeEnum factory
and it's a very hacky workaround for the fact that execution of a
class definition body only has local and module global namespace, so
you can't directly reference anything local to the factory function.
This of course goes for the methods of the main class being constructed too,
so they are sneaked in via the metaclass before instantiating the class ;-/
(Anyone have an idea how to do this cleanly and have the same functionality
--being able to iterate the _class_ object etc.?

Anyway, this is my "feedback" ;-) What's missing in this? 

----< makeenum.py >----------------------------------------------------------
def makeEnum(ename, names):
    """
    Factory function to returns enumeration class with name ename,
    and enumerating space-delimited names in names string.
    The class is an int subtype, and instances have int values as
    well as corresponding names. Different enum instances with the
    same int value are distinct as dict keys, but equal used as ints,
    though they are not directly comparable unless the same type.

    The return class itself defines an iterator that will return
    all possible instances in order. The class names property returns
    a tuple of the names, and the len(TheClass) returns the number of
    names or of the iteration sequence.
    """
    global __global_for_mc_hack__
    class __global_for_mc_hack__(type):
        """
        metaclass to define methods on the enum class per se, and to pass
        through closure-dependent methods (which see names & top) to the
        returned class, as well as being target itself for closure-dependent
        methods of the class (which can't be defined in the class body
        since names in a class body are either local or module global).
        """
        def __new__(mcls, cname, cbases, cdict):
            cdict.update(mcls.edict)
            return type.__new__(mcls, cname, cbases, cdict)
        def __getattr__(cls, name):
            """make instances accessible by attribute name"""
            if isinstance(name, basestring):
                return cls(name) # make/retrieve-from-cache an instance
            raise IndexError, '%r not a name in "%s"'%(i, cls.__name__)
        
    # the closure cell variables
    names = names.split()
    top = len(names)
    cache = {}
    
    # define method functions outside class so they'll be closures accessing nested names
    def __contains__(cls, other): return type(other)==cls and 0<=int(other)<top
    def __getitem__(cls, i):
        """make class iterable and indexable, returning fresh instances with given values"""
        if isinstance(i, basestring) and i in names or isinstance(i, (int, long)) and (0<=i<top):
            return cls(i) # make an instance
        raise IndexError, '%r out of range for "%s"'%(i, cls.__name__)
    # stuff closure-type method functions into global metaclass to define methods
    # of the enum class per se
    __global_for_mc_hack__.__contains__ = __contains__
    __global_for_mc_hack__.__len__ = lambda cls: len(names)
    __global_for_mc_hack__.names = property(lambda cls: tuple(names))
    __global_for_mc_hack__.__getitem__ = __getitem__

    def __new__(cls, name=names[0]):
        try: return cache[name]
        except KeyError:
            try:
                i = names.index(name)
                e = int.__new__(cls, i)
                cache[name] = cache[i] = e
                return e
            except ValueError:
                if isinstance(name, int) and 0<= name < top:
                    e = int.__new__(cls, name)
                    cache[name] = cache[names[name]] = e
                    return e
                raise ValueError, 'illegal %s enum value %r'%(cls.__name__, name)
    def __repr__(self): return '%s(%r)' %(self.__class__.__name__, names[self])

    # pass closure-type method functions to global metaclass. Ick ;-/
    __global_for_mc_hack__.edict = dict(
        __new__ = __new__, __repr__=__repr__, __str__=__repr__)
    
    class enum(int):
        __metaclass__ = __global_for_mc_hack__
        def __cmp__(self, other):
            if isinstance(other, int): oval = int(other)
            else: oval = other
            # allow sort by type names on incompatible types XXX make optional??
            return cmp( (type(self).__name__, int(self)),
                        (type(other).__name__, oval))
                        
            #assert type(self)==type(other), (
            #    'incompatible cmp types: %s vs %s'%(type(self).__name__, type(other).__name__))
            #return cmp(int(self), int(other))
        def __hash__(self): return hash((int(self), type(self).__name__)) 
    enum.__name__ = ename
    del __global_for_mc_hack__
    return enum
-----------------------------------------------------------------------------
>
>> I'll go look at PyPI now ;-)
>
>Feedback appreciated :-)
>
The above in use looks like:

 >>> from makeenum import makeEnum
 >>> Count = makeEnum('Count', 'eeny meeny miny moe')
 >>> Count.names
 ('eeny', 'meeny', 'miny', 'moe')
 >>> Count[1]
 Count('meeny')
 >>> d = dict((c, int(c)) for c in Count)
 >>> d
 {Count('moe'): 3, Count('miny'): 2, Count('meeny'): 1, Count('eeny'): 0}
 >>> Fruit = makeEnum('Fruit', 'apple banana peach pear')
 >>> d.update((c, int(c)) for c in Fruit)
 >>> d
 {Count('eeny'): 0, Fruit('peach'): 2, Count('moe'): 3, Fruit('apple'): 0, Count('miny'): 2, Fru
 t('banana'): 1, Fruit('pear'): 3, Count('meeny'): 1}
 >>> for it in sorted(d.items()): print '%20s: %r'%it
 ...
        Count('eeny'): 0
       Count('meeny'): 1
        Count('miny'): 2
         Count('moe'): 3
       Fruit('apple'): 0
      Fruit('banana'): 1
       Fruit('peach'): 2
        Fruit('pear'): 3

 >>> Fruit('pear') in Count
 False
 >>> Fruit('pear') in Fruit
 True
 >>> Fruit('plum') in Fruit
 Traceback (most recent call last):
   File "<stdin>", line 1, in ?
   File "makeenum.py", line 65, in __new__
     raise ValueError, 'illegal %s enum value %r'%(cls.__name__, name)
 ValueError: illegal Fruit enum value 'plum'
 >>> d[Fruit('pear')]
 3
 >>> d[3]
 Traceback (most recent call last):
   File "<stdin>", line 1, in ?
 KeyError: 3
 >>> d[Fruit(3)]
 3
 >>> d[Fruit('pear')] = 'juicy'
 >>> d[Fruit['pear']]
 'juicy'
 >>> lf = list(Fruit)
 >>> lf
 [Fruit('apple'), Fruit('banana'), Fruit('peach'), Fruit('pear')]
 >>> lf2 = list(Fruit)
 >>> map(id, lf)
 [49320972, 49321068, 49321132, 49321164]
 >>> map(id, lf2)
 [49320972, 49321068, 49321132, 49321164]
 >>> len(Fruit)
 4
 >>> Fruit
 <class 'makeenum.Fruit'>
 >>> Fruit()
 Fruit('apple')
 >>> Fruit(0)
 Fruit('apple')
 >>> Fruit('apple')
 Fruit('apple')
 >>> type(Fruit('apple'))
 <class 'makeenum.Fruit'>
 >>> id(Fruit('apple')), id(lf[0])
 (49320972, 49320972)
 
Almost forgot, I added attribute style access:

 >>> Fruit.pear
 Fruit('pear')
 >>> d[Fruit.pear]
 'juicy'
 >>> d[Count(3)]
 3
 >>> Count[3]
 Count('moe')
 >>> d[Count.moe]
 3
 >>> d[Fruit.pear] += ', very'
 >>> d[Fruit.pear]
 'juicy, very'

Note,
 >>> int(Fruit.pear)
 3
 >>> d[3]
 Traceback (most recent call last):
   File "<stdin>", line 1, in ?
 KeyError: 3


 >>> isinstance(Fruit.pear, int)
 True
 >>> type(Fruit.pear)
 <class 'makeenum.Fruit'>
 >>> type(Fruit.pear).mro()
 [<class 'makeenum.Fruit'>, <type 'int'>, <type 'object'>]

But Fruit itself is derived, which is how the tricky methods work
 >>> type(Fruit)
 <class 'makeenum.__global_for_mc_hack__'>
 >>> type(Fruit).mro(type(Fruit))
 [<class 'makeenum.__global_for_mc_hack__'>, <type 'type'>, <type 'object'>]

I guess an option could be passed to makeEnum to disallow inter-type comparisons.
Wouldn't be that hard. I guess I'll do it, but I don't want to re-do this post ;-)

Regards,
Bengt Richter



More information about the Python-list mailing list