python 3's adoption

Steven D'Aprano steve at REMOVE-THIS-cybersource.com.au
Fri Jan 29 06:47:26 EST 2010


On Fri, 29 Jan 2010 07:10:01 +0100, Alf P. Steinbach wrote:

>    >>> L = ["æ", "ø", "å"]   # This is in SORTED ORDER in Norwegian L

[...]

>    >>> L.sort( key = locale.strxfrm )
>    >>> L
>    ['å', 'æ', 'ø']
>    >>> locale.strcoll( "å", "æ" )
>    1
>    >>> locale.strcoll( "æ", "ø" )
>    -1
> 
> Note that strcoll correctly orders the strings as ["æ", "ø", "å"], that
> is, it would have if it could have been used as cmp function to sort (or
> better, to a separate routine named e.g. custom_sort).

This is in Python2.5, so I'm explicitly specifying unicode strings:

>>> L = [u"æ", u"ø", u"å"]
>>> assert sorted(L) == [u'å', u'æ', u'ø']

The default C-locale sorting of L does not equal to L. Now let's change 
to Norwegian locale:

>>> import locale
>>> locale.setlocale(locale.LC_ALL, 'nb_NO')
'nb_NO'
>>> print u''.join(sorted(L, cmp=locale.strcoll))
æøå

So far so good, we've confirmed that in Python 2.5 we can sort in a 
locale-aware form. Now, can we do this with a key function? Thanks to 
Raymond Hettinger's recipe here:

http://code.activestate.com/recipes/576653/


>>> print u''.join(sorted(L, key=CmpToKey(locale.strcoll)))
æøå


Success!

Let's try it in Python 3.1:

>>> L = ["æ", "ø", "å"]
>>> assert sorted(L) == ['å', 'æ', 'ø']
>>>
>>> import locale
>>> locale.setlocale(locale.LC_ALL, 'nb_NO')
'nb_NO'
>>> ''.join(sorted(L, key=CmpToKey(locale.strcoll)))
'æøå'


The definition of CmpToKey can be found at the URL above. It's not very 
exciting, but here it is:


def CmpToKey(mycmp):
    'Convert a cmp= function into a key= function'
    class K(object):
        def __init__(self, obj, *args):
            self.obj = obj
        def __lt__(self, other):
            return mycmp(self.obj, other.obj) == -1
        def __gt__(self, other):
            return mycmp(self.obj, other.obj) == 1
        def __eq__(self, other):
            return mycmp(self.obj, other.obj) == 0
        def __le__(self, other):
            return mycmp(self.obj, other.obj) != 1  
        def __ge__(self, other):
            return mycmp(self.obj, other.obj) != -1
        def __ne__(self, other):
            return mycmp(self.obj, other.obj) != 0
    return K


If that's too verbose for you, stick this as a helper function in your 
application:


def CmpToKey(mycmp):
    'Convert a cmp= function into a key= function'
    class K(object):
        def __init__(self, obj, *args):
            self.obj = obj
        __lt__ = lambda s, o: mycmp(s.obj, o.obj) == -1
        __gt__ = lambda s, o: mycmp(s.obj, o.obj) == 1
        __eq__ = lambda s, o: mycmp(s.obj, o.obj) == 0
        __le__ = lambda s, o: mycmp(s.obj, o.obj) != 1
        __ge__ = lambda s, o: mycmp(s.obj, o.obj) != -1
        __ne__ = lambda s, o: mycmp(s.obj, o.obj) != 0
    return K


[...]
> The above may just be a bug in the 3.x stxfrm. But it illustrates that
> sometimes you have your sort order defined by a comparision function.
> Transforming that into a key can be practically impossible (it can also
> be quite inefficient).

This might be true, but you haven't demonstrated it. With one little 
helper function, you should be able to convert any comparison function 
into a key function, with relatively little overhead.


-- 
Steven



More information about the Python-list mailing list