Natural string sorting

Connelly Barnes connellybarnes at yahoo.com
Fri Jun 18 01:33:07 EDT 2004


Summary:

Sorts strings in a way that seems natural to humans. 
If the strings contain integers, then the integers are ordered
numerically.  For example, sorts ['Team 11', 'Team 3', 'Team 1']
into the order ['Team 1', 'Team 3', 'Team 11'].

Code:

#---------------------------------------------------------
# natsort.py: Natural string sorting.
#---------------------------------------------------------

# By Seo Sanghyeon.  Some changes by Connelly Barnes.

def try_int(s):
    "Convert to integer if possible."
    try: return int(s)
    except: return s

def natsort_key(s):
    "Used internally to get a tuple by which s is sorted."
    import re
    return map(try_int, re.findall(r'(\d+|\D+)', s))

def natcmp(a, b):
    "Natural string comparison, case sensitive."
    return cmp(natsort_key(a), natsort_key(b))

def natcasecmp(a, b):
    "Natural string comparison, ignores case."
    return natcmp(a.lower(), b.lower())

def natsort(seq, cmp=natcmp):
    "In-place natural string sort."
    seq.sort(cmp)
    
def natsorted(seq, cmp=natcmp):
    "Returns a copy of seq, sorted by natural string sort."
    import copy
    temp = copy.copy(seq)
    natsort(temp, cmp)
    return temp

Examples:

You can use this code to sort tarball filenames:

>>> natsorted(['ver-1.3.12', 'ver-1.3.3', 'ver-1.2.5', 'ver-1.2.15',
'ver-1.2.3', 'ver-1.2.1'])
['ver-1.2.1', 'ver-1.2.3', 'ver-1.2.5', 'ver-1.2.15', 'ver-1.3.3',
'ver-1.3.12']

Chemical elements:

>>> natsorted(['C1H2', 'C1H4', 'C2H2', 'C2H6', 'C2N', 'C3H6'])
['C1H2', 'C1H4', 'C2H2', 'C2H6', 'C2N', 'C3H6']

Teams:

>>> natsorted(['Team 101', 'Team 58', 'Team 30', 'Team 1'])
['Team 1', 'Team 30', 'Team 58', 'Team 101']

Pass natcasecmp as a second argument for case-insensitive sorting:

>>> natsorted(['a5', 'A7', 'a15', 'a9', 'A8'], natcasecmp)
['a5', 'A7', 'A8', 'a9', 'a15']

Enjoy!



More information about the Python-list mailing list