sort by column a csv file case insensitive

Peter Otten __peter__ at web.de
Mon Apr 16 02:49:12 EDT 2012


Lee Chaplin wrote:

> Hi all,
> 
> I am trying to sort, in place, by column, a csv file AND sort it case
> insensitive.
> I was trying something like this, with no success:
> 
> import csv
> import operator
> 
> def sortcsvbyfield(csvfilename, columnnumber):
>   with open(csvfilename, 'rb') as f:
>     readit = csv.reader(f)
>     thedata = list(readit)
> 
>   thedata = sorted(thedata, key = lambda x:
> (operator.itemgetter(columnnumber) ,x[0].lower()))  #!!!
>   with open(csvfilename, 'wb') as f:
>     writeit = csv.writer(f)
>     writeit.writerows(thedata)
> 
> The line marked is the culprit.
> Any help is greatly appreciated.


Try out your sort key on interactively:

>>> import csv
>>> import operator
>>> columnnumber = 0
>>> sortkey = lambda x: (operator.itemgetter(columnnumber), x[0].lower())
>>> sortkey(["an", "example", "row"])
(<operator.itemgetter object at 0x7feda34fdc90>, 'an')
>>> sortkey(["an", "example", "row"])
(<operator.itemgetter object at 0x7feda350b690>, 'an')
>>> sortkey(["an", "example", "row"])
(<operator.itemgetter object at 0x7feda34fdc90>, 'an')

Effectively you are sorting your data by the id (memory address) of the 
itemgetters you create. You probably want

>>> def sortkey(row):
...     column = row[columnnumber]
...     return column.lower(), column
... 
>>> sorted([[col] for col in "alpha ALPHA beta GAMMA gamma".split()], 
key=sortkey)
[['ALPHA'], ['alpha'], ['beta'], ['GAMMA'], ['gamma']]

Alternatively you can use

>>> import locale
>>> locale.setlocale(locale.LC_ALL, "")
'de_DE.UTF-8'
>>> sorted([[col] for col in "alpha ALPHA beta GAMMA gamma".split()], 
key=lambda row: locale.strxfrm(row[columnnumber]))
[['alpha'], ['ALPHA'], ['beta'], ['gamma'], ['GAMMA']]





More information about the Python-list mailing list