sort by column a csv file case insensitive
Peter Otten
__peter__ at web.de
Mon Apr 16 02:49:12 EDT 2012
Lee Chaplin wrote:
> Hi all,
>
> I am trying to sort, in place, by column, a csv file AND sort it case
> insensitive.
> I was trying something like this, with no success:
>
> import csv
> import operator
>
> def sortcsvbyfield(csvfilename, columnnumber):
> with open(csvfilename, 'rb') as f:
> readit = csv.reader(f)
> thedata = list(readit)
>
> thedata = sorted(thedata, key = lambda x:
> (operator.itemgetter(columnnumber) ,x[0].lower())) #!!!
> with open(csvfilename, 'wb') as f:
> writeit = csv.writer(f)
> writeit.writerows(thedata)
>
> The line marked is the culprit.
> Any help is greatly appreciated.
Try out your sort key on interactively:
>>> import csv
>>> import operator
>>> columnnumber = 0
>>> sortkey = lambda x: (operator.itemgetter(columnnumber), x[0].lower())
>>> sortkey(["an", "example", "row"])
(<operator.itemgetter object at 0x7feda34fdc90>, 'an')
>>> sortkey(["an", "example", "row"])
(<operator.itemgetter object at 0x7feda350b690>, 'an')
>>> sortkey(["an", "example", "row"])
(<operator.itemgetter object at 0x7feda34fdc90>, 'an')
Effectively you are sorting your data by the id (memory address) of the
itemgetters you create. You probably want
>>> def sortkey(row):
... column = row[columnnumber]
... return column.lower(), column
...
>>> sorted([[col] for col in "alpha ALPHA beta GAMMA gamma".split()],
key=sortkey)
[['ALPHA'], ['alpha'], ['beta'], ['GAMMA'], ['gamma']]
Alternatively you can use
>>> import locale
>>> locale.setlocale(locale.LC_ALL, "")
'de_DE.UTF-8'
>>> sorted([[col] for col in "alpha ALPHA beta GAMMA gamma".split()],
key=lambda row: locale.strxfrm(row[columnnumber]))
[['alpha'], ['ALPHA'], ['beta'], ['gamma'], ['GAMMA']]
More information about the Python-list
mailing list