sorting tuples...

Magnus Lycka lycka at carmen.se
Mon Sep 26 08:19:04 EDT 2005


idhog at gmail.com wrote:
> I edited my code earlier and came up with stringing the groups
> (200501202010, sender, message_string) into one string delimited by
> '%%%'.

Why? It seems you are trying to use a string as some kind of container,
and Python has those in the box. Just use a list of tuples, rather than
a list of strings. That will work fine for .sort(), and it's much more
convenient to access your data. Using the typical tool for extracting
binary data from files/strings will give you tuples by default.

 >>> import struct # Check this out in library ref.
 >>> # I'm inventing a simple binary format with everything
 >>> # as strings in fixed positions. There's just one string
 >>> # below, adjacent string literals are concatenated by
 >>> # Python. I split it over three lines for readability.
 >>> bin = (
"200501221530John    *** long string here ***        " 
"200504151625Clyde   *** clyde's long string here ***" 
"200503130935Jeremy  *** jeremy string here ****     ")
 >>> fmt="@12s8s32s" # imagined binary format.
 >>> l=52 # 12+8+32, from previous line
 >>> msgs = []
 >>> for i in range(3):
...     # struct.unpack will return a tuple. It works well
...     # with numeric data too.
...     msgs.append(struct.unpack(fmt, bin[i*l:(i+1)*l]))

 >>> msgs.sort()
 >>> for msg in msgs:
...     print msg
	
('200501221530', 'John    ', '*** long string here ***        ')
('200503130935', 'Jeremy  ', '*** jeremy string here ****     ')
('200504151625', 'Clyde   ', "*** clyde's long string here ***")

> I could then sort the messages with the date string at the beginning as
> the one being sorted with the big string in its "tail" being sorted
> too.

This works equally well with a list of tuples. Another benefit of
the list of tuples approach is that you don't need to cast everything
to strings. If parts of your data is e.g. numeric, just let it be an
int, a long or a float in your struct, and sorting will work correctly
without any need to format the number in such a way as to make string
sorting work exactly as numeric sorting.

Here's an example with numeric data:

 >>> b = (
'\x00\x00\x07\xd5\x00\x00\x00\x01\x00\x00\x00\x16\x00\x00\x00'
'\x0f\x00\x00\x00\x1eJohn\x00\x00\x00\x00*** long string here'
' ***\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x07\xd5\x00\x00'
'\x00\x03\x00\x00\x00\r\x00\x00\x00\t\x00\x00\x00#Jeremy\x00'
'\x00*** jeremy string here ****\x00\x00\x00\x00\x00\x00\x00'
'\x07\xd5\x00\x00\x00\x04\x00\x00\x00\x0f\x00\x00\x00\x10\x00'
'\x00\x00\x19Clyde\x00\x00\x00*** clyde\'s long string here ***')
 >>> fmt="!iiiii8s32s"
 >>> l = 60 # five ints (5*4) + 8 + 32
 >>> bin_msgs=[]
 >>> for i in range(3):
	bin_msgs.append(struct.unpack(fmt, bin[i*l:(i+1)*l]))

	
 >>> bin_msgs.reverse() # unsort...
 >>> bin_msgs.sort()
 >>> for msg in bin_msgs:
	print msg

	
(2005, 1, 22, 15, 30, 'John\x00\x00\x00\x00', '*** long string here 
***\x00\x00\x00\x00\x00\x00\x00\x00')
(2005, 3, 13, 9, 35, 'Jeremy\x00\x00', '*** jeremy string here 
****\x00\x00\x00\x00\x00')
(2005, 4, 15, 16, 25, 'Clyde\x00\x00\x00', "*** clyde's long string here 
***")



More information about the Python-list mailing list