[Tutor] Sorting numbers

Isaac Hall hall@ouhep5.nhn.ou.edu
Tue Feb 4 15:21:17 2003


On Tue, 4 Feb 2003, Jeff Shannon wrote:

> 
> Adam Vardy wrote:
> 
> >If I'd like to sort a simple kinda list like:
> >
> >45K
> >100K
> >3.4Meg
> >17K
> >300K
> >9.3Meg
> >
> >How do you suppose I can approach it?
> >
> 
> Are your suffixes (K, Meg, etc) standardized?  If so, you can use that 
> to separate your list into sublists, sort each sublist, and then present 
> the sublists in appropriate order.  When sorting sublists, you'll have 
> to be careful, though -- you're dealing with strings, and you want them 
> sorted in numeric order instead of alphabetical order.  (In alphabetical 
> order,  '35' comes after '300'.)  
> 
> I'd split each string into a numeric part and a suffix.  Use the suffix 
> as a dictionary key, and add the numeric part to a list pointed to by 
> that key.  Then, when you sort each list, convert each numeric string to 
> a float for sorting comparisons -- but *only* use the float for sorting, 
> and then display the original string value.
> 
This is a good method, however this would fail if we want to place a 
numbers like [2000K, 1.5Meg].  Since in reality, 2000K=2Meg (or if we are 
talking computer speak, 2048K=2Meg), but anyway, this method would place 
2000K before 1.5Meg, which we know is false.
 
> An alternative approach would be to write a function that converts, say, 
> '17K' to the integer value 17000, and '3.4Meg' to 3,400,000.  Then you 
> could sort your raw data based on the results of that function.
> 
> A good way to convert these values would be to make a dictionary that 
> links a given suffix to a multiplier.  Then you can separate the numeric 
> part from the suffix (we already know how to do that), use the suffix to 
> get the multiplier, do the math and return the result.  And once we have 
> a function to expand these numbers, we can simply write a comparison 
> function that uses the expanded numbers for sorting.
> 
>  >>> rawdata
> ['45K', '100K', '3.4Meg', '17K', '300K', '9.3Meg', '512', '23Meg']
>  >>> suff = { '':1, 'K':1000, 'Meg':1000000, 'Gig':1000000000 }
>  >>> def expand(item, suffixes = suff):
> ...     numpart, suffixpart = splitsuffix(item)
> ...     multiplier = suffixes[suffixpart]
> ...     return float(numpart) * multiplier
> ...
>  >>> def sortfunc(a, b):
> ...     return cmp(expand(a), expand(b))
> ...
>  >>> rawdata.sort(sortfunc)
>  >>> rawdata
> ['512', '17K', '45K', '100K', '300K', '3.4Meg', '9.3Meg', '23Meg']
>  >>>
> 
I like this method much better. We avoid stepping into pitfalls if we get 
wierd numbers to sort (like say 2000K or 0.5 Meg) or things like that...)
and if you are using computerspeak, then we just assign a value of 1024 
for each subsequent suffix (ie. 1.2Meg = 1.2*1024*1024) 

Ike



> Here I've sorted the data in-place.  If you need to leave the original 
> data alone for whatever reason, you can simply make a copy of the list ( 
> sortedlist = rawdata[:] ) and then sort the new list.
> 
> Jeff Shannon
> Technician/Programmer
> Credit International
> 
> 
> 
> _______________________________________________
> Tutor maillist  -  Tutor@python.org
> http://mail.python.org/mailman/listinfo/tutor
> 

--