sorting items in a table problematic because of scientific notation

John Machin sjmachin at lexicon.net
Tue Apr 28 20:37:03 EDT 2009


Davis, Amelie Y <aydavis <at> purdue.edu> writes:
> 
> Hi All,
> 
> I have a
> dbf table outputted by another program that I cannot (I’m pretty sure)
> change the format of.
> 
> I use a
> dbf reader code found online (http://code.activestate.com/recipes/362715/
> ) to read the table in and I need to sort it on a particular field but this
> field has scientific notation in it and when I use the following command, it
> seems to ignore the scientific notation which is very problematic outlist =
> sorted(records, key=itemgetter(2)) .  
> 
> I read
> through the help and it says to use ‘%f’ % to change the format but
> I’m not sure where to put that in the code (pasted below) as I think I’m
> not fully grasping the table structure...
 PS
> 
> Here’s the data 
> 
> >>> db
> 
> [['IN_FID', 'NEAR_FID',
> 'NEAR_DIST'], [('N', 9, 0), ('N', 9, 0), ('F', 19, 11)], [53, 55, '

The data type code for the offending column is "F" which is not in the
bog-standard dBase III set of C, N, D, and L. The code that you have used merely
returns unchanged the character string that finds in the data base.

> 1.05646365517e+005'], [53, 6, ' 9.32599134016e+004'], [53, 0, '
> 8.97477154418e+004'], [53, 2, ' 8.96449127749e+004'], [53, 1, '
> 7.88170078501e+004'], [53, 5, ' 8.29281503631e+004'], [53, 4, '
> 
> >>> 
> 
> è The
> ‘second item’ should be NEAR_ID = 64 but because of the scientific
> notation it returns NEAR_ID = 70.  
> 
>  
> 
> AND THE CODE
> 
> ####################################
> 
> # code from http://code.activestate.com/recipes/362715/ 
> 
>             if typ == "N":
>                 value = value.replace('\0', '').lstrip()
>                 if value == '':
>                     value = 0
>                 elif deci:
>                     value = decimal.Decimal(value)
>                 else:
>                     value = int(value)
>             elif typ == 'D':
>                 y, m, d = int(value[:4]), int(value[4:6]),
> int(value[6:8])
>                 value = datetime.date(y, m, d)
>             elif typ == 'L':
>                 value = (value in 'YyTt' and 'T') or (value
> in 'NnFf' and 'F') or '?'

Missing code:

else:
    raise Exception("Unknown data type " + typ)

>             result.append(value)
>         yield result


if this is a one-off exercise:
     go with the post-processing fix that Skip gave you.
elif you are interested in a module that cares, is chock-full of assertions,
supports several more data type codes (including F) and memo fields (of 3
different varieties), and is not slower:
   e-mail me.
else:
   raise Exception(":-)")

Cheers,
John




More information about the Python-list mailing list