frequency of values in a field

Paul Rubin no.email at nospam.invalid
Tue Feb 8 16:20:26 EST 2011


noydb <jenn.duerr at gmail.com> writes:
> I am looking for ways to go about capturing the frequency of unique
> values in one field in a dbf table which contains ~50k records.  The
> values are numbers with atleast 5 digits to the right of the decimal,
> but I want the frequency of values to only 2 decimal places.  I do
> have a method to do this courtesy of a provided tool in ArcGIS.  Was
> just curious about ways to do it without arcgis sw, using just python.

The Decimal module is pretty slow but is conceptually probably the right
way to do this.  With just 50k records it shouldn't be too bad.  With
more records you might look for a faster way.

    from decimal import Decimal as D
    from collections import defaultdict

    records = ['3.14159','2.71828','3.142857']

    td = defaultdict(int)
    for x in records:
        td[D(x).quantize(D('0.01'))] += 1

    print td

> Saw this http://code.activestate.com/recipes/277600-one-liner-frequency-count/
> using itertools.

That is cute but I cringe a bit at the temporary lists and the n log n algorithm.



More information about the Python-list mailing list