"Fuzzy" Counter?

Steven D'Aprano steve+comp.lang.python at pearwood.info
Fri Sep 26 02:45:06 EDT 2014


Ian Kelly wrote:

> On Tue, Sep 23, 2014 at 11:01 PM, Miki Tebeka <miki.tebeka at gmail.com>
> wrote:
>> On Tuesday, September 23, 2014 7:33:06 PM UTC+3, Rob Gaddi wrote:
>>
>>> While you're at it, think
>>> long and hard about that definition of fuzziness.  If you can make it
>>> closer to the concept of histogram "bins" you'll get much better
>>> performance.
>> The problem for me here is that I can't determine the number of bins in
>> advance. I'd like to get frequencies. I guess every "new" (don't have any
>> previous equal item) can be a bin.
> 
> Then your result depends on the order of your input, which is usually
> not a good thing.
> 
> Why would you need to determine the *number* of bins in advance? You
> just need to determine where they start and stop. If for example your
> epsilon is 0.5, you could determine the bins to be at [-0.5, 0.5);
> [0.5, 1.5); [1.5, 2.5); ad infinitum. Then for each actual value you
> encounter, you could calculate the appropriate bin, creating it first
> if it doesn't already exist.

That has the unfortunate implication that:

0.500000001 and 1.499999999 (delta = 0.999999998)

are considered equal, but:

1.500000001 and 1.499999999 (delta = 0.000000002)

are considered unequal.



-- 
Steven




More information about the Python-list mailing list