"Fuzzy" Counter?

Ian Kelly ian.g.kelly at gmail.com
Wed Sep 24 10:56:18 EDT 2014


On Tue, Sep 23, 2014 at 11:01 PM, Miki Tebeka <miki.tebeka at gmail.com> wrote:
> On Tuesday, September 23, 2014 7:33:06 PM UTC+3, Rob Gaddi wrote:
>
>> While you're at it, think
>> long and hard about that definition of fuzziness.  If you can make it
>> closer to the concept of histogram "bins" you'll get much better
>> performance.
> The problem for me here is that I can't determine the number of bins in advance. I'd like to get frequencies. I guess every "new" (don't have any previous equal item) can be a bin.

Then your result depends on the order of your input, which is usually
not a good thing.

Why would you need to determine the *number* of bins in advance? You
just need to determine where they start and stop. If for example your
epsilon is 0.5, you could determine the bins to be at [-0.5, 0.5);
[0.5, 1.5); [1.5, 2.5); ad infinitum. Then for each actual value you
encounter, you could calculate the appropriate bin, creating it first
if it doesn't already exist.



More information about the Python-list mailing list