[Numpy-discussion] efficient way to manage a set of floats?

Warren Weckesser warren.weckesser at enthought.com
Mon May 10 22:44:27 EDT 2010


Dr. Phillip M. Feldman wrote:
> Anne Archibald-2 wrote:
>   
>> on a 32-bit machine,
>> the space overhead is roughly a 32-bit object pointer or two for each
>> float, plus about twice the number of floats times 32-bit pointers for
>> the table.
>>
>>     
>
> Hello Anne,
>
> I'm a bit confused by the above.  It sounds as though the hash table
> approach might occupy 4 times as much storage as a single array.  Is that
> right?
>
> Also, I don't understand why hashing would be useful for the set
> application.
>
> It seems as though a red-black tree might be a good implementation for a set
> of floats if all that one wants to do is add, delete, and test membership. 
> (I will also need to do unions).
>
>   

A couple questions:

How many floats will you be storing?

When you test for membership, will you want to allow for a numerical 
tolerance, so that if the value 1 - 0.7 is added to the set, a test for 
the value 0.3 returns True?  (0.3 is actually 0.29999999999999999, while 
1-0.7 is 0.30000000000000004)


Warren

> Thanks for the help!
>
> Phillip
>   




More information about the NumPy-Discussion mailing list