finding items that occur more than once in a list

Tue Mar 18 21:11:32 EDT 2008

On Mar 18, 6:38 pm, Arnaud Delobelle <arno... at googlemail.com> wrote:
> On Mar 18, 3:59 pm, Ninereeds <stephenhorne... at aol.com> wrote:
>
> > Hrvoje Niksic wrote:
> > > This doesn't apply to Python, which implements dict storage as an
> > > open-addressed table and automatically (and exponentially) grows the
> > > table when the number of entries approaches 2/3 of the table size.
> > > Assuming a good hash function, filling the dict should yield amortized
> > > constant time for individual additions.
>
> Isn't this average constant time rather than amortized?
>
> > OK. I obviously need to look up open-addressed tables. I thought this
> > was just, in effect, implicit linked listing - ie it still needs a
> > linear search to handle collisions, it just avoids the need for
> > explicitly stored link fields. Perhaps I'm mistaken.
>
> I don't think you are mistaken, but if I'm wrong I'd be grateful for a
> link to further details.

Is there a known algorithm which for Key A, Key B s.t. h( KeyA ) =
h( KeyB ) for hash function h, h( KeyA, 1 ) != h( KeyB, 1 ), where
h( x, i+ 1 ) is searched when h( x, i ) is filled?  That is, the
search sequence is unique (or at least orthogonal to the h_value)?

Could you just scramble the bits and hash key -> hash table -> hash
table?

Is deletion - resize to small - a strong bottleneck?  Can you store an
object's search sequence, as it looks for "a home", and when one of
its earlier slots vacantizes, promote it?