Abuse of Big Oh notation

88888 Dihedral dihedral88888 at googlemail.com
Mon Aug 20 18:20:44 EDT 2012


Paul Rubin於 2012年8月21日星期二UTC+8上午3時29分12秒寫道:
> Ian Kelly <ian.g.kelly at gmail.com> writes:
> 
> > The difference between the two is that the former is bounded by a
> 
> > constant that is fundamental to the algorithm at hand,...  S is
> 
> > clearly bounded by the constraints of actual shoes, so we can safely
> 
> > treat S as a constant and call it O(N).
> 
> 
> 
> Thanks, that is a good explain of what I was trying to get at.  One
> 
> quibble is in the parsing example, the constant is inherent in the
> 
> distribution of input data one expects to normally encounter, rather
> 
> than in the algorithm itself.  It's sort of like saying dictionary
> 
> access (based on hashing) is O(1), based on the variety of input data
> 
> that is normally used.  There is such a thing as pathological
> 
> (e.g. malicious) input data that causes a lot of hash collisions, making
> 
> O(n) access in the size of the dictionary.  
> 
> 
> 
> I suppose abnormal input should also be taken into account in examples
> 
> involving parsing if one parses potentially hostile data.

OK, the hash key function has to be seeded with some randomization 
in the creation of a hash table from some data independent factors.

Also for those items hashed to the same key with a length >=16
I think an insertion sort and a heap sort are good in storing a sorted list of items hashed to the same key of a length 16 to 256 or even larger which indicates  the hash function should be changed from time to time in the
occasions of resizing the hash table when the table is under a lot operations
of  item insertions and deletions which  are operated  much frequently 
than the number of item searches in the same period.


 



More information about the Python-list mailing list