Does '#hash' mean anything in IDLE?

Paul Rubin http
Mon Mar 6 17:50:45 EST 2006


Scott David Daniels <scott.daniels at acm.org> writes:
> > Note that I sorted the dictionary items in order to get the max
> > element.  That is sort of bogus because it's an O(N log N) operation
> > while finding the maximum should only need O(N).  But it leads to
> > a convenient spelling.  It would be nice if "max" accepted a "key"
> > argument the way that the sorting functions do.
> 
> Using a variant of DSU (Decorate-Sort-Undecorate) with max for S,
> rather than sort:
> 
>      print max((len(words), words) for words in d.itervalues())
> or:
>      size, words = max((len(words), words) for words in d.itervalues())
>      print size, sorted(words)

That's nice and concise but it doesn't completely fix the running time
issue.  max(words, key=len) should run in O(N) time where N is the
number of words, but
  max((len(words), words) for words in d.itervalues())
might need time proportional to the total lengths of the words.  I.e.
suppose words=['aaaaaaaa','aaaaaaab','aaaaaaac'].  They are the same
length so the cmp builtin proceeds to the next component of the tuples
being compared.  That means the strings get compared character by
character.  That's maybe not too bad for dictionary words, but isn't
too great for arbitrary strings which might be millions of chars each.

In general that DSU pattern doesn't even always work: you might want
the max of objects which don't directly support any comparison methods
that the cmp builtin understands.  The "key" callable is needed to
extract something that can be compared, or else there could be a
user-supplied comparison function like .sort() and sorted support.



More information about the Python-list mailing list