Comparing strings from the back?

Oscar Benjamin oscar.j.benjamin at gmail.com
Tue Sep 11 06:40:42 EDT 2012


On 11 September 2012 10:51, Duncan Booth <duncan.booth at invalid.invalid>wrote:

> Oscar Benjamin <oscar.j.benjamin at gmail.com> wrote:
>
> >> What interning buys you is that "s == t" is an O(1) pointer compare
> >> if they are equal. But if s and t differ in the last character,
> >> __eq__ will still inspect every character. There is no way to tell
> >> Python "all strings are interned, if s is not t then s != t as well".
> >>
> >
> > I thought that if *both* strings were interned then a pointer
> > comparison could decide if they were unequal without needing to check
> > the characters.
> >
> > Have I misunderstood how intern() works?
> >
>
> I don't think you've misunderstood how it work, but so far as I can see the
> code doesn't attempt to short circuit the "not equal but interned" case.
> The comparison code doesn't look at interning at all, it only looks for
> identity as a shortcut.


It also doesn't seem to check if the hash values have been set. I guess the
cached hash value is only used in contexts where the hash is explicitly
desired.

That makes two optimisations that can bring worst case string comparison
down to O(1) in many contexts that are available to cpython but unused. But
then if full string comparison is already on average O(1) then the cost of
checking the interned and hash flags for every string comparison would
outweigh the benefits of avoiding the rare worst case O(N) comparisons.

Oscar
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-list/attachments/20120911/7baf2ab8/attachment.html>


More information about the Python-list mailing list