is implemented with id ?
Steven D'Aprano
steve+comp.lang.python at pearwood.info
Sat Nov 3 23:10:24 EDT 2012
On Sun, 04 Nov 2012 01:14:29 +0000, Oscar Benjamin wrote:
> On 3 November 2012 22:50, Chris Angelico <rosuav at gmail.com> wrote:
>> This one I haven't checked the source for, but ISTR discussions on this
>> list about comparison of two unequal interned strings not being
>> optimized, so they'll end up being compared char-for-char. Using 'is'
>> guarantees that the check stops with identity. This may or may not be
>> significant, and as you say, defending against an uninterned string
>> slipping through is potentially critical.
>
> The source is here (and it shows what you suggest):
> http://hg.python.org/cpython/file/6c639a1ff53d/Objects/
unicodeobject.c#l6128
I don't think it does, although I could be wrong, I find reading C to be
quite difficult.
The unicode_compare function compares character by character, true, but
it doesn't get called directly. The public interface is
PyUnicode_Compare, which includes this test before calling
unicode_compare:
/* Shortcut for empty or interned objects */
if (v == u) {
Py_DECREF(u);
Py_DECREF(v);
return 0;
}
result = unicode_compare(u, v);
where v and u are pointers to the unicode object.
So it appears that the test for strings being equal length have been
dropped, but the identity test is still present.
> Comparing strings char for char is really not that big a deal though.
Depends on how big the string and where the first difference is.
> This has been discussed before: you don't need to compare very many
> characters to conclude that strings are unequal (if I remember correctly
> you were part of that discussion).
On average. Worst case, you have to look at every character.
--
Steven
More information about the Python-list
mailing list