is implemented with id ?
Steven D'Aprano
steve+comp.lang.python at pearwood.info
Sat Nov 3 18:18:15 EDT 2012
On Sat, 03 Nov 2012 22:49:07 +0100, Hans Mulder wrote:
> On 3/11/12 20:41:28, Aahz wrote:
>> [got some free time, catching up to threads two months old]
>>
>> In article <50475822$0$6867$e4fe514c at news2.news.xs4all.nl>, Hans Mulder
>> <hansmu at xs4all.nl> wrote:
>>> On 5/09/12 15:19:47, Franck Ditter wrote:
>>>>
>>>> - I should have said that I work with Python 3. Does that matter ? -
>>>> May I reformulate the queston : "a is b" and "id(a) == id(b)"
>>>> both mean : "a et b share the same physical address". Is that True
>>>> ?
>>>
>>> Yes.
>>>
>>> Keep in mind, though, that in some implementation (e.g. Jython), the
>>> physical address may change during the life time of an object.
>>>
>>> It's usually phrased as "a and b are the same object". If the object
>>> is mutable, then changing a will also change b. If a and b aren't
>>> mutable, then it doesn't really matter whether they share a physical
>>> address.
>>
>> That last sentence is not quite true. intern() is used to ensure that
>> strings share a physical address to save memory.
>
> That's a matter of perspective: in my book, the primary advantage of
> working with interned strings is that I can use 'is' rather than '==' to
> test for equality if I know my strings are interned. The space savings
> are minor; the time savings may be significant.
Actually, for many applications, the space "savings" may actually be
*costs*, since interning forces Python to hold onto strings even after
they would normally be garbage collected. CPython interns strings that
look like identifiers. It really wouldn't be a good idea for it to
automatically intern every string.
You can make your own intern system with a simple dict:
interned_strings = {}
Then, for every string you care about, do:
s = interned_strings.set_default(s, s)
to ensure you are always working with a single string object for each
unique value. In some applications that will save time at the expense of
space.
And there is no need to write "is" instead of "==", because string
equality already optimizes the "strings are identical" case. By using ==,
you don't get into bad habits, you defend against the odd un-interned
string sneaking in, and you still have high speed equality tests.
--
Steven
More information about the Python-list
mailing list