Speed of string += string
Alex Martelli
aleax at aleax.it
Sun Apr 13 12:15:59 EDT 2003
Martin v. Löwis wrote:
> Alex Martelli <aleax at aleax.it> writes:
>
>> The crucial difference is that Python dicts are hashtables, while the
>> std::map in the standard library of C++ can't be, because it's based
>> on an order relationship, NOT on hashing. So, std::map has an order,
>> but its O(...) performance just can't be as good as a dict's.
>
> Nit-pickers (like myself) will point out that the complexity of a
> hash-table depends on the distribution of the hash function (in
> addition to depending on the complexity of the hash function). With a
"To a nitpicker, a nitpicker squared": nothing stops you (in C++) from
writing an operator< whose complexity is just as high as you can have
for a __hash__ method in Python. In this sense, dependency on the
complexity of some auxiliary function is just the same in the two
languages (actually, the dependency is stronger in C++, since for each
insertion or memebership test operator< will typically be called O(log N)
times, while, in Python, for such an operation hash is called O(1) time).
> carefully-chosen hash O(1) function, all dictionary operations will be
> O(n). For std::map, O(log n) is the worst-case performance.
Yes, it's technically true that (once some fairy godmother ensures
that the auxiliary function is O(1)) worst-case behavior is potentially
worse for a hash table than it is for a sufficently smart order-based
container (and the C++ standard library does constrain its container
to be sufficiently smart in this sense). I think Tim's post has already
been quite eloquent on the relative care Python places on
worst-case-guaranteed vs typical-case-expected behavior.
Alex
More information about the Python-list
mailing list