Comparing strings from the back?

Johannes Bauer dfnsonfsduifb at gmx.de
Thu Sep 6 09:43:25 EDT 2012


On 06.09.2012 10:33, Steven D'Aprano wrote:

> But you know, it really doesn't make a difference. Equality testing will 
> *still* be O(N) since the asymptomatic behaviour for sufficiently large 
> string comparisons will be bounded by O(N). Multiplicative constants 
> ("half the string" versus "0.001 of the string") do not matter.

Wrong, at least for randomized strings (i.e. every character with the
same probability). O(N) is worst-case, O(log N) is correct for
randomized strings.

Then again, since you were nitpicking about Landau notation earlier this
thread, every function bound by O(log N) is also bound by O(N), since,
as you correctly stated, it's only a upper bound. We should be talking
about the asymptotic sharp bound, i.e. capital Theta.

> I may have been overly-conservative earlier when I said that on average 
> string equality has to compare half the characters. I thought I had 
> remembered that from a computer science textbook, but I can't find that 
> reference now, so possibly I was thinking of something else. (String 
> searching perhaps?). In any case, the *worst* case for string equality 
> testing is certainly O(N) (every character must be looked at), and the 
> *best* case is O(1) obviously (the first character fails to match). But 
> I'm not so sure about the average case. Further thought is required.

Yes, worst-case is O(N), best case O(1). Average is O(n log n).

Best regards,
Johannes

-- 
>> Wo hattest Du das Beben nochmal GENAU vorhergesagt?
> Zumindest nicht öffentlich!
Ah, der neueste und bis heute genialste Streich unsere großen
Kosmologen: Die Geheim-Vorhersage.
 - Karl Kaos über Rüdiger Thomas in dsa <hidbv3$om2$1 at speranza.aioe.org>



More information about the Python-list mailing list