[Python-ideas] Deprecate str.find

Fri Jul 15 18:12:33 CEST 2011

On Fri, Jul 15, 2011 at 10:57 AM, Guido van Rossum <guido at python.org> wrote:
> However, in many cases absence of the string is not an error -- you
> just need to do something else. So in cases where *if* it's found you
> need the position, and *if* it isn't found you need to do something
> else, you'd have to use a try/except block to catch the non-error that
> is absence. All in all I don't see enough reason to start deprecating
> find. But perhaps popular lint-like programs could flag likely abuses
> of find?
>
> --Guido

It isn't necessarily an error if the substring is not in the string
(though it sometimes is), but it is an exceptional case. Python uses
exceptions pretty liberally most places -- it isn't necessarily an
error if an iterator is exhausted or if float("4.2 bad user input") is
called or if BdbQuit was raised. In these cases, an exception can be
perfectly expected to indicate that what happened is different from
the information used in a return value.

Making a Python user write a try/except block when she wants to handle
both the cases "substring is in s" and "substring isn't in s" seems
perfectly fine to me and, really, preferable to the if statement
required to handle these two cases.

The base two cases really are about the same:

try:
    i = s.index(sub)
except IndexError:
    do_something()

vs.

i = s.find(sub)
if i == -1:
    do_something()

But what if I forgot to handle the special case?

i = s.index(sub) # An exception is raised right here and I can fix my code

vs.

i = s.find(sub) # No exception is raised

In this second case, I get the value of -1. Later I can use it as an
index, use it in a slice, or perform arithmetic on it. This can
introduce seemingly-unrelated values later on, making this especially
hard to track down. If the failure return code was at least None it
would behave more sanely, but at present the failure return code is a
perfectly valid value for almost any use.

If a programmer is sill averse to using try/except, we can still write

if sub in s:
    i = s.index(sub)
else:
    do_something()

Now. we can dredge up some examples where -1 is the actual value
someone wants to use. These cases are so rare and so subtle as to make
their use so clever I don't really see their existence as an
advantage.

Additionally, it is unfortunate that we currently have two methods to
do the same thing (which isn't even a super-common task) with
different APIs. Nothing about the names "find" and "index" really
makes clear which is which. This violates the "There should be one--
and preferably only one --obvious way to do it." principle and makes
the Python user need to memorize an unnecessary, arbitrary
distinction.

I would also point out that it was not a contrived case that I
mentioned where a beginner introduces a bug by trying "if
s.find(sub):" instead of "if sub in s:"; I have really seen people try
this several times. Obviously we cannot make many decisions based on
new Python programmers' mistakes, it is worth recognizing them.

I hope this additional discussion might be able to sway your opinion
here. The only advantage to using str.find is that you do not have to
use try/except blocks, but in fact you don't have to with str.index
either. On the other hand, there are numerous
disadvantages--practical, pedagogical, stylistic, and design--to
having and using str.find.

Mike