[Tutor] Advanced String Search using operators AND, OR etc..

Lie Ryan lie.1296 at gmail.com
Tue May 5 15:42:09 CEST 2009


Kent Johnson wrote:
> On Tue, May 5, 2009 at 8:11 AM, Lie Ryan <lie.1296 at gmail.com> wrote:
> 
>> Bring on your hardest searches...
> 
> Nice!
> 
>> The Suite class is only there to turn the NotFound sentinel from len(text)
>> to -1 (used len(text) since it simplifies the code a lot...)
> 
> How does this simplify the code? Why not use the 'in' operator and
> return True or False from the terms?

Using len(text) as NotFound sentinel simplifies the code because, as it 
was searching for the lowest index that matches the expression, I used 
min((a, b)) to do the comparison, and if NotFound is -1, min((a, b)) 
will return -1. Since len(text) is always higher (or equal) to the other 
expression and thus will not affect the comparison

in codespeak:

text = 'ABCD'
pat1 = 'C' # P('C') determined 2
pat2 = 'Z' # P('Z') determined 4/NotFound
expr = OR(P('C'), P('Z'))
# 2 or NotFound is True
# min(2, 4) is 2

pat3 = 'A' # P('A') determined 0
expr = OR(P('C'), P('A'))
# 3 or 0 is True
# min((3, 0)) is 0

pat4 = 'Y' # P('Y') determined 4/NotFound
expr = OR(P('Z'), P('Y'))
# NotFound or NotFound is False
$ min(4, 4) is 4, i.e. NotFound

Also, the 'in' operator cannot be used to handle nested expressions. On 
nested expressions, I simply need to evaluate two indexes and find their 
lower index (if applicable).

Other alternative sentinel I was thinking about was 0, which simplify 
the if expressions, but it 1) also suffers the same problem with -1 and 
2) 0 is already occupied by the first character.

I decided that having pat1 != len(text) sprinkled is simpler than 
rewriting min() to special case -1 AND still had to have pat != -1 
sprinkled.


More information about the Tutor mailing list