"in" operator for strings

Magnus Lie Hetland mlh at idi.ntnu.no
Fri Feb 2 22:21:20 EST 2001


"Alex Martelli" <aleaxit at yahoo.com> wrote in message
news:95ep620rbt at news1.newsguy.com...
> "Magnus Lie Hetland" <mlh at idi.ntnu.no> wrote in message
> news:95ebcv$bn$1 at tyfon.itea.ntnu.no...

[...]

> > only allow contiguous subsequences, or any subsequence?
> > If we ever get built-in sets contiguity would be meningless,
> > as would it be if we get "in" tests for dictionaries.)
>
> Dictionaries aren't sequences, nor will sets be, so the
> issue doesn't apply to either.

Sure it does. It will then become the question of whether to
allow the same operator do check membership and subset-ness.

> Implementing __getitem__, to enumerate all
> subsequences (contiguous and non) is also
> of some interest -- a typical combinatorics
> problem.

[...]

> but, that's lazy -- surely some closed form
> exists (but there is no space in the margins
> of this post for me to write it down...).

[...]

> finding a closed-form expression
> of "N-th contiguous subsequence" sounds much more fun,
> and it might in fact have some use (e.g., "get a random
> contiguous subsequence with all contiguous subsequences
> having equal probability" -- if you can enumerate such
> subsequences in closed form then that's a solved problem
> too).

Interesting...

[...]

> I suspect a modified Bayer-Moore

(Boyer-Moore)

> could be very fast and
> effective for contiguous-subsequence checks -- sublinear
> just like "real" Bayer-Moore!-)

I'm sure :-)

What's the built-in string-matching algorithm in Python?
(Not very important to me, but...)

[...]

> with bisection (binary search) over the
> 'places' sequences, O(M log N) appears like an easy
> upperbound, and maybe it can be cut down further.

Well... If you could store an array of size N for each
letter you could easily do it in O(1) time... But then
you'd need O(M*N) time to initialize the stuff :-)

If you add a requirement that the matched letters must
be relatively close to each other (within a constant
distance limit) you can get a speedup. (A reasonable
requirement in some cases.) Actually, I know of some
harwdare that does this very quickly.

[...]

> > I think it's quite OK to use
> >
> >   from string import split
>
> In a small module, maybe.  Remember the status of local
> import statements is in doubt... they may be disallowed
> in Python 2.1 (hopefully they won't be, but, who knows),
> so it's not clear that you could do this inside a function.
> "doubtful" seems a fair assessment to me -- there ARE
> doubts, depending on context; it's not necessarily "quite
> OK" -- it may or may not be.

OK... I guess that's one way of interpreting the statement...
A rather mild one, IMO :-)

To me the module doesn't even have to be small. Just think
about all the other function and class names you define in
a module... Why should "split" specifically have to have
a module prefix? If I choose to import it that may be as
thought through as defining a function myself. Only if I
have another function called "split" would I feel the
need to add its module name. I do the same thing when
importing classes in Java, for instance. When I instantiate
a JFrame I find it more convenient to write

   mainFrame = new JFrame();

than

   mainFrame = new javax.swing.JFrame();

(OK... So I know you dislike the word "convenient". Well...
I find it prettier too ;-)


> Why not just use string.split and play it safe...?\
>
> Or, just as safe, the string-method...?

If in some project the other method seems unsafe, I certainly
will.

> > I mean -- I'm not against string methods. They're nice. I
> > just get a bit woozy seeing them called on literals. It
>
> I think we'll all get used to that in time.

I guess I'll have to. But I don't think they are all
logical...

For instance:

  " ".join(somesequence)

That means 'join the elements of somesequence by " "'

On the other hand,

  [1].extend(somesequence)

means 'extend [1] by somesequence'

Why not write somesequence.extend([1]) or
somesequence.join(" ") (not both, of course) so that they would
be consistent with each other? Oh, well. I'm sure it's just
that I don't see the Greatness of it all.

> Why would mutators be any more worthy of syntax-sugar
> support than accessors?  You've lost me here...

OK. Accessors are fine, but we hardly need those in Python
<wink>

The " ".join([1,2,3]) doesn't seem like an accessor to me.
It seems like a plain function with a weird syntax.

But, hey, since this is the way it *is*, I'm the one with
a problem here :-)

>
> Alex
>


--

  Magnus Lie Hetland      (magnus at hetland dot org)

 "Reality is what refuses to disappear when you stop
  believing in it"                 -- Philip K. Dick








More information about the Python-list mailing list