Strange regex problem
John Machin
sjmachin at lexicon.net
Mon Mar 3 17:15:27 EST 2003
Rene Pijlman <reply.in at the.newsgroup> wrote in message news:<obt46vosirrgbeuvbt0mv8gha2dp6p1im2 at 4ax.com>...
> Gary Herron:
> >I have just (last week) volunteered to take over maintenance of the
> >regular expression code, so I'll think about fixing this,
> >but it's not clear to me what a fix should entail.
> >
> > * Document the flag.
>
> Sure. But would it have helped in this case?
>
> > * Ignore the flag.
>
> I guess whatever it does is there for a reason.
>
> > * Raise an exception for any flag bit other than the document flags.
>
> That sounds like a good idea (assuming you're also going to
> document all sensible bits of the flag :-) ), but it stil won't
> catch all cases.
>
> > * Others?
>
> I don't think the undocumented flag is the real problem here. If
> my understanding is correct, the documentation of search() on
> compiled vs. uncompiled re's is somewhat confusing (see also
> news:ej646vod013oreo6bvj8ctgdmumdmjjg16 at 4ax.com). So I think
> that improving this documentation would probably be more
> effective.
IMO the doco should juxtapose the contenders (for each of search and
match), and explain that the two-step variety has more options:
re_obj = re.compile(pattern[, flags])
match_obj = re_obj.search(string[, pos[, endpos]])
versus
match_obj = re.search(pattern, string[, flags])
It may also be worthwhile stating (for sub, subn, etc also) that the
two-step option may be (or is, if true) faster for multiple searches
etc than the one-step option.
I also think that the module-level functions should be described in
terms of the two-step variety, not the other way around ... "As a
convenience for one-off and/or casual usage, result = re.xxxx(...) is
provided as (mostly) equivalent to re_obj = re.compile(...); result =
re_obj.xxxx(...)"
More information about the Python-list
mailing list