Strange regex problem

John Machin sjmachin at lexicon.net
Mon Mar 3 17:15:27 EST 2003


Rene Pijlman <reply.in at the.newsgroup> wrote in message news:<obt46vosirrgbeuvbt0mv8gha2dp6p1im2 at 4ax.com>...
> Gary Herron:
> >I have just (last week) volunteered to take over maintenance of the
> >regular expression code, so I'll think about fixing this, 
> >but it's not clear to me what a fix should entail.
> >
> > * Document the flag.
> 
> Sure. But would it have helped in this case?
> 
> > * Ignore the flag.
> 
> I guess whatever it does is there for a reason.
> 
> > * Raise an exception for any flag bit other than the document flags.
> 
> That sounds like a good idea (assuming you're also going to
> document all sensible bits of the flag :-) ), but it stil won't
> catch all cases.
> 
> > * Others?
> 
> I don't think the undocumented flag is the real problem here. If
> my understanding is correct, the documentation of search() on
> compiled vs. uncompiled re's is somewhat confusing (see also
> news:ej646vod013oreo6bvj8ctgdmumdmjjg16 at 4ax.com). So I think
> that improving this documentation would probably be more
> effective.

IMO the doco should juxtapose the contenders (for each of search and
match), and explain that the two-step variety has more options:

   re_obj = re.compile(pattern[, flags]) 
   match_obj = re_obj.search(string[, pos[, endpos]]) 

versus

   match_obj = re.search(pattern, string[, flags]) 

It may also be worthwhile stating (for sub, subn, etc also) that the
two-step option may be (or is, if true) faster for multiple searches
etc than the one-step option.

I also think that the module-level functions should be described in
terms of the two-step variety, not the other way around ... "As a
convenience for one-off and/or casual usage, result = re.xxxx(...) is
provided as (mostly) equivalent to re_obj = re.compile(...); result =
re_obj.xxxx(...)"




More information about the Python-list mailing list