[docs] possible doc bug: 6.2.1. Regular Expression Syntax, *?, +?, ??

Georg Brandl georg at python.org
Tue Apr 12 01:53:08 EDT 2016


On 04/11/2016 09:34 PM, John Gordon wrote:
> I'm a Python newbie and am very reluctant to submit a doc bug since I don't know
> what I'm doing yet.
> 
> But I'm trying to learn to use re.search and am using your doc.
> 
> I can't figure out how the text I colored red below can be accurate. Besides its
> logic not making sense to me, I tried running it and it doesn't work:  .*?
> doesn't match <H1> .
> 
> If I'm wrong I need to understand why.
> 
> ============
> 
> DOC BUG (maybe):
> https://docs.python.org/3/library/re.html
> 
>     6.2.1. Regular Expression Syntax
> 
>     **?, +?, ?? 
>     *The '*', '+', and '?' qualifiers are all greedy; they match as much text as
>     possible. Sometimes this behaviour isn’t desired; if the RE <.*> is matched
>     against '<H1>title</H1>', it will match the entire string, and not just
>     '<H1>'. Adding '?' after the qualifier makes it perform the match in
>     non-greedy or minimal fashion; as few characters as possible will be
>     matched. *Using .*? in the previous expression will match only '<H1>'.**
>     **
>     *
> 
> 
> I tried the following and none returned <H1>:
> 
>>>> ttt = re.search('.*?','<H1>title</H1>')
>>>> ttt = re.search('(.*?)','<H1>title</H1>')
>>>> ttt = re.search(r'(.*?)','<H1>title</H1>')

Hi John,

this is indeed confusingly worded; it should give the whole expression,
i.e. ``<.*?>``.

Also, it should probably refrain from using HTML in a regex example altogether,
as parsing HTML/XML with regexes is one of the classic "I thought I'd use regex,
now I have two problems" cases.

I've changed the example now to be a bit less specific, and clarified the
replacement regex.

Thanks for the report!
Georg


-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 181 bytes
Desc: OpenPGP digital signature
URL: <http://mail.python.org/pipermail/docs/attachments/20160412/527182e3/attachment.sig>


More information about the docs mailing list