Something weird about re.finditer()

Peter Otten __peter__ at web.de
Wed Apr 15 05:09:00 EDT 2009


Gilles Ganault wrote:

>         I stumbled upon something funny while downloading web pages and
> trying to extract one or more blocks from a page: Even though Python
> seems to return at least one block, it doesn't actually enter the for
> loop:
> 
> ======
> re_block = re.compile('before (.+?) after',re.I|re.S|re.M)
> 
> #Here, get web page and put it into "response"
> 
> blocks = None
> blocks = re_block.finditer(response)
> if blocks == None:
>         print "No block found"
> else:
>         print "Before blocks"
>         for block in blocks:
>                 #Never displayed!
>                 print "In blocks"
> ======
> 
> Since "blocks" is no longer set to None after calling finditer()...
> but doesn't contain a single block... what does it contain then?

This is by design. When there are no matches re.finditer() returns an empty
iterator, not None.

Change your code to something like

has_matches = False
for match in re_block.finditer(response):
    if not has_matches:
        has_matches = True
        print "before blocks"
    print "in blocks"
if not has_matches:
    print "no block found"

or

match = None
for match in re_block.finditer(response):
    print "in blocks"
if match is None:
    print "no block found"

Peter



More information about the Python-list mailing list