re question (perhaps a stupid misunderstanding of regex-logic)

Jonathan Hogg jonathan at onegoodidea.com
Tue Jul 30 08:34:57 EDT 2002


On 30/7/2002 13:53, in article
mailman.1028026365.30666.python-list at python.org, "Stefan Antoni"
<sasoft at gmx.de> wrote:

> thats why i ask the following question:
> 
> i made the following regex:
> m  = re.compile("[^M]*")
> 
> i think, it would be read as: "find everything except the occurences of
> the char 'M' " .

Think of it instead as "0 or more sequential occurrences of a character that
isn't an 'M'"

> i wrote the following code:
> 
> all = string.letters + string.digits + string.hexdigits + \
> string.whitespace
> 
> M  = re.compile("[^M]*")
> M.findall(all)
> ['abcdefghijklmnopqrstuvwxyzABCDEFGHIJKL', '', \
> 'NOPQRSTUVWXYZ01234567890123456789abcdefABCDEF\t\n\x0b\x0c\r ', '']
> 
> this gives me a list with an empty item at [1] of the list.
> the documentation says: "findall: Find all occurrences of a
> pattern in a string."

The empty string, '', represents a perfectly valid match for your regex as
it is precisely 0 sequential occurances of a character that isn't an 'M'.

Try doing your experiment again with the regex "[^M]+" instead, which means
"1 or more sequential...".

Jonathan




More information about the Python-list mailing list