re question (perhaps a stupid misunderstanding of regex-logic)
Jonathan Hogg
jonathan at onegoodidea.com
Tue Jul 30 08:34:57 EDT 2002
On 30/7/2002 13:53, in article
mailman.1028026365.30666.python-list at python.org, "Stefan Antoni"
<sasoft at gmx.de> wrote:
> thats why i ask the following question:
>
> i made the following regex:
> m = re.compile("[^M]*")
>
> i think, it would be read as: "find everything except the occurences of
> the char 'M' " .
Think of it instead as "0 or more sequential occurrences of a character that
isn't an 'M'"
> i wrote the following code:
>
> all = string.letters + string.digits + string.hexdigits + \
> string.whitespace
>
> M = re.compile("[^M]*")
> M.findall(all)
> ['abcdefghijklmnopqrstuvwxyzABCDEFGHIJKL', '', \
> 'NOPQRSTUVWXYZ01234567890123456789abcdefABCDEF\t\n\x0b\x0c\r ', '']
>
> this gives me a list with an empty item at [1] of the list.
> the documentation says: "findall: Find all occurrences of a
> pattern in a string."
The empty string, '', represents a perfectly valid match for your regex as
it is precisely 0 sequential occurances of a character that isn't an 'M'.
Try doing your experiment again with the regex "[^M]+" instead, which means
"1 or more sequential...".
Jonathan
More information about the Python-list
mailing list