Question about metacharacter '*'

Roy Smith roy at panix.com
Sun Jul 6 12:47:38 EDT 2014


In article <d8f8d76d-0a47-4f59-8f09-da2a44cc1d2e at googlegroups.com>,
 Rick Johnson <rantingrickjohnson at gmail.com> wrote:

> As an aside i prefer to only utilize a "character set" when
> nothing else will suffice. And in this case r"[0-9][0-9]*"
> can be expressed just as correctly  (and less noisy IMHO) as
> r"\d\d*".

Even better, r"\d+"

>>> re.search(r'(\d\d*)', '111aaa222').groups()
('111',)
>>> re.search(r'(\d+)', '111aaa222').groups()
('111',)

Oddly enough, I prefer character sets to the backslash notation, but I 
suppose that's largely because when I first learned regexes, that 
new-fangled backslash stuff hadn't been invented yet. :-)

I know I've said this before, but people should put more effort into 
learning regex.  There are lots of good tools in Python (startswith, 
endswith, split, in, etc) which handle many of the most common regex use 
cases.  Regex is also not as easy to use in Python as it is in a 
language like Perl where it's baked into the syntax.  As a result, 
pythonistas tend to shy away from regex, and either never learn the full 
power, or let their skills grow rusty.  Which is a shame, because for 
many tasks, there's no better tool.



More information about the Python-list mailing list