Question about metacharacter '*'

Rick Johnson rantingrickjohnson at gmail.com
Sun Jul 6 13:38:23 EDT 2014


On Sunday, July 6, 2014 11:47:38 AM UTC-5, Roy Smith wrote:
> Even better, r"\d+"
> >>> re.search(r'(\d\d*)', '111aaa222').groups()
> ('111',)
> >>> re.search(r'(\d+)', '111aaa222').groups()
> ('111',)

Yes, good catch! I had failed to reduce your original
pattern down to it's most fundamental aspects for the sake
of completeness, and instead, opted to modify it in a manner
that mirrored your example. 

> Oddly enough, I prefer character sets to the backslash
> notation, but I suppose that's largely because when I
> first learned regexes, that new-fangled backslash stuff
> hadn't been invented yet. :-) 

Ha, point taken! :-)

Character sets really shine when you need a fixed range of
letters or numbers which are NOT defined by one of the
"special characters" of \d \D \W \w, etc... 

Say you want to match any letters between "c" and "m" or the
digits between "3" and "6". Defining that pattern using OR'd
"char literals" would be a massive undertaking!

Another great use of character sets is skipping chars that
don't match a "target". For instance, a python comment will
start with one hash char and proceedeth to the end of the
line,,, which when accounting for leading white-space,,,
could be defined by the pattern:

    r'\s*#[^\n]'
    
> Regex is also not as easy to use in Python as it is in a
> language like Perl where it's baked into the syntax.  As a
> result, pythonistas tend to shy away from regex, and
> either never learn the full power, or let their skills
> grow rusty. Which is a shame, because for many tasks,
> there's no better tool.

Agreed, but unfortunately like many other languages, Python
has decided to import all the illogical of regex syntax from
other languages instead of creating a "new" regex syntax
that is consistent and logical. They did the same thing with
Tkinter, and what a nightmare!

And don't misunderstand my statements, i don't intend that
we should create a syntax of verbosity, NO, we *CAN* keep
the syntax succinct whist eliminating the illogical and
inconsistent aspects that plague our patterns.  

Will regex ever be easy to learn, probably not, but they can
be easier to use if only we put on our "big boy" pants and
decide to do something about it!




More information about the Python-list mailing list