regular expressions and internationalization (WAS: permuting letters...)
Steven Bethard
steven.bethard at gmail.com
Wed Nov 17 17:47:45 EST 2004
Dieter Maurer wrote:
> Steven Bethard <steven.bethard at gmail.com> writes on Fri, 12 Nov 2004 20:15:28 +0000 (UTC):
> >
> > Is there any way to match \w but not \d?
>
> It is: r'(?!\d)\w'
Yeah, I guess you could use negative lookahead assertions too. My
proposed solution to the problem discussed in this thread:
>>> re.findall(r'[^\W\d_]{4,}', 'asdg1dfs _asfd s adfsa')
['asdg', 'asfd', 'adfsa']
A solution using a negative lookahead assertion:
>>> re.findall(r'(?:(?![\d_])\w){4,}', 'asdg1dfs _asfd s adfsa')
['asdg', 'asfd', 'adfsa']
This seems a fair bit more verbose (and IMHO harder to read) than the
solution I proposed, but perhaps you had a clearer version in mind?
I tend to shy away from lookahead assertions because IMHO there's
usually an easier way. They are occasionally useful though...
Steve
More information about the Python-list
mailing list