regular expression unicode character class trouble
Steven Bethard
steven.bethard at gmail.com
Sun Sep 4 15:08:36 EDT 2005
Diez B. Roggisch wrote:
> Hi,
>
> I need in a unicode-environment the character-class
>
> set("\w") - set("[0-9]")
>
> or aplha w/o num. Any ideas how to create that?
I'd use something like r"[^_\d\W]", that is, all things that are neither
underscores, digits or non-alphas. In action:
py> re.findall(r'[^_\d\W]+', '42badger100x__xxA1BC')
['badger', 'x', 'xxA', 'BC']
HTH,
STeVe
More information about the Python-list
mailing list