regex walktrough

MRAB python at mrabarnett.plus.com
Sat Dec 8 19:56:37 EST 2012


On 2012-12-08 23:27, Hans Mulder wrote:
> On 8/12/12 23:19:40, rh wrote:
>> I reduced the expression too. Now I wonder why re.DEBUG doesn't unroll
>> category_word. Some other re flag?
>
> he category word consists of the '_' character and the
> characters for which .isalnum() return True.
>
> On my system there are 102158 characters matching '\w':
>
That would be because you're using Python 3, where strings are Unicode.

>>>> sum(1 for i in range(sys.maxunicode+1)
> ...     if re.match(r'\w', chr(i)))
> 102158
>>>>
>
> You wouldn't want to see the complete list.
>
The number of such codepoints depends on which version of Unicode is
being supported (Unicode is evolving all the time).



More information about the Python-list mailing list