[Python-bugs-list] [ python-Bugs-635595 ] Misleading description of \w in regexs
noreply@sourceforge.net
noreply@sourceforge.net
Tue, 12 Nov 2002 15:14:40 -0800
Bugs item #635595, was opened at 2002-11-08 12:29
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=635595&group_id=5470
Category: Documentation
Group: Python 2.2
>Status: Closed
>Resolution: Fixed
Priority: 5
Submitted By: Greg Chapman (glchapman)
Assigned to: Fred L. Drake, Jr. (fdrake)
Summary: Misleading description of \w in regexs
Initial Comment:
In the Regular Expression Syntax doc page
(http://www.python.org/dev/doc/devel/lib/re-syntax.html), the
description for \w is misleading (the same goes for \W).
The description indicates that, with the locale flag in effect,
\w includes "characters defined as letters" for the current
locale. In reading that, I took "letters" to mean characters
for which isalpha returns true, but, in fact, all characters
defined as alphanumerics for the current locale are
included (so \w works pretty much the same way with locale
flag as with the unicode flag). For example (using '\xb2',
the superscript two):
Python 2.2.2 (#37, Oct 14 2002, 17:02:34) [MSC 32 bit
(Intel)] on win32
Type "help", "copyright", "credits" or "license" for more
information.
>>> import locale
>>> locale.setlocale(locale.LC_ALL, '')
'English_United States.1252'
>>> import re
>>> re.match(r'\w', '\xb2', re.L).group()
'\xb2'
----------------------------------------------------------------------
>Comment By: Fred L. Drake, Jr. (fdrake)
Date: 2002-11-12 18:14
Message:
Logged In: YES
user_id=3066
Fixed in Doc/lib/libre.tex revisions 1.91 and 1.73.6.11.
----------------------------------------------------------------------
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=635595&group_id=5470