[issue14258] Better explain re.LOCALE and re.UNICODE for \S and \W

Senthil Kumaran report at bugs.python.org
Mon Mar 12 04:22:06 CET 2012


New submission from Senthil Kumaran <senthil at uthcode.com>:

Opening the this bug following this discussion - http://mail.python.org/pipermail/docs/2012-March/007829.html

library/re.html

\S

When the LOCALE and UNICODE flags are not specified, matches any non-whitespace character; this is equivalent to the set [^ \t\n\r\f\v] With LOCALE, it will match any character not in this set, and not defined as space in the current locale. If UNICODE is set, this will match anything other than [ \t\n\r\f\v] and characters marked as space in the Unicode character properties database.

This is wrong. With LOCALE set, it should be [^ \t\n\r\f\v] plus any non-space character in that locale.

----------
assignee: orsenthil
components: Documentation
messages: 155434
nosy: orsenthil
priority: low
severity: normal
status: open
title: Better explain re.LOCALE and re.UNICODE for \S and \W
type: behavior
versions: Python 2.7

_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue14258>
_______________________________________


More information about the Python-bugs-list mailing list