[docs] copy&waste problem

Hauke Rehr homo_laber at yahoo.de
Fri Mar 9 15:12:22 CET 2012


Hello again,

I can’t agree with your rewrite either, sorry - my suggestion based on yours:

 +   When the :const:`LOCALE` and :const:`UNICODE` flags are not specified,
+   matches any non-whitespace character; this is equivalent to the set ``[^
+   \t\n\r\f\v]`` With :const:`LOCALE`, it will match those elements of the above set
+   not defined as space in the current locale. If :const:`UNICODE` is set, those elements
+   of ``[^ \t\n\r\f\v]`` not marked as space in the Unicode character properties database
+   will be matched.

If I don’t get the meaning of \S (that is: anything but \s) wrong, this should be correct.
The same applies to \W:

+   this will match anything other than ``[0-9_]`` not classified as
+   alphanumeric in the Unicode character properties database.


For the additional sentence, I’d prefer:

+   In case both ``re.LOCALE`` and ``re.UNICODE`` are specified alongside,
+   these character classes will behave as if the union was given.

for that’s the logic behind.

Hauke

--- Senthil Kumaran <senthil at uthcode.com> schrieb am Fr, 9.3.2012:

Von: Senthil Kumaran <senthil at uthcode.com>
Betreff: Re: [docs] copy&waste problem
An: "Hauke Rehr" <homo_laber at yahoo.de>
CC: docs at python.org
Datum: Freitag, 9. März, 2012 09:18 Uhr

Hello Hauke,

Yeah, it was pretty confusing. Thanks for catching this. How does this
change sound?

-   When the :const:`LOCALE` and :const:`UNICODE` flags are not
specified, matches
-   any non-whitespace character; this is equivalent to the set ``[^
\t\n\r\f\v]``
-   With :const:`LOCALE`, it will match any character not in this set, and not
-   defined as space in the current locale. If :const:`UNICODE` is
set, this will
-   match anything other than ``[ \t\n\r\f\v]`` and characters marked
as space in
-   the Unicode character properties database.
+   When the :const:`LOCALE` and :const:`UNICODE` flags are not specified,
+   matches any non-whitespace character; this is equivalent to the set ``[^
+   \t\n\r\f\v]`` With :const:`LOCALE`, it will match the above set and any
+   non-space character in the current locale. If :const:`UNICODE` is set, the
+   above set ``[^ \t\n\r\f\v]`` and characters not marked as space in the
+   Unicode character properties database.

 ``\w``
    When the :const:`LOCALE` and :const:`UNICODE` flags are not
specified, matches
@@ -381,8 +381,8 @@
    any non-alphanumeric character; this is equivalent to the set
``[^a-zA-Z0-9_]``.
    With :const:`LOCALE`, it will match any character not in the set
``[0-9_]``, and
    not defined as alphanumeric for the current locale. If
:const:`UNICODE` is set,
-   this will match anything other than ``[0-9_]`` and characters marked as
-   alphanumeric in the Unicode character properties database.
+   this will match anything other than ``[0-9_]`` plus characters classied as
+   not alphanumeric in the Unicode character properties database.


Hope the rewrite is less confusing.

We can also include this sentence somewhere.

Both re.LOCALE and re.UNICODE is specified together,in that case
re.LOCALE would be matched first and the re.UNICODE.


-- 
Senthil

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/docs/attachments/20120309/d016d203/attachment.html>


More information about the docs mailing list