[Python-bugs-list] [ python-Bugs-603930 ] string.punctuation

noreply@sourceforge.net noreply@sourceforge.net
Tue, 03 Sep 2002 14:19:10 -0700


Bugs item #603930, was opened at 2002-09-03 13:03
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=603930&group_id=5470

Category: Python Library
Group: None
Status: Open
Resolution: None
Priority: 5
Submitted By: Ignacio Dosil Lago (do_sil)
Assigned to: Nobody/Anonymous (nobody)
Summary: string.punctuation

Initial Comment:
string.punctuation doesn't include the characters ¡ and
¿ used in spanish and galician.
When I start a python interactive session, import the
string module and "print string.punctuation" these two
characters never appear (indipendently of the
interpreter version).
Zope uses this module to implement structured text.
When somebody tries to write structured text, in
spanish, galician or ..., that includes any of these
two characters it doesn't work. For item, **¡this
should be bold text if it where structured text in
galician!**
Is this a lack in the python library or does it depend
on a third party?

----------------------------------------------------------------------

>Comment By: Martin v. Löwis (loewis)
Date: 2002-09-03 23:19

Message:
Logged In: YES 
user_id=21627

However, it might be better for Zope to use a truly
locale-independent determination of punctuation, namely the
Unicode database. Invoking g objects, which would be
locale-aware.

However, it might be better for Zope to use a truly
locale-independent determination of punctuation, namely the
Unicode database. Invoking unicodedata.category(u"\xa1");
this gives "Po". The Unicode database recognizes the
following punctuation categories:

Pc  Punctuation, Connector    
Pd  Punctuation, Dash    
Ps  Punctuation, Open    
Pe  Punctuation, Close    
Pi  Punctuation, Initial quote (may behave like Ps or Pe
depending on usage)    
Pf  Punctuation, Final quote (may behave like Ps or Pe
depending on usage)    
Po  Punctuation, Other

So I would recommend that Zope uses the Unicode database.
They should either check for categories starting with "P".

It might be worth noting that string.punctuation contains
characters that are not classified as punctuators in Unicode:

$ Sc
+ Sm
< Sm
= Sm
> Sm
^ Sk
` Sk
| Sm
~ Sm

(Sm:  Symbol, Math; Sc  Symbol, Currency; Sk  Symbol, Modifier)

So it might be that Zope is also interested in symbols
(categories starting with "S").


----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=603930&group_id=5470