[issue2636] Regexp 2.7 (modifications to current re 2.2.2)

Vlastimil Brom report at bugs.python.org
Mon Sep 13 00:01:06 CEST 2010


Vlastimil Brom <vlastimil.brom at gmail.com> added the comment:

Thank you both for the explanations; I somehow suspected, there would be some strong reasoning for the conservative approach with regard to the backward compatibility.
Thanks for the block() and script() offer, Matthew, but I believe, this might clutter the interface of the module, while it belongs somwhere else.
(Personally, I just solved this need by directly grabbing 
http://www.unicode.org/Public/UNIDATA/Scripts.txt using regex :-)

It might be part of the problem for unicodedata, that this is another data file than UnicodeData.txt (which is the only one used, currently, IIRC).

On the other hand it might be worthwile to synchronise this features with such updates in unicodedata (block, script, unicode range; maybe the full names of the character properties might be added too).
As unicode 6.0 is about to come with the end of September, this might also reduce the efforts of upgrading it for regex.

Do you think, it would be appropriate/realistic to create a feature request in bug tracker on enhancing unicodedata?
(Unfortunately, I must confess, I am unable to contribute code in this area, without the C knowledge I always failed to find any useful data in optimised sources of unicodedata; hence I rather directly scanned the online datafiles.) 

vbr

----------

_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue2636>
_______________________________________


More information about the Python-bugs-list mailing list