[issue35549] Add partial_match: bool = False argument to unicodedata.lookup
Steven D'Aprano
report at bugs.python.org
Fri Dec 21 19:37:08 EST 2018
Steven D'Aprano <steve+python at pearwood.info> added the comment:
I love the idea, but dislike the proposed interface.
As a general rule of thumb, Guido dislikes "constant bool parameters", where you pass a literal True or False to a parameter to a function to change its behaviour. Obviously this is not a hard rule, there are functions in the stdlib that do this, but like Guido I think we should avoid them in general.
Instead, I think we should allow the name to include globbing symbols * ? etc. (I think full blown re syntax is overkill.) I have an implementation which I use:
lookup(name) -> single character # the current behaviour
lookup(name_with_glob_symbols) -> list of characters
For example lookup('latin * Z') returns:
['LATIN CAPITAL LETTER Z', 'LATIN SMALL LETTER Z', 'LATIN CAPITAL LETTER D WITH SMALL LETTER Z', 'LATIN LETTER SMALL CAPITAL Z', 'LATIN CAPITAL LETTER VISIGOTHIC Z', 'LATIN SMALL LETTER VISIGOTHIC Z']
A straight substring match takes at worst twelve extra characters:
lookup('*' + name + '*')
and only two if the name is a literal:
lookup('*spam*')
This is less than `partial_match=True` (18 characters) and more flexible and powerful. There's no ambiguity between the two styles of call because the globbing symbols * ? and [] are never legal in Unicode names. See section 4.8 of
http://www.unicode.org/versions/Unicode11.0.0/ch04.pdf
----------
nosy: +steven.daprano
_______________________________________
Python tracker <report at bugs.python.org>
<https://bugs.python.org/issue35549>
_______________________________________
More information about the Python-bugs-list
mailing list