[Python-Dev] Python and the Unicode Character Database

Mon Nov 29 22:04:03 CET 2010

Am 29.11.2010 19:33, schrieb Antoine Pitrou:
> On Mon, 29 Nov 2010 08:22:46 +0100
> "Martin v. Löwis" <martin at v.loewis.de> wrote:
>>> The former ensures that literals in code are always readable; the later
>>> allows users to enter numbers in their own number system. How could that
>>> be a bad thing?
>>
>> It's YAGNI, feature bloat. It gives the illusion of supporting something
>> that actually isn't supported very well (namely, parsing local number
>> strings). I claim that there is no meaningful application
>> of this feature.
> 
> Still, if it's not detrimental and it it's not difficult to support,
> then why do you care? You aren't even maintaining that part of the code.

I sure do maintain the Unicode database implementation in Python - the
one that is being used (IMO incorrectly) to implement the conversion in
question (and also the one that triggered this thread).

> I don't think "remove feature bloat" is part of our development goals
> or practices. Given the diversity of our user base, such removal should
> be done carefully and only for serious reasons.

I think it's a serious reason that the intuitive expectation of many
people (including committers) deviates from the actual implementation -
so much that they clarify the documentation in a way that makes the
difference explicit.

Having a mismatch between the expected behavior and the actual behavior
is a serious problem because it could lead to security issues, e.g. when
someone relies on float() to perform certain syntactic checking, making
it then possible to sneak in values that cause corruption later on
(speaking theoretically, of course - I'm not aware of an application
that is vulnerable in this manner).

Regards,
Martin