[Python-ideas] Add "namereplace" error handler

Steven D'Aprano steve at pearwood.info
Tue Jun 11 17:47:54 CEST 2013


On 11/06/13 23:32, Victor Stinner wrote:
> 2013/6/11 Steven D'Aprano <steve at pearwood.info>:
>> +1 on namereplace. unicodenamereplace is unnecessary, since strings in
>> Python 3 are Unicode. Might as well say "stringnamereplace" :-)
>
> Names come from the *Unicode* standard and the *unicode*data module.

Well I should hope so. Where else would they come from? The TRON[1] standard? *wink*

My point is that there is no need for a long, verbose name in this instance. Take, for example, the backslashescape handler. We don't make it explicit that it is the *Python* backslash escape (rather than, say, Java backslash escapes[2]), because that is the obvious and expected system to use. Some day, if another handler is added that uses Java escapes, it should get the longer, more explicit name: javabackslashescape.

Perl's Larry Wall is fond of talking about "Huffman coding" language features. Common features should be short: len rather than length. Uncommon features can be longer. Since we're more likely to want Python backslash escapes than Java ones, the Python system gets the shorter name.

Contrariwise, we have a single character reference replacement handler. In this case, it isn't obvious which system of character references will be used: XML, HTML, TeX, something else? So it needs to be specified explicitly: xmlcharrefreplace.

I believe that *Unicode* names is sufficiently obvious that it does not need to be explicitly stated in the handler name. If, some day, another set of name replacements (say, HTML character entity names) is added, that can be given the more verbose name.

So I'm +1 on calling it simply "namereplace".

But regardless of the handler name, this is a great suggestion and I will definitely find it useful.



[1] https://en.wikipedia.org/wiki/TRON_(encoding)

[2] I believe Java does not support \a, \v or \0.


-- 
Steven


More information about the Python-ideas mailing list