[Python-Dev] PEP 467: last round (?)

Sun Sep 4 23:06:58 EDT 2016

On 5 September 2016 at 06:42, Koos Zevenhoven <k7hoven at gmail.com> wrote:
> On Sun, Sep 4, 2016 at 6:38 PM, Nick Coghlan <ncoghlan at gmail.com> wrote:
>>
>> There are two self-consistent sets of names:
>>
>
> Let me add a few. I wonder if this is really used so much that
> bytes.chr is too long to type (and you can do bchr = bytes.chr if you
> want to)
>
> bytes.chr (or bchr in builtins)

The main problem with class method based spellings is that we need to
either duplicate it on bytearray or else break the bytearray/bytes
symmetry and propose "bytearray(bytes.chr(x))" as the replacement for
current cryptic "bytearray([x])"

Consider:

    bytearray([x])
    bytearray(bchr(x))
    bytearray(byte(x))
    bytearray(bytes.chr(x))

Folks that care about maintainability are generally willing to trade a
few extra characters at development time for ease of reading later,
but there are limits to how large a trade-off they can be asked to
make if we expect the alternative to actually be used (since overly
verbose code can be a readability problem in its own right).

> bytes.chr_at, bytearray.chr_at
> bytes.iterchr, bytearray.iterchr

These don't work for me because I'd expect iterchr to take encoding
and errors arguments and produce length 1 strings.

You also run into a searchability problem as "chr" will get hits for
both the chr builtin and bytes.chr, similar to the afalg problem that
recently came up in another thread. While namespaces are a honking
great idea, the fact that search is non-hierarchical means they still
don't give API designers complete freedom to reuse names at will.

> bytes.chr (or bchr in builtins)
> bytes.chrview, bytearray.chrview (sequence views)
>
> bytes.char    (or bytes.chr or bchr in builtins)
> bytes.chars, bytearray.chars (sequence views)

The views are already available via memoryview.cast if folks really
want them, but encouraging their use in general isn't a great idea, as
it means more function developers now need to ask themselves "What if
someone passes me a char view rather than a normal bytes object?".

>> Strings are not going to become atomic objects, no matter how many
>> times people suggest it.
>
> You consider all non-iterable objects atomic? If str.__iter__ raises
> an exception, it does not turn str somehow atomic.

"atomic" is an overloaded word in software design, but it's still the
right one for pointing out that something people want strings to be
atomic, and sometimes they don't - it depends on what they're doing.

In particular, you can look up the many, many, many discussions of
providing a generic flatten operation for iterables, and how it always
founders on the question of types like str and bytes, which can both
be usefully viewed as an atomic unit of information, *and* as
containers of smaller units of information (NumPy arrays are another
excellent example of this problem).

> I wouldn't be
> surprised by breaking changes of this nature to python at some point.

I would, and you should be to:
http://www.curiousefficiency.org/posts/2014/08/python-4000.html

> The breakage will be quite significant, but easy to fix.

Please keep in mind that we're already 10 years into a breaking change
to Python's text handling model, with another decade or so still to go
before the legacy Python 2 text model is spoken of largely in terms
similar to the way COBOL is spoken of today. There is no such thing as
a "significant, but easy to fix" change when it comes to adjusting how
a programming language handles text data, as text handling is a
fundamental part of defining how a language is used to communicate
with people.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia