[Python-Dev] PEP 467: Minor API improvements for bytes & bytearray

Raymond Hettinger raymond.hettinger at gmail.com
Sun Aug 17 23:19:17 CEST 2014


On Aug 17, 2014, at 11:33 AM, Ethan Furman <ethan at stoneleaf.us> wrote:

> I've had many of the problems Nick states and I'm also +1.

There are two code snippets below which were taken from the standard library.
Are you saying that:
1) you don't understand the code (as the pep suggests)
2) you are willing to break that code and everything like it
3) and it would be more elegantly expressed as:  
        charmap = bytearray.zeros(256)
    and
        mapping = bytearray.zeros(256)

At work, I have network engineers creating IPv4 headers and other structures
with bytearrays initialized to zeros.  Do you really want to break all their code?
No where else in Python do we create buffers that way.  Code like
"msg, who = s.recvfrom(256)" is the norm.

Also, it is unclear if you're saying that you have an actual use case for this
part of the proposal?

   ba = bytearray.byte(65)

And than the code would be better, clearer, and faster than the currently working form?

   ba = bytearray([65])

Does there really need to be a special case for constructing a single byte?
To me, that is akin to proposing "list.from_int(65)" as an important special
case to replace "[65]".

If you must muck with the ever changing bytes() API, then please 
leave the bytearray() API alone.  I think we should show some respect
for code that is currently working and is cleanly expressible in both
Python 2 and Python 3.  We aren't winning users with API churn.

FWIW, I guessing that the differing view points in the thread stem
mainly from the proponents experiences with bytes() rather than
from experience with bytearray() which doesn't seem to have any
usage problems in the wild.  I've never seen a developer say they
didn't understand what "buf = bytearray(1024)" means.   That is
not an actual problem that needs solving (or breaking).

What may be an actual problem is code like "char = bytes(1024)"
though I'm unclear what a user might have actually been trying
to do with code like that.


Raymond


----------- excerpts from Lib/sre_compile.py ---------------

    charmap = bytearray(256)
    for op, av in charset:
	while True:
            try:
                if op is LITERAL:
                    charmap[fixup(av)] = 1
                elif op is RANGE:
                    for i in range(fixup(av[0]), fixup(av[1])+1):
                        charmap[i] = 1
                elif op is NEGATE:
                    out.append((op, av))
                else:
                    tail.append((op, av))

    ...

    charmap = bytes(charmap) # should be hashable                                                                                 
    comps = {}
    mapping = bytearray(256)
    block = 0
    data = bytearray()
    for i in range(0, 65536, 256):
        chunk = charmap[i: i + 256]
        if chunk in comps:
            mapping[i // 256] = comps[chunk]
        else:
            mapping[i // 256] = comps[chunk] = block
            block += 1
            data += chunk
    data = _mk_bitmap(data)
    data[0:0] = [block] + _bytes_to_codes(mapping)
    out.append((BIGCHARSET, data))
    out += tail
    return out                    
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20140817/6b58a7b9/attachment.html>


More information about the Python-Dev mailing list