[Python-Dev] PEP 332 revival in coordination with pep 349? [ Was:Re: release plan for 2.5 ?]

Barry Warsaw barry at python.org
Wed Feb 15 00:51:48 CET 2006


On Tue, 2006-02-14 at 15:13 -0800, Guido van Rossum wrote:

> So I'm taking that the specific properties you want to model are the
> overflow behavior, right? N-bit unsigned is defined as arithmethic mod
> 2**N; N-bit signed is a bit more tricky to define but similar. These
> never overflow but instead just throw away bits in an exactly
> specified manner (2's complement arithmetic).

That would be my use case, yep.

> While I personally am comfortable with writing (x+y) & 0xFFFF (for
> 16-bit unsigned), I can see that someone who spends a lot of time
> doing arithmetic in this field might want specialized types.

I'd put it in the "annoying, although there exists a workaround that
might confound newbies" category.  Which means it's definitely not
urgent enough to address for 2.5 -- if ever -- especially given your
current stance on bytes(bunch_of_ints)[0].  The two are of course
separate issues, but thinking about one lead to the other.

> But I'm not sure that that's what the Numeric folks want -- I believe
> they're more interested in saving space, not in the mod 2**N
> properties. 

Could be.  I don't care about space savings.  And I definitely have no
clue what the Numeric folks want. ;)

> There's certainly a point to treating bytes as ints; I don't know if
> it's more compelling than to treating them as unit bytes. But if we
> decide that the bytes types contains ints, b[0] should return a plain
> int (whose value necessarily is in range(0, 256)), not some new
> unsigned-8-bit type. And creating a bytes object from a list of ints
> should accept any input values as long as their __index__ value is in
> that same range.
> 
> I.e. bytes([1, 2L]) should be the same as bytes([1L, 2]); and
> bytes([-1]) should raise a ValueError.

That seems fine to me.

> I agree it's icky, and I'd rather not design APIs like that -- but I
> can't help it that others continue to want to use that idiom. I also
> agree that most likely we'll want to treat bytes the same as strings
> here. But no basestring (bytes are mutable and don't behave like
> sequences of characters).

That's interesting.  So bytes really behave a lot more like some weird
string/lists hybrid then? It makes some sense.  You read 801 bytes from
a binary file, twiddle bytes 223 and 741 and then write those bytes back
out to a different binary file.

If we don't inherit from basestring, what I'm worried about is that for
those who do continue to use the idiom described previously, we'll have
to extend our isinstance() to include both basestring and bytes.  Which
definitely gets ickier.  But if bytes are mutable, as make sense, then
it also makes sense that they don't inherit from basestring.

BTW, using that idiom is a bit of a hedge against such API (which you
may not control).  It allows us to say "okay, at /this/ point I don't
know whether I have a scalar or a sequence, but from this point forward,
I know I have something I can safely iterate over."

I wonder if it makes sense to add a more fundamental abstract base class
that can be used as a marker for "photonic behavior".  I don't know what
that class would be called, but you'd then have a hierarchy like this:

photonic
    basestring
        str
        unicode
    bytes

OTOH, it seems like a lot to add for a specialized (and some would say
dubious) use case.

-Barry

-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 307 bytes
Desc: This is a digitally signed message part
Url : http://mail.python.org/pipermail/python-dev/attachments/20060214/9acabe4a/attachment.pgp 


More information about the Python-Dev mailing list