[Python-Dev] [Python-checkins] r64424 - in python/trunk:Include/object.h Lib/test/test_sys.py Misc/NEWSObjects/intobject.c Objects/longobject.c Objects/typeobject.cPython/bltinmodule.c

Thu Jun 26 22:41:50 CEST 2008

On Thu, Jun 26, 2008 at 12:52 PM, Raymond Hettinger <python at rcn.com> wrote:
> From: "Guido van Rossum" <guido at python.org>
>>
>> Let's step back and discuss the API some more.
>>
>> - Do we need all three?
>
> I think so -- see the the reasons below.

Sounds like Mark Dickinson only cares about bin and hex.

> Of course, my first choice was not
> on your list.  To me, the one obvious way to convert a number to a eval-able
> string in a different base is to use bin(), oct(), or hex().  But that
> appears to be off the table for reasons that I've read but don't make any
> sense to me.
> It seems simple enough, extendable enough, and clean enough
> for bin/oct/hex to use __index__ if present and __float__ if not.

That's not extendable to types that aren't int or float though. And it
would accept Decimal instances which seems a really odd thing to do.

>> - If so, why not .tobase(N)? (Even if N is restricted to 2, 8 and 16.)
>
> I don't think it's user-friendly to have the float-to-bin API
> fail to parallel the int-to-bin API.  IMO, it should be done
> the same way in both places.

Consistency only goes so far. We have 0b, 0o and 0x notations for
integers, and the bin/oct/hex builtins are meant to invert those. We
don't have base-{2,8,16} literals for floats.

> I don't find it attractive in appearance.  Any use case I can
> imagine involves multiple calls using the same base and I would likely
> end-up using functools.partial or somesuch
> to factor-out the repeated use of the same variable.  In particular,
> it's less usable with a series of numbers at the interactive prompt. That is
> one of the primary use cases since it allows you to see
> exactly what is happening with float arithmetic:
>
>>>> .6 + .7
>
> 1.2999999999999998
>>>>
>>>> bin(.6)
>
> '0b10011001100110011001100110011001100110011001100110011 * 2.0 ** -53'
>>>>
>>>> bin(.7)
>
> '0b1011001100110011001100110011001100110011001100110011 * 2.0 ** -52'
>>>>
>>>> bin(.6 + .7)
>
> '0b101001100110011001100110011001100110011001100110011 * 2.0 ** -50'
>>>>
>>>> bin(1.3)
>
> '0b10100110011001100110011001100110011001100110011001101 * 2.0 ** -52'
>
> Or checking whether a number is exactly representable:
>
>>>> bin(3.375)
>
> '0b11011 * 2.0 ** -3'
>
> Both of those bits of analysis become awkward with the tobase() method:
>
>>>> (.6).tobase(2)

You don't need the parentheses around .6.

I think much fewer than 0.01% of Python users will ever need this.
It's a one-liner helper function if you prefer to say bin(x) instead
of x.bin().

>> - What should the output format be? I know you originally favored
>> 0b10101.010101 etc. Now that it's not overloaded on the bin/oct/hex
>> builtins, the constraint that it needs to be an eval() able expression
>> may be dropped (unless you see a use case for that too).
>
> The other guys convinced me that round tripping was important
> and that there is a good use case for being able to read/write
> precisely specified floats in a platform independent manner.

Can you summarize those reasons? Who are the users of that feature?
I'm still baffled why a feature whose only users are extreme experts
needs to have such a prominent treatment. Surely there are a lot more
Python users who call urlopen() or urlparse() all day long. Should
these be built-in functions then?

> Also, my original idea didn't scale well without exponential
> notation -- i.e.  bin(125E-100) would have a heckofa lot
> of leading zeroes.   Terry and Mark also pointed-out that
> the hex with exponential notation was the normal notation
> used in papers on floating point arithmetic.  Lastly, once I
> changed over to the new way, it dramatically simplified the
> implementation.

I agree that you need to have a notation using an exponent. If it
weren't for the roundtripping, I'd probably have preferred something
which simply showed me the bits of the IEEE floating point number
broken out into mantissa and exponent -- that seems more educational
to me than normalizing things so that the last bit is nonzero.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)