[Python-Dev] trunc()

Sun Jan 27 19:53:43 CET 2008

On Jan 27, 2008 10:39 AM, Michael Urman <murman at gmail.com> wrote:
> Is this a valid summary of the arguments so far?
>
> I see two arguments for the change:
>
>   1) The semantics of trunc() are clear: it maps R -> Z in a specific fashion
>   2) The semantics of int() are fuzzy; even non-numeric types
> (strings) are handled
>
> Yet there will be a __trunc__ that will allow any chosen mapping to be
> implemented, so long as it results in an integer, so (1) is only
> guaranteed true for the builtin types.

We can easily add docs to the Real ABC indicating that __trunc__
*should* implement a certain semantics, just like we do (or should do)
for __add__ and everything else.

While this doesn't provide a hard guarantee in the presence of
non-conforming implementations, it's as good as it gets anywhere in
Python. Given that __int__ may be implemented for things that aren't
reals (like dates), it's much harder to prescribe what it *should* do.

> This leaves us with (2) which
> seems strongly tied to string parsing (as __index__ resolved the other
> common X -> integer case).
>
> I see one main argument against:
>
>   *) trunc() results in duplication at best, zealous deprecation at worst
>
> Given that the deprecation or removal of int(2.3) has been dropped,
> the argument is about pointless duplication.

To some it's pointless. To others there's a fine point to it.

> What problem is trunc() supposed to solve if it does to floats what
> int() does now? I've done some initial code searches for: lang:python
> "int(", and I've seen three primary uses for calling int(x):
>
>   a) parsing strings, such as from a user, a config file, or other
> serialized format
>   b) limiting the input to a function to an integer, such as in a
> calendar trying to ensure it has integer months
>   c) truncation, such as throwing away sub-seconds from time.time(),
> or ensuring integer results from division
>
> It's unclear to me whether (b) could be better served by more
> type-specific operations that would prevent passing in strings, or
> whether uses like (c) often have latent bugs due to truncation instead
> of rounding.

Case (b) should be using index() instead. Most likely the code either
predates index() or needs to be compatible with Python versions that
don't have it, or the programmer wasn't aware of index(), which hasn't
received a lot of publicity.

> If trunc() is to clarify round vs. integer-portion, it's something
> people learn -- the general lack of comments in (c) usages indicates
> nobody considers it special behavior. If it's to clarify the
> argument's type (the parsing of strings vs. getting integers from
> other numeric types), would separating parsing from the int (and
> float) constructors also solve this?

But there's a long-standing tradition that all numeric types in Python
accept a string as argument. This was just added to decimal too.

> Is the aim to "clean up" the following fake example? (Real world uses
> of map(int, ...) seem almost uniformly related to string parsing.)
>
> >>> map(int, ("42", 6.022, 2**32))
> [42, 6, 4294967296L]

That's an artificial example and hence it is impossible to derive the
intent of the programmer. Heterogeneous lists are pretty rare.

Let me give another artificial example.

Suppose I have a need to implement floats and ints that print
themselves in hex. I can make it so that this property is maintained
across addition etc. without having to change the code that *uses*
these numbers. But code that uses int() to convert a float to an int
will lose the property. If that code used trunc() instead I can
provide a __trunc__ on my hexfloat that returns a hexint. QED.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)