This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: Fix __index__() clipping really big numbers
Type: Stage:
Components: Interpreter Core Versions: Python 2.5
process
Status: closed Resolution: rejected
Dependencies: Superseder:
Assigned To: Nosy List: ncoghlan, nnorwitz, teoliphant, tim.peters
Priority: normal Keywords: patch

Created on 2006-07-29 03:47 by ncoghlan, last changed 2022-04-11 14:56 by admin. This issue is now closed.

Files
File name Uploaded Description Edit
index_overflow.diff ncoghlan, 2006-07-29 03:47 Patch to raise OverflowError for big numbers passed to __index__()
pep357-fixed.diff ncoghlan, 2006-07-29 13:19 Much better solution that revises the PEP 357 C API
pep357-fix-v.2.diff ncoghlan, 2006-07-29 16:43 Further reduction in code duplication
pep357-fix-v.3.diff ncoghlan, 2006-08-01 15:28 Update to address comments from Travis on python-dev
pep357-fix-v.4.diff ncoghlan, 2006-08-05 15:10 Update to address comments from Neal
pep357-fix-v.5.diff ncoghlan, 2006-08-10 14:10 Fast track builtin longs as well as ints
Messages (11)
msg50782 - (view) Author: Nick Coghlan (ncoghlan) * (Python committer) Date: 2006-07-29 03:47
Patch attached (index_overflow.diff) that causes
__index__() to raise OverflowError for really big
numbers instead of silently clipping them.

The approach in the patch is a "minimal fix" that is as
ugly as hell (3 different error return codes!), so I'm
going to try for a cleaner version that changes
nb_index to return a PyObject* (then the client code
can decide whether to convert to Py_ssize_t or not, and
whether to clip or raise OverflowError when doing so).
msg50783 - (view) Author: Tim Peters (tim.peters) * (Python committer) Date: 2006-07-29 04:05
Logged In: YES 
user_id=31435

Since I don't think this is a sane approach, I'm not going
to spend time reviewing it :-)  Suggest working out what's
/wanted/ on python-dev first, including beefing up PEP 357
so it spells out the revised intents.
msg50784 - (view) Author: Nick Coghlan (ncoghlan) * (Python committer) Date: 2006-07-29 13:19
Logged In: YES 
user_id=1038590

Attaching the patch that approaches the problem from the
ground up by redesigning the PEP 357 C API to meet the needs
of the actual use cases in the standard library.
msg50785 - (view) Author: Nick Coghlan (ncoghlan) * (Python committer) Date: 2006-07-29 16:43
Logged In: YES 
user_id=1038590

Revised the patch to further reduce code duplication in the
implementation, and to reduce the divergence from PEP 357.

The functions in the C API now have the names PyNumber_Index
(raises IndexError), PyNumber_AsSsize_t (raises
OverflowError) and PyNumber_AsClippedSsize_t (clamps to
PY_SSIZE_T_MIN/MAX), and are no longer exposed through the
operator module. Python code should either use __index__()
directly (if not constrained to concrete containers) or else
use operator.index().
msg50786 - (view) Author: Neal Norwitz (nnorwitz) * (Python committer) Date: 2006-07-29 18:20
Logged In: YES 
user_id=33168

The description is no longer valid, this is not a minimal patch.

I haven't thought about the actual design of the patch, but
I have a bunch of issues.  In no particular, order:
 * tests should not use the assert statement, but self.assert*
 * There are a bunch of formatting changes (one place with a
couple of extra blank lines, other places where tabs became
spaces or vica versa)
 * I think there might be behaviour changes wrt checking for
sq_ass_item and sq_get_item and raising type_error()s.  I
didn't think about this hard, but just from a quick review
there was a red flag.
 * PyNumber_Index(o, is_index) is a bogus naming.  Either
the API should change or is_index should change.  It doesn't
make sense to me that you would call an API to get the index
that is not an index.
 * int_index became int_int, same with long, but not in
other places.  I don't understand the name change.

I need to think more about the structure of this patch. 
There's something I don't like, but I'm not sure what it is
without further review.

More eyes would definitely help.
msg50787 - (view) Author: Nick Coghlan (ncoghlan) * (Python committer) Date: 2006-08-01 15:28
Logged In: YES 
user_id=1038590

The minimal fix is still attached (the index_overflow one).
That's the patch Tim didn't like. The latter half of the
description covers the pep357-fix patch versions.

The API's in the more comprehensive patches (_Index,
_AsSsize_t, _AsClippedSsize_t) were based on minimising the
boilerplate associated with actually *using* those functions
to implement the existing operations in the standard library.

Without the is_index flag, the mp_subscript implementations
all need to perform their own check for whether or not the
object provides __index__ or not, since they don't want to
raise a TypeError unless the object is *also* not a slice. .

However, I'm open to inverting the sense of is_index and
changing the name of the output variable to typeerror (an
early version of the code actually had it that way around).
The only impact on the patch is a bunch of name changes to
different variables.

In terms of error messages, the patch definitely changes
some of the exact wording of raised exceptions. It doesn't
change the types, though.

The specific change to PyObject_GetItem and friends is that
the error message becomes "TypeError: 'foo' object is
unsubscriptable" instead of the old "TypeError: 'foo' object
is unindexable". The old error message can be restored if
you prefer by moving the type check back after the check for
whether or not the key is a valid index. However, it really
didn't make a lot of sense to me to check the key before
we'd even confirmed that the object actually supported
subscripting.

For the unit tests, I eliminated the use of assert
statements in v3 of the patch (as well as updating the tests
to cover a bunch of problems with the v2 patch that Travis
noted, and to avoid running the same tests multiple times).

int_index and long_index went away because the signature
change to the nb_index slot (returning PyObject *) meant
that the existing int_int and long_long functions were
sufficient - there was no need to write new functions
specifically for the nb_index slot. This actually applies to
any extension type that implements nb_index - it should be
populated with the same value as either their nb_int slot or
their nb_long slot (depending on whether or not the value
will always fit in a Python int)

In terms of style (and what may be bothering you), I'm not
aware of any existing public CPython API that supports the
use of an output flag to indicate errors instead of
requiring that callers check PyErr_Occurred. The
straightforward approach of providing a PyNumber_IndexCheck
API is certainly feasible (and has an identical line count
in the mp_subscript functions), but comes at the cost of
doing the check twice in every mp_subscript function (once
directly and once inside PyNumber_Index).

The old code didn't have this problem with doing the same
check in two different places because it was merrily calling
the nb_index slot directly all over the place instead of
going through the abstract API.
msg50788 - (view) Author: Nick Coghlan (ncoghlan) * (Python committer) Date: 2006-08-05 15:10
Logged In: YES 
user_id=1038590

Added version 4 of the patch. Changes:
  - is_index is now type_err, with the logical sense
inverted and corresponding changes made to other parts of
the patch (I didn't use type_error because that would have
caused a name clash inside abstract.c)
  - for consistency in terminology, the internal function
used for long conversion is now _PyLong_AsClippedSsize_t to
match the PyNumber function, and the output variable used to
report clipping is named clipped
  - an error message from _PyEval_SliceIndex had changed
gratuitously, so I restored the original message.
msg50789 - (view) Author: Nick Coghlan (ncoghlan) * (Python committer) Date: 2006-08-06 00:21
Logged In: YES 
user_id=1038590

Put the priority back to 9 (I accidentally changed it back
to 7 when I attached the new patch)
msg50790 - (view) Author: Nick Coghlan (ncoghlan) * (Python committer) Date: 2006-08-10 14:10
Logged In: YES 
user_id=1038590

Version 5 fast tracks builtin longs as well as ints
msg50791 - (view) Author: Travis Oliphant (teoliphant) * (Python committer) Date: 2006-08-11 10:44
Logged In: YES 
user_id=5296

I'm attaching a patch that fixes the problem using a more
traditional C-API.  

Their are still 3 new C-API calls --- 1 is actually a macro 

* PyIndex_Check(obj)

* PyNumber_Index(obj)  ---  is to nb_index as PyNumber_Int
is to nb_int

* PyNumber_AsSsize_t(obj, err)  --- the second argument
allows specifying the error when conversion to Pyssize_t
raises an Overflow.  If err == NULL, the value is clipped
instead of raising any error.  OverflowError and IndexError
are the two errors used in Python itself and the standard
library. 

* I kept the same unit-tests from Nick except that
subclasses for integers are allowed as slice indices as this
does not cause recursion. 

msg50792 - (view) Author: Nick Coghlan (ncoghlan) * (Python committer) Date: 2006-08-11 11:42
Logged In: YES 
user_id=1038590

I'm withdrawing this in favour of Travis's patch #1538606 -
it's much more in line with the idioms used in other parts
of the C API. It isn't worth messing up the API just to
avoid checking for the existing of the nb_index slot twice.

With Travis's patch, it's still straightforward for
extension modules to do their own overflow handling, too -
call PyNumber_AsSsize_t, and if an overflow error gets
triggered, call PyNumber_Index to find out whether it was
positive or negative overflow.
History
Date User Action Args
2022-04-11 14:56:19adminsetgithub: 43742
2006-07-29 03:47:49ncoghlancreate