[Pandas-dev] Pandas loc function including upper limit

Stephan Hoyer shoyer at gmail.com
Mon Oct 1 09:51:10 EDT 2018


This has been reported a number of times as unintuive by xarray users as
well (xarray uses pandas internally for the label based slicing).

I agree that there is some logic to including the last value, since at
least for some datatypes (e.g., cateogical) isn't always obvious what the
next value will be.

I do tend to think this was a mistake, because it means you can't break up
an object into non-overlapping but contiguous slices with only a single
list of "dividing" indexes. It's particularly awkward with general
numeric/float indexes, where calculating the next float is not obvious --
it would be nice if .loc[0:10] and .loc[10:20] did not both include the
value 10. Datetime indexes would also be pretty as awkward if pandas did
not have the shortcut where omitted times in the end of a slice are taken
to be the last value.

That said, it does feel rather late to change this behavior. I don't see
anyway to do a graceful deprecation cycle for this without a new indexing
attribute.

On Mon, Oct 1, 2018 at 12:17 PM Joris Van den Bossche <
jorisvandenbossche at gmail.com> wrote:

> Hi Hritik,
>
> You can question the logic of this, but this is such a ingrained /
> fundamental functionality of pandas, that I don't think this is something
> that we can change (even if we would like to).
>
> But it is true this can be counter intuitive / confusing if you are used
> to how integer positional indexing works in python / numpy / iloc. This is
> certainly one of the "gotcha's" in pandas indexing where we differentiate
> between label-based and positional indexing, that you need to be aware of.
>
> That said, I personally think there is also some logic in including the
> last value for label-based indexing. For example, when having not a simple
> range of integers as index, the values are often irregular and given a
> specific label, you don't necessarily know what the value is just before
> that label. So it might be more predictable the if the passed label (the
> upper limit) is included.
>
> Best,
> Joris
>
>
> Op zo 30 sep. 2018 om 10:02 schreef Hritik Vijay <hritikxx8 at gmail.com>:
>
>> The `loc` function includes the upper limit which is very counter
>> intuitive.
>> Shouldn't it follow iloc and other indexing methods and exclude upper
>> limit (at least for integral slices)
>>
>> --
>> Regards
>> Hritik Vijay
>> _______________________________________________
>> Pandas-dev mailing list
>> Pandas-dev at python.org
>> https://mail.python.org/mailman/listinfo/pandas-dev
>>
> _______________________________________________
> Pandas-dev mailing list
> Pandas-dev at python.org
> https://mail.python.org/mailman/listinfo/pandas-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/pandas-dev/attachments/20181001/15c9fc86/attachment.html>


More information about the Pandas-dev mailing list