[C++-sig] Re: New slice implementation

Thu Jan 8 21:51:22 CET 2004

On Thu, 2004-01-08 at 07:22, Raoul Gough wrote:
> Jonathan Brandmeyer <jbrandmeyer at earthlink.net> writes:
> 
> > On Tue, 2004-01-06 at 11:09, Raoul Gough wrote:
> >> Jonathan Brandmeyer <jbrandmeyer at earthlink.net> writes:
> >> 
> >> > I have written a new implementation for free-standing slice objects. 
> >> > This allows you to create and query slice objects that include a
> >> > step-size, as well as export a form of __getitem__ and/or __setitem__
> >> > that can receive slice arguments and tuples of slice arguments.
> >> 
> >> Not sure how much overlap there is, 
> >
> > It turns out that there isn't much.  slice::get_indicies() overlaps with
> > the indexing suite's slicing support, but in a different way such that
> > it might be OK to keep, anyway.  My object really just provides a object
> > manager for PySliceObject, rather than provide a container of objects
> > initialized by a slice (such as the indexing_suite does).
> 
> It sounds like the interfaces may be quite different, but isn't the
> functionality the same? Once you have a Python slice object, you can
> use this to access portions of a container via __getitem__ from the
> indexing suite. On the other hand, if the slice object is actually one
> of your python::slice objects, you could use its get_indices member
> function to access parts of a container as well. I guess the main
> difference is whether this is returned via a separate container, or
> via iterators into the existing container. Note the potential problems
> from the Python side, though, if the existing container disappears
> while those iterators still exist.

That bit about container lifetime is a very good point, but what I have
in mind is performing modifying operations.  That is, I want to be able
to write a function that uses a Python slice object to address which
elements of the container that I want to operate on, such as this:

double
partial_sum( std::vector<double>* Foo, slice index)
{
    slice::range<std::vector<double> > bounds;
    try {
        bounds = index.get_indicies( Foo->begin(), Foo->end());
    } catch (std::invalid_argument)
        return 0.0;
    double ret = 0.0;
    while (bounds.start != bounds.stop) {
        ret += *bounds.start;
        std::advance( bounds.start, bounds.step);
    }
    ret += bounds.start;
    return ret;
}

> BTW, wouldn't it be a good idea to have a slice constructor that takes
> a PySliceObject * as parameter?

I try to avoid raw PyObject*'s whenever possible, but I think that the
answer is "no".  The reason is that you have no idea how to properly
manage it.  That is partially what the detail::new_reference<>,
detail::borrowed_reference<>, and detail::new_non_null_reference<> are
for, right?.  Feel free to correct me if I'm wrong.

> > ---crash_test.py---
> > # Uses the existing vector_indexing_suite_ext.cpp test modul
> > from vector_indexing_suite_ext import *
> > foo = FloatVec()
> > # Weird initialization, why not supported by a constructor?
> 
> That's a good question, but it isn't necessarily that easy to
> answer. At least, not if you want to use the container's
> iterator-based constructor template. e.g. std::vector has a
> constructor

I don't think you reasonably can use those iterator-based constructors
unless you have a way of creating a [begin,end) pair of generic
iterators descended from boost::python::object.  Something that, when
dereferenced, automatically calls extract<value_type>.  The 'begin'
iterator would also have to trap for IndexErroror and StopIteration and
compare equal to the 'end' iterator afterwords.  I smell another code
contribution coming in a day or so for something just like this.

> template <class InputIterator> 
> vector(InputIterator f, InputIterator l, const Allocator& a = Allocator())
> 
> which would be the best one to use. I still haven't figured out the
> details of providing this.

Well, I don't know much about the metaprogramming guts of either suite,
but I wrote this simple template to create a preallocated vector that I
initialize with extract<>(), and exported it using "injected
constructors" support:

template<typename Container>
boost::shared_ptr<Container>
create_from_pysequence( object seq)
{
  boost::shared_ptr<Container> ret( 
    new Container(extract<int>(seq.attr("__len__")())));
  object seq_i = seq.attr("__iter__")();
  for ( typename Container::iterator i = ret->begin(); i != ret->end();
++i) {
        *i = extract< typename Container::value_type>( 
            seq_i.attr("next")());
    }
    return ret;
}

> An easier way:
> 
> def print_vec(foo):
>      print [x for x in foo]

Oooh.  Nice.

> >
> > # Should raise IndexError, or print backwards; actually prints the 
> > # original
> > print_vector(foo[::-1])
> 
> That would be because the step value is ignored, right? 

Yes.

> In any case,
> it's very useful to try this kind of test with a real Python list to
> see what "should" happen:
> 
> >>> v = [1,2,3,4,5]
> >>> print v[::-1]
> Traceback (most recent call last):
>   File "<stdin>", line 1, in ?
> TypeError: sequence index must be integer
> 
> Would have to look that one up in the Python reference to see if it's
> documented! The indexing_v2 suite prints [5,4,3,2] which also can't
> really be right.

Try it with Python 2.3.  In Python 2.2, slicing support was extended for
builtin sequences to support __getitem__(slice(start,stop,step)),
whereas in Python 2.2 you only have the __getslice__(start,stop) form.

I think that the right way to handle negative step sizes (when you can
choose the algorithm) is to use reverse iterators.  The reason is that
when provided a negative step, the stop value defaults to "one before
the beginning" and the start value defaults to the last element.  So
long as you are using the [begin,end) ranges for iterators, the only way
to make that work safely is with a reverse iterator and algorithm that
carefully accounts for the effects of non-singular step size.

> >
> > # I think this should raise IndexError; crashes.
> > foo[-1:0] = [7, 8, 9]
> 
> With a real python list, it inserts 7, 8, 9 before the last element:
> 
> >>> v = [1,2,3,4]
> >>> v[-1:0] = [7,8,9]
> >>> print v
> [1, 2, 3, 7, 8, 9, 4]

Yes, that is what happens: performing an insertion before the provided
start value.  However, I think that it should be an undefined operation
since the expression is utter nonsense.  I've looked at the source code
for PyListObject, and I think that this behavior is the result of bounds
limiting rather than real error checking.

Furthermore if you try it with Numeric (the original source of rich
slices), you will find that it is a no-op, which is what I would rather
see in Boost now that I think a little more about it.

See Python bug# 873305 at http://sourceforge.net/tracker/?group_id=5470

-Jonathan Brandmeyer