[C++-sig] Re: Adding __len__ to range objects
David Abrahams
dave at boost-consulting.com
Mon Aug 11 17:00:09 CEST 2003
Raoul Gough <RaoulGough at yahoo.co.uk> writes:
> David Abrahams <dave at boost-consulting.com> writes:
>
>> Raoul Gough <RaoulGough at yahoo.co.uk> writes:
> [snip]
>>> On the other hand, the following should work OK, unless I'm
>>> mistaken:
>>>
>>> iterator_copy = some_iterator
>>> for x in some_iterator: print x
>>> for x in iterator_copy: print x # Prints same
>>
>> You're mistaken. iterator_copy and some_iterator both refer to the
>> same object. Try it:
>>
>> >>> i = iter(range(5))
>> >>> j = i
>> >>> for x in i: print x
>> ...
>> 0
>> 1
>> 2
>> 3
>> 4
>> >>> for x in j: print x
>> ...
>
> Of course - silly of me, but I still get confused sometimes by
> Python's reference copying.
>
>>> One real drawback
>>> that I thought of is that adding this might break existing code. A
>>> (broken) user-defined iterator might not provide enough smarts for
>>> std::distance to work, and yet still be compatible with the existing
>>> range support.
>>
>> It would always supply enough smarts. However, I am nervous about
>> supplying __len__ for anything below random-access, especially for
>> input iterators where it is probably destructive.
>
> Well, I'm not so sure - std::distance almost certainly requires
> iterator_traits<>::iterator_category which (AFAIK) range doesn't at
> the moment.
Technically, range() does require iterator_category, since it requires
real iterators. But no, it does not enforce that requirement.
> Since range currently only needs next() functionality, it
> wouldn't need to know what the iterator's category is, but adding
> len() wants more information for efficiency reasons.
Yes.
>>> Adding len would break this, since the distance function would get
>>> generated even if it is never used or wanted.
>>
>> I don't think that's a problem.
>
> Well, it certainly wouldn't be hard to fix code that does break.
No it wouldn't. It was wrong anyway ;->
>>> I can think of two ways around this: add a new range_with_len type
>>> (seems excessive), or provide some way for the client code to access
>>> the generated class_ object to add their own extensions (probably
>>> difficult?)
>>
>> I think you're barking up the wrong tree. Maybe we ought to simply
>> change our tune and say that range() produces an iterable-returning
>> function rather than an iterator-returning function. If we did that
>> we could always generate __len__, and for that matter we could also
>> generate __getitem__/__setitem__ for random-access ranges.
>
> Sounds good to me. I think Andreas was also suggesting something in
> this direction. I was almost going to suggest adding a _new_ C++
> sequence wrapper (e.g. called view or sequence_view) that provides the
> extra stuff as well as __iter__ support via the existing range
> code. The only real benefit would be that the client code could then
> choose whether to include the more sophisticated support.
Too many options; not enough benefit.
> Maybe you could achieve this via an optional iterator_category
> anyway (as you pointed out, __len__ is probably a bad idea for
> input_iterators).
Don't know what this means.
Anyway, I'm not going to implement this one myself. Patches
(including docs and tests) will be gratefully considered.
--
Dave Abrahams
Boost Consulting
www.boost-consulting.com
More information about the Cplusplus-sig
mailing list