[C++-sig] Re: Adding __len__ to range objects

David Abrahams dave at boost-consulting.com
Mon Aug 11 17:00:09 CEST 2003


Raoul Gough <RaoulGough at yahoo.co.uk> writes:

> David Abrahams <dave at boost-consulting.com> writes:
>
>> Raoul Gough <RaoulGough at yahoo.co.uk> writes:
> [snip]
>>> On the other hand, the following should work OK, unless I'm
>>> mistaken:
>>>
>>> iterator_copy = some_iterator
>>> for x in some_iterator: print x
>>> for x in iterator_copy: print x    # Prints same
>>
>> You're mistaken.  iterator_copy and some_iterator both refer to the
>> same object.  Try it:
>>
>>     >>> i = iter(range(5))
>>     >>> j = i
>>     >>> for x in i: print x
>>     ...
>>     0
>>     1
>>     2
>>     3
>>     4
>>     >>> for x in j: print x
>>     ...
>
> Of course - silly of me, but I still get confused sometimes by
> Python's reference copying.
>
>>> One real drawback
>>> that I thought of is that adding this might break existing code. A
>>> (broken) user-defined iterator might not provide enough smarts for
>>> std::distance to work, and yet still be compatible with the existing
>>> range support. 
>>
>> It would always supply enough smarts.  However, I am nervous about
>> supplying __len__ for anything below random-access, especially for
>> input iterators where it is probably destructive.
>
> Well, I'm not so sure - std::distance almost certainly requires
> iterator_traits<>::iterator_category which (AFAIK) range doesn't at
> the moment. 

Technically, range() does require iterator_category, since it requires
real iterators.  But no, it does not enforce that requirement.

> Since range currently only needs next() functionality, it
> wouldn't need to know what the iterator's category is, but adding
> len() wants more information for efficiency reasons.

Yes.

>>> Adding len would break this, since the distance function would get
>>> generated even if it is never used or wanted.
>>
>> I don't think that's a problem.
>
> Well, it certainly wouldn't be hard to fix code that does break.

No it wouldn't.  It was wrong anyway ;->

>>> I can think of two ways around this: add a new range_with_len type
>>> (seems excessive), or provide some way for the client code to access
>>> the generated class_ object to add their own extensions (probably
>>> difficult?)
>>
>> I think you're barking up the wrong tree.  Maybe we ought to simply
>> change our tune and say that range() produces an iterable-returning
>> function rather than an iterator-returning function.  If we did that
>> we could always generate __len__, and for that matter we could also
>> generate __getitem__/__setitem__ for random-access ranges.
>
> Sounds good to me. I think Andreas was also suggesting something in
> this direction. I was almost going to suggest adding a _new_ C++
> sequence wrapper (e.g. called view or sequence_view) that provides the
> extra stuff as well as __iter__ support via the existing range
> code. The only real benefit would be that the client code could then
> choose whether to include the more sophisticated support. 

Too many options; not enough benefit.

> Maybe you could achieve this via an optional iterator_category
> anyway (as you pointed out, __len__ is probably a bad idea for
> input_iterators).

Don't know what this means.

Anyway, I'm not going to implement this one myself.  Patches
(including docs and tests) will be gratefully considered.

-- 
Dave Abrahams
Boost Consulting
www.boost-consulting.com





More information about the Cplusplus-sig mailing list