Standard lib version of something like enumerate() that takes a max count iteration parameter?

Andre Müller gbs.deadeye at gmail.com
Fri Jun 16 07:49:39 EDT 2017


Am 15.06.2017 um 07:09 schrieb Jussi Piitulainen:
> Andre Müller writes:
>
>> I'm a fan of infinite sequences. Try out itertools.islice.
>> You should not underestimate this very important module.
>>
>> Please read also the documentation:
>> https://docs.python.org/3.6/library/itertools.html
>>
>> from itertools import islice
>>
>> iterable = range(10000000000)
>> # since Python 3 range is a lazy evaluated object
>> # using this just as a dummy
>> # if you're using legacy Python (2.x), then use the xrange function for it
>> # or you'll get a memory error
>>
>> max_count = 10
>> step = 1
>>
>> for i, element in enumerate(islice(iterable, 0, max_count, step), start=1):
>>     print(i, element)
> I like to test this kind of thing with iter("abracadabra") and look at
> the remaining elements, just to be sure that they are still there.
>
> from itertools import islice
>
> s = iter("abracadabra")
> for i, element in enumerate(islice(s, 3)):
>     print(i, element)
>
> print(''.join(s))
>
> Prints this:
>
> 0 a
> 1 b
> 2 r
> acadabra
>
> One can do a similar check with iter(range(1000)). The range object
> itself does not change when its elements are accessed.

Very interesting. Normally this should not work.
The behavior is unexpected. So you taught me, what can happen.

Thank You :-)

Normally you don't see very often iter(). If you've short sequences
which are str,
you can just use index access. My example is for something, which is
bigger than memory.
Otherwise you've sometimes objects which doesn't support index access
like sets or generators.
Then you can use this nifty trick.

Instead of using:

s = iter('abracadabra') # no direct access to the str object

You should use:

s = 'abracadabra' # direct access to object
iterator = iter(s) # makes an iterator which is accessing s. The str object does not change.
# s is still 'abracadabra'

# also you can make more than one iterator of the same object
iterator2 = iter(s)
iterator3 = iter(s)
iterator4 = iter(s)
# they only rely on s, but they are independent iterators
# s won't change

Another trick is:

# incomplete chunks are left out
list(zip(*[iter(s)]*4))
# -> [('a', 'b', 'r', 'a'), ('c', 'a', 'd', 'a')]

# incomplete chunks will have None in the list
list(itertools.zip_longest(*[iter(s)]*4))
# -> [('a', 'b', 'r', 'a'), ('c', 'a', 'd', 'a'), ('b', 'r', 'a', None)]

# to impress your friends you can do
for chunk in itertools.zip_longest(*[iter(s)]*4):
    chunked_str = ''.join(c for c in chunk if c) # generator expression
inside join with condition
    print(chunked_str)

It took long time for me to understand this.
Someone has written this nifty tick in the python irc channel.

You should create every time a new iterator with iter, when iterating
over something again.
In your example you're iterating twice over the same iterator.

When you're using for example a for loop, it creates internally an
iterator of your object, if it's supported.
It gets element by element and assigns it to one_char.

# creates iterator for s and iterates over it
for one_char in s:
    print(one_char)

# creates again a new iterator for s and iterates over it
for one_char in s:
    print(one_char)



-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 801 bytes
Desc: OpenPGP digital signature
URL: <http://mail.python.org/pipermail/python-list/attachments/20170616/36032e0b/attachment.sig>


More information about the Python-list mailing list