Using enumerate to get line-numbers with itertools grouper?

Victor Hooi victorhooi at gmail.com
Wed Sep 2 08:03:17 EDT 2015


Hi Peter,

Hmm, are you sure that will work?

The indexes returned by enumerate will start from zero.

Also, I've realised line_number is a bit of a misnomer here - it's actually the index for the chunks that grouper() is returning.

So say I had a 10-line textfile, and I was using a _BATCH_SIZE of 50.

If I do:

    print(line_number * _BATCH_SIZE)

I'd just get (0 * 50) = 0 printed out 10 times.

Even if I add one:

    print((line_number + 1) * _BATCH_SIZE)

I will just get 50 printed out 10 times.

My understanding is that the file handle f is being passed to grouper, which is then passing another iterable to enumerate - I'm just not sure of the best way to get the line numbers from the original iterable f, and pass this through the chain?

On Wednesday, 2 September 2015 20:37:01 UTC+10, Peter Otten  wrote:
> Victor Hooi wrote:
> 
> > I'm using grouper() to iterate over a textfile in groups of lines:
> > 
> > def grouper(iterable, n, fillvalue=None):
> >     "Collect data into fixed-length chunks or blocks"
> >     # grouper('ABCDEFG', 3, 'x') --> ABC DEF Gxx
> >     args = [iter(iterable)] * n
> >     return zip_longest(fillvalue=fillvalue, *args)
> > 
> > However, I'd also like to know the line-number that I'm up to, for
> > printing out in informational or error messages.
> > 
> > Is there a way to use enumerate with grouper to achieve this?
> > 
> > The below won't work, as enumerate will give me the index of the group,
> > rather than of the lines themselves:
> > 
> > _BATCH_SIZE = 50
> > 
> >     with open(args.input_file, 'r') as f:
> >         for line_number, chunk in enumerate(grouper(f, _BATCH_SIZE)):
> >             print(line_number)
> > 
> > I'm thinking I could do something to modify grouper, maybe, but I'm sure
> > there's an easier way?
> 
> print(line_number * _BATCH_SIZE)
> 
> Eureka ;)



More information about the Python-list mailing list