itertools.groupby

Steve Howell showell30 at yahoo.com
Mon May 28 11:14:24 EDT 2007


--- Paul Rubin <"http://phr.cx"@NOSPAM.invalid> wrote:
> [...]
> Here's yet another example that came up in something
> I was working on:
> you are indexing a book and you want to print a list
> of page numbers
> for pages that refer to George Washington.  If
> Washington occurs on
> several consecutive pages you want to print those
> numbers as a 
> hyphenated range, e.g.
> 
>    Washington, George: 5, 19, 37-45, 82-91, 103
> 
> This is easy with groupby (this version not tested
> but it's pretty close
> to what I wrote in the real program).  Again it
> works by Bates numbering,
> but a little more subtly (enumerate generates the
> Bates numbers):
> 
>    snd = operator.itemgetter(1)   # as before
> 
>    def page_ranges():
>       pages = sorted(filter(contains_washington,
> all_page_numbers))
>       for d,g in groupby(enumerate(pages), lambda
> (i,p): i-p):
>         h = map(snd, g)
>         if len(h) > 1:
>            yield '%d-%d'% (h[0], h[-1])
>         else:
>            yield '%d'% h[0]
>    print ', '.join(page_ranges())
>
> [...]

Cool.

Here's another variation on itertools.groupby, which
wraps text from paragraphs:

import itertools
lines = [line.strip() for line in '''
This is the
first paragraph.

This is the second.
'''.split('\n')]

for has_chars, frags in itertools.groupby(lines,
lambda x: len(x) > 0):
    if has_chars:
        print ' '.join(list(frags))
# prints this:
#
# This is the first paragraph.
# This is the second.


I put the above example here:

http://wiki.python.org/moin/SimplePrograms




      ____________________________________________________________________________________
Park yourself in front of a world of choices in alternative vehicles. Visit the Yahoo! Auto Green Center.
http://autos.yahoo.com/green_center/ 



More information about the Python-list mailing list