itertools.groupby
Steve Howell
showell30 at yahoo.com
Mon May 28 11:14:24 EDT 2007
--- Paul Rubin <"http://phr.cx"@NOSPAM.invalid> wrote:
> [...]
> Here's yet another example that came up in something
> I was working on:
> you are indexing a book and you want to print a list
> of page numbers
> for pages that refer to George Washington. If
> Washington occurs on
> several consecutive pages you want to print those
> numbers as a
> hyphenated range, e.g.
>
> Washington, George: 5, 19, 37-45, 82-91, 103
>
> This is easy with groupby (this version not tested
> but it's pretty close
> to what I wrote in the real program). Again it
> works by Bates numbering,
> but a little more subtly (enumerate generates the
> Bates numbers):
>
> snd = operator.itemgetter(1) # as before
>
> def page_ranges():
> pages = sorted(filter(contains_washington,
> all_page_numbers))
> for d,g in groupby(enumerate(pages), lambda
> (i,p): i-p):
> h = map(snd, g)
> if len(h) > 1:
> yield '%d-%d'% (h[0], h[-1])
> else:
> yield '%d'% h[0]
> print ', '.join(page_ranges())
>
> [...]
Cool.
Here's another variation on itertools.groupby, which
wraps text from paragraphs:
import itertools
lines = [line.strip() for line in '''
This is the
first paragraph.
This is the second.
'''.split('\n')]
for has_chars, frags in itertools.groupby(lines,
lambda x: len(x) > 0):
if has_chars:
print ' '.join(list(frags))
# prints this:
#
# This is the first paragraph.
# This is the second.
I put the above example here:
http://wiki.python.org/moin/SimplePrograms
____________________________________________________________________________________
Park yourself in front of a world of choices in alternative vehicles. Visit the Yahoo! Auto Green Center.
http://autos.yahoo.com/green_center/
More information about the Python-list
mailing list