Dict to "flat" list of (key,value)

Scott David Daniels Scott.Daniels at Acm.Org
Sat Aug 2 23:42:32 EDT 2003


John Machin wrote:

> On Sat, 02 Aug 2003 12:41:19 GMT, "Raymond Hettinger"
> <vze4rx4y at verizon.net> wrote:
>> 
	[a perfectly fine chunk of code]
>>it-all-starts-with-a-good-data-structure-ly yours,
> 
> I liked the elegant code example for building a book index. However in
> practice the user requirement would be not to have duplicated page
> numbers when a word occurs more than once on the same page. If you can
> achieve that elegantly, please post it!

OK, I'll bite:
def genindex(pages):
     index = {}
     for pagenum in range(len(pages)):
          page = pages[pagenum]
          for word in page:
                index.setdefault(word, {})[pagenum] = None

or with 2.3:
def genindex(pages):
     index = {}
     for pagenum, page in enumerate(pages):
          for word in page:
                index.setdefault(word, {})[pagenum] = None
     return index

With either one, flattening looks like:
def flatten(index):
     flattened = [(word, list(pages)) for word, pages
                                      in index.items()]
     for word, pages in flattened:
          pages.sort()
     flattened.sort()
     return flatten

For fun, try:

book = ['this is a test'.split(),
	'this is only a test'.split(),
	'had this been a real function, you care a bit'.split()]
for word, pages in flatten(index(book)):
     print word, pages

-Scott David Daniels
Scott.Daniels at Acm.Org





More information about the Python-list mailing list