toy list processing problem: collect similar terms

Gary Herron gherron at digipen.edu
Sun Sep 26 01:21:48 EDT 2010


On 09/25/2010 09:05 PM, Xah Lee wrote:
> here's a interesting toy list processing problem.
>
> I have a list of lists, where each sublist is labelled by
> a number. I need to collect together the contents of all sublists
> sharing
> the same label. So if I have the list
>
> ((0 a b) (1 c d) (2 e f) (3 g h) (1 i j) (2 k l) (4 m n) (2 o p) (4 q
> r) (5 s t))
>
> where the first element of each sublist is the label, I need to
> produce:
>
> output:
> ((a b) (c d i j) (e f k l o p) (g h) (m n q r) (s t))
>
> a Mathematica solution is here:
> http://xahlee.org/UnixResource_dir/writ/notations_mma.html
>
> R5RS Scheme lisp solution:
> http://xahlee.org/UnixResource_dir/writ/Sourav_Mukherjee_sourav.work_gmail.scm
> by Sourav Mukherjee
>
> also, a Common Lisp solution can be found here:
> http://groups.google.com/group/comp.lang.lisp/browse_frm/thread/5d1ded8824bc750b?
>
> anyone care to give a solution in Python, Perl, javascript, or other
> lang? am guessing the scheme solution can be much improved... perhaps
> using some lib but that seems to show scheme is pretty weak if the lib
> is non-standard.
>
>   Xah ∑ xahlee.org ☄
>    


Python 3:  (I have not tried to match the exact format of your output, 
but I get the right things is the right order.)

data = ((0,'a','b'), (1,'c','d'), (2,'e','f'), (3,'g','h'),
         (1,'i','j'), (2,'k','l'), (4,'m','n'), (2,'o','p'),
         (4,'q','r'), (5,'s','t'))

from collections import OrderedDict
r = OrderedDict()
for label,*rest in data:
     r.setdefault(label, []).extend(rest)
print(list(r.values()))

produces:

(['a', 'b'], ['c', 'd', 'i', 'j'], ['e', 'f', 'k', 'l', 'o', 'p'], ['g', 
'h'], ['m', 'n', 'q', 'r'], ['s', 't'])


-- 
Gary Herron, PhD.
Department of Computer Science
DigiPen Institute of Technology
(425) 895-4418





More information about the Python-list mailing list