[perl-python] exercise: partition a list by equivalence

Fri Feb 18 11:20:51 EST 2005

On Thu, Feb 17, 2005 at 03:46:20PM -0800, Xah Lee wrote:
> here's another interesting algorithmic exercise, again from part of a
> larger program in the previous series.
> 
> Here's the original Perl documentation:
> 
> =pod
> 
> merge($pairings) takes a list of pairs, each pair indicates the
> sameness
> of the two indexes. Returns a partitioned list of same indexes.
> 
> For example, if the input is
> merge( [ [1,2], [2,4], [5,6] ] );
> 
> that means 1 and 2 are the same. 2 and 4 are the same. Therefore
> 1==2==4. The result returned is
> 
> [[4,2,1],[6,5]];
> 
> (ordering of the returned list and sublists are not specified.)
> 
> =cut

in Python:

    def merge(pairings):
        rev = {}
        res = []
        for pairing in pairings:
            first, second = pairing
            has_first = first in rev
            has_second = second in rev
            if has_first and has_second:
                rev[first].extend(rev[second])
                rev[second][:] = []
                rev[second] = rev[first]
            elif has_first:
                rev[first].append(second)
                rev[second] = rev[first]
            elif has_second:
                rev[second].append(first)
                rev[first] = rev[second]
            else:
                copy = [first, second]
                res.append(copy)
                rev[first] = rev[second] = copy
        return filter(None, res)

and in Perl:

    sub merge($)
    {
        my %rev = ();
        my @res = ();
        my ($pairing, $first, $second, $has_first, $has_second);
        foreach $pairing (@{$_[0]})
        {
            ($first, $second) = @$pairing;
            $has_first = exists $rev{$first};
            $has_second = exists $rev{$second};
            if ($has_first and $has_second)
            {
                push @{$rev{$first}}, @{$rev{$second}};
                @{$rev{$second}} = ();
                $rev{$second} = $rev{$first};
            }
            elsif (exists $rev{$first})
            {
                push @{$rev{$first}}, $second;
                $rev{$second} = $rev{$first};
            }
            elsif (exists $rev{$second})
            {
                push @{$rev{$second}}, $first;
                $rev{$first} = $rev{$second};
            }
            else
            {
                my @copy = ($first, $second);
                push @res, \@copy;
                $rev{$first} = $rev{$second} = \@copy;
            }
        }
        return [grep @$_, @res];
    }

although in Perl you wouldn't define it to take a reference to a list
(as you did), but rather a list, and you wouldn't define it to return
a reference to a list, but rather a list in list context (and possibly
a reference in scalar context).

Both versions are (IMHO) pretty clear (when the docstring/pod is
included), and O(n) because dict/hash lookup and list
appending/pushing is O(1) in both languages. Both versions can
probably be tweaked for speed quite a bit, but I don't *think* there's
a better-than-O(n) algorithm for this.

Note that the Python version assumes that the pairs' elements are
hashable; your example used numbers, so I thought it was pretty safe
assumption. The Perl version has no such restriction.

-- 
John Lenton (john at grulic.org.ar) -- Random fortune:
Noble cosa es, aún para un anciano, el aprender.
		-- Sófocles. (497-406 a.C.) Filósofo griego. 
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 196 bytes
Desc: Digital signature
URL: <http://mail.python.org/pipermail/python-list/attachments/20050218/9b4c01b3/attachment.sig>