Enumerating k-segmentations of a sequence

bullockbefriending bard kinch1967 at gmail.com
Tue Nov 25 11:56:24 EST 2008


I'm not sure if my terminology is precise enough, but what I want to
do is:

Given an ordered sequence of n items, enumerate all its possible k-
segmentations.

This is *not* the same as enumerating the k set partitions of the n
items because I am only interested in those set partitions which
preserve the order of the items. i.e. if the input sequence is (1 2 3
4 5), then ((1 4) (2 3) (5)) is unacceptable, whereas ((1 2) (3) (4
5)) is acceptable. Hence use of term 'segmentations'.

e.g., for input sequence (1 2 3 4 5) and k = 3, I need a function
which enumerates the following 3-segmentations:

((1 2 3) (4) (5))
((1 2) (3 4) (5))
((1 2) (3) (4 5))
((1) (2 3 4) (5))
((1) (2 3) (4 5))
((1) (2) (3 4 5))

The reason I want to do this is to use it in some simple one-
dimensional clustering, i.e. each set item (won't be just integers as
above) will have a value of interest, and i'll select the particular
set partition which minimises Sigma SSD (sum of squared differences
from mean) of these values.

It seems overkill to generate the full list of set partitions
[including e.g. ((1 4) (2) (3 5))] because I intend to begin by
sorting the input sequence such that element 1 < element 2 < ... <
element n.

Structural Recursion not being my strong point, any ideas on how to go
about this would be much appreciated!




More information about the Python-list mailing list