Enumerating k-segmentations of a sequence

marek.rocki at wp.pl marek.rocki at wp.pl
Tue Nov 25 12:15:49 EST 2008


bullockbefriending bard napisał(a):
> I'm not sure if my terminology is precise enough, but what I want to
> do is:
>
> Given an ordered sequence of n items, enumerate all its possible k-
> segmentations.
>
> This is *not* the same as enumerating the k set partitions of the n
> items because I am only interested in those set partitions which
> preserve the order of the items. i.e. if the input sequence is (1 2 3
> 4 5), then ((1 4) (2 3) (5)) is unacceptable, whereas ((1 2) (3) (4
> 5)) is acceptable. Hence use of term 'segmentations'.
>
> e.g., for input sequence (1 2 3 4 5) and k = 3, I need a function
> which enumerates the following 3-segmentations:
>
> ((1 2 3) (4) (5))
> ((1 2) (3 4) (5))
> ((1 2) (3) (4 5))
> ((1) (2 3 4) (5))
> ((1) (2 3) (4 5))
> ((1) (2) (3 4 5))
>
> The reason I want to do this is to use it in some simple one-
> dimensional clustering, i.e. each set item (won't be just integers as
> above) will have a value of interest, and i'll select the particular
> set partition which minimises Sigma SSD (sum of squared differences
> from mean) of these values.
>
> It seems overkill to generate the full list of set partitions
> [including e.g. ((1 4) (2) (3 5))] because I intend to begin by
> sorting the input sequence such that element 1 < element 2 < ... <
> element n.
>
> Structural Recursion not being my strong point, any ideas on how to go
> about this would be much appreciated!

I'm not sure if I correctly understood the semantics of what you need.
Anyway it looks like a nice exercise in nested generators:

def segmentations(sequence, k):
	assert 1 <= k <= len(sequence)
	if k == 1:
		yield [sequence]
	else:
		for headlen in xrange(1, len(sequence) - k + 2):
			head, tail = sequence[:headlen], sequence[headlen:]
			for tail_seg in segmentations(tail, k - 1):
				yield [head] + tail_seg

for segmentation in segmentations((1, 2, 3, 4, 5), 3):
	print segmentation

Regards,
Marek



More information about the Python-list mailing list