Best strategy for finding a pattern in a sequence of integers

Slaunger Slaunger at gmail.com
Fri Nov 21 10:13:43 EST 2008


Hi all,

I am a Python novice, and I have run into a problem in a project I am
working on, which boils down to identifying the patterns in a sequence
of integers, for example

.... 1 6 6 1 6 6 1 6 6 1 6 6 1 6 6 1 9 3 3 0 3 3 0 3 3 0 3 3 0 10 6 6
1 6 6 1 6 6 1 6 6 1 6 6 1 6 6 1 6 6 1 9 3 3 0 3 3 0 3 3 0 3 3 0 10 6 6
1 6 6 1 6 6 1 6 6 1 6 6 1 6 6 1 6 6 1 9 3 3 0 3 3 0 3 3 0 3 3 0 10 6 6
1 6 6 1 6 6 1 6 6 1 6 6 1 6 6 1 6 6 1 9 3 3 0 3 3 0 3 3 0 3 3 0 10 6 6
1 6 6 1 6 6 1 6 6 1 6 6 1 6 6 1 6 6 1 ...

I want to process this such that I get out two patterns, like:
(9, 3, 3, 0, 3, 3, 0, 3, 3, 0, 3, 3, 0)
and
(10, 6, 6, 1, 6, 6, 1, 6, 6, 1, 6, 6, 1, 6, 6, 1, 6, 6, 1, 6, 6, 1)

I am pretty sure I can figure out how to do that, but I would like to
have some guidance on the most pythonic approach to this.

Two paths I have considered is:
1. Convert the sequence of integers to a hex string, i.e., "...
16616616616616619330330330330A66166..." and use the re module to find
the patterns. Use the string positions to go back to the sequence
2. Put them in a list or an array and manually look for the patterns
by iterating and filtering the elements compare with sets.

I am not looking for a "solution" to this specific problem, just some
guidance

The rules for the sequence is:
1. The sequence may start in the middle of a pattern
2. There are one or two patterns, Pattern A and Pattern B in the
sequence
3. Pattern A only consists of the numbers 0, 3, and 9. 3, 3 is always
followed by 0
4. Pattern B only consists of the numbers 1, 6, and 10. 6, 6, is
always followed by 1
5. There may be other numbers interspersed within the sequence, but
they can be ignored
6. The relative position of 9 or 10 in the patterns varies from case
to case, but is consistent throughout a sequence.
7. There is always one 9 or one 10 in a pattern
7. The beginning of a pattern is marked by the transision from oner
pattern to the other.
8. If there is only one pattern in the sequence, the pattern beginning
is marked by the first occurance of either 9 or 10
9. The pattern is repetitive in the sequence,
e.g., ...ABABABAB..., ...AAA..., or ...BBB...

Thank you,
-- Slaunger




More information about the Python-list mailing list