make sublists of a list broken at nth certain list items

Mon Jul 8 16:52:40 EDT 2013

I'm looking for a Pythonic way to do the following:

I have data in the form of a long list of tuples.  I would like to break that list into four sub-lists.  The break points would be based on the nth occasion of a particular tuple.  (The list represents behavioral data trials; the particular tuple represents the break between trials; I want to collect 20 trials at a time, so every 20th break between trials, start a new sublist).

So say I have this data:  

data_list = [(0.0, 1.0), (1.0, 24.0), (24.0, 9.0), (9.0, 17.0), (17.0, 5.0), (5.0, 0.0), (5.0, 0.0), (5.0, 24.0), (24.0, 13.0), (13.0, 0.0), (13.0, 21.0), (21.0, 0.0), (21.0, 0.0), (21.0, 23.0), (23.0, 24.0), (24.0, 10.0), (10.0, 18.0), (18.0, 4.0), (4.0, 22.0), (22.0, 1.0), (1.0, 0.0), (1.0, 24.0), (24.0, 6.0), (6.0, 14.0), (14.0, 5.0), (5.0, 0.0), (5.0, 0.0), (5.0, 0.0), (5.0, 0.0), (5.0, 0.0), (5.0, 0.0), (5.0, 0.0), (5.0, 0.0), (5.0, 0.0), (5.0, 24.0), (24.0, 6.0), (6.0, 14.0), (14.0, 4.0), (4.0, 0.0), (4.0, 22.0), (22.0, 1.0), (1.0, 0.0), (1.0, 24.0), (24.0, 9.0), (9.0, 17.0), (17.0, 4.0), (4.0, 0.0), (4.0, 22.0), (22.0, 1.0), (1.0, 0.0), (1.0, 0.0), (1.0, 24.0), (24.0, 12.0), (12.0, 4.0), (4.0, 0.0), (4.0, 22.0)]  #rest of data truncated...

I'd like to break the list into sublists at the 20th, 40th, and 60th occasions of any tuple that begins with 1.0--so for example, (1.0, 0.0).  This will produce four sub-lists, for trial 1-20, 21-40, 41-60, and 61-80.

What I have, just to get the break points within the data_list, and which is not working is:

trial_break_indexes_list = []  #needed to see where the sublists start
trial_count = 0  #keep count of which trial we're on

trial_break_indexes_list = []  #holds the index of the transitions_list for trials 1-20, 21-40, 41-60, and 61-80
trial_count = 0

for tup in data_list:
    if tup[0] == 1.0: #Therefore the start of a new trial

        #We have a match!  Therefore get the index in the data_list
        data_list_index = data_list.index(tup)

        trial_count += 1  #update the trial count.

        if trial_count % 20 == 0:  #this will match on 0, 20, 40, 60, 80
            trial_break_indexes_list.append(data_list_index)

print 'This is trial_break_indexes_list: ', trial_break_indexes_list

Unfortunately, the final output here is:

>>> 
This is trial_break_indexes_list:  [1, 20, 20, 20, 20, 1, 20, 1]

I sense there is a way more elegant/simpler/Pythonic way to approach this, let alone one that is actually correct, but I don't know of it.  Suggestions appreciated!

Thanks.