Building Time Based Bins
Michael Spencer
mahs at telcopartners.com
Sat Mar 19 22:41:39 EST 2005
MCD wrote:
> Hello, I'm new to python and this group and am trying to build some
> bins and was wondering if any of you could kindly help me out. I'm a
> bit lost on how to begin.
>
> I have some text files that have a time filed along with 2 other fields
> formatted like this >>
>
> 1231 23 56
> 1232 25 79
> 1234 26 88
> 1235 22 34
> 1237 31 85
> 1239 35 94
>
> This goes on throughout a 12hr. period. I'd like to be able to place
> the low and high values of the additional fields in a single line
> divided into 5min intervals. So it would look something like this >>
>
> 1235 22 88
> 1240 31 94
>
> I hope that makes sense. Should I be using a module like numarray for
> this, or is it possible to just use the native functions? Any ideas
> would help me very much.
>
> Thank you - Marcus
>
This sort of thing would do it:
from itertools import groupby
def splitter(iterable):
"""Takes a line-based iterator, yields a list of values per line
edit this for more sophisticated line-based parsing if required"""
for line in iterable:
yield [int(item) for item in line.split()]
def groupkey(data):
"""Groups times by 5 min resolution. Note this version doesn't work
exactly like the example - so fix if necessary"""
time = data[0]
return time / 100 * 100 + (time % 100) / 5 * 5
def grouper(iterable):
"""Groups and summarizes the lines"""
for time, data in groupby(iterable, groupkey):
data_x = zip(*data) #transform the data from cols to rows
print time, min(data_x[1]), max(data_x[2])
# Exercise it:
source = """1231 23 56
1232 25 79
1234 26 88
1235 22 34
1237 31 85
1239 35 94
"""
>>> grouper(splitter(source.splitlines()))
1230 23 88
1235 22 94
>>>
Note this groups by the time at the end of each 5 mins, rather than the
beginning as in your example. If this needs changing, fix groupkey
HTH
Michael
More information about the Python-list
mailing list