[SciPy-User] Sum duplicate dates in a series
josef.pktd at gmail.com
josef.pktd at gmail.com
Fri Jan 29 14:36:42 EST 2010
On Fri, Jan 29, 2010 at 2:13 PM, Pierre GM <pgmdevlist at gmail.com> wrote:
> On Jan 29, 2010, at 2:00 PM, John Hunter wrote:
>> On Fri, Jan 29, 2010 at 12:42 PM, Pierre GM <pgmdevlist at gmail.com> wrote:
>>> On Jan 29, 2010, at 1:16 PM, Robert Ferrell wrote:
>>>> How can I sum data for duplicate dates in a time series? I can do it
>>>> with a loop, but I wonder if there is some tricky magic I might use.
>>
>> If you can put your data in a record array, you can use
>> matplotlib.mlab.rec_groupby
>>
>> http://matplotlib.sourceforge.net/api/mlab_api.html#matplotlib.mlab.rec_groupby
>>
>> http://matplotlib.sourceforge.net/examples/misc/rec_groupby_demo.html
>
> John,
> Could you have a look into numpy.lib.recfunctions ? That's an attempt to homogenize what you did for matplotlib, and it'd be great if you could help.
> _______________________________________________
> SciPy-User mailing list
> SciPy-User at scipy.org
> http://mail.scipy.org/mailman/listinfo/scipy-user
>
I just wanted to show that there will be some advantages when it is
possible to easily move between packages
>>> import scikits.timeseries as ts
>>> import la
>>> s = ts.time_series([1,2,3,4,5],dates=ts.date_array(["2001-01","2001-01","2001-02","2001-03","2001-03"],freq="M"))
>>> dta = la.larry(s.data, label=[range(len(s.data))])
>>> dat = la.larry(s.dates.tolist(), label=[range(len(s.data))])
>>> s2 = ts.time_series(dta.group_mean(dat).x,dates=ts.date_array(dat.x,freq="M"))
>>> s
timeseries([1 2 3 4 5],
dates = [Jan-2001 Jan-2001 Feb-2001 Mar-2001 Mar-2001],
freq = M)
>>> s2
timeseries([ 1.5 1.5 3. 4.5 4.5],
dates = [Jan-2001 Jan-2001 Feb-2001 Mar-2001 Mar-2001],
freq = M)
>>> s2u = ts.remove_duplicated_dates(s2)
>>> s2u
timeseries([ 1.5 3. 4.5],
dates = [Jan-2001 ... Mar-2001],
freq = M)
>>> s2u.dates
DateArray([Jan-2001, Feb-2001, Mar-2001],
freq='M')
It's not so easy yet. But it would be nice if we can use timeseries,
pandas and la for different things depending on the more convenient
representation.
Josef
More information about the SciPy-User
mailing list