[SciPy-User] scipy.signal.resample muffs my timestamps?

Ralf Gommers ralf.gommers at gmail.com
Tue Jan 20 14:34:47 EST 2015


On Tue, Jan 20, 2015 at 6:14 PM, Skip Montanaro <skip at pobox.com> wrote:

> I want to resample a large (400k+) dataset where x are datetime
> objects and y are floats. The x data are epoch seconds from the past
> week. For the purposes of this example, I've crudely downsampled them,
> choosing every 10 elements (Python prompt changed to "... " to fool
> Gmane).
>
> ... len(t)
> 43051
> ... len(x)
> 43051
> ... pprint([datetime.datetime.fromtimestamp(_) for _ in t[:10]])
> [datetime.datetime(2015, 1, 12, 0, 0),
>  datetime.datetime(2015, 1, 12, 0, 0, 46, 742044),
>  datetime.datetime(2015, 1, 12, 0, 1, 3, 320089),
>  datetime.datetime(2015, 1, 12, 0, 1, 23, 700560),
>  datetime.datetime(2015, 1, 12, 0, 1, 44, 583401),
>  datetime.datetime(2015, 1, 12, 0, 1, 57, 733937),
>  datetime.datetime(2015, 1, 12, 0, 2, 38, 30245),
>  datetime.datetime(2015, 1, 12, 0, 3, 35, 336342),
>  datetime.datetime(2015, 1, 12, 0, 4, 23, 833251),
>  datetime.datetime(2015, 1, 12, 0, 4, 48, 272131)]
> ... pprint([datetime.datetime.fromtimestamp(_) for _ in t[-10:]])
> [datetime.datetime(2015, 1, 19, 23, 56, 9, 996926),
>  datetime.datetime(2015, 1, 19, 23, 56, 12, 104080),
>  datetime.datetime(2015, 1, 19, 23, 56, 12, 158963),
>  datetime.datetime(2015, 1, 19, 23, 56, 12, 280701),
>  datetime.datetime(2015, 1, 19, 23, 56, 12, 337853),
>  datetime.datetime(2015, 1, 19, 23, 56, 22, 169709),
>  datetime.datetime(2015, 1, 19, 23, 56, 29, 676865),
>  datetime.datetime(2015, 1, 19, 23, 57, 14, 570601),
>  datetime.datetime(2015, 1, 19, 23, 58, 56, 394975),
>  datetime.datetime(2015, 1, 19, 23, 59, 37, 707367)]
>
> So, let's get started, downsampling our 43k points to 250:
>
> ... res_x, res_t = signal.resample(x, 250, t)
> (Final Jeopardy tune plays...)
> ...
>
> If I understand correctly, signal.resample should generate 250 evenly
> spaced points from each of the inputs.
>
> ... len(res_x)
> 250
> ... len(res_t)
> 250
>
> So far, so good. Now, look at the range of res_t:
>
> ... pprint([datetime.datetime.fromtimestamp(_) for _ in res_t[:10]])
> [datetime.datetime(2015, 1, 12, 0, 0),
>  datetime.datetime(2015, 1, 12, 2, 14, 9, 166940),
>  datetime.datetime(2015, 1, 12, 4, 28, 18, 333880),
>  datetime.datetime(2015, 1, 12, 6, 42, 27, 500820),
>  datetime.datetime(2015, 1, 12, 8, 56, 36, 667761),
>  datetime.datetime(2015, 1, 12, 11, 10, 45, 834701),
>  datetime.datetime(2015, 1, 12, 13, 24, 55, 1641),
>  datetime.datetime(2015, 1, 12, 15, 39, 4, 168581),
>  datetime.datetime(2015, 1, 12, 17, 53, 13, 335521),
>  datetime.datetime(2015, 1, 12, 20, 7, 22, 502461)]
> ... pprint([datetime.datetime.fromtimestamp(_) for _ in res_t[-10:]])
> [datetime.datetime(2015, 2, 3, 8, 36, 40, 65638),
>  datetime.datetime(2015, 2, 3, 10, 50, 49, 232578),
>  datetime.datetime(2015, 2, 3, 13, 4, 58, 399518),
>  datetime.datetime(2015, 2, 3, 15, 19, 7, 566458),
>  datetime.datetime(2015, 2, 3, 17, 33, 16, 733398),
>  datetime.datetime(2015, 2, 3, 19, 47, 25, 900338),
>  datetime.datetime(2015, 2, 3, 22, 1, 35, 67279),
>  datetime.datetime(2015, 2, 4, 0, 15, 44, 234219),
>  datetime.datetime(2015, 2, 4, 2, 29, 53, 401159),
>  datetime.datetime(2015, 2, 4, 4, 44, 2, 568099)]
>
> That doesn't look right at all.
>
> I'm sure I'm using an outdated version of scipy:
>
> ... scipy.version.version
> '0.9.0'
>
> but it's what I have available (it's a long story).
>
> If this is a bug requiring upgrade, I'll beat on the powers that be to
> get a newer version of scipy. I'm happy to provide my data to anyone
> who would be willing to try this exercise out using a more recent
> version.
>

I doubt that an upgrade will fix your issue; I don't see any bug fixes to
signal.resample since 0.9.0 that look relevant. I don't understand that
this works for you at all, a quick test with ``t = list_of_datetimes``
gives me:

    TypeError: unsupported operand type(s) for /: 'datetime.timedelta' and
'float'

If you can provide a reproducible example on a generated set of data, that
would be the easiest (we can use that as a regression test). Otherwise
providing your code with your actual dataset is also OK - if you send me a
link or email it to me I'll have a look.

Cheers,
Ralf
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20150120/1eabc71a/attachment.html>


More information about the SciPy-User mailing list