Fast forward-backward (write-read)

David Hutto dwightdhutto at gmail.com
Tue Oct 23 22:29:09 EDT 2012


On Tue, Oct 23, 2012 at 8:06 PM, Oscar Benjamin
<oscar.j.benjamin at gmail.com> wrote:
> On 23 October 2012 15:31, Virgil Stokes <vs at it.uu.se> wrote:
>> I am working with some rather large data files (>100GB) that contain time
>> series data. The data (t_k,y(t_k)), k = 0,1,...,N are stored in ASCII
>> format. I perform various types of processing on these data (e.g. moving
>> median, moving average, and Kalman-filter, Kalman-smoother) in a sequential
>> manner and only a small number of these data need be stored in RAM when
>> being processed. When performing Kalman-filtering (forward in time pass, k =
>> 0,1,...,N) I need to save to an external file several variables (e.g. 11*32
>> bytes) for each (t_k, y(t_k)). These are inputs to the Kalman-smoother
>> (backward in time pass, k = N,N-1,...,0). Thus, I will need to input these
>> variables saved to an external file from the forward pass, in reverse order
>> --- from last written to first written.
>>
>> Finally, to my question --- What is a fast way to write these variables to
>> an external file and then read them in backwards?
>
> You mentioned elsewhere that you are using numpy. I'll assume that the
> data you want to read/write are numpy arrays.

If that is the case always timeit. The following is an example of 3
functions, with repetitions of time that give an average:

import timeit
#3 dimensional matrix
x_dim = -1
y_dim = -1
z_dim = -1
s = """\

x_dim = -1
y_dim = -1
z_dim = -1
dict_1 = {}

for i in xrange(0,6):
	x_dim = 1
	y_dim = 1
	z_dim = 1
	dict_1['%s' % (i) ] = ['x = %i' % (x_dim), 'y = %i' % (y_dim),  'z =
%i' % (z_dim)]

"""

t = """\
import numpy
numpy.array([[ 1.,  0.,  0.],
       [ 0.,  1.,  2.]])
"""

u = """\
list_count = 0
an_array = []
for i in range(0,10):

	if list_count > 3:
		break
	
	if i % 3 != 0:
		an_array.append(i)

	if i % 3 == 0:
		list_count += 1

"""
print timeit.timeit(stmt=s, number=100000)
print timeit.timeit(stmt=t, number=100000)
print timeit.timeit(stmt=u, number=100000)


-- 
Best Regards,
David Hutto
CEO: http://www.hitwebdevelopment.com



More information about the Python-list mailing list