[Numpy-discussion] np.savez not multi-processing safe, alternatives?

Pauli Virtanen pav at iki.fi
Mon Mar 30 15:14:56 EDT 2009


Mon, 30 Mar 2009 09:03:56 -0400, Wes McKinney wrote:
> I have a process that stores a number of sets of 3 arrays output which
> can either be stored as a few .npy files or an .npz file with the same
> keys in each file (let's say, writing roughly 10,000 npz files, all
> containing the same keys 'a', 'b', 'c'). If I run multiple processes on
> the same machine (desirable, since they heavily database-IO-bound), over
> a period of hours some of the npz-writes will collide and fail due to
> the use of tempfile and tempfile.gettempdir() (either one of the .npy
> subfiles will be locked for writing or will get os.remove'd while the
> zip file is being written).

This is bug #852, it's fixed in trunk. As a workaround for the present, 
you may want to grab the `savez` function from

	http://projects.scipy.org/numpy/browser/trunk/numpy/lib/io.py#L243

and use a copy of it in your code temporarily. The function is fairly 
small.

-- 
Pauli Virtanen




More information about the NumPy-Discussion mailing list