[Numpy-discussion] bad CRC errors when using np.savez, only sometimes though!

Benjamin Root ben.v.root at gmail.com
Fri May 14 11:37:38 EDT 2021


Isaac,

What I mean is that your bug might be similar to the savemat() bug that was
fixed in scipy in 2019. Completely different functions, but both functions
need to properly interact with zlib in order to work properly.

On Fri, May 14, 2021 at 10:22 AM Isaac Gerg <isaac.gerg at gergltd.com> wrote:

> Hi Ben,  I am not sure.  However, in looking at the dates, it looks like
> that was fixed in scipy as of 2019.
>
> Would you recommend using the scipy save interface as opposed to the numpy
> one?
>
> On Fri, May 14, 2021 at 10:16 AM Benjamin Root <ben.v.root at gmail.com>
> wrote:
>
>> Perhaps it is a similar bug as this one?
>> https://github.com/scipy/scipy/issues/6999
>>
>> Basically, it turned out that the CRC was getting computed on an
>> unflushed buffer, or something like that.
>>
>> On Fri, May 14, 2021 at 10:05 AM Isaac Gerg <isaac.gerg at gergltd.com>
>> wrote:
>>
>>> I am using 1.19.5 on Windows 10 using Python 3.8.6 (tags/v3.8.6:db45529,
>>> Sep 23 2020, 15:52:53) [MSC v.1927 64 bit (AMD64)].
>>>
>>> I have two python processes running (i.e. no threads) which do
>>> independent processing jobs and NOT writing to the same directories.  Each
>>> process runs for 5-10 hours and then writes out a ~900MB npz file
>>> containing 4 arrays.
>>>
>>> When I go back to read in the npz files, I will sporadically get bad CRC
>>> errors which are related to npz using ziplib.  I cannot figure out why this
>>> is happening.  Looking through online forums, other folks have had CRC
>>> problems but they seem to be isolated to specifically using ziblib, not
>>> numpy.  I have found a few mentions though of ziplib causing headaches if
>>> the same file pointer is used across calls when one uses the file handle
>>> interface to ziblib as opposed to passing in a filename.'
>>>
>>> I have verified with 7zip that the files do in fact have a CRC error so
>>> its not an artifact of the ziblib.  I have also used the file handle
>>> interface to np.load and still get the error.
>>>
>>> Aside from writing my own numpy storage file container, I am stumped as
>>> to how to fix this, or reproduce this in a consistent manner.
>>> Any suggestions would be greatly appreciated!
>>>
>>> Thank you,
>>> Isaac
>>> _______________________________________________
>>> NumPy-Discussion mailing list
>>> NumPy-Discussion at python.org
>>> https://mail.python.org/mailman/listinfo/numpy-discussion
>>>
>> _______________________________________________
>> NumPy-Discussion mailing list
>> NumPy-Discussion at python.org
>> https://mail.python.org/mailman/listinfo/numpy-discussion
>>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.python.org/pipermail/numpy-discussion/attachments/20210514/2c8304ae/attachment.html>


More information about the NumPy-Discussion mailing list