[Ironpython-users] Fwd: weakref random "SystemError" and "ValueError" exception (affecting _weakrefset.py and abc.py)

Mon Apr 13 07:39:13 CEST 2015

Dear IronPython gurus!

Hopping you can help me with a kind of ramdom bug (unexpected
SystemError and ValueError exception) in _weakrefset.py (it would be
great if you can replicate it by running my script below. Please let me
know).

The error is random, but I managed to reproduce it by running the
"ramdomly" ofending code 100.000 times inside a loop (Script attached).
Note that the test takes only a few seconds, and in my PC it throws
regularly between 5 to 30 exception for all those cycles). I see there are
other people suffering for the same, but without solution or workaround yet
(
https://mail.python.org/pipermail/ironpython-users/2014-November/017332.html
and https://github.com/IronLanguages/main/issues/1187)

In my case, weakref error was related with an intensive use of isinstance()
inside "Pulp" library (an optimization python library). *Just for your
reference*:* isintance() use internally a WeakSet as a cache for the class
types, which in turn use weakref (see ABCMeta.__instancecheck__() in the
standard file "abc.py"). *
In my test script I have isolated the problem to WeakSet only (I isolated
it to clean the bug from the "abc.py" and "Pulp" library stuff). The
exception happens inside *WeakSet.__contains__()* function (from
_weakrefset.py file). As stated, it happens randomly, but apparently only
related with a GC collect cycle into a memory "hungry" python script . I
ran the test script in two PCs with similar results: Windows 7 64bits and
Windows 7 32bits. Both using ipy.exe 32bit version 2.7.4 (2.7.0.40). The
.NET version is 4.0.30319.18444 (32-bit). The test script does:

   1. It simulate a "memory intensive" python code (creating a weakref
   object of 2kb to 12kb in each loop. If smaller, like 0.1kb objects, then
   the bug don't show up)
   2. It manually runs GC.collect() every 1.000 cycles (should collect
   those weakref objects)
   3. ... and it repeat (100.000 times) the following "offending" boolean
   test:

       test = item in WeakSetObject   *#<- Repeated 100.000 times. *
*                                      #-> it fails between 10 to 20 times
with an unexpected exception*

*NOTE 1:* The "item" is an object added to the WeakSet at the beginning of
the script, but "item" should not GC collected as it also keeps a normal
(not weak) reference alive.

*NOTE 2:* The boolean test should give always True, which is the case 99.9%
of the time. Less than 0.01%, the boolean test fails raising an exception
of the type "ValueError" *(Description:"Index was out of range") *or a bit
more frequent "SystemError" *(Description:"Handle is not initialized")*.
This happens 5 to 30 times in 100.000 test cycle (Seems very small, but it
is important enough to avoid a practical construction of a medium size
optimization problem with "Pulp" library).

Tracking down the error, the exception ValueError is raised in line 70 and
the exception "SystemError" in line 73 of "_weakrefset.py" .

    On "Lib\_weakrefset.py"
    35 :class WeakSet(object):
    ....
    68 :    def __contains__(self, item):
    69 :        try:
    **70*:           wr = ref(item)      # <- here is raised "ValueError"
Exception ("Index was out of range")
    71 :        except TypeError:
    72 :            return False
    **73*:       return wr in self.data  # <- here is raised "SystemError"
Exception ("Handle is not initialized")

Continuing after the exception, when executing the same boolean test again,
it works fine (same item, same WeakSetObject, same execution, local and
global context. Script was not started again!). I.e. if you catch the
exception and continue the execution, It is like as if the exception never
happened before (it's like a runtime lapsus!).

I believe to fix the source of the problem, Ironpython should trap or avoid
the error in weakref module itself (C# code). I don't know how... can
someone kindly help me to fix this?
Cheers,
Andres Sommerhoff
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/ironpython-users/attachments/20150413/de470068/attachment.html>
-------------- next part --------------
#
# file: testWeakSetError.py
#

#EXPLANATION ABOUT THE TEST SCRIPT:
#    This test script is somehow based in the way abc.py class handle 
#  a weakSet as a cache for speeding up the ABCMeta.__instancecheck__(),
#  which overrides the buildin isinstance(). The abc.py check the cache by 
#  testing "test=classX in WeakSetObjAsCache", which fails when 
#  the seldom exception is raised inside WeakSet.__contains__().

#    The boolean test is a simple "test = item in WeakSetObject" which performs 
#  as expected 99.9% of the time and the result is True. "item" is a 
#  normal (not weak) reference, so it should not be killed 
#  by gc.collection, and "item" was added to WeakSetObject before. This
#  boolean test is inside a loop that repeat the test 100.000 times. On every
#  cycle of the loop, is it also added a new instance of 
#  a big (RAM) object (just 2kb to 12kb each obj), which will be 
#  stay as a weakref inside the WeakSet. Every 1000 cycles a gc.collection()
#  is manually performed, which should clean some of this big objects. The 
#  error (exception) start to happen around cycle 12.000th (sometimes sooner).
#  The error consist in an untrapped exception of the type:
#       (ValueError) or (SystemError)
#  After 100.000 cycles I count around 10 or 20 errors. For a real python
#  application, that is enough frequent to stop the execution of a medium size
#  optimization problem (Pulp library make intensive use of "isinstance()", 
#  and what buildin function use WeakSet under the hook as a cache.) 

import gc, sys, traceback 
from _weakrefset import WeakSet

#Big Object -> takes 2 to 12kb of RAM!
class BigObjClass(object):
    def __init__(self,id):
        self.useRAM = str(id)*1000 # Less than <100 or bigger than >10000 no error is produced (not knowing why?) 

item       = BigObjClass(id="AA")
weakSetObj = WeakSet()
weakSetObj.add(item) # adding an item instance 

errorcount=0
for i in xrange(100000): #<- Do several loops until you get the error
    #Simulating ram intensive python script (it must be inside the loop, otherwise no error is expected)
    t = BigObjClass(id=i) # A new instance, takes 2 to 12kb of RAM!
    weakSetObj.add(t)  #<- Adding it to WeakSet. It is a weakref so the 2 to 12kb of RAM should be retrieved when gc.collect()

    #The boolean Test
    try:
        test = item in weakSetObj  #<- OFENDING CODE: THIS TEST IS WHERE SOMETIMES THE ERROR IS PRODUCED!!!
        assert test==True
    except Exception as e: 
        errorcount += 1
        (extype, exvalue, extraceback) = sys.exc_info()

        print "\n####################################################"
        print "Error in Loop",i,"coming from WeakSet.__contains__:"
        print type(e), e
        #print "------------- TRACEBACK -------------"
        traceback.print_exception(extype,exvalue,extraceback,limit=10,file=sys.stdout)
        print "####################################################\n"

    # Garbage Collection from time to time (necessary to get the error)
    if i%1000==0:  # If too small (i.e. <100) the error is not shown! If too big -> memoryout exception
        howmuch = gc.collect() #should collect the BigObjs created in the loop, but not "item" as it is normal referenced   
        print "loop",i, " doing gc.collect(): ",howmuch

print
print "In total:", errorcount, "errors in 100000 tests = %",  errorcount/100000.0*100      
print "End"