[ python-Bugs-868896 ] Multiple problems with GC in 2.3.3
SourceForge.net
noreply at sourceforge.net
Tue Jan 13 10:07:38 EST 2004
Bugs item #868896, was opened at 2004-01-01 19:00
Message generated for change (Comment added) made by kjetilja
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=868896&group_id=5470
Category: Python Interpreter Core
Group: Python 2.3
Status: Open
Resolution: None
Priority: 5
Submitted By: washington irving (washirv)
Assigned to: Nobody/Anonymous (nobody)
Summary: Multiple problems with GC in 2.3.3
Initial Comment:
Hi. We're running a multithreaded application that
spiders some web pages, and parses them. We've had 2
types of problems: one where we have the python process
segfault. Another where python spins in an infinite
loop. We are running on FreeBSD 4.8-RELEASE. We have
not had this problem with 2.2. We have this problem
with both 2.3.2 and 2.3.3. This is repeatable, and
we're willing to help in every way to fix this. I've
attached the gdb stack trace for the process that
segfaulted, and the process that spins in an infinte
loop. We attached to it in gdb and ctrl-c'ed to check
the status. There are 2 separate gdb traces in the
attached file.
Thanks
----------------------------------------------------------------------
Comment By: Kjetil Jacobsen (kjetilja)
Date: 2004-01-13 15:07
Message:
Logged In: YES
user_id=5685
This is a known pycurl issue. The problem is avoided by
turning off the gc tracking code in pycurl which is now the
default behaviour in pycurl (i.e. in the current cvs
version) until the problems related to the gc tracking have
been resolved properly.
So, as an intermediate solution, use the current cvs version
of pycurl.
----------------------------------------------------------------------
Comment By: Tim Peters (tim_one)
Date: 2004-01-02 03:15
Message:
Logged In: YES
user_id=31435
I doubt you can guess this easily. When an object that
participates in cyclic gc is first created, its gc_refs member is
set to constant _PyGC_REFS_UNTRACKED. This tells gc to
leave this object alone: it's still (mostly) uninitialized memory
at this point, so it's not safe for gc to try to do anything with
it.
After its creator has initialized the object's memory to a sane
state, the creator must call PyObject_GC_Track() to tell the
memory system that the object is no longer insane, and
specifically that it's now safe to call this object's tp_traverse
method. That's where the new error message can happen: if
the object's gc_refs member is *not*
_PyGC_REFS_UNTRACKED when PyObject_GC_Track() is
called, Python aborts with a "GC object already tracked" fatal
error.
The most obvious way for this to happen is to call
PyObject_GC_Track() more than once on the same object,
without an intervening call to PyObject_GC_UnTrack().
That's not the most *likely* way for this to happen, though.
The most likely is for a wild store to overwrite the gc_refs
member by mistake. I've seen that happen in core Python
during development, but never (so far) in a released Python.
I've also seen it happen during development of the C
extension modules used in Zope.
Random hint: if you do
import gc
gc.set_threshold(1)
that will greatly increase the frequency with which gc runs.
While gc is almost never at fault when something blows up
while gc is running (that's just historical fact -- that code is
solid), as I said before, the true cause of the blowup typically
happened long ago. Making gc run much more frequently can
often help provoke the blowup into happening much closer to
the time the real damage was done. It's still unlikely to show
up in the stack trace at the time of the blowup, though.
----------------------------------------------------------------------
Comment By: washington irving (washirv)
Date: 2004-01-02 02:29
Message:
Logged In: YES
user_id=941550
> That's a new check in 2.3, added to try to catch
> one incorrect usage of the Python C API. That symptom has
> also been reported only by pycurl users.
I'm wondering what this incorrect usage is so I can poke
into the pycurl code myself and take a look and see what's
going on there.
Thanks
----------------------------------------------------------------------
Comment By: Neal Norwitz (nnorwitz)
Date: 2004-01-01 21:16
Message:
Logged In: YES
user_id=33168
I also don't have enough time to help. Another way to try
to find the problem is by using valgrind, dmalloc, electric
fence and/or purify or any other memory debugger.
----------------------------------------------------------------------
Comment By: Tim Peters (tim_one)
Date: 2004-01-01 20:56
Message:
Logged In: YES
user_id=31435
Sorry, I can't offer time to help debug this. Maybe someone
else can, but since the evidence so strongly points at pycurl
(see below), it would be best to get someone from that
project to volunteer.
FYI, *many* incorrect usages of the Python C API first show
up when cyclic gc is running, simply because gc traverses
every container object in existence. gc is thus extremely
sensitive to coding errors like uninitialized memory, wild
stores, thread races, and incorrect usage of the C-level GC
API. By the time gc suffers the effects of such an error, it's
*typical* that the code causing the error is long gone, having
screwed up millions or billions of cycles before gc ran (gc
doesn't run all that often).
It's also typical that such bugs are eventually traced to
coding errors in extension modules -- the Python core is too
heavily exercised by too many users on too many platforms
for such fundamental bugs to survive long there.
That doesn't mean the Python core can't be at fault, but
does mean it's unlikely to be in the core. Add to that that
the symptoms you report have been reported by, and only by,
people using pycurl, and the evidence pointing to pycurl is
simply overwhelming.
At least two earlier reports from pycurl users said Python 2.3
died with a
GC object already tracked
fatal error. That's a new check in 2.3, added to try to catch
one incorrect usage of the Python C API. That symptom has
also been reported only by pycurl users.
BTW, if pycurl also has some sort of debug-mode build option
(don't know -- haven't used pycurl), it would be good to build
with that too.
----------------------------------------------------------------------
Comment By: washington irving (washirv)
Date: 2004-01-01 20:30
Message:
Logged In: YES
user_id=941550
We are using pycurl. And we have reported it, in case. The
reason I'm reporting it here is that the stack trace does
not involve pycurl in any way. it seems to be python all the
way. I will build with pydebug and report back.
The only test case I have is the program we're running now.
We haven't yet managed to reduce it to a simple test case.
We're working on it. We would be open to figuring out a way
to give you access to the code itself...
----------------------------------------------------------------------
Comment By: Tim Peters (tim_one)
Date: 2004-01-01 19:31
Message:
Logged In: YES
user_id=31435
Are you using the pycurl extension module? If so, you should
report your problems to the pycurl project too. I don't know
whether the problem *is* in pycurl, but the only reports of
this type ever seen before have come from people using
pycurl. If you're not using pycurl, it would be good to know
that too.
----------------------------------------------------------------------
Comment By: Neal Norwitz (nnorwitz)
Date: 2004-01-01 19:14
Message:
Logged In: YES
user_id=33168
Have you tried building python --with-pydebug? That may
lead to an assert or some other indication of the problem.
Do you have a test case to reproduce this problem?
----------------------------------------------------------------------
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=868896&group_id=5470
More information about the Python-bugs-list
mailing list