[Tutor] Multi-thread environments

Liam Clarke ml.cyresse at gmail.com
Fri Mar 31 15:02:33 CEST 2006


Thanks very much for that Kent, works fine and dandy now. >_< This is
one to chalk up to experience. I copied the dicts as you said.

Regards,

Liam


On 3/31/06, Kent Johnson <kent37 at tds.net> wrote:
> Liam Clarke wrote:
> > Hi all,
> >
> > I'm working in my first multi-threaded environments, and I think I
> > might have just been bitten by that.
> >
> > class Parser:
> >     def __init__(self, Q):
> >     self.Q = Q
> >     self.players = {}
> >     self.teams = {}
> >
> >     def sendData(self):
> >         if not self.players or not self.teams: return
> >         self.Q.put((self.players, self.teams))
> >         self.resetStats()
> >
> >     def resetStats():
> >         for key in self.players:
> >             self.players[key] = 0
> >         for key in self.teams:
> >             self.teams[key] = 0
> >
>
> > What I'm finding is that if a lot more sets of zeroed data are being
> > sent to the DAO than should occur.
> >
> > If the resetStats() call is commented out, data is sent correctly. I
> > need to reset the variables after each send so as to not try and
> > co-ordinate state with a database, otherwise I'd be away laughing.
> >
> > My speculation is that because the Queue is shared between two
> > threads, one of which is looping on it, that a data write to the Queue
> > may actually occur after the next method call, the resetStats()
> > method, has occurred.
> >
> > So, the call to Queue.put() is made, but the actual data is accessedin
> > memory by the Queue after resetStats has changed it.
>
> You're close. The call to Queue.put() is synchronous - it will finish
> before the call to resetStats() is made - but the *data* is still shared.
>
> What is in the Queue is references to the dicts that is also referenced
> by self.players and self.teams. The actual dict is not copied! This is
> normal Python function call and assignment semantics, but in this case
> it's not what you want. You have a race condition - if the data in the
> Queue is processed before the call to resetStats() is made, it will work
> fine; if resetStats() is called first, it will be a problem. Actually
> there are many possible failures since resetStats() loops over the
> dicts, the consumer could be interleaving its reads with the writes in
> resetStats().
>
> What you need to do is copy the data, either before you put it in the
> queue or as part of the reset. I suggest rewriting resetStats() to
> create new dicts because dict.fromkeys() will do just what you want:
>    def resetStats():
>      self.players = dict.fromkeys(self.players.keys(), 0)
>      self.teams = dict.teams(self.players.keys(), 0)
>
> This way you won't change the data seen by the consumer thread.
>
> > I've spent about eight hours so far trying to debug this; I've never
> > been this frustrated in a Python project before to be honest... I've
> > reached my next skill level bump, so to speak.
>
> Yes, threads can be mind-bending until you learn to spot the gotchas
> like this.
>
> By the way you also have a race condition here:
> >             if self.dump:
> >                 self.parser.sendDat()
> >                 self.dump = False
>
> Possibly the thread that sets self.dump will set it again between the
> time you test it and when you reset it. If the setting thread is on a
> timer and the time is long enough, it won't be a problem, but it is a
> potential bug.
>
> Kent
>
> _______________________________________________
> Tutor maillist  -  Tutor at python.org
> http://mail.python.org/mailman/listinfo/tutor
>


More information about the Tutor mailing list