[Web-SIG] Threading and client-side support

John J Lee jjl at pobox.com
Tue Oct 28 05:35:33 EST 2003


On Mon, 27 Oct 2003 amk at amk.ca wrote:

> On Mon, Oct 27, 2003 at 02:45:16PM +0000, John J Lee wrote:
> > I suppose I should ask on python-dev if there's a policy / tradition here.
>
> The rough tradition would be: Thread-safety is good, and library modules
> shouldn't be non-threadsafe unless there's a very good reason.

Thanks.  So, in particular, httplib, urllib and urllib2 are thread-safe
(except for problems noted in the source: FTP connection caching in
urllib2, FTP content caching in urllib)?


> > changing?  I suppose I'd also need to just label the .cookies attribute as
> > non-threadsafe (or get rid of it, or add a __getattr__ to allow locking it
> > -- yuck).
>
> Assuming .cookies is a Python dictionary (I haven't looked at the CookieJar
> code), there's no locking needed.  Locking is necessary when a data
> structure is temporarily inconsistent, or some invariant is temporarily
> broken.

Yes, I realise that.  .cookies is a nested dict (currently documented as
publicly readable, though FWLIW will probably have to cease to be soon,
for non-thread related reasons):

self.cookies[domain][path][name]


So my set_cookie method certainly needs locking, because there are tests
like this:

 c = self.cookies
 if not c.has_key(cookie.domain): c[cookie.domain] = {}


I guess what I was really worrying about, though (without fully realizing
it), was higher-level integrity issues over and above mere thread-safety.
For example, if one thread is iterating over cookies and reading their
values, and halfway through, another thread calls extract_cookies to
extract the cookies from an HTTP response, causing some cookies to be
added and/or removed, that might cause trouble, but isn't a thread-safety
issue (and is the application's problem, not mine).  I guess the methods I
have for loading / saving to a file also fall into this category, but I'm
still a little confused.

Since the relevant level of granularity is the bytecode instruction
(right?), am I right in assuming you may have to start thinking about what
your code looks like in bytecode form?  I guess you play with the compiler
module until you get to know which operations are single bytecode
instructions and which are not?

[...]
> But if you're assigning to a single attribute (self.filename = 'foo'),
> there's no point in time where the attribute is inconsistent, a mix of the
> old and new names; instead it's first the old value, and then it's set to
> 'foo'.  So no lock is needed.

OK.  I wasn't sure whether that was a single bytecode or not, but I
suppose that makes sense given Python's semantics.  I saw masses of
'synchronize's on strings in a Java implementation of cookie handling
(jCookie), and I'm far from sure what they're all there for...


John



More information about the Web-SIG mailing list