[Tutor] sockets, files, threads
Marilyn Davis
marilyn at deliberate.com
Wed Jan 19 02:29:17 CET 2005
Thank you so much Danny. I know how hard it is to look at and comment
on other people's code. You know I teach C and Python and I have to
say, though, that reading students' Python is 100 times easier than
reading their C.
And, I hope you're feeling better. I hate to think of you struggling
through my code when you should be resting.
>
> Hi Marilyn,
>
>
> [Long program comments ahead. Please forgive me if some of the comments
> are overbearing; I'm trying to get over a cold, and I'm very grumpy.
> *grin*]
>
>
> Some comments on the code follow. I'll be focusing on:
>
> > http://www.maildance.com/python/doorman/py_daemon.py
>
> One of the import statements:
>
> ###
> from signal import *
> ###
>
>
> may not be safe. According to:
>
> http://docs.python.org/tut/node8.html#SECTION008410000000000000000
>
> """Note that in general the practice of importing * from a module or
> package is frowned upon, since it often causes poorly readable code.
Note that it says it is frowned upon, and I almost never do it,
because it 'often causes poorly readable code' -- but not in this
case. There is only signal.signal and signal.alarm and the signal
numbers, none of which are easy to mistake for something else. I have
never seen 'unsafe' --unless you cause yourself a bug by hiding
something from yourself, which would be hard to do in this case.
But, ok, to encourage you to make suggestions, I used the regular
import on signals. No change to the bugs.
> However, it is okay to use it to save typing in interactive sessions, and
> certain modules are designed to export only names that follow certain
> patterns."""
>
>
> There's a block of code in Server.run() that looks problematic:
Darn it! I managed to put up a hybrid version. Here's what I thought
I should have, to wrap a lock around the creation of the file
descriptor for the client socket:
self.server_socket.setblocking(0)
while 1:
if log.level & log.calls:
log.it("fd%d:py_daemon.py: Waiting ...", self.descriptor)
try:
file_lock.acquire()
try:
client_socket, client_addr = self.server_socket.accept()
finally:
file_lock.release()
except socket.error, msg:
if str(msg).find('Resource temporarily unavailable') > -1:
time.sleep(.2)
continue
except (EOFError, KeyboardInterrupt):
self.close_up()
Spawn(client_socket).start()
Not that it matters to your suggestions. But without locks, this
reduces to:
while 1:
if log.level & log.calls:
log.it("fd%d:py_daemon.py: Waiting ...", self.descriptor)
try:
client_socket, client_addr = self.server_socket.accept()
except (EOFError, KeyboardInterrupt):
self.close_up()
Spawn(client_socket).start()
But, I take it, you want me to put Spawn(client_socket).start() in that
try: clause.
>
> The problem is that, as part of program flow, it appears to run after the
> try block. But in one particular case of program flow, 'client_socket'
> will not be set to a valid value. It is better to put that statement a
I don't understand. Which particular case of program flow will
'client_socket' be invalid and yet make it through the socket.accept
call?
> few lines up, right where 'client_socket' is initialized. Like this:
>
> ###
> try:
> client_socket, client_addr = self.server_socket.accept()
> Spawn(client_socket).start()
> except socket.error, msg:
> time.sleep(.5)
> except (EOFError, KeyboardInterrupt):
> self.close_up()
> ###
>
> Not only does this make it more clear where 'client_socket' is being used,
> but it ends up making the code shorter. I've dropped the 'continue'
> statement, as it becomes superfluous when the Spawn() moves into the try's
> body.
But but but, that wraps the whole thread in one try/except clause.
That Spawn() call starts the thread. If a client_socket generates an
error, we don't want the main thread to sleep. All the socket
operations in Spawn.start() are also wrapped in their own (hopefully)
intelligent try/excepts. I thought it was good practice to -not- put
extra stuff in a try/except.
>
> In fact, the placement of that statement may account partially for the
> error message that you were running into earlier. The earlier error
> message:
>
> ###
> close failed: [Errno 9] Bad file descriptor
> ###
>
> can easliy occur if there's an EOFError or KeyboardInterrupt under the
> original code. And although KeyboardInterrupt might be unusual, I'm not
> so sure if EOFError is.
I thought I listed exactly the two errors that come from pounding on
the keyboard. I try to be careful not to mask any tracebacks.
And, the original way, this error only gets triggered while we are
waiting for a client to connect, not for anything else.
>
> Furthermore, the 'except' blocks can emit more debugging information.
> You mentioned earlier that you're not getting good error tracebacks when
> exceptions occur. Let's fix that.
>
> Use the 'traceback' module here for all except blocks in your program.
> This will ensure that you get a good stack trace that we can inspect.
>
> I'm using Python 2.4's format_exc() function to get the stack trace as a
> string. If this version is not available to you, you can use this
Yes. I'm using 2.4.
> workaround format_exc():
>
> Once you've made the correction and augmented all the except blocks with
ALL of them? That seems a similar practice to putting locks
everywhere? Is there a sort of except clause that I'm writing that
you think might be problematic?
> traceback logging, try running your program again. We should then be
> better able to trace down the root cause of those errors.
>
>
> Let me just look through a little bit more, just to see what else we can
> pick out.
>
> >From what I read of the code so far, I believe that a lot of the locking
> that the code is doing is also superfluous. You do not need to lock local
> variables, nor do you have to usually lock things during object
Where did I lock local variables? The things I lock are all calls to
operations that create file descriptors with the operating system, I
think. Except I think the experiment is in there where I tried
locking os.listdir, but it didn't help.
> initialization. For example, FileReader.__init__() has the following
> code:
>
> ###
> class FileReader(TokenReader):
> def __init__(self, file_socket):
> self.local_name = file_socket.local_name
> self.freader_name = '/tmp/fsf%s' % self.local_name
> file_lock.acquire()
> self.freader = open(self.freader_name, "w+")
> file_lock.release()
> ###
>
> The locks around the open() calls are unnecessary: you do not need to
> synchronize file opening here, as there's no way for another thread to get
> into the same initializer of the same instance.
But I think that I'm only wrapping the calls that create file
descriptors, because those get trampled on.
>
> In fact, as far as I can tell, none of the Spawn() threads are
> communicating with each other. As long as your threads are working
Yes. None communicate with each other.
So, why was I doing this? I thought it was the obvious architecture
for what I was doing. I didn't know it would be such a bear.
> independently of each other --- and as long as they are not writing to
> global variables --- you do not need locks.
I took it all out and nothing seemed to change, like you said.
>
> In summary, a lot of that locking code isn't doing anything, and may
> actually be a source of problems if we're not careful.
>
>
>
> The other classes are complex enough that I actually don't trust either of
> them. I am a very paranoid person. *grin*
>
> FileReader appears to reimplement a temporary-file implementation: I'd
> strongly recommend using tempfile.TemporaryFile instead of reimplementing
> its functionality. Either that, or use StringIO() to suck the whole file
> into memory. How large do you expect a single email message to be?
As big as an email can be. Huge and more huge. We don't want to
think about this program when big emails come in.
When I first made my program to read pop-servers, I used the poplib
library and Charlie immediately tested my code with some big messages
and it brought the machine to its knees. So I rewrote it only using
_socket for the real work, and keeping nothing in memory, and: boom!
it is now so fast and takes almost no extra memory in the python part.
And since then, please don't shoot me, but I don't immediately trust
the modules. I read them and see how many times they loop through the
data, and how many copies of the data they put into memory -- and
usually decide to write the simple things I need myself, looping zero
times and keeping only one block in memory.
>
> There appears to be missing symmetry between the open() and the close()
> calls in the code, and that's a sign of possible trouble. I try to
> structure most file-handling code with the following structure:
>
> ###
> f = open(somefile)
> try:
> ... do some stuff with f
> finally:
> f.close()
> ###
>
> Code that doesn't follow this idiom might be susceptable to resource
> leakage: we might forget to close() something. This idiom's probably not
> as important in Python, since we have garbage collection, but I still try
> to follow it to prevent things like running out of file handles too
> quickly. *grin*
I try to put the closes with the opens too, but it's not often
practical. I do log my openings and closings, as you can see, and
count them for leakage.
>
> The eval() architecture that Spawn uses to dispatch to either
> calls.acl_rcpt, calls.route_mail, calls.route_errors, calls.doorman, or
> calls.doorman_errors is dangerous. I know that in prior email, you
> mentioned that you were using eval().
>
> I had misgivings then, and I definitely have them now: the program calls
> eval() based on stuff that's coming from a socket, and that's just asking
> for trouble.
Well, remember that it's an internal socket and the calls are
hardcoded in the exim configuration file.
>
> I strongly recommend modifying Spawn.self() to something like this:
>
> ######
> def tokenToDispatchFunction(self, prog):
> """Given a command, returns the correct dispatch function. If prog
> is incorrect, throws a LookupError."""
> dispatchTable = { 'acl_rept' : calls.acl_rept,
> 'route_errors' : calls.route_errors,
> 'doorman' : calls.doorman,
> 'doorman_errors' : calls.doorman_errors'
> }
> return dispatchTable[token].main
>
>
> def run(self):
> """Executes one of the calls program."""
> argv = []
> dispatch = self.tokenToDispatchFunction(self.exim_io.call)
> dispatch(argv)
> ######
>
> I've oversimplified ("broken") the code here; you'll need to reinsert the
> argument-construction stuff and the logging. But other than that, this
> should have the same external-command calling functionality as the
> original code, with additional safety.
>
> The danger is that the tokens that we are reading from the socket can be
> arbitrarily formed. One potential token that can be written might be:
>
> ###
> evil = '__name__[__import__("os").system("ls"+chr(32)+"/")].py'
> ###
>
> 'evil' is nonsensical, but according to the tokenization code, this is a
> single token, since it does not contain spaces.
>
>
> Imagine that that weird string is the first token that
> FileSocket.find_call() returns to us. Back in Spawn.run(), under the
> original code, we end up trying to evaluate a string that looks like:
>
> 'call.__name__[__import__("os").system("ls"+chr(32)+"/")].main(argv)'
>
> which in fact will break with a runtime error, but not before doing
> mischief. This is exactly the kind of devious exploit that eval() opens
> up. It's hard to close the bottle after the genie's out.
>
I'm sorry that you didn't realize that the socket input is hard-coded
on the other side. To get to my socket, you must break into our
machine, and by then, wrecking havoc will be much easier than this.
>
> The revised code that uses a dispatchTable method is much more resiliant
> to this kind of attack. If someone tries to send malformed commands, the
> worse that can happen is a KeyError. More than that, if we look back at
> that tokenToDispatchFunction() again:
>
> ###
> def tokenToDispatchFunction(self, prog):
> dispatchTable = { 'acl_rept' : calls.acl_rept,
> 'route_errors' : calls.route_errors,
> 'doorman' : calls.doorman,
> 'doorman_errors' : calls.doorman_errors'
> }
> return dispatchTable[token].main
> ###
>
> it's much more clear what possible calls can happen.
>
Yes. I like it, but not because of security concerns, but because of
the clarity. I also simplified my protocol so now the Spawn.start()
is just this:
def start(self):
'''Given the command, provides the function to call.'''
global TESTING
function = { 'acl_rcpt.py' : calls.acl_rcpt,
'route_mail.py' : calls.route_mail,
'route_errors.py' : calls.route_errors,
'doorman.py' : calls.doorman,
'doorman_errors.py' : calls.doorman_errors,
'is_known_to.py' : is_known_to
}
if log.level & log.calls:
log.it('fd%d: %s: %s', self.descriptor, self.exim_io.call, self.exim_io.collector_name)
function[self.exim_io.call].main(self.exim_io)
So nice! Thank you.
But, I did take out threading and the big error went away. I'm done
with threading, unless I see a big need one day. I don't know what
I'll tell students from here on.
>
> Anyway, my eyes are going dim, so I'd better stop right now. *grin* Sorry
> for taking so much space; I hope this helps!
>
>
> I thought about this a little bit more: there is one place where you do
> need locks. Locking needs to be done around Spawn's setting of its 'no'
> class attribute:
>
> ###
> class Spawn(threading.Thread):
> no = 1
> def __init__(self, client_socket):
> Spawn.no = Spawn.no + 2 % 30000
> ## some code later...
> self.local_name = str(Spawn.no)
> ####
>
> The problem is that it's possible that two Spawn threads will be created
> with the same local_name, because they are both using a shared resource:
> the class attribute 'no'. Two thread executions can interleave. Let's
> draw it out.
>
But, notice that the call to:
threading.Thread.__init__(self, name = self.local_name)
came after, so the Spawn.no manipulation always happens in the main
thread.
OK?
> Anyway, I hope this helps!
Of course. Code review is always a huge help, and you gave me some
good ideas. I'm very grateful.
One problem remains after removing the threading stuff. I still get
those pesky:
close failed: [Errno 9] Bad file descriptor
even though my logged openings and closings match up one-to-one. Now
then, I haven't added that line of code to each except clause yet
because 1) it doesn't seem to cause any problem 2) it's a bunch of
busy-work and 3) I'm hoping that, when your illness clears, you'll
think of something less brute-force for me to do.
What about when I do an explicit call to a close inside a __del__. Is
that a bad idea?
Again, thank you thank you thank you. When I meet you (we are both in
the Bay Area) I'll give you a huge hand-shake for sure.
Marilyn
More information about the Tutor
mailing list