[Tutor] threading mind set

carlo locci locci.carlo.1985 at gmail.com
Sat May 12 22:43:58 CEST 2012


Hello All,
I've started to study python a couple of month ago(and I truly love it :)),
however I'm having some problems understanding how to modify a sequential
script and make it multithreaded (I think it's because I'm not used to
think in that way), as well as when it's best to use it(some say that
because of the GIL I won't get any real benefit from threading my script).
It's my understanding that threading a program in python can be useful when
we've got some I/O involved, so here is my case, I wrote a quite simple
script that reads the first column from a csv file and insert every row of
the value into a tuple, then I created a function which gets me the size of
a given path/folder and I made it loop so that it'll print the the folder
dimension of each path is in the tuple previously created. Here's the code:

* def read():*
*    import csv*
*    with open('C:\\test\\VDB.csv', 'rb') as somefile:*
*        read = csv.reader(somefile)*
*        l = []*
*        for row in read:*
*                l += row*
*        return l*
*
*
*def DirGetSize(cartella):*
*    import os*
*    cartella_size = 0*
*    for (path, dirs, files) in os.walk(cartella):*
*        for x in files:*
*            filename = os.path.join(path, x)*
*            cartella_size += os.path.getsize(filename)*
*    return cartella_size*
*
*
*import os.path*
*for x in read():*
*    if not os.path.exists(x):*
*        print ' DOES NOT EXIST ON', x*
*    else:*
*        S = DirGetSize(x)*
*        print 'the file size of', x, 'is',S*
*
*
The script works quite well(at least does what I want), but my real
question is will I gain any better performance, in terms of speed, out of
it, if I multithread it? The csv file contains a list of server/path/folder
therefore I though that If I would multitread it I's gonna became much
faster since it will perform the *DirGetSize,*
function almost concurrently, although I'm quite confused by the subject,
so I'm not really sure. I would really appreciate anyone who would make me
understand when it's useful to implement a multreaded script and when it's
not and why :),(Maybe I'm asking to much), as well as any good resources
where I can study
from. Thank you in advance to anyone who will reply me as well as thank you
for having such a mailinglist(I discovered it when I had watched a google
I/O conference on youtube). Thank you guys.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/tutor/attachments/20120512/d021e2fb/attachment.html>


More information about the Tutor mailing list