threading a 10 lines out of a file

alex goretoy aleksandr.goretoy at gmail.com
Mon Jan 5 16:43:00 EST 2009


>
> Se we need to know a bit more about your 4.5-hour program before we can
> determine whether threads can help. There is light at the end of the
> tunnel, however, since even if threads don't work it's possible that the
> multiprocessing module will (assuming you have multi-processor hardware
> at your disposal).

What my program is doing is sending each line to a function that processes
it via pycurl(with urllib fallback),mysqlDB(with _mysql fallback). It check
the mysql database to see if this line exists. If it doesn't then it sends
it either via mysql query or pycurl. Depending on the option set in the
functions. Some sections of the function  have time.sleep(6) in them.
Otherwise things won't work. This considerably slows down performance. If I
thread all lines then it will process more at the same time. So that means
there will be like 10 or set amount threads running doing all steps in the
functions. posting forms, performing queries and waiting for form postings
to process on the server, etc... I hope this adds more light at the end of
that tunnel. It currently works under my ubuntu install of python(2.5.x) and
bt's python(2.4.3). Then reason why I added a fallback to MySQLdb and pycurl
is then a person can install this on a server that is hosted elsewhere.
Where you can't install python modules, due to permissions and such. I want
it to work everywhere. There's alot more to this application, I'm not sure I
can disclose at the moment. Seeing as it can be used for good or bad. I
don't want it to get in the wrong hands if it's public. OTOH, I think I'll
make it public. That's all up in the air at the moment. One thing it that it
does make life easier for me. A lot easier. Although, I haven't made money
with it. Yet. Plus, I want to make pyGTK frontend for it. Looking into that
too. I wouldn't be against a private team assembling to create this though.
As long as I can get money out of it somehow. Cuz I'm broke. and I live with
my mom. Not sure how anyone can help me there. But I'll throw it up in the
air for all to see. Maybe somethings comes out of it. This program is an
idea I've been building inside my garage(my room) for about a year and a
half. Built in PHP and python, now.

Would something that uses pycurl,mysql be good for threading? It doesn't run
on SMP but maybe one day.

I also need to look into how to make a python package out of it. I
researched some stuff awhile ago, but I didn't quite need it then. I just
wanted to see what I'm getting into.  Any other stuff about this would be
appreciated to. Although of topic. Sorry.

By the way, I wanted to really thank everyone for all your help. It means a
lot to me.

-Alex Goretoy
http://www.alexgoretoy.com



On Mon, Jan 5, 2009 at 9:17 PM, Steve Holden <steve at holdenweb.com> wrote:

> I did, once upon a time, write code
> that used several hundred threads to send emails, and gave a dramatic
> speed-up (because of the network-bound nature of the task). Can I
> presume that your original inquiry was a toy, and that your real problem
> is also IO-bound? Otherwise I am unsure how you will benefit by
> threading - if your line-processing tasks don't contain any IO then
> using a threaded approach will not yield any speed-up at all.
>
> The example you quoted achieved its speed-up because a thread releases
> the GIL while waiting for a network response, allowing other threads to
> process. Thus it effectively ran all the pings in parallel.
>
> Se we need to know a bit more about your 4.5-hour program before we can
> determine whether threads can help. There is light at the end of the
> tunnel, however, since even if threads don't work it's possible that the
> multiprocessing module will (assuming you have multi-processor hardware
> at your disposal).
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-list/attachments/20090105/0da6b2ae/attachment-0001.html>


More information about the Python-list mailing list