Parallel insert to postgresql with thread

Erik Jones erik at myemma.com
Thu Oct 25 10:46:54 EDT 2007


On Oct 25, 2007, at 7:28 AM, Scott David Daniels wrote:

> Diez B. Roggisch wrote:
>> Abandoned wrote:
>>
>>> Hi..
>>> I use the threading module for the fast operation. But ....
> [in each thread]
>>> def save(a,b,c):
>>>             cursor.execute("INSERT INTO ...
>>>             conn.commit()
>>>             cursor.execute(...)
>>> How can i insert data to postgresql the same moment ?...
>>
>> DB modules aren't necessarily thread-safe. Most of the times, a  
>> connection
>> (and of course their cursor) can't be shared between threads.
>>
>> So open a connection for each thread.
>
> Note that your DB server will have to "serialize" your inserts, so
> unless there is some other reason for the threads, a single thread
> through a single connection to the DB is the way to go.  Of course
> it may be clever enough to behave "as if" they are serialized, but
> mostly of your work parallelizing at your end simply creates new
> work at the DB server end.

Fortunately, in his case, that's not necessarily true.  If they do  
all their work with the same connection then, yes, but there are  
other problems with that as mention wrt thread safety and psycopg2.   
If he goes the recommended route with a separate connection for each  
thread, then Postgres will not serialize multiple inserts coming from  
separate connections unless there is something like and ALTER TABLE  
or REINDEX concurrently happening on the table.  The whole serialized  
inserts thing is strictly something popularized by MySQL and is by no  
means necessary or standard (as with a lot of MySQL).

Erik Jones

Software Developer | Emma®
erik at myemma.com
800.595.4401 or 615.292.5888
615.292.0777 (fax)

Emma helps organizations everywhere communicate & market in style.
Visit us online at http://www.myemma.com





More information about the Python-list mailing list