Parallel insert to postgresql with thread
Scott David Daniels
Scott.Daniels at Acm.Org
Thu Oct 25 22:46:59 EDT 2007
Erik Jones wrote:
>
> On Oct 25, 2007, at 7:28 AM, Scott David Daniels wrote:
>> Diez B. Roggisch wrote:
>>> Abandoned wrote:
>>>> Hi..
>>>> I use the threading module for the fast operation. But ....
>> [in each thread]
>>>> def save(a,b,c):
>>>> cursor.execute("INSERT INTO ...
>>>> conn.commit()
>>>> cursor.execute(...)
>>>> How can i insert data to postgresql the same moment ?...
>>>
>>> DB modules aren't necessarily thread-safe. Most of the times, a
>>> connection (and ... cursor) can't be shared between threads.
>>> So open a connection for each thread.
>>
>> Note that your DB server will have to "serialize" your inserts, so
>> ... a single thread through a single connection to the DB is the way
>> to go. Of course it (the DB server) may be clever enough to behave
>> "as if" they are serialized, but most of your work parallelizing at
>> your end simply creates new work at the DB server end.
>
> Fortunately, in his case, that's not necessarily true.... If he
> goes the recommended route with a separate connection for each thread,
> then Postgres will not serialize multiple inserts coming from separate
> connections unless there is something like and ALTER TABLE or REINDEX
> concurrently happening on the table.
> The whole serialized inserts thing is strictly something popularized
> by MySQL and is by no means necessary or standard (as with a lot of
> MySQL).
But he commits after every insert, which _does_ force serialization (if
only to provide safe transaction boundaries). I understand you can get
clever at how to do it, _but_ preserving ACID properties is exactly what
I mean by "serialize," and while I like to bash MySQL as well as the
next person, I most certainly am not under the evil sway of the vile
MySQL cabal.
The server will have to be able to abort each transaction
_independently_ of the others, and so must serialize any index
updates that share a page by, for example, landing in the same node
of a B-Tree.
-Scott David Daniels
Scott.Daniels at Acm.Org
More information about the Python-list
mailing list