Psycopg; How to detect row locking?

Jorge Godoy godoy at ieee.org
Mon Sep 27 09:48:30 EDT 2004


Alban Hertroys <alban at magproductions.nl> writes:

> Jorge Godoy wrote:
>> Are you manually locking those rows?  If so, you can maintain some
>> structure to keep track of locked rows.
>
> No, I don't. That may be a solution, though. Then at least I *know* when 
> a record is locked... But it's going to complicate things, if the 
> transaction is split among multiple threads... I'm not looking forward 
> to that.

:-)  I don't think that locking is a solution...  The MVCC approach
sounds much better to me.

> I think the rows are locked because the inserts haven't finished 
> inserting yet. The select takes place in the same session AFAIK, but not 
> in the same thread of my Python application. I'm probably looking at a 
> race condition here... (Ain't multithreading fun...)

If there's no data then a SELECT would return nothing at all, without
any error after all you're querying the database for some information
that doesn't exist.

> I'm also not sure whether I'm actually looking at the same transaction. 
> Is there a way to verify such?

You have to look at psycopg's docs.  I use pyPgSQL here.

> I do know for certain that all transactions use the same database 
> connection (I pass it along in a context object, together with config 
> settings and debugging methods). And I'm also quite sure that it doesn't 
> commit in between.

You would benefit a lot of transactions if you are inserting a lot of
data or if there's some relationship between data (and constraints and
triggers and ... at the database).  The INSERT isn't commited until you
issue a COMMIT.  Any SELECT before that will return nothing.  If you're
not using transactions, then you are hitting the disk for each and every
command.  I've made some tests here and the difference goes from some
seconds (with transactions) to several minutes (without transactions)
for a 65k rows insert on an old project we did. 

>> If you are not locking, PostgreSQL uses MVCC where it locks as little as
>> possible and you are able to select the new data inside the same
>> transaction and old data outside of it (until it is commited).
>
> I suppose there must be a short while where the row is locked during the 
> insert, where I may already be trying to select it. If this is indeed 
> the case, I would expect to receive a "row is locked" type of error.

There's no such need.  There's no data there so the SELECT returns
nothing.  If there's data and you're updating it, then until you commit
the transaction you get the old values.

> Alternatively, the select may be waiting (w/in psql) until the insert 
> finished (which should be pretty soon in all cases[*]), but that depends 
> on implementations beyond my reach. Not that that matters, I shouldn't 
> have this problem in that case.
>
>
> [*] Unless you break your database with triggers that lock up or 
> something similar. That could be a reason for the PostgreSQL team to not 
>  let select wait until an insert on the same row finished, but to 
> return an error instead.

They use MVCC: Multi-Version Concurrency Control.  You might want to
read about it:

        http://www.linuxgazette.com/issue68/mitchell.html
        http://www.developer.com/open/article.php/877181


Be seeing you,
-- 
Godoy.     <godoy at ieee.org>



More information about the Python-list mailing list