psycopg2 & large result set

Paul Boddie paul at boddie.org.uk
Fri May 25 06:39:54 EDT 2007


On 25 May, 11:16, Jon Clements <jon... at googlemail.com> wrote:
>
> I'm using psycopg2 to retrieve results from a rather large query (it
> returns 22m records); unsurprisingly this doesn't fit in memory all at
> once. What I'd like to achieve is something similar to a .NET data
> provider I have which allows you to set a 'FetchSize' property; it
> then retrieves 'n' many rows at a time, and fetches the next 'chunk'
> after you read past the end of the current chunk. I suppose I could
> use Python for .NET or IronPython but I'd rather stick with CPython
> 2.5 if possible.
>
> I'm not 100% sure if it's an interface or a server thing. Any ideas
> are most welcome.

It's an interface thing. The DB-API has fetchone, fetchmany and
(optionally) iteration methods on cursor objects; PostgreSQL supports
what you have in mind; pyPgSQL supports it at the interface level, but
psycopg2 only supports it if you use "named cursors", which is not
part of the DB-API specification as far as I recall, and not
particularly convenient if you're thinking of targeting more than one
database system with the same code. See this bug filed against
psycopg2 and the resulting discussion:

http://www.initd.org/tracker/psycopg/ticket/158

I've been running a patched version of psycopg2, but haven't developed
the patch further since it may be more convenient for me to switch
back to pyPgSQL eventually.

Paul




More information about the Python-list mailing list