Read from database, write to another database, simultaneously

Paul Boddie paul at boddie.org.uk
Thu Jan 11 13:36:22 EST 2007


Dennis Lee Bieber wrote:
>
>  "need"? No... the "copy ... from ..." statement "needs" such, but do
> you really "need" to use "copy ... from ..." to load the data -- or is
> this just a concept that caught your fancy?

In chapter 13 of the PostgreSQL 8.1 documentation ("Performance Tips"),
descending into section 13.4 ("Populating a Database"), whose
introduction may not admittedly coincide precisely with the inquirer's
use case ("first populating a database" doesn't strictly apply, but it
may yet describe a situation that is close enough given the kind of
data involved), and within section 13.4.2 ("Use COPY"), the text reads
as follows:

"Use COPY to load all the rows in one command, instead of using a
series of INSERT commands."

http://www.postgresql.org/docs/8.1/static/populate.html#POPULATE-COPY-FROM

>  If  you can get Oracle to dump the data in the form of SQL insert
> statements, maybe with some clean-up done by a script, and feed them to
> PostgreSQL the entire job starts to look something like:
>
> odump theDatabase | python cleanup.py | psql theNewDatabase

The problem with this approach (as the manual notes) is that COPY is a
lot faster than lots of INSERT statements:

"Note that loading a large number of rows using COPY is almost always
faster than using INSERT, even if PREPARE is used and multiple
insertions are batched into a single transaction."

My experience with inserting large numbers of records into PostgreSQL
suggests that this advice should not be readily ignored.

Paul




More information about the Python-list mailing list