[Tutor] Optimal solution in dealing with huge databases inpython

Alan Gauld alan.gauld at btinternet.com
Fri Jan 26 01:46:38 CET 2007


"Eric Walstad" <eric at ericwalstad.com> wrote

>> Loadfile.sql would in turn contain lots of SQL commands like:
>>
>> INSERT into CUSTOMER
>> VALUES ( 'fred', 'bloggs', 45, '01773-987321');

> I think PostgreSQl will wrap each of these INSERT statements into a
> transaction, which will add a lot of overhead and slow the process. 
> I

Actually Oracle would do that too, but I didn't know about...

> COPY <table_name> (<list of field names>)
> FROM stdin;
> <rows of tab delimited field data>

...Because in Oracle there is a separate utility (from memory called
sqloader) that is specifically intended for large database loads so
you rarely use the technique I described, unless the schema is very
complex. (There is a COPY command in Oracle but its for copying data
from one database to another across a network...)

> My SQL files are about 350MB, fwiw.  I haven't tried breaking them 
> down
> to smaller files because I haven't read in the PostgreSQL docs, or
> forums, that doing so would be helpful.

In that case big files are probably best. The small file trick is
important if the database tries to cache tables in memory, or
if it creates large memory buffers for bulk writing to disk (via DMA).

Thanks for posting the COPY stuff I hadn't seen it before. Although
not being heavily into Postgres thats not exactly a shock! :-)

Alan G., 




More information about the Tutor mailing list