How can I reduce the number of queries to my PostgreSQL database?

Alan Morgan amorgan at xenon.Stanford.EDU
Mon Apr 10 14:21:39 EDT 2006


In article <1144489087.794505.277930 at v46g2000cwv.googlegroups.com>,
SR <shay at shay-riggs.fsnet.co.uk> wrote:
>As a starter project for learning Python/PostgreSQL, I am building a
>Books database that stores information on the books on my bookshelf.
>
>Say I have three tables.
>
>Table "books" contains rows for book_id, title, subtitle, ISBN.
>
>Table "authors" contains rows for author_id, author surname, author
>first names, biographical notes.
>
>Table "bookauthors" contains two rows: book_id, author_id.
>
>The bookauthors table links the books and authors tables.
>
>Scenario: I have a python script which creates web page listing all
>books in the database, and all authors for each book. My python script
>does essentially three things:
>
>1.  retrieve a list of all book_ids and book_titles.
>
>2.  for each book_id, query the bookauthors table and retrieve all
>author names for that book_id.
>
>3.  display it all out as an html table on a web page.
>
>The script works fine, if a little slow. I think that's because if I
>have 50 books in my database, my script performs 51 database queries (1
>for all book names; then 1 for each book). A colleague of mine
>suggested that I could get away with two queries, 1 to read the book
>ids and titles, and 1 to read the bookauthors table to pull in *all*
>relations, and then do all the work in Python.
>
>I think I know where he's coming from, but I don't know where to begin.
>Any clues? Is there a specific name for this technique?

Yup.  The technique is called "using a relational database".  This is
precisely the sort of thing SQL does well.  Let's say you want to find
out who wrote 'The Hitchhikers Guide to the Galaxy'.  You could do the
following (all sql untested and, let's face it, probably not understood
by author):

1. Query for that book to get the book_id
SELECT id FROM books WHERE title='The Hitchhikers Guide To The Galaxy'

2. Look up that author id in the bookauthor database
SELECT author_id FROM bookauthors WHERE book_id=<book id>

3. Look up that author in the author database
SELECT name FROM authors WHERE id=<author id>

or do

SELECT name FROM authors, books, bookauthors
 WHERE books.id=bookauthors.book_id
   AND authors.id=bookauthors.author_id
   AND title='The Hitchhikers Guide To The Galaxy'

Slick, no?  You want something like:

SELECT title, name, book_id FROM authors, books, bookauthors
 WHERE books.id=bookauthors.book_id
   AND authors.id=bookauthors.author_id

If you have more than one author for a book then the book will
appear in the table multiple times.  You'll have to combine
those yourself (the book_id row can help here.  I don't know
if you can leverage more SQL for that job).

You can optimize some of these SQL queries if you like.
Optimizing JOINs, which is what these are) is a serious
business, but for piddly databases of this size it really
isn't necessary.

Alan
-- 
Defendit numerus



More information about the Python-list mailing list