DB API

Wed Jan 19 11:11:53 EST 2000

In article <84qqs4$dn7$1 at news1.xs4all.nl>,
Boudewijn Rempt <boud at rempt.xs4all.nl> wrote:
>Aahz Maruch <aahz at netcom.com> wrote:
>>
>> Oh, okay.  Let's suppose, for example, that you have a customer
>> database, and the client wants to view the entire order history for a
>> customer.  Would you say that it's inappropriate for the database to
>> return thousands of rows for an active customer?
>
>Well, it is if the customer is one of many who expects lightning
>performance. If the customer wants a lot of data he will have to wait,
>and he will simultaneously tie up the server for other customers. 

Uh, if it's a properly constructed database, it shouldn't get tied up
like that.

>If
>he's seriously going to look at thousands of records, well, so be it
>then. But it remains a fact that what a database is best at is selectively
>presenting a bit of data out of a large mass. If your relational database
>is mostly employed serving up most of its contents every query, something
>is wrong, and you will get the blame for its lacklustre performance.

If you have millions of records, a few thousand is very far from "most
of its contents".  Also, from my POV, the client is usually a Python
script, so it *is* interested in every single record.

>Even so, in most cases I've seen, when thousands of rows are retrieved
>there's designers/developers laziness or ignorance behind it - and in
>most of the other cases the customer can be 'educated' in wanting a more
>selective result. Nobody is going to eyeball more than a few hundred
>records, and if they are going to post-process the result it is often
>better to do the post-processing on the server and send out the result.

Often, yes, but far from always.  For example, if one has twenty groups
of records, each group with roughly two hundred items, there's no
efficient SQL to pick up the first thirty items of each group.  Yes, one
could do twenty queries, but it might be more efficient to post-process
a single query.

>> Similarly, suppose you have millions of BLObs that change regularly.
>> How would you organize them on an HTTP server?  Incidentally, HTTP is
>> *not* an efficient transmission mechanism (less true if you're using
>> HTTP 1.1).
>
>If you have millions of large blobs you will have problems no matter
>what - nothing I can say or do can worsen or lighten those ;-). I'm not
>sure that I know the right solution to problems like that. I know of a
>few sites where they regularly work with data like that, but they don't
>store it in a relational database, they use custom solutions, and still
>often store the blobs on the filesystem.

Really?  How do they map the filesystem to the database?

>I know http isn't that efficient, but it allows you to a) offload the
>demand to other servers very easily, and  b) present the textual data
>already while the customer is still waiting for the picture. 

a) is only true if you're replicating the file system or have a really,
really good network file server.  b) doesn't apply if the blob is part
of the data stream being used by the client (the blob is not an image).

>Anyway, while interesting, this is getting away from Python more
>and more...

Sure, but as you can see from the postings here, many people use Python
to access databases, so it isn't completely off-topic -- particularly
when the discussion revolves around whether it's better to do processing
in Python or in the database.
--
                      --- Aahz (@netcom.com)

Androgynous poly kinky vanilla queer het    <*>     http://www.rahul.net/aahz/
Hugs and backrubs -- I break Rule 6

Have a *HAPPY* day!!!!!!!!!!