Fast lookup of bulky "table"

Thomas Passin list1 at tompassin.net
Sun Jan 15 18:06:36 EST 2023


On 1/15/2023 4:49 PM, Stefan Ram wrote:
> dn <PythonList at DancesWithMice.info> writes:
>> Some programmers don't realise that SQL can also be used for
>> calculations, eg the eponymous COUNT(), which saves (CPU-time and
>> coding-effort) over post-processing in Python.
> 
>    Yes, I second that! Sometimes, people only re-invent things
>    in Python because they don't know SQL well enough, or they
>    do not normalize their tables because they have not properly
>    learned how to do this.
> 
>    I'd always start out with normalized tables and do as many
>    operations in SQL as possible. I would then hesitate to
>    de-normalize anything or transfer data operations into
>    the programming language unless I am very sure that this
>    is really advantageous.

Yes, if you get the indexes and joins right, sometimes you can get a 
very large speed-up.  It takes some experimenting and use of EXPLAIN, 
but it's worth doing.  You especially want to avoid letting the database 
engine do full-table scans over and over. And you never want to send a 
lot of rows to Python and do post-filtering on them if you can avoid it.

Use WHERE instead of HAVING if possible (HAVING works post-scan, WHERE 
works during row retrieval).

>    Once I had the task of writing VBA code to query and analyze
>    data from a Jet engine (i.e., Microsoft Access). I ended up
>    writing 90 % of the code in SQL and a thin layer of 10 % in VBA.
>    And it was fast.
> 
> 



More information about the Python-list mailing list