[Tutor] toy program to find standard deviation of 2 columns of a sqlite3 database

avi.e.gross at gmail.com avi.e.gross at gmail.com
Sun Jul 3 23:09:30 EDT 2022


Once explained, the request makes some sense and I withdraw my earlier
suggestions as not responding to what you want.

You do NOT want to use Python almost at all except as a way to test
manipulating some database.  Fine, do that!

As has been pointed out, many versions of SQL come pre-built with functions
you can cll from within your SQL directly and also remotely that do things
like calculate means and perhaps even standard deviations using queries
like:

mysql> SELECT STDDEV_SAMP (salary) FROM employee;  

mysql> SELECT STD(salary) FROM employee;  

mysql> SELECT STDDEV(salary) FROM employee;  

mysql> SELECT STDDEV_POP(salary) FROM employee;  

Depending on which one you want.

So if the above are SUPPORTED then your issue of not using much memory in
Python is quite irrelevant. 

If you want to learn to use a particular function that probably sends the
above command, or something similar, fine. Figure it out but my impression
is the function you are using may not be using a local Python function.

I know this group is for learning but I seem to not appreciate being asked
to think about a problem in ways that keep turning out to be very different
than what is wanted as it is usually a waste of time for all involved.
People with more focused and understandable needs may be a better use of any
time I devote here. I have negligible interest personally in continuing to
work on manipulating a database remotely at this time. 

When things get frustrating volunteers are mobile.


-----Original Message-----
From: Tutor <tutor-bounces+avi.e.gross=gmail.com at python.org> On Behalf Of
Manprit Singh
Sent: Sunday, July 3, 2022 9:01 AM
To: tutor at python.org
Subject: Re: [Tutor] toy program to find standard deviation of 2 columns of
a sqlite3 database

Sir,
I am just going through all the functionalities available in sqlite3 module
, just to see if I can use sqlite3 as a good data analysis tool or not .

Upto this point I have figured out that and sqlite data base file can be an
excellent replacement for data stored in files .

You can preserve data in a structured form, email to someone who need it etc
etc .

But for good data analysis ....I found pandas is superior . I use pandas for
data analysis and visualization .


Btw ....this is true . You should use right tool for your task .



Regards
Manprit Singh

On Sun, 3 Jul, 2022, 18:02 Alan Gauld via Tutor, <tutor at python.org> wrote:

> On 03/07/2022 03:49, Manprit Singh wrote:
>
> > con.create_aggregate("stddev", 1, StdDev) cur.execute("select 
> > stddev(X1), stddev(X2) from table1")
>
> I just wanted to say thanks for posting this. I have never used, nor 
> seen anyone else use, the ability to create a user defined aggregate 
> function in SQLite - usually I just extract the data into python and 
> use python to do the aggregation. But your question made me read up on 
> how that all worked so it has taught me something new.
> (It also makes me appreciate how the Pyhon API is much easier to use 
> than the raw C API to SQLite!)
>
> > My question is, as you can see i have used list inside the class 
> > StdDev,
> which
> > I think is an inefficient way to do this kind of problem because 
> > there
> may be
> > a large number of values in a column and it can take a huge amount 
> > of
> memory.
> > Can this problem be solved with the use of iterators ? What would be 
> > the
> best
> > approach to do it ?
>
> If I'm working with so much data that this would be a problem I'd use 
> the database itself to store the intermediate data. That would be much 
> slower but much less memory dependant. But as others have said, with 
> aggregate functions you don't usually need to store data from all rows 
> you just store a few inermediate results which you combine at the end.
>
> If you are trying to use an in-memory function - like the stddev 
> function here - then you need to fit all the data in memory anyway so 
> the function will simply not work if you can't store the data in RAM. 
> In that case you need to find(or write) another function that doesn't 
> use memory for storage or uses less storage.
>
> It is also worth pointing out that most industrial strength SQL 
> databases come with a far richer set of aggregate functions than 
> SQLite. So if you do have to work with large volumes of data you 
> should probably switch to someting like Oracle, DB2, SQLServer(*), etc 
> and just use the functions built into the server. If they don't have 
> such a function they also have amuch simpler way of defining stored 
> procedures. As ever, choose the appropriate tool for the job.
>
> (*)These are just the ones I know, I assume MySql, Postgres etc have 
> similarly broad libraries.
>
> --
> Alan G
> Author of the Learn to Program web site http://www.alan-g.me.uk/ 
> http://www.amazon.com/author/alan_gauld
> Follow my photo-blog on Flickr at:
> http://www.flickr.com/photos/alangauldphotos
>
>
> _______________________________________________
> Tutor maillist  -  Tutor at python.org
> To unsubscribe or change subscription options:
> https://mail.python.org/mailman/listinfo/tutor
>
_______________________________________________
Tutor maillist  -  Tutor at python.org
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor



More information about the Tutor mailing list