[Tutor] toy program to find standard deviation of 2 columns of a sqlite3 database

Sat Jul 2 22:49:43 EDT 2022

Dear sir ,

I have tried writing a program in which I am calculating the population
standard deviation of two columns X1  & X2 of a table of sqlite3  in -
memory database .
import sqlite3
import statistics

class StdDev:
    def __init__(self):
        self.lst = []

    def step(self, value):
        self.lst.append(value)

    def finalize(self):
        return statistics.pstdev(self.lst)

con = sqlite3.connect(":memory:")
cur = con.cursor()
cur.execute("create table table1(X1 int, X2 int)")
ls = [(2, 4),
      (3, 5),
      (4, 7),
      (5, 8)]
cur.executemany("insert into table1 values(?, ?)", ls)
con.commit()
con.create_aggregate("stddev", 1, StdDev)
cur.execute("select stddev(X1), stddev(X2) from table1")
print(cur.fetchone())
cur.close()
con.close()

prints the output as :

(1.118033988749895, 1.5811388300841898)

which is correct .

My question is, as you can see i have used list inside the class StdDev, which

I think is an inefficient way to do this kind of problem because there may be

a large number of values in a column and it can take a huge amount of memory.

Can this problem be solved with the use of iterators ? What would be the best

approach to do it ?

Regards

Manprit Singh