From manpritsinghece at gmail.com  Sat Jul  2 22:49:43 2022
From: manpritsinghece at gmail.com (Manprit Singh)
Date: Sun, 3 Jul 2022 08:19:43 +0530
Subject: [Tutor] toy program to find standard deviation of 2 columns of a
 sqlite3 database
Message-ID: <CAO1OCwY2dRmwyNREKGEdkW0ph4gLa705_nwgJcGOge3MfY9=Qg@mail.gmail.com>

Dear sir ,

I have tried writing a program in which I am calculating the population
standard deviation of two columns X1  & X2 of a table of sqlite3  in -
memory database .
import sqlite3
import statistics

class StdDev:
    def __init__(self):
        self.lst = []

    def step(self, value):
        self.lst.append(value)

    def finalize(self):
        return statistics.pstdev(self.lst)


con = sqlite3.connect(":memory:")
cur = con.cursor()
cur.execute("create table table1(X1 int, X2 int)")
ls = [(2, 4),
      (3, 5),
      (4, 7),
      (5, 8)]
cur.executemany("insert into table1 values(?, ?)", ls)
con.commit()
con.create_aggregate("stddev", 1, StdDev)
cur.execute("select stddev(X1), stddev(X2) from table1")
print(cur.fetchone())
cur.close()
con.close()

prints the output as :

(1.118033988749895, 1.5811388300841898)

which is correct .

My question is, as you can see i have used list inside the class StdDev, which

I think is an inefficient way to do this kind of problem because there may be

a large number of values in a column and it can take a huge amount of memory.

Can this problem be solved with the use of iterators ? What would be the best

approach to do it ?

Regards

Manprit Singh

From manpritsinghece at gmail.com  Sun Jul  3 00:23:05 2022
From: manpritsinghece at gmail.com (Manprit Singh)
Date: Sun, 3 Jul 2022 09:53:05 +0530
Subject: [Tutor] toy program to find standard deviation of 2 columns of
 a sqlite3 database
In-Reply-To: <015201d88e91$13d44470$3b7ccd50$@gmail.com>
References: <CAO1OCwY2dRmwyNREKGEdkW0ph4gLa705_nwgJcGOge3MfY9=Qg@mail.gmail.com>
 <015201d88e91$13d44470$3b7ccd50$@gmail.com>
Message-ID: <CAO1OCwbNy7a--pd-j_HVT6PuQ0AL2DZ3Od_FJNErRweDtk6Kew@mail.gmail.com>

Yes it is obviously a homework kind of thing ....I do create problems for
myself , try to solve and  try to find better ways .

I was trying to learn the use of create- aggregate() .

Thank you for the hint that is given by you . Let me try .

My purpose is not to find the std dev. It is actually to learn how to use
the functions


On Sun, 3 Jul, 2022, 09:28 , <avi.e.gross at gmail.com> wrote:

> Maybe a dumb question but why the need to do a calculation of a standard
> deviation so indirectly in SQL and in the database but starting from
> Python?
>
> R has a built-in function that calculates a standard deviation. You can
> easily save it where you want after.
>
> As for memory use in general, there are several ways to calculate a
> standard
> deviation but there is a tradeoff. You could read in an entry at a time and
> add it to a continuing sum while keeping track of the number of entries.
> You
> then calculate the mean. Then you can read it al in AGAIN and calculate the
> difference between each number and the mean, and do the rest of the
> calculation by squaring that and so on as you sum that and finally play
> with
> a division and a square root.
>
> But that may not be needed except with large amounts of data.
>
> What am I missing? Is this an artificial HW situation?
>
>
> -----Original Message-----
> From: Tutor <tutor-bounces+avi.e.gross=gmail.com at python.org> On Behalf Of
> Manprit Singh
> Sent: Saturday, July 2, 2022 10:50 PM
> To: tutor at python.org
> Subject: [Tutor] toy program to find standard deviation of 2 columns of a
> sqlite3 database
>
> Dear sir ,
>
> I have tried writing a program in which I am calculating the population
> standard deviation of two columns X1  & X2 of a table of sqlite3  in -
> memory database .
> import sqlite3
> import statistics
>
> class StdDev:
>     def __init__(self):
>         self.lst = []
>
>     def step(self, value):
>         self.lst.append(value)
>
>     def finalize(self):
>         return statistics.pstdev(self.lst)
>
>
> con = sqlite3.connect(":memory:")
> cur = con.cursor()
> cur.execute("create table table1(X1 int, X2 int)") ls = [(2, 4),
>       (3, 5),
>       (4, 7),
>       (5, 8)]
> cur.executemany("insert into table1 values(?, ?)", ls)
> con.commit()
> con.create_aggregate("stddev", 1, StdDev) cur.execute("select stddev(X1),
> stddev(X2) from table1")
> print(cur.fetchone())
> cur.close()
> con.close()
>
> prints the output as :
>
> (1.118033988749895, 1.5811388300841898)
>
> which is correct .
>
> My question is, as you can see i have used list inside the class StdDev,
> which
>
> I think is an inefficient way to do this kind of problem because there may
> be
>
> a large number of values in a column and it can take a huge amount of
> memory.
>
> Can this problem be solved with the use of iterators ? What would be the
> best
>
> approach to do it ?
>
> Regards
>
> Manprit Singh
> _______________________________________________
> Tutor maillist  -  Tutor at python.org
> To unsubscribe or change subscription options:
> https://mail.python.org/mailman/listinfo/tutor
>
>

From manpritsinghece at gmail.com  Sun Jul  3 03:54:50 2022
From: manpritsinghece at gmail.com (Manprit Singh)
Date: Sun, 3 Jul 2022 13:24:50 +0530
Subject: [Tutor] toy program to find standard deviation of 2 columns of
 a sqlite3 database
In-Reply-To: <CAO1OCwbNy7a--pd-j_HVT6PuQ0AL2DZ3Od_FJNErRweDtk6Kew@mail.gmail.com>
References: <CAO1OCwY2dRmwyNREKGEdkW0ph4gLa705_nwgJcGOge3MfY9=Qg@mail.gmail.com>
 <015201d88e91$13d44470$3b7ccd50$@gmail.com>
 <CAO1OCwbNy7a--pd-j_HVT6PuQ0AL2DZ3Od_FJNErRweDtk6Kew@mail.gmail.com>
Message-ID: <CAO1OCwb6UpZasJ0JxUrF_+4g-cGrs0VzNkMwbbqHNKmBDd_ckw@mail.gmail.com>

Dear Sir,

I have chosen this standard deviation as an exercise because there are two
steps: first you have to find the mean, then subtract the mean from each
value of the column .

Writing an aggregate function for this using python's sqlite3 seems a
little difficult as there is only single step function inside the class,
used to make that . Kindly put some light .

This is just an exercise to understand how this create_aggregate() works .
Kindly help
Regards
Manprit Singh


On Sun, Jul 3, 2022 at 9:53 AM Manprit Singh <manpritsinghece at gmail.com>
wrote:

> Yes it is obviously a homework kind of thing ....I do create problems for
> myself , try to solve and  try to find better ways .
>
> I was trying to learn the use of create- aggregate() .
>
> Thank you for the hint that is given by you . Let me try .
>
> My purpose is not to find the std dev. It is actually to learn how to use
> the functions
>
>
> On Sun, 3 Jul, 2022, 09:28 , <avi.e.gross at gmail.com> wrote:
>
>> Maybe a dumb question but why the need to do a calculation of a standard
>> deviation so indirectly in SQL and in the database but starting from
>> Python?
>>
>> R has a built-in function that calculates a standard deviation. You can
>> easily save it where you want after.
>>
>> As for memory use in general, there are several ways to calculate a
>> standard
>> deviation but there is a tradeoff. You could read in an entry at a time
>> and
>> add it to a continuing sum while keeping track of the number of entries.
>> You
>> then calculate the mean. Then you can read it al in AGAIN and calculate
>> the
>> difference between each number and the mean, and do the rest of the
>> calculation by squaring that and so on as you sum that and finally play
>> with
>> a division and a square root.
>>
>> But that may not be needed except with large amounts of data.
>>
>> What am I missing? Is this an artificial HW situation?
>>
>>
>> -----Original Message-----
>> From: Tutor <tutor-bounces+avi.e.gross=gmail.com at python.org> On Behalf Of
>> Manprit Singh
>> Sent: Saturday, July 2, 2022 10:50 PM
>> To: tutor at python.org
>> Subject: [Tutor] toy program to find standard deviation of 2 columns of a
>> sqlite3 database
>>
>> Dear sir ,
>>
>> I have tried writing a program in which I am calculating the population
>> standard deviation of two columns X1  & X2 of a table of sqlite3  in -
>> memory database .
>> import sqlite3
>> import statistics
>>
>> class StdDev:
>>     def __init__(self):
>>         self.lst = []
>>
>>     def step(self, value):
>>         self.lst.append(value)
>>
>>     def finalize(self):
>>         return statistics.pstdev(self.lst)
>>
>>
>> con = sqlite3.connect(":memory:")
>> cur = con.cursor()
>> cur.execute("create table table1(X1 int, X2 int)") ls = [(2, 4),
>>       (3, 5),
>>       (4, 7),
>>       (5, 8)]
>> cur.executemany("insert into table1 values(?, ?)", ls)
>> con.commit()
>> con.create_aggregate("stddev", 1, StdDev) cur.execute("select stddev(X1),
>> stddev(X2) from table1")
>> print(cur.fetchone())
>> cur.close()
>> con.close()
>>
>> prints the output as :
>>
>> (1.118033988749895, 1.5811388300841898)
>>
>> which is correct .
>>
>> My question is, as you can see i have used list inside the class StdDev,
>> which
>>
>> I think is an inefficient way to do this kind of problem because there may
>> be
>>
>> a large number of values in a column and it can take a huge amount of
>> memory.
>>
>> Can this problem be solved with the use of iterators ? What would be the
>> best
>>
>> approach to do it ?
>>
>> Regards
>>
>> Manprit Singh
>> _______________________________________________
>> Tutor maillist  -  Tutor at python.org
>> To unsubscribe or change subscription options:
>> https://mail.python.org/mailman/listinfo/tutor
>>
>>

From __peter__ at web.de  Sun Jul  3 05:02:19 2022
From: __peter__ at web.de (Peter Otten)
Date: Sun, 3 Jul 2022 11:02:19 +0200
Subject: [Tutor] toy program to find standard deviation of 2 columns of
 a sqlite3 database
In-Reply-To: <CAO1OCwbNy7a--pd-j_HVT6PuQ0AL2DZ3Od_FJNErRweDtk6Kew@mail.gmail.com>
References: <CAO1OCwY2dRmwyNREKGEdkW0ph4gLa705_nwgJcGOge3MfY9=Qg@mail.gmail.com>
 <015201d88e91$13d44470$3b7ccd50$@gmail.com>
 <CAO1OCwbNy7a--pd-j_HVT6PuQ0AL2DZ3Od_FJNErRweDtk6Kew@mail.gmail.com>
Message-ID: <9fd97b5f-67cf-0e29-e7eb-1398eea478db@web.de>

On 03/07/2022 06:23, Manprit Singh wrote:
> Yes it is obviously a homework kind of thing ....I do create problems for
> myself , try to solve and  try to find better ways .
>
> I was trying to learn the use of create- aggregate() .
>
> Thank you for the hint that is given by you . Let me try .
>
> My purpose is not to find the std dev. It is actually to learn how to use
> the functions
>
>
> On Sun, 3 Jul, 2022, 09:28 , <avi.e.gross at gmail.com> wrote:
>
>> Maybe a dumb question but why the need to do a calculation of a standard
>> deviation so indirectly in SQL and in the database but starting from
>> Python?
>>
>> R has a built-in function that calculates a standard deviation. You can
>> easily save it where you want after.
>>
>> As for memory use in general, there are several ways to calculate a
>> standard
>> deviation but there is a tradeoff. You could read in an entry at a time and
>> add it to a continuing sum while keeping track of the number of entries.
>> You
>> then calculate the mean. Then you can read it al in AGAIN and calculate the
>> difference between each number and the mean, and do the rest of the
>> calculation by squaring that and so on as you sum that and finally play
>> with
>> a division and a square root.
>>
>> But that may not be needed except with large amounts of data.
>>
>> What am I missing? Is this an artificial HW situation?
>>
>>
>> -----Original Message-----
>> From: Tutor <tutor-bounces+avi.e.gross=gmail.com at python.org> On Behalf Of
>> Manprit Singh
>> Sent: Saturday, July 2, 2022 10:50 PM
>> To: tutor at python.org
>> Subject: [Tutor] toy program to find standard deviation of 2 columns of a
>> sqlite3 database
>>
>> Dear sir ,
>>
>> I have tried writing a program in which I am calculating the population
>> standard deviation of two columns X1  & X2 of a table of sqlite3  in -
>> memory database .
>> import sqlite3
>> import statistics
>>
>> class StdDev:
>>      def __init__(self):
>>          self.lst = []
>>
>>      def step(self, value):
>>          self.lst.append(value)
>>
>>      def finalize(self):
>>          return statistics.pstdev(self.lst)
>>
>>
>> con = sqlite3.connect(":memory:")
>> cur = con.cursor()
>> cur.execute("create table table1(X1 int, X2 int)") ls = [(2, 4),
>>        (3, 5),
>>        (4, 7),
>>        (5, 8)]
>> cur.executemany("insert into table1 values(?, ?)", ls)
>> con.commit()
>> con.create_aggregate("stddev", 1, StdDev) cur.execute("select stddev(X1),
>> stddev(X2) from table1")
>> print(cur.fetchone())
>> cur.close()
>> con.close()
>>
>> prints the output as :
>>
>> (1.118033988749895, 1.5811388300841898)
>>
>> which is correct .
>>
>> My question is, as you can see i have used list inside the class StdDev,
>> which
>>
>> I think is an inefficient way to do this kind of problem because there may
>> be
>>
>> a large number of values in a column and it can take a huge amount of
>> memory.
>>
>> Can this problem be solved with the use of iterators ?

I don't think that you can convert the callback into a generator here;
and if you could it probably wouldn't help as the statistics module uses
a two-pass algorithm.

You could switch to a simpler algorithm like the one used by some old
calculators that only keep track of sum(xi), sum(xi*xi) and count, and
calculate stddev from these in the finalize() method.

  What would be the
>> best
>>
>> approach to do it ?
>>
>> Regards
>>
>> Manprit Singh
>> _______________________________________________
>> Tutor maillist  -  Tutor at python.org
>> To unsubscribe or change subscription options:
>> https://mail.python.org/mailman/listinfo/tutor
>>
>>
> _______________________________________________
> Tutor maillist  -  Tutor at python.org
> To unsubscribe or change subscription options:
> https://mail.python.org/mailman/listinfo/tutor
>


From alan.gauld at yahoo.co.uk  Sun Jul  3 08:30:51 2022
From: alan.gauld at yahoo.co.uk (Alan Gauld)
Date: Sun, 3 Jul 2022 13:30:51 +0100
Subject: [Tutor] toy program to find standard deviation of 2 columns of
 a sqlite3 database
In-Reply-To: <CAO1OCwY2dRmwyNREKGEdkW0ph4gLa705_nwgJcGOge3MfY9=Qg@mail.gmail.com>
References: <CAO1OCwY2dRmwyNREKGEdkW0ph4gLa705_nwgJcGOge3MfY9=Qg@mail.gmail.com>
Message-ID: <t9s25r$jpp$1@ciao.gmane.io>

On 03/07/2022 03:49, Manprit Singh wrote:

> con.create_aggregate("stddev", 1, StdDev)
> cur.execute("select stddev(X1), stddev(X2) from table1")

I just wanted to say thanks for posting this. I have never used,
nor seen anyone else use, the ability to create a user defined aggregate
function in SQLite - usually I just extract the data into python
and use python to do the aggregation. But your question made me
read up on how that all worked so it has taught me something new.
(It also makes me appreciate how the Pyhon API is much easier
to use than the raw C API to SQLite!)

> My question is, as you can see i have used list inside the class StdDev, which
> I think is an inefficient way to do this kind of problem because there may be
> a large number of values in a column and it can take a huge amount of memory.
> Can this problem be solved with the use of iterators ? What would be the best
> approach to do it ?

If I'm working with so much data that this would be a problem I'd
use the database itself to store the intermediate data. That would
be much slower but much less memory dependant. But as others have
said, with aggregate functions you don't usually need to store
data from all rows you just store a few inermediate results
which you combine at the end.

If you are trying to use an in-memory function - like the
stddev function here - then you need to fit all the data in
memory anyway so the function will simply not work if you can't
store the data in RAM. In that case you need to find(or write)
another function that doesn't use memory for storage or
uses less storage.

It is also worth pointing out that most industrial strength
SQL databases come with a far richer set of aggregate functions
than SQLite. So if you do have to work with large volumes of data
you should probably switch to someting like Oracle, DB2, SQLServer(*),
etc and just use the functions built into the server. If they
don't have such a function they also have amuch simpler way
of defining stored procedures. As ever, choose the appropriate
tool for the job.

(*)These are just the ones I know, I assume MySql, Postgres etc
have similarly broad libraries.

-- 
Alan G
Author of the Learn to Program web site
http://www.alan-g.me.uk/
http://www.amazon.com/author/alan_gauld
Follow my photo-blog on Flickr at:
http://www.flickr.com/photos/alangauldphotos


From manpritsinghece at gmail.com  Sun Jul  3 09:01:20 2022
From: manpritsinghece at gmail.com (Manprit Singh)
Date: Sun, 3 Jul 2022 18:31:20 +0530
Subject: [Tutor] toy program to find standard deviation of 2 columns of
 a sqlite3 database
In-Reply-To: <t9s25r$jpp$1@ciao.gmane.io>
References: <CAO1OCwY2dRmwyNREKGEdkW0ph4gLa705_nwgJcGOge3MfY9=Qg@mail.gmail.com>
 <t9s25r$jpp$1@ciao.gmane.io>
Message-ID: <CAO1OCwZA4CzVC+PEGVUT-7EB0DRoqcBb9qon1N=YhdOhmUunRw@mail.gmail.com>

Sir,
I am just going through all the functionalities available in sqlite3 module
, just to see if I can use sqlite3 as a good data analysis tool or not .

Upto this point I have figured out that and sqlite data base file can be an
excellent replacement for data stored in files .

You can preserve data in a structured form, email to someone who need it
etc etc .

But for good data analysis ....I found pandas is superior . I use pandas
for data analysis and visualization .


Btw ....this is true . You should use right tool for your task .


Regards
Manprit Singh

On Sun, 3 Jul, 2022, 18:02 Alan Gauld via Tutor, <tutor at python.org> wrote:

> On 03/07/2022 03:49, Manprit Singh wrote:
>
> > con.create_aggregate("stddev", 1, StdDev)
> > cur.execute("select stddev(X1), stddev(X2) from table1")
>
> I just wanted to say thanks for posting this. I have never used,
> nor seen anyone else use, the ability to create a user defined aggregate
> function in SQLite - usually I just extract the data into python
> and use python to do the aggregation. But your question made me
> read up on how that all worked so it has taught me something new.
> (It also makes me appreciate how the Pyhon API is much easier
> to use than the raw C API to SQLite!)
>
> > My question is, as you can see i have used list inside the class StdDev,
> which
> > I think is an inefficient way to do this kind of problem because there
> may be
> > a large number of values in a column and it can take a huge amount of
> memory.
> > Can this problem be solved with the use of iterators ? What would be the
> best
> > approach to do it ?
>
> If I'm working with so much data that this would be a problem I'd
> use the database itself to store the intermediate data. That would
> be much slower but much less memory dependant. But as others have
> said, with aggregate functions you don't usually need to store
> data from all rows you just store a few inermediate results
> which you combine at the end.
>
> If you are trying to use an in-memory function - like the
> stddev function here - then you need to fit all the data in
> memory anyway so the function will simply not work if you can't
> store the data in RAM. In that case you need to find(or write)
> another function that doesn't use memory for storage or
> uses less storage.
>
> It is also worth pointing out that most industrial strength
> SQL databases come with a far richer set of aggregate functions
> than SQLite. So if you do have to work with large volumes of data
> you should probably switch to someting like Oracle, DB2, SQLServer(*),
> etc and just use the functions built into the server. If they
> don't have such a function they also have amuch simpler way
> of defining stored procedures. As ever, choose the appropriate
> tool for the job.
>
> (*)These are just the ones I know, I assume MySql, Postgres etc
> have similarly broad libraries.
>
> --
> Alan G
> Author of the Learn to Program web site
> http://www.alan-g.me.uk/
> http://www.amazon.com/author/alan_gauld
> Follow my photo-blog on Flickr at:
> http://www.flickr.com/photos/alangauldphotos
>
>
> _______________________________________________
> Tutor maillist  -  Tutor at python.org
> To unsubscribe or change subscription options:
> https://mail.python.org/mailman/listinfo/tutor
>

From manpritsinghece at gmail.com  Sun Jul  3 12:59:41 2022
From: manpritsinghece at gmail.com (Manprit Singh)
Date: Sun, 3 Jul 2022 22:29:41 +0530
Subject: [Tutor] toy program to find standard deviation of 2 columns of
 a sqlite3 database
In-Reply-To: <007001d88ef3$3e032420$ba096c60$@gmail.com>
References: <CAO1OCwY2dRmwyNREKGEdkW0ph4gLa705_nwgJcGOge3MfY9=Qg@mail.gmail.com>
 <015201d88e91$13d44470$3b7ccd50$@gmail.com>
 <CAO1OCwbNy7a--pd-j_HVT6PuQ0AL2DZ3Od_FJNErRweDtk6Kew@mail.gmail.com>
 <CAO1OCwb6UpZasJ0JxUrF_+4g-cGrs0VzNkMwbbqHNKmBDd_ckw@mail.gmail.com>
 <007001d88ef3$3e032420$ba096c60$@gmail.com>
Message-ID: <CAO1OCwa0qoeEpkjdPWEOOR7Egjn__NA91Uh=uXvTab8t3-gOhQ@mail.gmail.com>

Dear Sir,

Leaving all that standard deviation thing. I would say something about
python's sqlite3 module . There is a
create_aggregate(*name*, *n_arg*, *aggregate_class*)  function there, which
can create a user defined aggregate function .
This function requires an aggregate class as argument, this class must
contain a step method and finalize method, as
per written in Python documentation .

I just want to learn about using this create_aggregate().
So far what i have concluded is the step method of the class is called for
each element of the column and finalize is to
return the final result of the aggregate.

Again coming to the example given in the documentation :

import sqlite3
class MySum:
    def __init__(self):
        self.count = 0

    def step(self, value):
        self.count += value

    def finalize(self):
        return self.count
con = sqlite3.connect(":memory:")con.create_aggregate("mysum", 1,
MySum)cur = con.cursor()cur.execute("create table
test(i)")cur.execute("insert into test(i) values
(1)")cur.execute("insert into test(i) values (2)")cur.execute("select
mysum(i) from test")print(cur.fetchone()[0])
con.close()

The answer is 3 , which is the correct answer and is the sum of values
in the column named i in the table test.It was easy to implement as
the need only

is to add all values of the column. For each value of the column i,
the step is called and each value of i gets added to self.count. at
last self.count

will represent the sum of all values of the column and is returned
through the finalize method.

Now as sum is an aggregate function, same way population standard
deviation is also an aggregate function. We should be able to make a
user defined function

to find the population standard deviation of a column or multiple
columns of a sqlite3 database . Hopefully you agree ?

Now to find this if i am going to write the class, in the step method,
i can only count the values in the column and find the sum of all
values

for getting mean.I am not getting the mechanism to subtract the mean
from each value of the column in the same step method or by any other
way

in the class.


Hopefully my question is more clear now.

Btw I would like to write a one liner to calculate population std
deviation  of a list:
lst = [2, 5, 7, 9, 10]
mean = sum(lst)/len(lst)
std_dev = (sum((ele-mean)**2 for ele in lst)/len(lst))**0.5
print(std_dev)

prints
2.870540018881465 which is the right answer.


Regards

Manprit Singh


On Sun, Jul 3, 2022 at 9:10 PM <avi.e.gross at gmail.com> wrote:

> Manprit,
>
> At some point, questions are not about Python but are about a specialized
> package or even about SQL statements.
>
> I can understand your wanting to learn by experimentation and the gist of
> your programming ideas seems to be to use Python and a set of functions to
> drive an SQL database when, as has been pointed out, it can be done
> directly
> in the database.
>
> But I think you have made your question clearer and you seem focused on
> ways
> to minimize memory use such as you might find on a device like Raspberry
> Pi.
>
> So if you asked without the SQL part, people might get into the question.
>
> If I understand it, you wish to make your own class, StdDev, to somehow
> manage getting the data incrementally and calculating a standard deviation,
> rather than reading it all at once. Is that the question?
>
> Your code, of course, makes absolutely no sense as written as all it does
> is
> create a few items and stores them in a new table, just so it can get them
> again! So of course you already fill your memory with the list you made.
> The
> code you want to share with us does not need any of these steps. It needs
> to
> start with a database out there that you want to read from. There is
> nothing
> wrong with your code just that it gets in the way of seeing what you want
> to
> do.
>
> You then leave some of us (meaning me) having to do research to look up
> what
> the heck create_aggregate() does and when I found out, I stopped wanting to
> continue.
>
> Someone else may want to help you but I am heading our for a long drive and
> already answered you in a way that I find reasonable.
>
> Look up the definition for a variance and then standard deviation. Decide
> which version to use and note some divide by N and some by N-1 depending on
> whether you have all of the population or a sample. Overall the scheme is
> to
> take each number minus the mean and square it and sum that and finally
> divide by N-1 for the variance and then take the square root of that for
> the
> standard deviation. It can be done trivially using functions already
> available but that does use memory all at once.
>
> If you want to fetch one pair of numbers at a time, the algorithm is fairly
> simple.
>
> Start with two accumulator variables, one for each of the numbers you want
> to calculate the standard deviation for.
>
> In a loop, read one row at a time, or some small number of rows like 100.
> Add the right numbers to the right accumulator, handling any NA values if
> needed, and keeping track of how many valid items in each you processed.
>
> When done, calculate the mean of each.
>
> Restart a new query and again in a loop get your numbers one at a time (or
> a
> hundred) and this time use a new accumulator in which you keep adding the
> current number minus the mean calculated and squared.
>
> When all the data is calculated, you have the sums. Divide by N-1 then take
> the square root.
>
> So you want to know how to use something that only gets one row of data at
> a
> time. That is here not really a Python issue as you are able to call some
> function in your module that presumably gets the next row of an active
> query. If you can call that directly, fine. If you want to hide it in an
> iterator, also fine.
>
> Given your current class, your question for this part of the exercise seems
> to be how to rewrite the class. But my perspective may not match yours as
> you seem to want to hand the object to a function that calls on it somehow
> repeatedly to get the task done. You show no code as to how you might do it
> in a way I am thinking of.
>
> The dunder init section creates an empty list. You no longer want the list.
> So what do you want? My guess is this could be the place you open a
> connection to the database or a specific query.
>
> Your step() function presumably no longer wants to append a value to the no
> longer existent list. What makes sense for you here? Is the calculation
> happening inside the class? If so, the step not only gets a row of data but
> is perhaps working on summing to eventually calculate a mean. As noted
> above, what I am talking about requires two passes through the data so two
> sets of steps like this. Maybe you want a step1() that is summing the
> original data and another function called step2() that is in some ways
> similar and sums the second part once it has a mean.
>
> And what does your finalize() method do? Right now it returns the
> calculation on the list that you no longer want to use. In the outline I am
> sketching, you might want to use it just to return the computed values and
> maybe close the database connection.
>
> Maybe I am confused but you seem to ask to calculate the standard deviation
> for TWO columns of data but your code seems to also work on one set of
> numbers. You need to get that straight.
>
> Have fun.
>
> I am dropping out of this one.
>
>
>
>
> -----Original Message-----
> From: Tutor <tutor-bounces+avi.e.gross=gmail.com at python.org> On Behalf Of
> Manprit Singh
> Sent: Sunday, July 3, 2022 3:55 AM
> To: tutor at python.org
> Subject: Re: [Tutor] toy program to find standard deviation of 2 columns of
> a sqlite3 database
>
> Dear Sir,
>
> I have chosen this standard deviation as an exercise because there are two
> steps: first you have to find the mean, then subtract the mean from each
> value of the column .
>
> Writing an aggregate function for this using python's sqlite3 seems a
> little
> difficult as there is only single step function inside the class, used to
> make that . Kindly put some light .
>
> This is just an exercise to understand how this create_aggregate() works .
> Kindly help
> Regards
> Manprit Singh
>
>
>
> On Sun, Jul 3, 2022 at 9:53 AM Manprit Singh <manpritsinghece at gmail.com>
> wrote:
>
> > Yes it is obviously a homework kind of thing ....I do create problems
> > for myself , try to solve and  try to find better ways .
> >
> > I was trying to learn the use of create- aggregate() .
> >
> > Thank you for the hint that is given by you . Let me try .
> >
> > My purpose is not to find the std dev. It is actually to learn how to
> > use the functions
> >
> >
> > On Sun, 3 Jul, 2022, 09:28 , <avi.e.gross at gmail.com> wrote:
> >
> >> Maybe a dumb question but why the need to do a calculation of a
> >> standard deviation so indirectly in SQL and in the database but
> >> starting from Python?
> >>
> >> R has a built-in function that calculates a standard deviation. You
> >> can easily save it where you want after.
> >>
> >> As for memory use in general, there are several ways to calculate a
> >> standard deviation but there is a tradeoff. You could read in an
> >> entry at a time and add it to a continuing sum while keeping track of
> >> the number of entries.
> >> You
> >> then calculate the mean. Then you can read it al in AGAIN and
> >> calculate the difference between each number and the mean, and do the
> >> rest of the calculation by squaring that and so on as you sum that
> >> and finally play with a division and a square root.
> >>
> >> But that may not be needed except with large amounts of data.
> >>
> >> What am I missing? Is this an artificial HW situation?
> >>
> >>
> >> -----Original Message-----
> >> From: Tutor <tutor-bounces+avi.e.gross=gmail.com at python.org> On
> >> Behalf Of Manprit Singh
> >> Sent: Saturday, July 2, 2022 10:50 PM
> >> To: tutor at python.org
> >> Subject: [Tutor] toy program to find standard deviation of 2 columns
> >> of a
> >> sqlite3 database
> >>
> >> Dear sir ,
> >>
> >> I have tried writing a program in which I am calculating the
> >> population standard deviation of two columns X1  & X2 of a table of
> >> sqlite3  in - memory database .
> >> import sqlite3
> >> import statistics
> >>
> >> class StdDev:
> >>     def __init__(self):
> >>         self.lst = []
> >>
> >>     def step(self, value):
> >>         self.lst.append(value)
> >>
> >>     def finalize(self):
> >>         return statistics.pstdev(self.lst)
> >>
> >>
> >> con = sqlite3.connect(":memory:")
> >> cur = con.cursor()
> >> cur.execute("create table table1(X1 int, X2 int)") ls = [(2, 4),
> >>       (3, 5),
> >>       (4, 7),
> >>       (5, 8)]
> >> cur.executemany("insert into table1 values(?, ?)", ls)
> >> con.commit()
> >> con.create_aggregate("stddev", 1, StdDev) cur.execute("select
> >> stddev(X1),
> >> stddev(X2) from table1")
> >> print(cur.fetchone())
> >> cur.close()
> >> con.close()
> >>
> >> prints the output as :
> >>
> >> (1.118033988749895, 1.5811388300841898)
> >>
> >> which is correct .
> >>
> >> My question is, as you can see i have used list inside the class
> >> StdDev, which
> >>
> >> I think is an inefficient way to do this kind of problem because
> >> there may be
> >>
> >> a large number of values in a column and it can take a huge amount of
> >> memory.
> >>
> >> Can this problem be solved with the use of iterators ? What would be
> >> the best
> >>
> >> approach to do it ?
> >>
> >> Regards
> >>
> >> Manprit Singh
> >> _______________________________________________
> >> Tutor maillist  -  Tutor at python.org
> >> To unsubscribe or change subscription options:
> >> https://mail.python.org/mailman/listinfo/tutor
> >>
> >>
> _______________________________________________
> Tutor maillist  -  Tutor at python.org
> To unsubscribe or change subscription options:
> https://mail.python.org/mailman/listinfo/tutor
>
>

From wlfraed at ix.netcom.com  Sun Jul  3 13:05:56 2022
From: wlfraed at ix.netcom.com (Dennis Lee Bieber)
Date: Sun, 03 Jul 2022 13:05:56 -0400
Subject: [Tutor] toy program to find standard deviation of 2 columns of
 a sqlite3 database
References: <CAO1OCwY2dRmwyNREKGEdkW0ph4gLa705_nwgJcGOge3MfY9=Qg@mail.gmail.com>
Message-ID: <o5g3ch58dfvq2329aab1mhjjjo6fddkme5@4ax.com>

On Sun, 3 Jul 2022 08:19:43 +0530, Manprit Singh
<manpritsinghece at gmail.com> declaimed the following:

	{Seems gmane wasn't updating yesterday}


>
>My question is, as you can see i have used list inside the class StdDev, which
>
>I think is an inefficient way to do this kind of problem because there may be
>
>a large number of values in a column and it can take a huge amount of memory.
>
>Can this problem be solved with the use of iterators ? What would be the best
>
>approach to do it ?
>

	First off... What do you consider a "huge amount of memory"? 

>>> lrglst = list(range(100000000))
>>> sys.getsizeof(lrglst)
800000056
>>> 

	That is a list of 100 MILLION integers. In Python, it consumes 800
Mbytes (plus some overhead). 

	Unless you are running your application on a Raspberry-Pi you shouldn't
have any concern for memory (and even then, if you have a spinning disk on
USB with a swap file it should only slow you down, not crash -- R-PI 4B can
be had with up to 8GB of RAM, so might not need swap at all). Who is
running a computer these days with less than 8GB (my decade old system has
12GB, and rarely activates swap).

	Granted, each invocation (you have two in the example) will add another
such list... Still, that would only come to 1.6GB of RAM. Which, ideally,
would be freed up just as soon as the SQL statement finished processing and
returned the results.

	 SQLite3 is already doing iteration -- it invokes the .step() method
for each record in the data set. The only way to avoid collecting that
(large?) list is to change the algorithm for standard deviation itself --
and not use the one provided by the statistics module.

	The "naive" algorithm has been mentioned (in association with
calculators -- which work with single data point entry at a time (well, the
better ones handled X and Y on each step).

https://en.wikipedia.org/wiki/Algorithms_for_calculating_variance#Na%C3%AFve_algorithm

	UNTESTED

class sd_Pop:
	def __init__(self):
		self.cnt = 0
		self.sum = 0.0
		self.sum_squares = 0.0

	def step(self, x):
		self.cnt +=1
		self.sum += x
		self.sum_squares += (x * x)

	def finalize(self):
		return sqrt(self.sum_squares - 
			((self.sum * self.sum) / self.cnt)) /  self.cnt)

	There... No accumulation of a long list, just one integer and two
floating point values.

	Note that the Wikipedia link under "Computing shifted data" is a
hypothetical improvement over the pure "naive" algorithm, and (if you made
the sample code a class so all those "global" references change to
self.###) could also fit into the SQLite3 aggregate.


-- 
	Wulfraed                 Dennis Lee Bieber         AF6VN
	wlfraed at ix.netcom.com    http://wlfraed.microdiversity.freeddns.org/


From manpritsinghece at gmail.com  Sun Jul  3 14:12:26 2022
From: manpritsinghece at gmail.com (Manprit Singh)
Date: Sun, 3 Jul 2022 23:42:26 +0530
Subject: [Tutor] toy program to find standard deviation of 2 columns of
 a sqlite3 database
In-Reply-To: <o5g3ch58dfvq2329aab1mhjjjo6fddkme5@4ax.com>
References: <CAO1OCwY2dRmwyNREKGEdkW0ph4gLa705_nwgJcGOge3MfY9=Qg@mail.gmail.com>
 <o5g3ch58dfvq2329aab1mhjjjo6fddkme5@4ax.com>
Message-ID: <CAO1OCwY_6A2m0MLUXjpqVvpPFGC34sfvZ8L8Vg0vTArLRqZ_Ug@mail.gmail.com>

Dear Sir,

Many Many thanks to Dennis Lee Bieber. (Not for the code he wrote ) But for
the last lines after the code in which he mentions various ways to
calculate Std. Dev (Naive method, two pass method & Welford's Algo) - That
clearly shows I need to study more  . I found out various ways to calculate
Standard deviation. My Question is finally answered.  Will be back with the
implementation once gone through all.

Regards
Manprit Singh


On Sun, Jul 3, 2022 at 10:37 PM Dennis Lee Bieber <wlfraed at ix.netcom.com>
wrote:

> On Sun, 3 Jul 2022 08:19:43 +0530, Manprit Singh
> <manpritsinghece at gmail.com> declaimed the following:
>
>         {Seems gmane wasn't updating yesterday}
>
>
> >
> >My question is, as you can see i have used list inside the class StdDev,
> which
> >
> >I think is an inefficient way to do this kind of problem because there
> may be
> >
> >a large number of values in a column and it can take a huge amount of
> memory.
> >
> >Can this problem be solved with the use of iterators ? What would be the
> best
> >
> >approach to do it ?
> >
>
>         First off... What do you consider a "huge amount of memory"?
>
> >>> lrglst = list(range(100000000))
> >>> sys.getsizeof(lrglst)
> 800000056
> >>>
>
>         That is a list of 100 MILLION integers. In Python, it consumes 800
> Mbytes (plus some overhead).
>
>         Unless you are running your application on a Raspberry-Pi you
> shouldn't
> have any concern for memory (and even then, if you have a spinning disk on
> USB with a swap file it should only slow you down, not crash -- R-PI 4B can
> be had with up to 8GB of RAM, so might not need swap at all). Who is
> running a computer these days with less than 8GB (my decade old system has
> 12GB, and rarely activates swap).
>
>         Granted, each invocation (you have two in the example) will add
> another
> such list... Still, that would only come to 1.6GB of RAM. Which, ideally,
> would be freed up just as soon as the SQL statement finished processing and
> returned the results.
>
>          SQLite3 is already doing iteration -- it invokes the .step()
> method
> for each record in the data set. The only way to avoid collecting that
> (large?) list is to change the algorithm for standard deviation itself --
> and not use the one provided by the statistics module.
>
>         The "naive" algorithm has been mentioned (in association with
> calculators -- which work with single data point entry at a time (well, the
> better ones handled X and Y on each step).
>
>
> https://en.wikipedia.org/wiki/Algorithms_for_calculating_variance#Na%C3%AFve_algorithm
>
>         UNTESTED
>
> class sd_Pop:
>         def __init__(self):
>                 self.cnt = 0
>                 self.sum = 0.0
>                 self.sum_squares = 0.0
>
>         def step(self, x):
>                 self.cnt +=1
>                 self.sum += x
>                 self.sum_squares += (x * x)
>
>         def finalize(self):
>                 return sqrt(self.sum_squares -
>                         ((self.sum * self.sum) / self.cnt)) /  self.cnt)
>
>         There... No accumulation of a long list, just one integer and two
> floating point values.
>
>         Note that the Wikipedia link under "Computing shifted data" is a
> hypothetical improvement over the pure "naive" algorithm, and (if you made
> the sample code a class so all those "global" references change to
> self.###) could also fit into the SQLite3 aggregate.
>
>
> --
>         Wulfraed                 Dennis Lee Bieber         AF6VN
>         wlfraed at ix.netcom.com
> http://wlfraed.microdiversity.freeddns.org/
>
> _______________________________________________
> Tutor maillist  -  Tutor at python.org
> To unsubscribe or change subscription options:
> https://mail.python.org/mailman/listinfo/tutor
>

From wlfraed at ix.netcom.com  Sun Jul  3 16:41:59 2022
From: wlfraed at ix.netcom.com (Dennis Lee Bieber)
Date: Sun, 03 Jul 2022 16:41:59 -0400
Subject: [Tutor] toy program to find standard deviation of 2 columns of
 a sqlite3 database
References: <CAO1OCwY2dRmwyNREKGEdkW0ph4gLa705_nwgJcGOge3MfY9=Qg@mail.gmail.com>
 <015201d88e91$13d44470$3b7ccd50$@gmail.com>
 <CAO1OCwbNy7a--pd-j_HVT6PuQ0AL2DZ3Od_FJNErRweDtk6Kew@mail.gmail.com>
 <CAO1OCwb6UpZasJ0JxUrF_+4g-cGrs0VzNkMwbbqHNKmBDd_ckw@mail.gmail.com>
 <007001d88ef3$3e032420$ba096c60$@gmail.com>
 <CAO1OCwa0qoeEpkjdPWEOOR7Egjn__NA91Uh=uXvTab8t3-gOhQ@mail.gmail.com>
Message-ID: <cvt3chdtqi66e5keg6dfdh24vk45ga8d9c@4ax.com>

On Sun, 3 Jul 2022 22:29:41 +0530, Manprit Singh
<manpritsinghece at gmail.com> declaimed the following:

>Now as sum is an aggregate function, same way population standard
>deviation is also an aggregate function. We should be able to make a
>user defined function
>

	It is also superfluous: SQLite3 already has count(), sum() and even
avg() built-in (though it lacks many of the bigger statistical computations
-- variance, std. dev, covariance, correlation, linear regression -- that
many of the bigger client/server RDBMs support).

>
>for getting mean.I am not getting the mechanism to subtract the mean
>from each value of the column in the same step method or by any other
>way
>
	Note that the definition for creating aggregates includes something for
number of arguments. Figure out how to specify multiple arguments and you
might be able to have SQLite3 provide "current item" and "mean" (avg) to
the step() method. I'm not going to take the time to experiment (for the
most part, I'd consider it simpler to just grab the entire dataset from the
database, and run the number crunching in Python, rather than the overhead
of having SQLite3 invoke a Python "callback" method for each item, just to
be able to have the SQLite3 return a single computed value.


>Btw I would like to write a one liner to calculate population std
>deviation  of a list:
>lst = [2, 5, 7, 9, 10]
>mean = sum(lst)/len(lst)
>std_dev = (sum((ele-mean)**2 for ele in lst)/len(lst))**0.5
>print(std_dev)

	Literally, except for the imports, that is just...

print(statistics.pstdev(lst))

>>> import math as m
>>> import statistics as s
>>> lst = [2, 5, 7, 9, 10]
>>> print(s.pstdev(lst))
2.870540018881465
>>> 

	Going up a level in complexity (IE -- not using the imported pstdev())

>>> print("Population Std. Dev.: %s" % m.sqrt( s.mean( (ele - s.mean(lst)) ** 2 for ele in lst)))
Population Std. Dev.: 2.870540018881465
>>> 

	This has the problem  that it invokes mean(lst) for each element, so
may be slower for large data sets (that problem will also exist if you
manage a multi-argument step() for SQLite3).

	Anytime you have

		sum(equation-with-elements-of-data) / len(data)

you can replace it with just

		mean(equation...)


-- 
	Wulfraed                 Dennis Lee Bieber         AF6VN
	wlfraed at ix.netcom.com    http://wlfraed.microdiversity.freeddns.org/


From anirudh.tamsekar at gmail.com  Sun Jul  3 17:05:57 2022
From: anirudh.tamsekar at gmail.com (Anirudh Tamsekar)
Date: Sun, 3 Jul 2022 14:05:57 -0700
Subject: [Tutor] Consecutive_zeros
Message-ID: <CAKG+6q8S7oTxVbgQjphss3rvxGEGQ=p8nc=4voH5Ni8DwjTgKw@mail.gmail.com>

Hello All,

Any help on this function below is highly appreciated.
Goal: analyze a binary string consisting of only zeros and ones. Your code
should find the biggest number of consecutive zeros in the string.

For example, given the string:
Its failing on below test case

print(consecutive_zeros("0"))
It should return 1. Returns 0

I get the max(length) as 1, if I print it separately


def consecutive_zeros(string):
    zeros = []
    length = []
    result = 0
    for i in string:
        if i == "0":
            zeros.append(i)        else:
            length.append(len(zeros))
            zeros.clear()
            result = max(length)
    return result


-Thanks,

Anirudh Tamsekar

From mats at wichmann.us  Sun Jul  3 18:10:46 2022
From: mats at wichmann.us (Mats Wichmann)
Date: Sun, 3 Jul 2022 16:10:46 -0600
Subject: [Tutor] Consecutive_zeros
In-Reply-To: <CAKG+6q8S7oTxVbgQjphss3rvxGEGQ=p8nc=4voH5Ni8DwjTgKw@mail.gmail.com>
References: <CAKG+6q8S7oTxVbgQjphss3rvxGEGQ=p8nc=4voH5Ni8DwjTgKw@mail.gmail.com>
Message-ID: <a51738f3-1fda-a5e5-95f3-3e117559a242@wichmann.us>


On 7/3/22 15:05, Anirudh Tamsekar wrote:
> Hello All,
> 
> Any help on this function below is highly appreciated.
> Goal: analyze a binary string consisting of only zeros and ones. Your code
> should find the biggest number of consecutive zeros in the string.
> 
> For example, given the string:
> Its failing on below test case
> 
> print(consecutive_zeros("0"))
> It should return 1. Returns 0
> 
> I get the max(length) as 1, if I print it separately
> 
> 
> def consecutive_zeros(string):
>     zeros = []
>     length = []
>     result = 0
>     for i in string:
>         if i == "0":
>             zeros.append(i)        else:
>             length.append(len(zeros))
>             zeros.clear()
>             result = max(length)
>     return result

you can certainly shorten that function, and even use a few tricks, but
that function looks like it would work for anything except the boundary
case you've given it.  Of course, boundary cases are one of the
challenges for programming - and writing good unit tests: "it works, but
what if you do something unexpected?"

Your problem is you only process the previous information if you see a
character that isn't zero, and since there's only a single zero in the
string it never triggers, so the saved count never gets collected.

Btw, there's no particular reason to use an array that you append zero
characters to and then take the length of, just use a counter.

From hilarycarris at yahoo.com  Sun Jul  3 15:46:10 2022
From: hilarycarris at yahoo.com (Hilary Carris)
Date: Sun, 3 Jul 2022 14:46:10 -0500
Subject: [Tutor] (no subject)
References: <F4193E95-05C1-4400-AE6F-25A2C35C869A.ref@yahoo.com>
Message-ID: <F4193E95-05C1-4400-AE6F-25A2C35C869A@yahoo.com>

I am unable to get my python correctly loaded onto my Mac.  I have to have a 3.9 version for a class and can not get it to function.
Can you help me?

From wlfraed at ix.netcom.com  Sun Jul  3 20:37:25 2022
From: wlfraed at ix.netcom.com (Dennis Lee Bieber)
Date: Sun, 03 Jul 2022 20:37:25 -0400
Subject: [Tutor] Consecutive_zeros
References: <CAKG+6q8S7oTxVbgQjphss3rvxGEGQ=p8nc=4voH5Ni8DwjTgKw@mail.gmail.com>
Message-ID: <17c4chdfiou5fip8d34dhvvu706hqr51kf@4ax.com>

On Sun, 3 Jul 2022 14:05:57 -0700, Anirudh Tamsekar
<anirudh.tamsekar at gmail.com> declaimed the following:

>Hello All,
>
>Any help on this function below is highly appreciated.
>Goal: analyze a binary string consisting of only zeros and ones. Your code
>should find the biggest number of consecutive zeros in the string.
>

	This sounds very much like the core of run-length encoding
(https://en.wikipedia.org/wiki/Run-length_encoding), with a filter stage to
determine the longest run of 0s...

>def consecutive_zeros(string):

... which makes the name somewhat misleading. I'd generalize to something
like
		longest_run(source, item)
where item is just one of the potential values within source data. That
would permit calls in the nature of:

	longest_run(somedatasource, "0")
	longest_run(somedatasource, "A")
	longest_run(somedatasource, 3.141592654)	#source is floats


>            result = max(length)

	You are only updating "result" when you encounter a non-"0" element. If
there is no non-"0" following any amount of "0" you will not update.

	If not forbidden by the assignment/class, recommend you look at the
itertools module -- in particular groupby().

>>> import itertools as it
>>> source = "000100111010000122"
>>> for k, g in it.groupby(source):
... 	print("key %s: length %s:" % (k, len(list(g))))
... 
key 0: length 3:
key 1: length 1:
key 0: length 2:
key 1: length 3:
key 0: length 1:
key 1: length 1:
key 0: length 4:
key 1: length 1:
key 2: length 2:
>>> 

	.takewhile() and .filterfalse() might also be of use.


-- 
	Wulfraed                 Dennis Lee Bieber         AF6VN
	wlfraed at ix.netcom.com    http://wlfraed.microdiversity.freeddns.org/


From manpritsinghece at gmail.com  Sun Jul  3 21:44:11 2022
From: manpritsinghece at gmail.com (Manprit Singh)
Date: Mon, 4 Jul 2022 07:14:11 +0530
Subject: [Tutor] toy program to find standard deviation of 2 columns of
 a sqlite3 database
In-Reply-To: <cvt3chdtqi66e5keg6dfdh24vk45ga8d9c@4ax.com>
References: <CAO1OCwY2dRmwyNREKGEdkW0ph4gLa705_nwgJcGOge3MfY9=Qg@mail.gmail.com>
 <015201d88e91$13d44470$3b7ccd50$@gmail.com>
 <CAO1OCwbNy7a--pd-j_HVT6PuQ0AL2DZ3Od_FJNErRweDtk6Kew@mail.gmail.com>
 <CAO1OCwb6UpZasJ0JxUrF_+4g-cGrs0VzNkMwbbqHNKmBDd_ckw@mail.gmail.com>
 <007001d88ef3$3e032420$ba096c60$@gmail.com>
 <CAO1OCwa0qoeEpkjdPWEOOR7Egjn__NA91Uh=uXvTab8t3-gOhQ@mail.gmail.com>
 <cvt3chdtqi66e5keg6dfdh24vk45ga8d9c@4ax.com>
Message-ID: <CAO1OCwZWx4vvUr53xR+U20rn1e0x06F9tDU0dmF=ziBBBZFOAg@mail.gmail.com>

Dear Sir,

In Pandas, handling an sql query is so simple as given below:
import sqlite3
import pandas as pd

con = sqlite3.connect(":memory:")
cur = con.cursor()
cur.execute("create table test(i, j)")
ls = [(2, 4), (3, 5), (4, 2), (7, 9)]
cur.executemany("insert into test(i, j) values (?, ?)", ls)

pd.read_sql("select i, j from test", con).std(ddof=0)

will give the desired result:

i    1.870829
j    2.549510
dtype: float64


On Mon, Jul 4, 2022 at 2:13 AM Dennis Lee Bieber <wlfraed at ix.netcom.com>
wrote:

> On Sun, 3 Jul 2022 22:29:41 +0530, Manprit Singh
> <manpritsinghece at gmail.com> declaimed the following:
>
> >Now as sum is an aggregate function, same way population standard
> >deviation is also an aggregate function. We should be able to make a
> >user defined function
> >
>
>         It is also superfluous: SQLite3 already has count(), sum() and even
> avg() built-in (though it lacks many of the bigger statistical computations
> -- variance, std. dev, covariance, correlation, linear regression -- that
> many of the bigger client/server RDBMs support).
>
> >
> >for getting mean.I am not getting the mechanism to subtract the mean
> >from each value of the column in the same step method or by any other
> >way
> >
>         Note that the definition for creating aggregates includes
> something for
> number of arguments. Figure out how to specify multiple arguments and you
> might be able to have SQLite3 provide "current item" and "mean" (avg) to
> the step() method. I'm not going to take the time to experiment (for the
> most part, I'd consider it simpler to just grab the entire dataset from the
> database, and run the number crunching in Python, rather than the overhead
> of having SQLite3 invoke a Python "callback" method for each item, just to
> be able to have the SQLite3 return a single computed value.
>
>
> >Btw I would like to write a one liner to calculate population std
> >deviation  of a list:
> >lst = [2, 5, 7, 9, 10]
> >mean = sum(lst)/len(lst)
> >std_dev = (sum((ele-mean)**2 for ele in lst)/len(lst))**0.5
> >print(std_dev)
>
>         Literally, except for the imports, that is just...
>
> print(statistics.pstdev(lst))
>
> >>> import math as m
> >>> import statistics as s
> >>> lst = [2, 5, 7, 9, 10]
> >>> print(s.pstdev(lst))
> 2.870540018881465
> >>>
>
>         Going up a level in complexity (IE -- not using the imported
> pstdev())
>
> >>> print("Population Std. Dev.: %s" % m.sqrt( s.mean( (ele - s.mean(lst))
> ** 2 for ele in lst)))
> Population Std. Dev.: 2.870540018881465
> >>>
>
>         This has the problem  that it invokes mean(lst) for each element,
> so
> may be slower for large data sets (that problem will also exist if you
> manage a multi-argument step() for SQLite3).
>
>         Anytime you have
>
>                 sum(equation-with-elements-of-data) / len(data)
>
> you can replace it with just
>
>                 mean(equation...)
>
>
> --
>         Wulfraed                 Dennis Lee Bieber         AF6VN
>         wlfraed at ix.netcom.com
> http://wlfraed.microdiversity.freeddns.org/
>
> _______________________________________________
> Tutor maillist  -  Tutor at python.org
> To unsubscribe or change subscription options:
> https://mail.python.org/mailman/listinfo/tutor
>

From martin at linux-ip.net  Sun Jul  3 22:15:58 2022
From: martin at linux-ip.net (Martin A. Brown)
Date: Sun, 3 Jul 2022 19:15:58 -0700
Subject: [Tutor] Consecutive_zeros
In-Reply-To: <CAKG+6q8S7oTxVbgQjphss3rvxGEGQ=p8nc=4voH5Ni8DwjTgKw@mail.gmail.com>
References: <CAKG+6q8S7oTxVbgQjphss3rvxGEGQ=p8nc=4voH5Ni8DwjTgKw@mail.gmail.com>
Message-ID: <693a92f-afa-eeb4-a56-164a2b78fe27@wonderfrog.net>


Hello there,

> For example, given the string:
> Its failing on below test case
> 
> print(consecutive_zeros("0"))
> It should return 1. Returns 0
> I get the max(length) as 1, if I print it separately

There are many ways to this sort of thing, and I think Mats has 
already responded that your function (below) looks good aside from 
your boundary case.  Of course, boundary cases are one of the things 
that software quality assurance and test tooling are there to help 
you discover.

If your code never hits the boundary condition, then you'll always 
sleep soundly.  But, as an excellent systems programmer I worked 
with used to say:  "I like to think that my software inhabits a 
hostile universe."  It's a defensive programming mindset that I have 
also tried to adopt.

> def consecutive_zeros(string):
>     zeros = []
>     length = []
>     result = 0
>     for i in string:
>         if i == "0":
>             zeros.append(i)        else:
>             length.append(len(zeros))
>             zeros.clear()
>             result = max(length)
>     return result

> Goal: analyze a binary string consisting of only zeros and ones. Your code
> should find the biggest number of consecutive zeros in the string.

I took a look at the problem statement and I thought immediately of 
the itertools.groupby function [0].  While this is not generally one 
of the Python modules that you'd encounter at your first exposure to 
Python, I thought I'd mention it to illustrate that many languages 
provide advanced tooling that allow you to solve these sorts of 
computer science problems. There's no substitute for knowing how to 
write or apply this yourself. There's also value in knowing what 
other options are available to you in the standard library.  (Like 
learning when to use a rasp, chisel, smoothing plane or sandpaper 
when working with wood:  All work.  Some work better in certain 
situations.)

The itertools module is fantastic for dealing with streams and 
infinite sequences.

This function (and others in this module) useful for stuff like your 
specific question, which appears to be a string that probably fits 
neatly into memory.

  def consecutive_zeros(s):
      zcount = 0
      for char, seq in itertools.groupby(s):
          if char != '0':
              continue
          t = len(list(seq))
          zcount = max((t, zcount))
      return zcount, s

I am mentioning this particular function not because I have any 
specific deep experience with the itertools module, but to share 
another way to think about approaching any specific problem.  I have 
never regretted reading in the standard libraries of any language 
nor the documentation pages of any system I've ever worked on.

In this case, your question sounded to me closer to a "pure" 
computer science question and in the vein of functional programming 
and itertools jumped directly into my memory.

One exercise I attempt frequently is to move a specific question, 
e.g. "count the number of zeroes in this string sequence" to a more 
general case of "give me the size of the longest repeated 
subsequence of items".  This often requires keeping extra data (e.g.
the collections.defaultdict) but means you sometimes have other 
answers at the ready.  In this case, you could also easily answer 
the question of 'what is the longest subsequence of "1"s' as well.

And, in keeping with the notion of testing* (as Mats has suggested) 
and worrying about boundaries, I include a toy program (below) that 
demonstrates the above function as well as an alternate, generalized 
example, as well as testing the cases in which the result is known 
ahead of time.

Best of luck,

-Martin

   * Note, what I did is not quite proper testing, but is a simple 
     illustration of how one could do it.  Using assert is a 
     convenient way to work with test tooling like py.test.

 [0] https://docs.python.org/3/library/itertools.html#itertools.groupby


#! /usr/bin/python
#
# -- response to Consecutive_zeros question

import os
import sys
import random
import itertools
import collections


def consecutive_items(s):
    counts = collections.defaultdict(int)
    for item, seq in itertools.groupby(s):
        counts[item] = max(sum(1 for _ in seq), counts[item])
    return counts, s


def consecutive_char_of_interest(s, item):
     counts, _ = consecutive_items(s)
     return counts[item], s


def consecutive_zeros(s):
    return(consecutive_char_of_interest(s, '0'))


# def consecutive_zeros(s):
#      zcount = 0
#      for char, seq in itertools.groupby(s):
#          if char != '0':
#              continue
#          t = len(list(seq))
#          zcount = max((t, zcount))
#      return zcount, s


def report(count, sample):
    sample = (sample[:50] + '...') if len(sample) > 75 else sample
    print('{:>9d} {}'.format(count, sample))


def fail_if_wrong(sample, correct):
    count, _ = consecutive_zeros(sample)
    assert count == correct
    report(count, sample)
    return count, sample


def cli(fin, fout, argv):
    fail_if_wrong('a', 0)          # - feed it "garbage"
    fail_if_wrong('Q', 0)          # - of several kinds
    fail_if_wrong('^', 0)          # - punctuation! What kind of planet?!
    fail_if_wrong('1', 0)          # - check boundary
    fail_if_wrong('0', 1)          # - check boundary
    fail_if_wrong('01', 1)         # - oh, let's be predictable
    fail_if_wrong('001', 2)
    fail_if_wrong('0001', 3)
    fail_if_wrong('00001', 4)
    fail_if_wrong('000001', 5)
    fail_if_wrong('0000001', 6)
    fail_if_wrong('00000001', 7)
    fail_if_wrong('000000001', 8)
    n = 37
    fail_if_wrong('0' * n, n)      # - favorite number?
    n = 1024 * 1024                # - some big number
    fail_if_wrong('0' * n, n)
    n = random.randint(9, 80)
    fail_if_wrong('0' * n, n)      # - let the computer pick

    report(*consecutive_zeros(''.join(random.choices('01', k=60))))
    report(*consecutive_zeros(''.join(random.choices('01', k=60))))
    report(*consecutive_zeros(''.join(random.choices('01', k=1024))))
    report(*consecutive_zeros(''.join(random.choices('01', k=1024*1024))))

    return os.EX_OK


if __name__ == '__main__':
    sys.exit(cli(sys.stdin, sys.stdout, sys.argv[1:]))

# -- end of file


-- 
Martin A. Brown
http://linux-ip.net/

From manpritsinghece at gmail.com  Mon Jul  4 01:49:54 2022
From: manpritsinghece at gmail.com (Manprit Singh)
Date: Mon, 4 Jul 2022 11:19:54 +0530
Subject: [Tutor] toy program to find standard deviation of 2 columns of
 a sqlite3 database
In-Reply-To: <009401d88f65$b102a8c0$1307fa40$@gmail.com>
References: <CAO1OCwY2dRmwyNREKGEdkW0ph4gLa705_nwgJcGOge3MfY9=Qg@mail.gmail.com>
 <015201d88e91$13d44470$3b7ccd50$@gmail.com>
 <CAO1OCwbNy7a--pd-j_HVT6PuQ0AL2DZ3Od_FJNErRweDtk6Kew@mail.gmail.com>
 <CAO1OCwb6UpZasJ0JxUrF_+4g-cGrs0VzNkMwbbqHNKmBDd_ckw@mail.gmail.com>
 <007001d88ef3$3e032420$ba096c60$@gmail.com>
 <CAO1OCwa0qoeEpkjdPWEOOR7Egjn__NA91Uh=uXvTab8t3-gOhQ@mail.gmail.com>
 <cvt3chdtqi66e5keg6dfdh24vk45ga8d9c@4ax.com>
 <CAO1OCwZWx4vvUr53xR+U20rn1e0x06F9tDU0dmF=ziBBBZFOAg@mail.gmail.com>
 <009401d88f65$b102a8c0$1307fa40$@gmail.com>
Message-ID: <CAO1OCwb8AYOierzi4cho8ssxx6MEq8sivVU26c7X9hchwcO1PA@mail.gmail.com>

Dear Sir,

My target is still to do this task in pure python. without using an
iterable.
Just shown you using pandas, how easy it is in the last mail . There are
several easy ways also. I will come up with a solution given by Dennis Lee
bieber.


I am exploring other options also, The one using numpy was also explored by
me as given below:

import sqlite3
import numpy as np

conn = sqlite3.connect(":memory:")
cur = conn.cursor()
cur.execute("create table table1(X1, X2)")
lst = [(2, 5),
         (4, 3),
         (5, 2),
         (7, 1)]
cur.executemany("insert into table1 values(?, ?)", lst)
cur.execute("select * from table1")
print(np.std(cur.fetchall(), axis=0))

array([1.80277564, 1.47901995])  which is the right answer

For column names :
col_names= [cur.description[i][0] for i in (0, 1)]
print(col_names)

['X1', 'X2']


My target is still to do this task in pure python. with out using an
iterable

From __peter__ at web.de  Mon Jul  4 03:51:01 2022
From: __peter__ at web.de (Peter Otten)
Date: Mon, 4 Jul 2022 09:51:01 +0200
Subject: [Tutor] Consecutive_zeros
In-Reply-To: <CAKG+6q8S7oTxVbgQjphss3rvxGEGQ=p8nc=4voH5Ni8DwjTgKw@mail.gmail.com>
References: <CAKG+6q8S7oTxVbgQjphss3rvxGEGQ=p8nc=4voH5Ni8DwjTgKw@mail.gmail.com>
Message-ID: <9a041b69-9a71-098b-ee8f-2da4bcb51d75@web.de>

On 03/07/2022 23:05, Anirudh Tamsekar wrote:
> Hello All,
>
> Any help on this function below is highly appreciated.
> Goal: analyze a binary string consisting of only zeros and ones. Your code
> should find the biggest number of consecutive zeros in the string.
>
> For example, given the string:
> Its failing on below test case

In case you haven't already fixed your function here's a hint that is a
bit more practical than what already has been said.

>
> print(consecutive_zeros("0"))
> It should return 1. Returns 0
>
> I get the max(length) as 1, if I print it separately
>
>
> def consecutive_zeros(string):
>      zeros = []
>      length = []
>      result = 0
>      for i in string:
>          if i == "0":
>              zeros.append(i)

         else:
>              length.append(len(zeros))
>              zeros.clear()
>              result = max(length)

At this point in the execution of your function what does zeros look
like for the succeeding cases, and what does it look like for the
failing ones? Add a print(...) call if you aren't sure and run
consecutive_zeros() for examples with trailing ones, trailing runs of
zeros that have or don't have the maximum length for that string.

How can you bring result up-to-date?

>      return result

PS: Because homework problems are often simpler than what comes up in
the "real world" some programmers tend to come up with solutions that
are less robust or general. In that spirit I can't help but suggest

 >>> max(map(len, "011110000110001000001111100".split("1")))
5

which may also be written as

 >>> max(len(s) for s in "011110000110001000001111100".split("1"))
5

Can you figure out how this works?
What will happen if there is a character other than "1" or "0"in the string?

From PythonList at DancesWithMice.info  Mon Jul  4 04:21:24 2022
From: PythonList at DancesWithMice.info (dn)
Date: Mon, 4 Jul 2022 20:21:24 +1200
Subject: [Tutor] (no subject)
In-Reply-To: <F4193E95-05C1-4400-AE6F-25A2C35C869A@yahoo.com>
References: <F4193E95-05C1-4400-AE6F-25A2C35C869A.ref@yahoo.com>
 <F4193E95-05C1-4400-AE6F-25A2C35C869A@yahoo.com>
Message-ID: <63eb83e1-c28a-5d61-2882-2b8929e0d823@DancesWithMice.info>

On 04/07/2022 07.46, Hilary Carris via Tutor wrote:
> I am unable to get my python correctly loaded onto my Mac.  I have to have a 3.9 version for a class and can not get it to function.
> Can you help me?

Please start with the documentation. If the necessary answer is not
featured, please mention where in the docs things diverge and provide
more information.

5. Using Python on a Mac?
https://docs.python.org/3/using/mac.html
-- 
Regards,
=dn

From learn2program at gmail.com  Mon Jul  4 04:45:29 2022
From: learn2program at gmail.com (Alan Gauld)
Date: Mon, 4 Jul 2022 09:45:29 +0100
Subject: [Tutor] (no subject)
In-Reply-To: <F4193E95-05C1-4400-AE6F-25A2C35C869A@yahoo.com>
References: <F4193E95-05C1-4400-AE6F-25A2C35C869A.ref@yahoo.com>
 <F4193E95-05C1-4400-AE6F-25A2C35C869A@yahoo.com>
Message-ID: <d7406442-f95e-80e4-036c-a8b2d41582f5@yahoo.co.uk>


On 03/07/2022 20:46, Hilary Carris via Tutor wrote:
> I am unable to get my python correctly loaded onto my Mac.  I have to have a 3.9 version for a class and can not get it to function.
> Can you help me?

What have you done so far?

Where did you download from?

What exactly is going wrong?

??? Is the download failing?

??? Is the install process failing?

??? Does it start with an error - what error?

?Please be as specific as possible about what you have done, and what is
going wrong.

Simply saying it doesn't work is not of much help.?

-- 
Alan G
Author of the Learn to Program web site
http://www.alan-g.me.uk/
http://www.amazon.com/author/alan_gauld
Follow my photo-blog on Flickr at:
http://www.flickr.com/photos/alangauldphotos


From avi.e.gross at gmail.com  Sun Jul  3 23:09:30 2022
From: avi.e.gross at gmail.com (avi.e.gross at gmail.com)
Date: Sun, 3 Jul 2022 23:09:30 -0400
Subject: [Tutor] toy program to find standard deviation of 2 columns of
 a sqlite3 database
In-Reply-To: <CAO1OCwZA4CzVC+PEGVUT-7EB0DRoqcBb9qon1N=YhdOhmUunRw@mail.gmail.com>
References: <CAO1OCwY2dRmwyNREKGEdkW0ph4gLa705_nwgJcGOge3MfY9=Qg@mail.gmail.com>
 <t9s25r$jpp$1@ciao.gmane.io>
 <CAO1OCwZA4CzVC+PEGVUT-7EB0DRoqcBb9qon1N=YhdOhmUunRw@mail.gmail.com>
Message-ID: <003c01d88f53$7a7e3a60$6f7aaf20$@gmail.com>

Once explained, the request makes some sense and I withdraw my earlier
suggestions as not responding to what you want.

You do NOT want to use Python almost at all except as a way to test
manipulating some database.  Fine, do that!

As has been pointed out, many versions of SQL come pre-built with functions
you can cll from within your SQL directly and also remotely that do things
like calculate means and perhaps even standard deviations using queries
like:

mysql> SELECT STDDEV_SAMP (salary) FROM employee;  

mysql> SELECT STD(salary) FROM employee;  

mysql> SELECT STDDEV(salary) FROM employee;  

mysql> SELECT STDDEV_POP(salary) FROM employee;  

Depending on which one you want.

So if the above are SUPPORTED then your issue of not using much memory in
Python is quite irrelevant. 

If you want to learn to use a particular function that probably sends the
above command, or something similar, fine. Figure it out but my impression
is the function you are using may not be using a local Python function.

I know this group is for learning but I seem to not appreciate being asked
to think about a problem in ways that keep turning out to be very different
than what is wanted as it is usually a waste of time for all involved.
People with more focused and understandable needs may be a better use of any
time I devote here. I have negligible interest personally in continuing to
work on manipulating a database remotely at this time. 

When things get frustrating volunteers are mobile.


-----Original Message-----
From: Tutor <tutor-bounces+avi.e.gross=gmail.com at python.org> On Behalf Of
Manprit Singh
Sent: Sunday, July 3, 2022 9:01 AM
To: tutor at python.org
Subject: Re: [Tutor] toy program to find standard deviation of 2 columns of
a sqlite3 database

Sir,
I am just going through all the functionalities available in sqlite3 module
, just to see if I can use sqlite3 as a good data analysis tool or not .

Upto this point I have figured out that and sqlite data base file can be an
excellent replacement for data stored in files .

You can preserve data in a structured form, email to someone who need it etc
etc .

But for good data analysis ....I found pandas is superior . I use pandas for
data analysis and visualization .


Btw ....this is true . You should use right tool for your task .


Regards
Manprit Singh

On Sun, 3 Jul, 2022, 18:02 Alan Gauld via Tutor, <tutor at python.org> wrote:

> On 03/07/2022 03:49, Manprit Singh wrote:
>
> > con.create_aggregate("stddev", 1, StdDev) cur.execute("select 
> > stddev(X1), stddev(X2) from table1")
>
> I just wanted to say thanks for posting this. I have never used, nor 
> seen anyone else use, the ability to create a user defined aggregate 
> function in SQLite - usually I just extract the data into python and 
> use python to do the aggregation. But your question made me read up on 
> how that all worked so it has taught me something new.
> (It also makes me appreciate how the Pyhon API is much easier to use 
> than the raw C API to SQLite!)
>
> > My question is, as you can see i have used list inside the class 
> > StdDev,
> which
> > I think is an inefficient way to do this kind of problem because 
> > there
> may be
> > a large number of values in a column and it can take a huge amount 
> > of
> memory.
> > Can this problem be solved with the use of iterators ? What would be 
> > the
> best
> > approach to do it ?
>
> If I'm working with so much data that this would be a problem I'd use 
> the database itself to store the intermediate data. That would be much 
> slower but much less memory dependant. But as others have said, with 
> aggregate functions you don't usually need to store data from all rows 
> you just store a few inermediate results which you combine at the end.
>
> If you are trying to use an in-memory function - like the stddev 
> function here - then you need to fit all the data in memory anyway so 
> the function will simply not work if you can't store the data in RAM. 
> In that case you need to find(or write) another function that doesn't 
> use memory for storage or uses less storage.
>
> It is also worth pointing out that most industrial strength SQL 
> databases come with a far richer set of aggregate functions than 
> SQLite. So if you do have to work with large volumes of data you 
> should probably switch to someting like Oracle, DB2, SQLServer(*), etc 
> and just use the functions built into the server. If they don't have 
> such a function they also have amuch simpler way of defining stored 
> procedures. As ever, choose the appropriate tool for the job.
>
> (*)These are just the ones I know, I assume MySql, Postgres etc have 
> similarly broad libraries.
>
> --
> Alan G
> Author of the Learn to Program web site http://www.alan-g.me.uk/ 
> http://www.amazon.com/author/alan_gauld
> Follow my photo-blog on Flickr at:
> http://www.flickr.com/photos/alangauldphotos
>
>
> _______________________________________________
> Tutor maillist  -  Tutor at python.org
> To unsubscribe or change subscription options:
> https://mail.python.org/mailman/listinfo/tutor
>
_______________________________________________
Tutor maillist  -  Tutor at python.org
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor


From avi.e.gross at gmail.com  Mon Jul  4 00:24:51 2022
From: avi.e.gross at gmail.com (avi.e.gross at gmail.com)
Date: Mon, 4 Jul 2022 00:24:51 -0400
Subject: [Tutor] Consecutive_zeros
In-Reply-To: <CAKG+6q8S7oTxVbgQjphss3rvxGEGQ=p8nc=4voH5Ni8DwjTgKw@mail.gmail.com>
References: <CAKG+6q8S7oTxVbgQjphss3rvxGEGQ=p8nc=4voH5Ni8DwjTgKw@mail.gmail.com>
Message-ID: <005c01d88f5e$01937b50$04ba71f0$@gmail.com>

I read through all the suggestions sent in to date and some seem a bit
beyond the purpose of the exercise. Why not go all the way and use a regular
expression to return all matches of "0+" and then choose the length of the
longest one!

But seriously, the problem is actually fairly simple as long as you start by
stating the maximum found so far at the start is 0 long. If no longer one is
found, then the answer will be 0.

And no need to play with lists. Keep a counter. Start at 0. When you see a
1, if the current count is greater that the current maximum, reset the
maximum. Either way reset the count to zero. When you see a zero, increment
the count. 

And critically, when you reach the end, check the current count and if
needed increment the maximum.

This is an exercise in a fairly small state machine with just a few states
as in a marking automaton or Turing machine. 

Is there a guarantee for the purposes of this assignment that there is
nothing else in the string and that it terminates? If there may be other
values or it terminates in a NULL of some kind, you may have to adjust the
algorithm in one of many ways but I suspect the assignment is
straightforward.

JUST FOR FUN --- DO NOT USE THIS:

import re

def consecutive_zeros(string):
    if (matches := re.findall("0+", string)) == [] :
        return 0
    else:
        return (max([len(it) for it in matches]))

print(consecutive_zeros("1010010001000010000001"))    #6
print(consecutive_zeros("00000000000000000000000000"))    #26
print(consecutive_zeros("1111"))    #0
print(consecutive_zeros("begin1x00x110001END"))    #3
print(consecutive_zeros("Boy did you pick the wrong string!"))    #0

prints out:

6
26
0
3
0

REPEAT: This is not a valid way for a normal assignment and would not have
been shown if I had not just finished a huge tome on using regular
expressions for everything imaginable and with very different
implementations in all kinds of programs and environments. But it is a tad
creative if also wasteful and does handle some edge cases. And note this may
not work in older versions of python as it uses the walrus operator. That
could easily be avoided with slightly longer code or different code but it
does present a different viewpoint on what a stretch of zeroes means.

-----Original Message-----
From: Tutor <tutor-bounces+avi.e.gross=gmail.com at python.org> On Behalf Of
Anirudh Tamsekar
Sent: Sunday, July 3, 2022 5:06 PM
To: tutor at python.org
Subject: [Tutor] Consecutive_zeros

Hello All,

Any help on this function below is highly appreciated.
Goal: analyze a binary string consisting of only zeros and ones. Your code
should find the biggest number of consecutive zeros in the string.

For example, given the string:
Its failing on below test case

print(consecutive_zeros("0"))
It should return 1. Returns 0

I get the max(length) as 1, if I print it separately


def consecutive_zeros(string):
    zeros = []
    length = []
    result = 0
    for i in string:
        if i == "0":
            zeros.append(i)        else:
            length.append(len(zeros))
            zeros.clear()
            result = max(length)
    return result


-Thanks,

Anirudh Tamsekar
_______________________________________________
Tutor maillist  -  Tutor at python.org
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor


From avi.e.gross at gmail.com  Mon Jul  4 01:20:45 2022
From: avi.e.gross at gmail.com (avi.e.gross at gmail.com)
Date: Mon, 4 Jul 2022 01:20:45 -0400
Subject: [Tutor] toy program to find standard deviation of 2 columns of a
 sqlite3 database
References: <CAO1OCwY2dRmwyNREKGEdkW0ph4gLa705_nwgJcGOge3MfY9=Qg@mail.gmail.com>
 <015201d88e91$13d44470$3b7ccd50$@gmail.com>
 <CAO1OCwbNy7a--pd-j_HVT6PuQ0AL2DZ3Od_FJNErRweDtk6Kew@mail.gmail.com>
 <CAO1OCwb6UpZasJ0JxUrF_+4g-cGrs0VzNkMwbbqHNKmBDd_ckw@mail.gmail.com>
 <007001d88ef3$3e032420$ba096c60$@gmail.com>
 <CAO1OCwa0qoeEpkjdPWEOOR7Egjn__NA91Uh=uXvTab8t3-gOhQ@mail.gmail.com>
 <cvt3chdtqi66e5keg6dfdh24vk45ga8d9c@4ax.com>
 <CAO1OCwZWx4vvUr53xR+U20rn1e0x06F9tDU0dmF=ziBBBZFOAg@mail.gmail.com> 
Message-ID: <009601d88f65$d0403f90$70c0beb0$@gmail.com>

Manprit,

Did you just violate your condition to not keep the entire list of results
in memory with your use of pandas? Yes, what you want is straightforward
when you are not trying to do it some other way.

Your earlier comment though suggested you were more interested in just using
a database as some kind of portable format. Pandas can equally trivially
open a CSV file which strikes me as even more portable.  Other formats with
more flexibility that you can bring in fairly portably can be in JSON format
or XML and so on. No reason not to use a database but a great strength of a
database is that it can be easy to do much more complex queries such as only
getting the data selectively such as the numbers for group A or those with
dates in some range. You can of course do the same thing with data you load
into pandas or numpy or any other format, but would need to do it within
python.

As mentioned earlier, if the size of memory is a concern, then asking the
database to calculate things like a standard deviation simply pushes the
memory usage elsewhere. If it is inside your own computer, that does not
sound like serious savings unless SQLLITE uses a method that does not keep
everything in memory. 

-----Original Message-----
From: Tutor <tutor-bounces+avi.e.gross=gmail.com at python.org> On Behalf Of
Manprit Singh
Sent: Sunday, July 3, 2022 9:44 PM
To: tutor at python.org
Subject: Re: [Tutor] toy program to find standard deviation of 2 columns of
a sqlite3 database

Dear Sir,

In Pandas, handling an sql query is so simple as given below:
import sqlite3
import pandas as pd

con = sqlite3.connect(":memory:")
cur = con.cursor()
cur.execute("create table test(i, j)")
ls = [(2, 4), (3, 5), (4, 2), (7, 9)]
cur.executemany("insert into test(i, j) values (?, ?)", ls)

pd.read_sql("select i, j from test", con).std(ddof=0)

will give the desired result:

i    1.870829
j    2.549510
dtype: float64


On Mon, Jul 4, 2022 at 2:13 AM Dennis Lee Bieber <wlfraed at ix.netcom.com>
wrote:

> On Sun, 3 Jul 2022 22:29:41 +0530, Manprit Singh 
> <manpritsinghece at gmail.com> declaimed the following:
>
> >Now as sum is an aggregate function, same way population standard 
> >deviation is also an aggregate function. We should be able to make a 
> >user defined function
> >
>
>         It is also superfluous: SQLite3 already has count(), sum() and 
> even
> avg() built-in (though it lacks many of the bigger statistical 
> computations
> -- variance, std. dev, covariance, correlation, linear regression -- 
> that many of the bigger client/server RDBMs support).
>
> >
> >for getting mean.I am not getting the mechanism to subtract the mean 
> >from each value of the column in the same step method or by any other 
> >way
> >
>         Note that the definition for creating aggregates includes 
> something for number of arguments. Figure out how to specify multiple 
> arguments and you might be able to have SQLite3 provide "current item"
> and "mean" (avg) to the step() method. I'm not going to take the time 
> to experiment (for the most part, I'd consider it simpler to just grab 
> the entire dataset from the database, and run the number crunching in 
> Python, rather than the overhead of having SQLite3 invoke a Python 
> "callback" method for each item, just to be able to have the SQLite3 
> return a single computed value.
>
>
> >Btw I would like to write a one liner to calculate population std 
> >deviation  of a list:
> >lst = [2, 5, 7, 9, 10]
> >mean = sum(lst)/len(lst)
> >std_dev = (sum((ele-mean)**2 for ele in lst)/len(lst))**0.5
> >print(std_dev)
>
>         Literally, except for the imports, that is just...
>
> print(statistics.pstdev(lst))
>
> >>> import math as m
> >>> import statistics as s
> >>> lst = [2, 5, 7, 9, 10]
> >>> print(s.pstdev(lst))
> 2.870540018881465
> >>>
>
>         Going up a level in complexity (IE -- not using the imported
> pstdev())
>
> >>> print("Population Std. Dev.: %s" % m.sqrt( s.mean( (ele -
> >>> s.mean(lst))
> ** 2 for ele in lst)))
> Population Std. Dev.: 2.870540018881465
> >>>
>
>         This has the problem  that it invokes mean(lst) for each 
> element, so may be slower for large data sets (that problem will also 
> exist if you manage a multi-argument step() for SQLite3).
>
>         Anytime you have
>
>                 sum(equation-with-elements-of-data) / len(data)
>
> you can replace it with just
>
>                 mean(equation...)
>
>
> --
>         Wulfraed                 Dennis Lee Bieber         AF6VN
>         wlfraed at ix.netcom.com
> http://wlfraed.microdiversity.freeddns.org/
>
> _______________________________________________
> Tutor maillist  -  Tutor at python.org
> To unsubscribe or change subscription options:
> https://mail.python.org/mailman/listinfo/tutor
>
_______________________________________________
Tutor maillist  -  Tutor at python.org
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor


From lawrencefdunn at gmail.com  Sun Jul  3 20:36:50 2022
From: lawrencefdunn at gmail.com (Lawrence Dunn)
Date: Sun, 3 Jul 2022 19:36:50 -0500
Subject: [Tutor] (no subject)
In-Reply-To: <F4193E95-05C1-4400-AE6F-25A2C35C869A@yahoo.com>
References: <F4193E95-05C1-4400-AE6F-25A2C35C869A.ref@yahoo.com>
 <F4193E95-05C1-4400-AE6F-25A2C35C869A@yahoo.com>
Message-ID: <CAJG2z+dCZzqfo4Dbb6L_47PAtKe-kAKXT_m9_pNcunLRT8gjXw@mail.gmail.com>

Just wondering, is it the M1 Scilicon ?

On Sun, Jul 3, 2022 at 7:09 PM Hilary Carris via Tutor <tutor at python.org>
wrote:

> I am unable to get my python correctly loaded onto my Mac.  I have to have
> a 3.9 version for a class and can not get it to function.
> Can you help me?
> _______________________________________________
> Tutor maillist  -  Tutor at python.org
> To unsubscribe or change subscription options:
> https://mail.python.org/mailman/listinfo/tutor
>

From alan.gauld at yahoo.co.uk  Mon Jul  4 06:46:02 2022
From: alan.gauld at yahoo.co.uk (Alan Gauld)
Date: Mon, 4 Jul 2022 11:46:02 +0100
Subject: [Tutor] toy program to find standard deviation of 2 columns of
 a sqlite3 database
In-Reply-To: <CAO1OCwZA4CzVC+PEGVUT-7EB0DRoqcBb9qon1N=YhdOhmUunRw@mail.gmail.com>
References: <CAO1OCwY2dRmwyNREKGEdkW0ph4gLa705_nwgJcGOge3MfY9=Qg@mail.gmail.com>
 <t9s25r$jpp$1@ciao.gmane.io>
 <CAO1OCwZA4CzVC+PEGVUT-7EB0DRoqcBb9qon1N=YhdOhmUunRw@mail.gmail.com>
Message-ID: <t9ugda$jsd$1@ciao.gmane.io>

On 03/07/2022 14:01, Manprit Singh wrote:
> Sir,
> I am just going through all the functionalities available in sqlite3 module
> , just to see if I can use sqlite3 as a good data analysis tool or not .

SQLite is a good storage and retrieval system. It's not aimed at data
analysis, thats where tools like Pandas and R come into play.

SQLite will do a better job in pulling out specific subsets of
data and of organising your data with relationships etc. But it
makes no attempt to be a fully featured application environment
(unlike the bigger client/server databases like Oracle or DB2)

> Upto this point I have figured out that and sqlite data base file can be an
> excellent replacement for data stored in files .
> 
> You can preserve data in a structured form, email to someone who need it
> etc etc .

Yes, that is its strong point. Everything is stored in a single file
that can be easily shared by email or by storing it on a cloud server.

> But for good data analysis ....I found pandas is superior . I use pandas
> for data analysis and visualization .

And that's good because that is what Pandas (and SciPy in general)
is  designed for.
> 
> Btw ....this is true . You should use right tool for your task .

Absolutely. One of the key skills of a software engineer is
recognising which tools are best suited to which part of the
task and how to glue them together.
There is no universally best tool.

-- 
Alan G
Author of the Learn to Program web site
http://www.alan-g.me.uk/
http://www.amazon.com/author/alan_gauld
Follow my photo-blog on Flickr at:
http://www.flickr.com/photos/alangauldphotos


From manpritsinghece at gmail.com  Mon Jul  4 13:14:29 2022
From: manpritsinghece at gmail.com (Manprit Singh)
Date: Mon, 4 Jul 2022 22:44:29 +0530
Subject: [Tutor] toy program to find standard deviation of 2 columns of
 a sqlite3 database
In-Reply-To: <t9ugda$jsd$1@ciao.gmane.io>
References: <CAO1OCwY2dRmwyNREKGEdkW0ph4gLa705_nwgJcGOge3MfY9=Qg@mail.gmail.com>
 <t9s25r$jpp$1@ciao.gmane.io>
 <CAO1OCwZA4CzVC+PEGVUT-7EB0DRoqcBb9qon1N=YhdOhmUunRw@mail.gmail.com>
 <t9ugda$jsd$1@ciao.gmane.io>
Message-ID: <CAO1OCwZqMzCdYUNGyM_nsWc_2JjwtATfdpb_rmxPpkFh=eOutQ@mail.gmail.com>

Dear Sir,

Finally I came up with a solution which seems more good to me, rather than
using the previous approach. In this solution I have used shortcut method
for calculating the standard deviation.

import sqlite3

class StdDev:

    def __init__(self):
        self.cnt = 0
        self.sumx = 0
        self.sumsqrx = 0

    def step(self, x):
        self.cnt += 1
        self.sumx += x
        self.sumsqrx += x**2

    def finalize(self):
        return ((self.sumsqrx - self.sumx**2/self.cnt)/self.cnt)**0.5

conn = sqlite3.connect(":memory:")
cur = conn.cursor()
cur.execute("create table table1(X1 int, X2 int)")
ls = [(2, 5),
      (3, 7),
      (4, 2),
      (5, 1),
      (8, 6)]
cur.executemany("insert into table1 values(?, ?)", ls)
conn.commit()

conn.create_aggregate("stdev", 1, StdDev)
std_dev, = cur.execute("select stdev(X1), stdev(X2) from table1")
print(std_dev)
cur.close()
conn.close()


gives  output

(2.0591260281974, 2.315167380558045)

That's all.  This is what I was looking for .So what will be the best
solution to this problem ? This one or the previous one posted by me ?

The whole credit goes to Dennis lee bieber & avi.e.gross at gmail.com


Regards
Manprit Singh


On Mon, Jul 4, 2022 at 4:17 PM Alan Gauld via Tutor <tutor at python.org>
wrote:

> On 03/07/2022 14:01, Manprit Singh wrote:
> > Sir,
> > I am just going through all the functionalities available in sqlite3
> module
> > , just to see if I can use sqlite3 as a good data analysis tool or not .
>
> SQLite is a good storage and retrieval system. It's not aimed at data
> analysis, thats where tools like Pandas and R come into play.
>
> SQLite will do a better job in pulling out specific subsets of
> data and of organising your data with relationships etc. But it
> makes no attempt to be a fully featured application environment
> (unlike the bigger client/server databases like Oracle or DB2)
>
> > Upto this point I have figured out that and sqlite data base file can be
> an
> > excellent replacement for data stored in files .
> >
> > You can preserve data in a structured form, email to someone who need it
> > etc etc .
>
> Yes, that is its strong point. Everything is stored in a single file
> that can be easily shared by email or by storing it on a cloud server.
>
> > But for good data analysis ....I found pandas is superior . I use pandas
> > for data analysis and visualization .
>
> And that's good because that is what Pandas (and SciPy in general)
> is  designed for.
> >
> > Btw ....this is true . You should use right tool for your task .
>
> Absolutely. One of the key skills of a software engineer is
> recognising which tools are best suited to which part of the
> task and how to glue them together.
> There is no universally best tool.
>
> --
> Alan G
> Author of the Learn to Program web site
> http://www.alan-g.me.uk/
> http://www.amazon.com/author/alan_gauld
> Follow my photo-blog on Flickr at:
> http://www.flickr.com/photos/alangauldphotos
>
>
> _______________________________________________
> Tutor maillist  -  Tutor at python.org
> To unsubscribe or change subscription options:
> https://mail.python.org/mailman/listinfo/tutor
>

From avi.e.gross at gmail.com  Mon Jul  4 12:49:20 2022
From: avi.e.gross at gmail.com (avi.e.gross at gmail.com)
Date: Mon, 4 Jul 2022 12:49:20 -0400
Subject: [Tutor] Consecutive_zeros
In-Reply-To: <9a041b69-9a71-098b-ee8f-2da4bcb51d75@web.de>
References: <CAKG+6q8S7oTxVbgQjphss3rvxGEGQ=p8nc=4voH5Ni8DwjTgKw@mail.gmail.com>
 <9a041b69-9a71-098b-ee8f-2da4bcb51d75@web.de>
Message-ID: <504401d88fc6$02af9e70$080edb50$@gmail.com>

Thanks for reminding me Peter. This may be a bit off topic for the original
question if it involved learning how to apply simple algorithms in Python
but is a good exercise as well for finding ways to use various built-in
tools perhaps in a more abstract way.

What Peter did was a nice use of the built-in split() function which does
indeed allow the removal of an arbitrary number of ones but, as he notes,
his solution, as written, depends on there not being anything except zeroes
and ones in the string. 

So I went back to my previous somewhat joking suggestion and thought of a
way of shortening it as the "re" module also has a regular-expression
version called re.split() that has the nice side effect of including an
empty string when I ask it to split on all runs of non-zero. 

re.split("[^0]", "1010010001000010000001")
['', '0', '00', '000', '0000', '000000', '']

That '' at the end of the resulting list takes care of the edge condition
for a string with no zeroes at all:

re.split("[^0]", "No zeroes")
['', '', '', '', '', '', '', '', '', '']

Yes, lots of empty string but when you hand something like that to get
lengths, you get at least one zero which handles always getting a number,
unlike my earlier offering which needed to check if it matched anything.

So here is a tad shorter and perhaps more direct version of the requested
function which is in effect a one-liner:

def consecutive_zeros(string):
    return(max([len(zs) for zs in re.split("[^0]", string)]))

The full code with testing would be:

import re

def consecutive_zeros(string):
    return(max([len(zs) for zs in re.split("[^0]", string)]))

def testink(string, expected):
    print(str(consecutive_zeros(string)) + ": " + string + "\n" +
str(expected) + ": expected\n")

print("testing the function and showing what was expected:\n")
testink("1010010001000010000001", 6)
testink("00000000000000000000000000", 26)
testink("1111", 0)
testink("begin1x00x110001END", 3)
testink("Boy did you pick the wrong string!", 0)

The output is:

testing the function and showing what was expected:

6: 1010010001000010000001
6: expected

26: 00000000000000000000000000
26: expected

0: 1111
0: expected

3: begin1x00x110001END
3: expected

0: Boy did you pick the wrong string!
0: expected


-----Original Message-----
From: Tutor <tutor-bounces+avi.e.gross=gmail.com at python.org> On Behalf Of
Peter Otten
Sent: Monday, July 4, 2022 3:51 AM
To: tutor at python.org
Subject: Re: [Tutor] Consecutive_zeros

On 03/07/2022 23:05, Anirudh Tamsekar wrote:
> Hello All,
>
> Any help on this function below is highly appreciated.
> Goal: analyze a binary string consisting of only zeros and ones. Your 
> code should find the biggest number of consecutive zeros in the string.
>
> For example, given the string:
> Its failing on below test case

In case you haven't already fixed your function here's a hint that is a bit
more practical than what already has been said.

>
> print(consecutive_zeros("0"))
> It should return 1. Returns 0
>
> I get the max(length) as 1, if I print it separately
>
>
> def consecutive_zeros(string):
>      zeros = []
>      length = []
>      result = 0
>      for i in string:
>          if i == "0":
>              zeros.append(i)

         else:
>              length.append(len(zeros))
>              zeros.clear()
>              result = max(length)

At this point in the execution of your function what does zeros look like
for the succeeding cases, and what does it look like for the failing ones?
Add a print(...) call if you aren't sure and run
consecutive_zeros() for examples with trailing ones, trailing runs of zeros
that have or don't have the maximum length for that string.

How can you bring result up-to-date?

>      return result

PS: Because homework problems are often simpler than what comes up in the
"real world" some programmers tend to come up with solutions that are less
robust or general. In that spirit I can't help but suggest

 >>> max(map(len, "011110000110001000001111100".split("1")))
5

which may also be written as

 >>> max(len(s) for s in "011110000110001000001111100".split("1"))
5

Can you figure out how this works?
What will happen if there is a character other than "1" or "0"in the string?
_______________________________________________
Tutor maillist  -  Tutor at python.org
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor


From sjeik_appie at hotmail.com  Mon Jul  4 18:31:07 2022
From: sjeik_appie at hotmail.com (Albert-Jan Roskam)
Date: Tue, 05 Jul 2022 00:31:07 +0200
Subject: [Tutor] __debug__ and PYTHONOPTIMIZE
Message-ID: <DB6PR01MB3895EA0EBAD7868C5947152183BE9@DB6PR01MB3895.eurprd01.prod.exchangelabs.com>

   Hi,?
   I am using PYTHONOPTIMIZE=1 (equivalent to start-up option "-o"). As I
   understood, assert statements are "compiled away" then. But how about
   __debug__. Does it merely evaluate to False when using "-o"? Or is it also
   really gone? More generally; how do I decompile a .pyc or .pyo file to
   verify this? With the "dis" module? This is a relevant code snippet:
   for record in billionrecords:
   ? ? if __debug__:? #evaluating this still takes some time!
   ? ? ? ? logger.debug(record)
   Thanks!
   Albert-Jan

From __peter__ at web.de  Tue Jul  5 03:20:41 2022
From: __peter__ at web.de (Peter Otten)
Date: Tue, 5 Jul 2022 09:20:41 +0200
Subject: [Tutor] toy program to find standard deviation of 2 columns of
 a sqlite3 database
In-Reply-To: <CAO1OCwZqMzCdYUNGyM_nsWc_2JjwtATfdpb_rmxPpkFh=eOutQ@mail.gmail.com>
References: <CAO1OCwY2dRmwyNREKGEdkW0ph4gLa705_nwgJcGOge3MfY9=Qg@mail.gmail.com>
 <t9s25r$jpp$1@ciao.gmane.io>
 <CAO1OCwZA4CzVC+PEGVUT-7EB0DRoqcBb9qon1N=YhdOhmUunRw@mail.gmail.com>
 <t9ugda$jsd$1@ciao.gmane.io>
 <CAO1OCwZqMzCdYUNGyM_nsWc_2JjwtATfdpb_rmxPpkFh=eOutQ@mail.gmail.com>
Message-ID: <bec9adb8-c8c5-7d84-7b16-cc0f46054765@web.de>

On 04/07/2022 19:14, Manprit Singh wrote:
> Dear Sir,
>
> Finally I came up with a solution which seems more good to me, rather than
> using the previous approach. In this solution I have used shortcut method
> for calculating the standard deviation.
>
> import sqlite3
>
> class StdDev:
>
>      def __init__(self):
>          self.cnt = 0
>          self.sumx = 0
>          self.sumsqrx = 0
>
>      def step(self, x):
>          self.cnt += 1
>          self.sumx += x
>          self.sumsqrx += x**2
>
>      def finalize(self):
>          return ((self.sumsqrx - self.sumx**2/self.cnt)/self.cnt)**0.5
>
> conn = sqlite3.connect(":memory:")
> cur = conn.cursor()
> cur.execute("create table table1(X1 int, X2 int)")
> ls = [(2, 5),
>        (3, 7),
>        (4, 2),
>        (5, 1),
>        (8, 6)]
> cur.executemany("insert into table1 values(?, ?)", ls)
> conn.commit()
>
> conn.create_aggregate("stdev", 1, StdDev)
> std_dev, = cur.execute("select stdev(X1), stdev(X2) from table1")
> print(std_dev)
> cur.close()
> conn.close()
>
>
> gives  output
>
> (2.0591260281974, 2.315167380558045)
>
> That's all.  This is what I was looking for .So what will be the best
> solution to this problem ? This one or the previous one posted by me ?

As always -- it depends. I believe the numerical error for the above
algorithm tends to be much higher than for the one used in the
statistics module. I'd have to google for the details, though, and I am
lazy enough to leave that up to you.

> The whole credit goes to Dennis lee bieber & avi.e.gross at gmail.com

I think I mentioned it first ;)

From __peter__ at web.de  Tue Jul  5 03:55:11 2022
From: __peter__ at web.de (Peter Otten)
Date: Tue, 5 Jul 2022 09:55:11 +0200
Subject: [Tutor] Consecutive_zeros
In-Reply-To: <504401d88fc6$02af9e70$080edb50$@gmail.com>
References: <CAKG+6q8S7oTxVbgQjphss3rvxGEGQ=p8nc=4voH5Ni8DwjTgKw@mail.gmail.com>
 <9a041b69-9a71-098b-ee8f-2da4bcb51d75@web.de>
 <504401d88fc6$02af9e70$080edb50$@gmail.com>
Message-ID: <6adf2afe-475d-2fc7-6051-867ceddd2015@web.de>

On 04/07/2022 18:49, avi.e.gross at gmail.com wrote:

> So I went back to my previous somewhat joking suggestion and thought of a
> way of shortening it as the "re" module also has a regular-expression
> version called re.split() that has the nice side effect of including an
> empty string when I ask it to split on all runs of non-zero.
>
> re.split("[^0]", "1010010001000010000001")
> ['', '0', '00', '000', '0000', '000000', '']
>
> That '' at the end of the resulting list takes care of the edge condition
> for a string with no zeroes at all:
>
> re.split("[^0]", "No zeroes")
> ['', '', '', '', '', '', '', '', '', '']
>
> Yes, lots of empty string but when you hand something like that to get
> lengths, you get at least one zero which handles always getting a number,
> unlike my earlier offering which needed to check if it matched anything.
>
> So here is a tad shorter and perhaps more direct version of the requested
> function which is in effect a one-liner:
>
> def consecutive_zeros(string):
>      return(max([len(zs) for zs in re.split("[^0]", string)]))

Python's developers like to tinker as much as its users -- but they
disguise it as adding syntactic sugar or useful features ;)

One of these features is a default value for max() which allows you to
stick with re.findall() while following the cult of the one-liner:

 >>> max(map(len, re.findall("0+", "1010010001000010000001")), default=-1)
6
 >>> max(map(len, re.findall("0+", "ham spam")), default=-1)
-1


From __peter__ at web.de  Tue Jul  5 04:07:16 2022
From: __peter__ at web.de (Peter Otten)
Date: Tue, 5 Jul 2022 10:07:16 +0200
Subject: [Tutor] __debug__ and PYTHONOPTIMIZE
In-Reply-To: <DB6PR01MB3895EA0EBAD7868C5947152183BE9@DB6PR01MB3895.eurprd01.prod.exchangelabs.com>
References: <DB6PR01MB3895EA0EBAD7868C5947152183BE9@DB6PR01MB3895.eurprd01.prod.exchangelabs.com>
Message-ID: <505052f8-6697-c275-8124-f4f7809a4291@web.de>

On 05/07/2022 00:31, Albert-Jan Roskam wrote:
>     Hi,
>     I am using PYTHONOPTIMIZE=1 (equivalent to start-up option "-o"). As I
>     understood, assert statements are "compiled away" then. But how about
>     __debug__. Does it merely evaluate to False when using "-o"? Or is it also
>     really gone?


Why so complicated? Just try and see:

PS C:\> py -c "print(__debug__)"
True
PS C:\> py -Oc "print(__debug__)"
False


More generally; how do I decompile a .pyc or .pyo file to
>     verify this? With the "dis" module? This is a relevant code snippet:
>     for record in billionrecords:
>     ? ? if __debug__:? #evaluating this still takes some time!
>     ? ? ? ? logger.debug(record)
>     Thanks!

Again: stop worrying and start experimenting ;)

PS C:\tmp> type tmp.py
def f():
     if __debug__: print(42)
     assert False
PS C:\tmp> py -O
Python 3.9.6 (tags/v3.9.6:db3ff76, Jun 28 2021, 15:04:37) [MSC v.1929 32
bit (Intel)] on win32
Type "help", "copyright", "credits" or "license" for more information.
 >>> import tmp, dis
 >>> dis.dis(tmp.f)
   3           0 LOAD_CONST               0 (None)
               2 RETURN_VALUE
 >>> ^Z

PS C:\tmp> py
Python 3.9.6 (tags/v3.9.6:db3ff76, Jun 28 2021, 15:04:37) [MSC v.1929 32
bit (Intel)] on win32
Type "help", "copyright", "credits" or "license" for more information.
 >>> import tmp, dis
 >>> dis.dis(tmp.f)
   2           0 LOAD_GLOBAL              0 (print)
               2 LOAD_CONST               1 (42)
               4 CALL_FUNCTION            1
               6 POP_TOP

   3           8 LOAD_CONST               2 (False)
              10 POP_JUMP_IF_TRUE        16
              12 LOAD_ASSERTION_ERROR
              14 RAISE_VARARGS            1
         >>   16 LOAD_CONST               0 (None)
              18 RETURN_VALUE

From sjeik_appie at hotmail.com  Tue Jul  5 08:59:18 2022
From: sjeik_appie at hotmail.com (Albert-Jan Roskam)
Date: Tue, 05 Jul 2022 14:59:18 +0200
Subject: [Tutor] __debug__ and PYTHONOPTIMIZE
In-Reply-To: <505052f8-6697-c275-8124-f4f7809a4291@web.de>
Message-ID: <DB6PR01MB3895156D83F4F8AEB055509C83819@DB6PR01MB3895.eurprd01.prod.exchangelabs.com>

   On Jul 5, 2022 10:07, Peter Otten <__peter__ at web.de> wrote:

     On 05/07/2022 00:31, Albert-Jan Roskam wrote:
     >???? Hi,
     >???? I am using PYTHONOPTIMIZE=1 (equivalent to start-up option "-o").
     As I
     >???? understood, assert statements are "compiled away" then. But how
     about
     >???? __debug__. Does it merely evaluate to False when using "-o"? Or is
     it also
     >???? really gone?

     Why so complicated? Just try and see:

     PS C:\> py -c "print(__debug__)"
     True
     PS C:\> py -Oc "print(__debug__)"
     False

     More generally; how do I decompile a .pyc or .pyo file to
     >???? verify this? With the "dis" module? This is a relevant code
     snippet:
     >???? for record in billionrecords:
     >???? ? ? if __debug__:? #evaluating this still takes some time!
     >???? ? ? ? ? logger.debug(record)
     >???? Thanks!

     Again: stop worrying and start experimenting ;)

     PS C:\tmp> type tmp.py
     def f():
     ???? if __debug__: print(42)
     ???? assert False
     PS C:\tmp> py -O
     Python 3.9.6 (tags/v3.9.6:db3ff76, Jun 28 2021, 15:04:37) [MSC v.1929 32
     bit (Intel)] on win32
     Type "help", "copyright", "credits" or "license" for more information.
     >>> import tmp, dis
     >>> dis.dis(tmp.f)
     ?? 3?????????? 0 LOAD_CONST?????????????? 0 (None)
     ?????????????? 2 RETURN_VALUE
     >>> ^Z

     PS C:\tmp> py
     Python 3.9.6 (tags/v3.9.6:db3ff76, Jun 28 2021, 15:04:37) [MSC v.1929 32
     bit (Intel)] on win32
     Type "help", "copyright", "credits" or "license" for more information.
     >>> import tmp, dis
     >>> dis.dis(tmp.f)
     ?? 2?????????? 0 LOAD_GLOBAL????????????? 0 (print)
     ?????????????? 2 LOAD_CONST?????????????? 1 (42)
     ?????????????? 4 CALL_FUNCTION??????????? 1
     ?????????????? 6 POP_TOP

     ?? 3?????????? 8 LOAD_CONST?????????????? 2 (False)
     ????????????? 10 POP_JUMP_IF_TRUE??????? 16
     ????????????? 12 LOAD_ASSERTION_ERROR
     ????????????? 14 RAISE_VARARGS??????????? 1
     ???????? >>?? 16 LOAD_CONST?????????????? 0 (None)
     ????????????? 18 RETURN_VALUE
     _______________________________________________

   ======
   ======
   Thanks Peter, that makes sense. It was half past midnight when I sent that
   mail from my phone. But I agree I should have tried a bit first. ?
   Best wishes,
   Albert-Jan

From manpritsinghece at gmail.com  Tue Jul  5 10:32:18 2022
From: manpritsinghece at gmail.com (Manprit Singh)
Date: Tue, 5 Jul 2022 20:02:18 +0530
Subject: [Tutor] toy program to find standard deviation of 2 columns of
 a sqlite3 database
In-Reply-To: <bec9adb8-c8c5-7d84-7b16-cc0f46054765@web.de>
References: <CAO1OCwY2dRmwyNREKGEdkW0ph4gLa705_nwgJcGOge3MfY9=Qg@mail.gmail.com>
 <t9s25r$jpp$1@ciao.gmane.io>
 <CAO1OCwZA4CzVC+PEGVUT-7EB0DRoqcBb9qon1N=YhdOhmUunRw@mail.gmail.com>
 <t9ugda$jsd$1@ciao.gmane.io>
 <CAO1OCwZqMzCdYUNGyM_nsWc_2JjwtATfdpb_rmxPpkFh=eOutQ@mail.gmail.com>
 <bec9adb8-c8c5-7d84-7b16-cc0f46054765@web.de>
Message-ID: <CAO1OCwYq6EZB0wv0g_RF6q7ecrB9ni-itHpDQJDxg-ubvmhSZw@mail.gmail.com>

Dear Sir,

Actually it started with- how to write an aggregate function in python for
sqlite3 . That i have got a fair idea now,So a task can be done in many
ways, each way has its own merits and demerits. I have got a fair idea of
this too with this example .

Series of mails have taught me a lot of things .

I am thankful to wonderful people like Dennis Lee Beiber,  Peter otten  and
avi.e.gross at gmail.com .

Regards
Manprit Singh

On Tue, Jul 5, 2022 at 1:10 PM Peter Otten <__peter__ at web.de> wrote:

> On 04/07/2022 19:14, Manprit Singh wrote:
> > Dear Sir,
> >
> > Finally I came up with a solution which seems more good to me, rather
> than
> > using the previous approach. In this solution I have used shortcut method
> > for calculating the standard deviation.
> >
> > import sqlite3
> >
> > class StdDev:
> >
> >      def __init__(self):
> >          self.cnt = 0
> >          self.sumx = 0
> >          self.sumsqrx = 0
> >
> >      def step(self, x):
> >          self.cnt += 1
> >          self.sumx += x
> >          self.sumsqrx += x**2
> >
> >      def finalize(self):
> >          return ((self.sumsqrx - self.sumx**2/self.cnt)/self.cnt)**0.5
> >
> > conn = sqlite3.connect(":memory:")
> > cur = conn.cursor()
> > cur.execute("create table table1(X1 int, X2 int)")
> > ls = [(2, 5),
> >        (3, 7),
> >        (4, 2),
> >        (5, 1),
> >        (8, 6)]
> > cur.executemany("insert into table1 values(?, ?)", ls)
> > conn.commit()
> >
> > conn.create_aggregate("stdev", 1, StdDev)
> > std_dev, = cur.execute("select stdev(X1), stdev(X2) from table1")
> > print(std_dev)
> > cur.close()
> > conn.close()
> >
> >
> > gives  output
> >
> > (2.0591260281974, 2.315167380558045)
> >
> > That's all.  This is what I was looking for .So what will be the best
> > solution to this problem ? This one or the previous one posted by me ?
>
> As always -- it depends. I believe the numerical error for the above
> algorithm tends to be much higher than for the one used in the
> statistics module. I'd have to google for the details, though, and I am
> lazy enough to leave that up to you.
>
> > The whole credit goes to Dennis lee bieber & avi.e.gross at gmail.com
>
> I think I mentioned it first ;)
> _______________________________________________
> Tutor maillist  -  Tutor at python.org
> To unsubscribe or change subscription options:
> https://mail.python.org/mailman/listinfo/tutor
>

From avi.e.gross at gmail.com  Tue Jul  5 12:43:09 2022
From: avi.e.gross at gmail.com (avi.e.gross at gmail.com)
Date: Tue, 5 Jul 2022 12:43:09 -0400
Subject: [Tutor] Consecutive_zeros
In-Reply-To: <6adf2afe-475d-2fc7-6051-867ceddd2015@web.de>
References: <CAKG+6q8S7oTxVbgQjphss3rvxGEGQ=p8nc=4voH5Ni8DwjTgKw@mail.gmail.com>
 <9a041b69-9a71-098b-ee8f-2da4bcb51d75@web.de>
 <504401d88fc6$02af9e70$080edb50$@gmail.com>
 <6adf2afe-475d-2fc7-6051-867ceddd2015@web.de>
Message-ID: <006601d8908e$50026680$f0073380$@gmail.com>

Peter, 

It may be tinkering but is a useful feature. Had I bothered to find the
manual page for max() and seen a way to specify a default of zero, indeed I
would have been quite happy.

There are many scenarios where you could argue a feature is syntactic sugar
but when something is done commonly and can result in a bad result, it can
be quite nice to have a way of avoiding it with lots of code. For example, I
have seen lots of functions that can be told to remove NA values in R, or
not to drop a dimension when consolidating info so the result has a know
size even if not an optimal size.

In this case, I would think it to be a BUG in the implementation of max()
for it to return an error like this:

max([])
Traceback (most recent call last):
  File "<pyshell#1>", line 1, in <module>
    max([])
ValueError: max() arg is an empty sequence

Basically it was given nothing and mathematically the maximum and minimum
and many other such functions are meaningless when given nothing. So
obviously a good approach is to test what you are working on before trying
such a call, or wrapping it in something like a try(...) and catch the error
and deal with it then.

Those are valid approaches BUT we have many scenarios like the one being
solved that have a concept that zero is a floor for counting numbers and is
appropriate also as a description of the length of a pattern that does not
exist. In this case, an empty list still means no instances of the pattern
where found so the longest pattern is 0 units long.

So it is perfectly reasonable to be able to add a default so the function
does not blow up the program:

max([], default=0)
0

I hasten to add that in many situations this approach is not valid and the
code should blow up unless you take steps to avoid it as further work can be
meaningless.

I am curious how you feel about various other areas such as defaults when
fetching an item from a dictionary. Yes, you can write elaborate code that
prevents a mishap. Realistically, some people would create a slew of short
functions like safe_max() that probably would use one of the techniques
internally and return something more gently, so what is wrong with making a
small change in max() itself that does this upon request?

I know max() was not necessarily designed with regular expression matches
(and their length) in mind. But many real world problems share the same
concept albeit not always with a default of 0. 


-----Original Message-----
From: Tutor <tutor-bounces+avi.e.gross=gmail.com at python.org> On Behalf Of
Peter Otten
Sent: Tuesday, July 5, 2022 3:55 AM
To: tutor at python.org
Subject: Re: [Tutor] Consecutive_zeros

On 04/07/2022 18:49, avi.e.gross at gmail.com wrote:

> So I went back to my previous somewhat joking suggestion and thought 
> of a way of shortening it as the "re" module also has a 
> regular-expression version called re.split() that has the nice side 
> effect of including an empty string when I ask it to split on all runs of
non-zero.
>
> re.split("[^0]", "1010010001000010000001") ['', '0', '00', '000', 
> '0000', '000000', '']
>
> That '' at the end of the resulting list takes care of the edge 
> condition for a string with no zeroes at all:
>
> re.split("[^0]", "No zeroes")
> ['', '', '', '', '', '', '', '', '', '']
>
> Yes, lots of empty string but when you hand something like that to get 
> lengths, you get at least one zero which handles always getting a 
> number, unlike my earlier offering which needed to check if it matched
anything.
>
> So here is a tad shorter and perhaps more direct version of the 
> requested function which is in effect a one-liner:
>
> def consecutive_zeros(string):
>      return(max([len(zs) for zs in re.split("[^0]", string)]))

Python's developers like to tinker as much as its users -- but they disguise
it as adding syntactic sugar or useful features ;)

One of these features is a default value for max() which allows you to stick
with re.findall() while following the cult of the one-liner:

 >>> max(map(len, re.findall("0+", "1010010001000010000001")), default=-1)
6
 >>> max(map(len, re.findall("0+", "ham spam")), default=-1)
-1

_______________________________________________
Tutor maillist  -  Tutor at python.org
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor


From avi.e.gross at gmail.com  Tue Jul  5 22:31:11 2022
From: avi.e.gross at gmail.com (avi.e.gross at gmail.com)
Date: Tue, 5 Jul 2022 22:31:11 -0400
Subject: [Tutor] this group and one liners
Message-ID: <008a01d890e0$756336a0$6029a3e0$@gmail.com>

Peter mentioned the cult of the one-liner and I admit I am sometimes one of the worshippers.

 
But for teaching a relatively beginner computer class, which in theory includes the purpose of this group to provide hints and some help to students, or help them when they get stuck even on a more advanced project, the goal often is to do things in a way that makes an algorithm clear, often using lots of shorter lines of code.

 
Few one-liners of any complexity are trivial to understand and often consist of highly nested code. 

 
As an example, lots of problems could look like f(g(h(arg, arg, arg, i(j(arg), arg)))) and I may not have matched my parenthese well. Throw in some pythonic constructs like [z^2 for z in ?] and it may take a while for a newcomer even to parse it, let alone debug it.

 
And worse, much such code almost has to be read from inside to outside.

 
What is wrong with code like:

 
Var1 = p(?)

Var2 = q(Var1, ?)

?

 
I mean sure, there are lots of extra variables sitting around, maybe some if statement with an else and maybe some visible loops. But they show how the problem is being approached and perhaps allow some strategic debugging or testing with print statements and the like.

 
An interesting variation I appreciate in languages with some form of pipeline is to write the algorithm forwards rather than nested. Again, this is easier for some to comprehend. I mean using pseudcode, something like:

 
Do_this(args) piped to

Do_that(above, args) piped to

Do_some_more(above, args) placed in

Result

 
I do this all the time in R with code like:

 
data.frame |>

  select(which columns to keep) |>

  filter(rows with some condition) |>

  mutate(make some new columns with calculations from existing columns) |>

  group_by(one or more columns) |>

  summarize(apply various calculations by group generating various additional columns in a report) |>

  arrange(sort in various ways) |>

  ggplot(make a graph) + ? + ? ->

  variable

 
The above can be a set of changes one at a time that build on each other and saves the result in a variable, (or just prints it immediately) and you can build this in stages and run just up to some point and test the results and of course can intervene in many other ways. Functionally it is almost the same as using temporary variable and it can implement an algorithm in bite-sized pieces. The ggplot line is  tad complex as it was created ages ago and has it?s own pipeline of sorts as ?adding? another command lets you refine your graph and add layers to it by effectively passing a growing data structure around to be changed.

 
The point I am making is whether when some of us do one-liners, are we really helping or just enjoying solving our puzzles?

 
Of course, if someone asks for a clever or obfuscated method, we can happily oblige. ?

 
From learn2program at gmail.com  Wed Jul  6 04:21:44 2022
From: learn2program at gmail.com (Alan Gauld)
Date: Wed, 6 Jul 2022 09:21:44 +0100
Subject: [Tutor] this group and one liners
In-Reply-To: <008a01d890e0$756336a0$6029a3e0$@gmail.com>
References: <008a01d890e0$756336a0$6029a3e0$@gmail.com>
Message-ID: <7143f0d4-0fdf-88eb-22d9-391065b28044@yahoo.co.uk>

On 06/07/2022 03:31, avi.e.gross at gmail.com wrote:
> But for teaching a relatively beginner computer class, ...
> the goal often is to do things in a way that makes an algorithm clear,

Never mind a beginner. The goal should *always* be to make the algorithm
clear!

As someone who spent many years leading a maintenance team we spent many
hours disentangling "clever" one-liners. They almost never provided any
benefit
and just slowed down comprehension and made debugging nearly impossible.

The only thing one-liners do is make the code shorter. But the compiler
doesn't
care and there are no benefits for short code except a tiny bit of saved
storage!

(I accept that some ancient interpreters did work slightly faster when
interpreting a one liner but I doubt that any such beast is still in
production
use today!)

?
> I mean sure, there are lots of extra variables sitting around, maybe some if statement with an else and maybe some visible loops. But they show how the problem is being approached and perhaps allow some strategic debugging or testing with print statements and the like.

They also allow interspersed comments to explain what's being done

And make it easier to insert print statements etc.


> An interesting variation I appreciate in languages with some form of pipeline is to write the algorithm forwards rather than nested. Again, this is easier for some to comprehend. I mean using pseudcode, something like:

Smalltalk programmers do this all the time(in a slightly different way)
and it makes code easy to read and debug. And in the Smalltalk case
easier for the optimiser to speed up.

> The point I am making is whether when some of us do one-liners, are we
really helping or just enjoying solving our puzzles?

In my experience one-liners are nearly always the result of:

a) an ego-trip by the programmer (usually by an "expert")

b) a misguided assumption that short code is somehow better (usually by
a beginner)

Very, very, occasionally they are an engineering choice becase they do
in fact give
a minor speed improvement inside a critical loop. But that's about 0.1%
of the one-liners
I've seen!


-- 

Alan G
Author of the Learn to Program web site
http://www.alan-g.me.uk/
http://www.amazon.com/author/alan_gauld
Follow my photo-blog on Flickr at:
http://www.flickr.com/photos/alangauldphotos


From avi.e.gross at gmail.com  Wed Jul  6 11:50:26 2022
From: avi.e.gross at gmail.com (avi.e.gross at gmail.com)
Date: Wed, 6 Jul 2022 11:50:26 -0400
Subject: [Tutor] this group and one liners
In-Reply-To: <7143f0d4-0fdf-88eb-22d9-391065b28044@yahoo.co.uk>
References: <008a01d890e0$756336a0$6029a3e0$@gmail.com>
 <7143f0d4-0fdf-88eb-22d9-391065b28044@yahoo.co.uk>
Message-ID: <007601d89150$1cf50320$56df0960$@gmail.com>

Alan,

Well said.

I differentiate between code, often what we call one-liners, that is being done to impress or to solve a puzzle as in how compact you can make your code, as compared to  fairly valid constructs that can allow programming to happen at a higher level of abstraction.

As has often been discussed, quite a bit is done in languages like python using constructs like:

[ x+2 for x in iterator ]

In general, all that is can be written with one or more loops. Someone who started off programming in other languages may hit a wall reading that code embedded in a one-liner. I have written code with multiple nested constructs like that where I had trouble parsing it just a few days later.

Functional programming techniques are another example of replacing a loop construct over multiple lines by an often short construct like do_this(function, iterable), again sometimes nested. You can even find versions that accept a list of functions to apply to a list of arguments either pairwise or more exhaustively.

The same arguments apply to the way some people do object-oriented techniques when not everything needs to be an object. So, yes, you can make a fairly complicated object that encapsulates a gigantic amount of code, then use that in a one-liner where everything is done magically.

I will say this though. Code that is brief and fits say on one screen, is far easier for many people to read and understand. But that can often be arranged by moving parts of the problem into other functions you create that each may also be easy to read and understand. A one-liner that simply calls one or two such functions with decent design and naming, may not qualify as using the smallest amount of code, but can be easy to read IN PARTS and still feel pleasing.

Specifically with some of the code I shared recently, I pointed out that if max() failed for empty lists as an argument, you could simply create a safe_max() function of your own that encapsulates one of several variations and returns a 0 when you pass it something without a maximum. 

BUT, I think it must be carefully pointed out to students that there is a huge difference between using functions guaranteed to exist in any Python program unless shadowed, and those that can be imported from some standard module, and those that a person creates for themselves that may have to be brought in some way each time you use them and that others may have no access to so sharing code using them without including those functions is not a good thing.

- Avi


-----Original Message-----
From: Tutor <tutor-bounces+avi.e.gross=gmail.com at python.org> On Behalf Of Alan Gauld
Sent: Wednesday, July 6, 2022 4:22 AM
To: tutor at python.org
Subject: Re: [Tutor] this group and one liners

On 06/07/2022 03:31, avi.e.gross at gmail.com wrote:
> But for teaching a relatively beginner computer class, ...
> the goal often is to do things in a way that makes an algorithm clear,

Never mind a beginner. The goal should *always* be to make the algorithm clear!

As someone who spent many years leading a maintenance team we spent many hours disentangling "clever" one-liners. They almost never provided any benefit and just slowed down comprehension and made debugging nearly impossible.

The only thing one-liners do is make the code shorter. But the compiler doesn't care and there are no benefits for short code except a tiny bit of saved storage!

(I accept that some ancient interpreters did work slightly faster when interpreting a one liner but I doubt that any such beast is still in production use today!)

?
> I mean sure, there are lots of extra variables sitting around, maybe some if statement with an else and maybe some visible loops. But they show how the problem is being approached and perhaps allow some strategic debugging or testing with print statements and the like.

They also allow interspersed comments to explain what's being done

And make it easier to insert print statements etc.


> An interesting variation I appreciate in languages with some form of pipeline is to write the algorithm forwards rather than nested. Again, this is easier for some to comprehend. I mean using pseudcode, something like:

Smalltalk programmers do this all the time(in a slightly different way) and it makes code easy to read and debug. And in the Smalltalk case easier for the optimiser to speed up.

> The point I am making is whether when some of us do one-liners, are we
really helping or just enjoying solving our puzzles?

In my experience one-liners are nearly always the result of:

a) an ego-trip by the programmer (usually by an "expert")

b) a misguided assumption that short code is somehow better (usually by a beginner)

Very, very, occasionally they are an engineering choice becase they do in fact give a minor speed improvement inside a critical loop. But that's about 0.1% of the one-liners I've seen!


-- 

Alan G
Author of the Learn to Program web site
http://www.alan-g.me.uk/
http://www.amazon.com/author/alan_gauld
Follow my photo-blog on Flickr at:
http://www.flickr.com/photos/alangauldphotos

_______________________________________________
Tutor maillist  -  Tutor at python.org
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor


From avi.e.gross at gmail.com  Wed Jul  6 13:02:39 2022
From: avi.e.gross at gmail.com (avi.e.gross at gmail.com)
Date: Wed, 6 Jul 2022 13:02:39 -0400
Subject: [Tutor] this group and one liners
In-Reply-To: <7143f0d4-0fdf-88eb-22d9-391065b28044@yahoo.co.uk>
References: <008a01d890e0$756336a0$6029a3e0$@gmail.com>
 <7143f0d4-0fdf-88eb-22d9-391065b28044@yahoo.co.uk>
Message-ID: <008b01d8915a$337e77c0$9a7b6740$@gmail.com>

The excuse for many one-liners is often that they are in some ways better as in use less memory or are faster in some sense.

Although that can be true, I think it is reasonable to say that often the exact opposite is true.

In order to make a compact piece of code, you often twist the problem around in a way that makes it possible to leave out some cases or not need to handle errors and so on. I did this a few days ago until I was reminded max(..., default=0) handled my need. Some of my attempts around it generated lots of empty strings which were all then evaluated as zero length and then max() processed a longer list with lots of zeroes.

I am not saying the above slowed things down seriously, but that other methods that could solve the problem, but not o one line, were ignored.

The pipelining methods I mentioned vary. Some languages have a variation using object-oriented techniques where you can chain calls by using methods or contents of objects as in:

a.b().c(arg).d 

and so on.

Some languages encourage writing that in a more obvious pipeline over multiple lines, where you may even be able to intersperse comments.

But is that more efficient than doing it line by line? Consider this code:

A = something
B = f(A)
C = g(B)
rm(B)
...

In many pipelines, there are lots of intermediate variables that are referenced once and meant to be deleted. In languages with garbage collection, that may happen eventually. But often the implementation of a pipeline may be such that only after all of it is completed, are the parts deleted on purpose or no longer held onto so garbage collection can see it as eligible.

I will end by mentioning a language like PERL where people that things to extremes and beyond. The language encourages you to be so compact that normal people reading it often get lost as they fail to see how it does anything.

Without giant details, the language maintains all sorts of variables with names like $_ that hold things like the last variable you set. Many of the verbs that accept arguments will default to using these automatic hidden ones as the target or source of something.

So you can read a line in with something like 

my $mystring = <STDIN>;
print "$mystring \n";
chomp($mystring);
$mystring =~ s/[hH]el+o/goodbye/g;
print $mystring;

The above (with possible errors on my part) is supposedly going to read in a line from Standard Input, remove any trailing newline and reassign it to the same variable without explicitly saying so, make a substitution from that string into another and replace every instance of "hello" with "goodbye" and print the result. But since everything is hidden in $_, you can write briefer and somewhat mysterious code like:

$mystring = <STDIN>;
chomp($mystring);
$mysub =~ s/hello/goodbye/g;
print $mysub;

The above prompts me for a line of text and I can enter something like "Hello my friends and hello Heloise." (without the quotes) and it spits out a result that matches "hello" with "h" either upper or lower case, and any numer of copies of the letter ell in a row:

$ perl -w prog.pl
Hello my friends and hello Heloise.
goodbye my friends and goodbye goodbyeise.

Well, yes, I deliberately used a regular expression that produced a bit more than is reasonable. BUT the same much shorter code below does the dame thing:

$_ = <STDIN>;
chomp;
s/[hH]el+o/goodbye/g;
print;

The lines like "chomp;" are interpreted as if I had written "$_ = chomp($_)" and the substitution as if I had written it a bit like the first version and the same for the print. 

And can it be a one liner? The semicolons are there to make something like this work fine:

$_ = <STDIN>; chomp; s/[hH]el+o/goodbye/g; print;

But WHY be so terse and cryptic?

As Alan points out, your code may be used and maintained by others. And the brevity above also comes with a cost of sorts. For every statement, the PERL interpreter must modify a slew of variables just in case you want them, many with cryptic names like $< and $0 but the other overhead is how many other function(alities) have to check the command line arguments and when missing, supply the hidden ones.

For beginning students, I think it wise to start with more standard methods and they can use more advanced techniques, or in my example perhaps more primitive techniques, after they have a firm understanding.

And my apologies if I bring in examples from other programming languages. Python has plenty of interesting usages that may seem equally mysterious to some.

- s/a.e.gross/Avi/


-----Original Message-----
From: Tutor <tutor-bounces+avi.e.gross=gmail.com at python.org> On Behalf Of Alan Gauld
Sent: Wednesday, July 6, 2022 4:22 AM
To: tutor at python.org
Subject: Re: [Tutor] this group and one liners

On 06/07/2022 03:31, avi.e.gross at gmail.com wrote:
> But for teaching a relatively beginner computer class, ...
> the goal often is to do things in a way that makes an algorithm clear,

Never mind a beginner. The goal should *always* be to make the algorithm clear!

As someone who spent many years leading a maintenance team we spent many hours disentangling "clever" one-liners. They almost never provided any benefit and just slowed down comprehension and made debugging nearly impossible.

The only thing one-liners do is make the code shorter. But the compiler doesn't care and there are no benefits for short code except a tiny bit of saved storage!

(I accept that some ancient interpreters did work slightly faster when interpreting a one liner but I doubt that any such beast is still in production use today!)

?
> I mean sure, there are lots of extra variables sitting around, maybe some if statement with an else and maybe some visible loops. But they show how the problem is being approached and perhaps allow some strategic debugging or testing with print statements and the like.

They also allow interspersed comments to explain what's being done

And make it easier to insert print statements etc.


> An interesting variation I appreciate in languages with some form of pipeline is to write the algorithm forwards rather than nested. Again, this is easier for some to comprehend. I mean using pseudcode, something like:

Smalltalk programmers do this all the time(in a slightly different way) and it makes code easy to read and debug. And in the Smalltalk case easier for the optimiser to speed up.

> The point I am making is whether when some of us do one-liners, are we
really helping or just enjoying solving our puzzles?

In my experience one-liners are nearly always the result of:

a) an ego-trip by the programmer (usually by an "expert")

b) a misguided assumption that short code is somehow better (usually by a beginner)

Very, very, occasionally they are an engineering choice becase they do in fact give a minor speed improvement inside a critical loop. But that's about 0.1% of the one-liners I've seen!


-- 

Alan G
Author of the Learn to Program web site
http://www.alan-g.me.uk/
http://www.amazon.com/author/alan_gauld
Follow my photo-blog on Flickr at:
http://www.flickr.com/photos/alangauldphotos

_______________________________________________
Tutor maillist  -  Tutor at python.org
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor


From sjeik_appie at hotmail.com  Thu Jul  7 10:19:44 2022
From: sjeik_appie at hotmail.com (Albert-Jan Roskam)
Date: Thu, 07 Jul 2022 16:19:44 +0200
Subject: [Tutor] this group and one liners
In-Reply-To: <007601d89150$1cf50320$56df0960$@gmail.com>
Message-ID: <DB6PR01MB38950EE411948C03100BAAA683839@DB6PR01MB3895.eurprd01.prod.exchangelabs.com>

   On Jul 6, 2022 17:50, avi.e.gross at gmail.com?

     As has often been discussed, quite a bit is done in languages like
     python using constructs like:

     [ x+2 for x in iterator ]

   ======
   My rule of thumb is: if black converts a list/set/dict comprehension into
   a multiline expression, I'll probably refactor it. I try to avoid nested
   loops in them. Nested comprehensions may be ok. Nested ternary expressions
   are ugly, even more so inside comprehensions!

     Specifically with some of the code I shared recently, I pointed out that
     if max() failed for empty lists as an argument, you could simply create
     a safe_max() function of your own that encapsulates one of several
     variations and returns a 0 when you pass it something without a maximum.

   ====
   Or this?
   max(items or [0])

From mats at wichmann.us  Thu Jul  7 11:38:20 2022
From: mats at wichmann.us (Mats Wichmann)
Date: Thu, 7 Jul 2022 09:38:20 -0600
Subject: [Tutor] this group and one liners
In-Reply-To: <008b01d8915a$337e77c0$9a7b6740$@gmail.com>
References: <008a01d890e0$756336a0$6029a3e0$@gmail.com>
 <7143f0d4-0fdf-88eb-22d9-391065b28044@yahoo.co.uk>
 <008b01d8915a$337e77c0$9a7b6740$@gmail.com>
Message-ID: <9badc5d6-b94c-0a96-2149-d1a1c1e989d9@wichmann.us>


On 7/6/22 11:02, avi.e.gross at gmail.com wrote:

Boy, we "old-timers" do get into these lengthy discussions...  having
read several comments here, of which this is only one:

> The excuse for many one-liners is often that they are in some ways better as in use less memory or are faster in some sense.
> 
> Although that can be true, I think it is reasonable to say that often the exact opposite is true.
> 
> In order to make a compact piece of code, you often twist the problem around in a way that makes it possible to leave out some cases or not need to handle errors and so on. I did this a few days ago until I was reminded max(..., default=0) handled my need. Some of my attempts around it generated lots of empty strings which were all then evaluated as zero length and then max() processed a longer list with lots of zeroes.


After a period when they felt awkward because lots of my programming
background was in languages that didn't have these, I now use simple
one-liners extensively - comprehensions and ternary expressions.  If
they start nesting, probably not.  Someone mentioned a decent metric -
if a code formatter like Black starts breaking your comprehension into
four lines, it probably got too complex.

My take on these is you can write a more compact function this way -
you're more likely to have the meat of what's going on right there
together in a few lines, rather than building mountain ranges of
indentation - and this can actually *improve* readability, not obscure
it.  I agree "clever" one-liners may be a support burden, but anyone
with a reasonable amount of Python competency (which I'd expect of
anyone in a position to maintain my code at a later date) should have no
trouble recognizing the intent of simple ones.

Sometimes thinking about how to write a concise one-liner exposes a
failure to have thought through what you're doing completely - so unlike
what is mentioned above - twisting a problem around unnaturally (no
argument that happens too), you might actually realize that there's a
simpler way to structure a step.

Just one more opinion.

From avi.e.gross at gmail.com  Thu Jul  7 18:01:13 2022
From: avi.e.gross at gmail.com (avi.e.gross at gmail.com)
Date: Thu, 7 Jul 2022 18:01:13 -0400
Subject: [Tutor] this group and one liners
In-Reply-To: <DB6PR01MB38950EE411948C03100BAAA683839@DB6PR01MB3895.eurprd01.prod.exchangelabs.com>
References: <007601d89150$1cf50320$56df0960$@gmail.com>
 <DB6PR01MB38950EE411948C03100BAAA683839@DB6PR01MB3895.eurprd01.prod.exchangelabs.com>
Message-ID: <006a01d8924d$133dbd60$39b93820$@gmail.com>

Albert,

 
It would be a great idea to use:

 
Max(item or 0)

 
But if item == [] then 

 
max([] or 0) 

 
breaks down with an error and the [] does not evaluate to false.

 
Many things in python are truthy but on my setup, an empty list does not seem to be evaluated and return false before max() gets to it.

 
Am I doing anything wrong? As noted, max has an argument allowed of default=0 perhaps precisely as there isn?t such an easy way around it.

 
Avi

 
From: Albert-Jan Roskam <sjeik_appie at hotmail.com> 
Sent: Thursday, July 7, 2022 10:20 AM
To: avi.e.gross at gmail.com
Cc: tutor at python.org
Subject: Re: [Tutor] this group and one liners

 
On Jul 6, 2022 17:50, avi.e.gross at gmail.com <mailto:avi.e.gross at gmail.com>  

As has often been discussed, quite a bit is done in languages like python using constructs like:

[ x+2 for x in iterator ]

 
======

My rule of thumb is: if black converts a list/set/dict comprehension into a multiline expression, I'll probably refactor it. I try to avoid nested loops in them. Nested comprehensions may be ok. Nested ternary expressions are ugly, even more so inside comprehensions!

 
Specifically with some of the code I shared recently, I pointed out that if max() failed for empty lists as an argument, you could simply create a safe_max() function of your own that encapsulates one of several variations and returns a 0 when you pass it something without a maximum.

 
====

Or this?

max(items or [0])

 
From wlfraed at ix.netcom.com  Thu Jul  7 18:50:29 2022
From: wlfraed at ix.netcom.com (Dennis Lee Bieber)
Date: Thu, 07 Jul 2022 18:50:29 -0400
Subject: [Tutor] this group and one liners
References: <007601d89150$1cf50320$56df0960$@gmail.com>
 <DB6PR01MB38950EE411948C03100BAAA683839@DB6PR01MB3895.eurprd01.prod.exchangelabs.com>
 <006a01d8924d$133dbd60$39b93820$@gmail.com>
Message-ID: <l9oechlj6c5hveavglb9pdc06o7j381ja1@4ax.com>

On Thu, 7 Jul 2022 18:01:13 -0400, <avi.e.gross at gmail.com> declaimed the
following:


>
>max([] or 0) 
>
> 
>
>breaks down with an error and the [] does not evaluate to false.
>

	That's because the 0 is not an iterable, not that [] didn't evaluate...

>>> [] or 0
0
>>> max([] or 0)
Traceback (most recent call last):
  File "<interactive input>", line 1, in <module>
TypeError: 'int' object is not iterable
>>> max(0)
Traceback (most recent call last):
  File "<interactive input>", line 1, in <module>
TypeError: 'int' object is not iterable
>>> 

	max() requires something it can iterate over. The expression

	<something falsey> or <anything>

returns <anything>, not true/false, and does NOT override max() normal
operation of trying to compare objects in an iterable..

>>> 
>>> [] or "anything"
'anything'
>>> max([] or "anything")
'y'
>>> 

	The "default" value option applies only if max() would otherwise raise
an exception for an empty sequence.

>>> max([], default=0)
0
>>> max([])
Traceback (most recent call last):
  File "<interactive input>", line 1, in <module>
ValueError: max() arg is an empty sequence
>>> 


-- 
	Wulfraed                 Dennis Lee Bieber         AF6VN
	wlfraed at ix.netcom.com    http://wlfraed.microdiversity.freeddns.org/


From alan.gauld at yahoo.co.uk  Thu Jul  7 18:59:00 2022
From: alan.gauld at yahoo.co.uk (Alan Gauld)
Date: Thu, 7 Jul 2022 23:59:00 +0100
Subject: [Tutor] this group and one liners
In-Reply-To: <9badc5d6-b94c-0a96-2149-d1a1c1e989d9@wichmann.us>
References: <008a01d890e0$756336a0$6029a3e0$@gmail.com>
 <7143f0d4-0fdf-88eb-22d9-391065b28044@yahoo.co.uk>
 <008b01d8915a$337e77c0$9a7b6740$@gmail.com>
 <9badc5d6-b94c-0a96-2149-d1a1c1e989d9@wichmann.us>
Message-ID: <ta7ofk$h7u$1@ciao.gmane.io>

On 07/07/2022 16:38, Mats Wichmann wrote:

> background was in languages that didn't have these, I now use simple
> one-liners extensively - comprehensions and ternary expressions.  

Let me clarify. When I'm talking about one-liners I mean the practice
of putting multiple language expressions into a single line. I don't
include language features like comprehensions, generators or ternary
operators. These are all valid language features.

But when a comprehension is used for its sidfe effects, or
ternary operators are used with multiple comprehensions and
so on, that's when it becomes a problem. Maintenance is the
most expensive part of any piece of long lived softrware,
making maintenance cheap is the responsibility of any programmer.
Complex one-liners are the antithesis of cheap. But use of
regular idiomatic expressions are fine.

It's also worth noting that comprehensions can be written
on multiple lines too, although that still loses debug
potential...

 var = [expr
        for item in seq
        if condition]

So if you do have to use more complex expressions or conditions
you can at least make them more readable.

The same applies to regex. They can be cryptic or readable. If
they must be complex (and sometimes they must) they can be built
up in stages with each group clearly commented.

> My take on these is you can write a more compact function this way -
> you're more likely to have the meat of what's going on right there
> together in a few lines, rather than building mountain ranges of
> indentation

True to an extent, although recent studies suggest that functions can be
up to 100 lines long before they become hard to maintain (they used to
say 25!) But if we are getting to 4 or more levels of indentation its
usually a sign that some refactoring needs to be done.

> anyone in a position to maintain my code at a later date) should have no
> trouble recognizing the intent of simple ones.

That's true, although in many organisations the maintence team is the
first assignment for new recruits. So they may have limited experience.
But that usually affects their ability to deal with lots of code rather
than the code within a single function. (And it's to learn that skill
that they get assigned to maintenance first! - Along with learning the
house style, if such exists) Of course, much software maintenance is now
off-shored rather than kept in-house and the issue there is that the
cheapest programmers are used and these also tend to be the least
experienced or those with "limited career prospects" - aka old or mediocre.

> Sometimes thinking about how to write a concise one-liner exposes a
> failure to have thought through what you're doing completely

That's true too and many pieces of good quality code start of as
one-liners. But in the interests of maintenance should be deconstructed
and/or refactored once the solution is understood. Good engineering
is all about cost reduction, so reducing maintenance cost is the
primary objective of good software engineering because maintenance
is by far the biggest cost of most software projects.

-- 
Alan G
Author of the Learn to Program web site
http://www.alan-g.me.uk/
http://www.amazon.com/author/alan_gauld
Follow my photo-blog on Flickr at:
http://www.flickr.com/photos/alangauldphotos


From PythonList at DancesWithMice.info  Thu Jul  7 19:14:29 2022
From: PythonList at DancesWithMice.info (dn)
Date: Fri, 8 Jul 2022 11:14:29 +1200
Subject: [Tutor] this group and one liners
In-Reply-To: <9badc5d6-b94c-0a96-2149-d1a1c1e989d9@wichmann.us>
References: <008a01d890e0$756336a0$6029a3e0$@gmail.com>
 <7143f0d4-0fdf-88eb-22d9-391065b28044@yahoo.co.uk>
 <008b01d8915a$337e77c0$9a7b6740$@gmail.com>
 <9badc5d6-b94c-0a96-2149-d1a1c1e989d9@wichmann.us>
Message-ID: <44cd1e9c-487e-4bc5-f452-8c05d92b342c@DancesWithMice.info>

On 08/07/2022 03.38, Mats Wichmann wrote:
> On 7/6/22 11:02, avi.e.gross at gmail.com wrote:
> Boy, we "old-timers" do get into these lengthy discussions...  having
> read several comments here, of which this is only one:

Who is being called "old"?

[this conversation is operating at an 'advanced' level. If you are  more
of a Beginner and would like an explanation of any of what I've written
here, please don't hesitate to request expansion or clarification!]


>> The excuse for many one-liners is often that they are in some ways better as in use less memory or are faster in some sense.

Rather than "excuse", is it a (misplace?) sense of pride or ego?

That said, sometimes there are efficiency-gains. The problem though, is
that such may only apply on the particular system. So, when the code is
moved to my PC, an alternate approach is actually 'better' (for
whichever is/are the pertinent criteria).


>> Although that can be true, I think it is reasonable to say that often the exact opposite is true.
>>
>> In order to make a compact piece of code, you often twist the problem around in a way that makes it possible to leave out some cases or not need to handle errors and so on. I did this a few days ago until I was reminded max(..., default=0) handled my need. Some of my attempts around it generated lots of empty strings which were all then evaluated as zero length and then max() processed a longer list with lots of zeroes.

Is this a good point to talk 'testability'?

Normally, I would mention "TDD" - my preferred dev-approach. However, at
this time I'm working with a bunch of 'Beginners' and showing the use of
Python's Interactive Mode, aka the REPL - so, experiment-first rather
than test-first - or, more testing of code-constructs than the data
processing.

The problem with dense units of code is that they are, by-definition, a
'black box'. We can test 'around' them, but not 'through' them/within them.

Accordingly, the TDD approach would be to develop step-by-step, assuring
each step in the process, as it is built (as it should be built). This
produces a line-by-line result. From there, some tools will suggest that
a for-loop be turned into a comprehension (for example). Would it be
part of one's skill in refactoring to decide if such is actually a good
idea - and because 'testing' has been integral from the start, that may
provide a discouragement from too much integration.

Trainees, using the REPL, particularly those with caution as an
attribute (cf arrogance (?) ), are also more inclined to develop
multi-stage processes (such as the R example given earlier),
stage-by-stage. They will similarly test-as-you-go, even if after
writing each line of code (cf 'pure' TDD). Once a single worked-example
has been developed, the code is likely to be copied into an editor/IDE,
and now that the coder's confidence is raised, the code (one hopes) will
be subjected to a wider range of test-cases.

In the former case, there is some caution before consolidation - tests
may need to be removed/rendered impossible. In the latter, it is less
likely such thinking will apply, because the "confidence" and code-first
thinking may lead to over-confidence, without regard for 'testability'.

Maybe?


> After a period when they felt awkward because lots of my programming
> background was in languages that didn't have these, I now use simple
> one-liners extensively - comprehensions and ternary expressions.  If
> they start nesting, probably not.  Someone mentioned a decent metric -
> if a code formatter like Black starts breaking your comprehension into
> four lines, it probably got too complex.

Agreed. There is nothing inherently 'wrong' with the likes of
comprehensions, generators, ternary operators, etc. So, why not use
them? In short, they are idioms within the Python language, and would
not have been made available if they hadn't been deemed useful and
appropriate. (see PEP process)

Any argument that they are difficult to understand is going to be
correct - at least, on the face of it. This applies to natural-languages
as well. For example, the American "yeah, right" is an exclamation
consisting of two positive words - apparently a statement of agreement
("yes"), and an affirmation of correctness ("right", ie correct). Yet it
is actually used as an expression of disagreement through derision, eg
someone says "dn is the best-looking person in the room" but another
person disputes with feeling by sarcastically intoning "yeah, right!".

Idioms are learned (and often can't be easily or literally translated),
and part of the necessity for ("deliberate") practice in the use of
Python. Sure, someone who is only a few chapters into an intro-book will
not readily recognise a list-comprehension. However, that is not a good
reason why those of us who have finished the whole book should not use
these facilities!

[ x+2 for x in iterator ]
- is a reasonable Python idiom

max(items or [0])
- appears to be a reasonable use of a ternary operator, except that
max( items, default=0 )
- is more 'pythonic' (a feature built-into the language for this express
purpose/situation) and thus idiomatic Python

How about the popular 'gotcha' of using a mutable-collection as a
function-parameter, eg

def function_name( mutable_collection=None ):
  if mutable_collection is None:
    mutable_collection = []

or is this commonly-used idiom more pythonic?

def function_name( mutable_collection=None ):
  mutable_collection = mutable_collection if mutable_collection else []


Can concise-forms be over-used?
Yes!

Rather than relying upon some external tool, I find that the IDE (or my
own) formatting of the various clauses provides immediate feedback that
things are becoming too complex (for my future-self to handle).

A longer ternary operator, or one that is contained within a more
complex construct could be spread over two lines (which would highlight
the two 'choices'). A list comprehension could split its expression, its
for-clause, and its conditional-clause over three lines.

Spreading a single such construct over more than three lines starts to
look 'complex' (to my "old" eyes). The indentation requires more thought
than I'd care to devote (I need my brain-power for problem-solving
rather than 'art work'!). Thus, these lead to the idea that 'simple is
better than complex...complicated' - regardless of one's interpretation
of "beautiful" [Zen of Python].


> My take on these is you can write a more compact function this way -
> you're more likely to have the meat of what's going on right there
> together in a few lines, rather than building mountain ranges of
> indentation - and this can actually *improve* readability, not obscure
> it.  I agree "clever" one-liners may be a support burden, but anyone
> with a reasonable amount of Python competency (which I'd expect of
> anyone in a position to maintain my code at a later date) should have no
> trouble recognizing the intent of simple ones.

Could we also posit that there is more than one definition of "complex"?
As well as "testability" being lost as groups of steps in a multi-stage
process are combined and/or compressed, there is a longer-term impact.
When (not "if") the code needs to be changed, how easy would you rate
that task? I guess the answer can't escape 'testability' in that some
code which comes with (good) tests already in-place, can be changed with
a greater sense of confidence - the idea that the code can be altered
and those alterations will not cause 'breakage' (because the tests still
'pass') - regression testing.

However, the main point here, is that someone charged with changing a
piece of code will take a certain amount of time to read and understand
it (before (s)he starts to make changes). Even assuming that-someone is
a Python-Master, comprehending existing code requires an understanding
of the domain and the algorithm (etc). Accordingly, the more 'dense' the
code, the harder it is likely to become to make such a task.

If the algorithm is a common platform within the domain, the 'level' at
which things will be deemed 'complex' will change. For example, a recent
thread on the list dealt with calculating a standard deviation.
Statisticians will readily recognise such a calculation (more likely
there is a library routine/function to call, but...). Accordingly, the
domain doesn't require such code to be 'simplified' - even though 'mere
mortals' might scratch their heads trying to decipher its purpose, the
steps within it, and its place in the overall routine.


> Sometimes thinking about how to write a concise one-liner exposes a
> failure to have thought through what you're doing completely - so unlike
> what is mentioned above - twisting a problem around unnaturally (no
> argument that happens too), you might actually realize that there's a
> simpler way to structure a step.

Unnatural twisting sounds like something to avoid!
Twist again, like we did last summer...
Play Twister again, like we...


I prefer to break complex calculations into their working parts (not
just as a characteristic of TDD, as above). It can also be useful to
separate working-parts into their own function. In both cases, we need
'names', either for intermediate-results (see also the use of _ as a
placeholder-identifier) or to be able to call the function. Taking care
to choose a 'good' name improves readability and decreases apparent
complexity.

A 'problem' I see trainees often evidencing (aka 'the rush to code',
'I'm not working if I'm not coding', etc), is that a name must be chosen
when the identifier is first defined ("LHS"). However, its use may not
be properly appreciated until that value is subsequently used ("RHS").
It is at this later step that the importance of the name becomes most
obvious (to the writer). If that is so, the assessment goes-double for a
subsequent reader attempting to divine the code's meaning and workings!

In the past, realising that the first choice of name might not be the
best may have lead us to say 'oh well', sigh, and quietly (try to)
carry-on, because of the (considerable) effort of changing a name
(without introducing regression errors). These days, such "technical
debt" is quite avoidable. Capable IDEs enable one to quickly and easily
refactor a choice of name and to (find-replace) update the code to
utilise a better name, everywhere it is mentioned, with minimal manual
effort! Thus, the effort of ensuring the future-reader/maintainer has
competent implicit and fundamental documentation loses yet another
'excuse' and reaches the level of professional expectation.

-- 
Regards,
=dn

From PythonList at DancesWithMice.info  Thu Jul  7 19:47:31 2022
From: PythonList at DancesWithMice.info (dn)
Date: Fri, 8 Jul 2022 11:47:31 +1200
Subject: [Tutor] this group and one liners
In-Reply-To: <ta7ofk$h7u$1@ciao.gmane.io>
References: <008a01d890e0$756336a0$6029a3e0$@gmail.com>
 <7143f0d4-0fdf-88eb-22d9-391065b28044@yahoo.co.uk>
 <008b01d8915a$337e77c0$9a7b6740$@gmail.com>
 <9badc5d6-b94c-0a96-2149-d1a1c1e989d9@wichmann.us>
 <ta7ofk$h7u$1@ciao.gmane.io>
Message-ID: <c5453b25-caaf-275a-4122-229eb80dec1f@DancesWithMice.info>

On 08/07/2022 10.59, Alan Gauld via Tutor wrote:
> On 07/07/2022 16:38, Mats Wichmann wrote:
> 
>> background was in languages that didn't have these, I now use simple
>> one-liners extensively - comprehensions and ternary expressions.  
> 
> Let me clarify. When I'm talking about one-liners I mean the practice
> of putting multiple language expressions into a single line. I don't
> include language features like comprehensions, generators or ternary
> operators. These are all valid language features.

+1


> But when a comprehension is used for its sidfe effects, or
> ternary operators are used with multiple comprehensions and
> so on, that's when it becomes a problem. Maintenance is the
> most expensive part of any piece of long lived softrware,
> making maintenance cheap is the responsibility of any programmer.
> Complex one-liners are the antithesis of cheap. But use of
> regular idiomatic expressions are fine.

+1


> It's also worth noting that comprehensions can be written
> on multiple lines too, although that still loses debug
> potential...
> 
>  var = [expr
>         for item in seq
>         if condition]
> 
> So if you do have to use more complex expressions or conditions
> you can at least make them more readable.

Many texts introduce comprehensions with comments like "shorter" and
"more efficient".

In this example, the comprehension will run at C-speed, whereas the
long-form for-loop will run at interpreter-speed. Thus, one definition
of "efficient".

Yes, it would *appear* shorter if written on a single line. However, is
not, when formatted for reading. Here's the long-form:

seq = list()
for item in seq:
    if condition:
        var = expr

NB I have a muscle-memory that inserts a blank-line before (and after) a
for-loop (for readability). However, in this example, the 'declaration'
of the seq[ence] would become physically/vertically-separated from the
loop which has the sole purpose of initialising same. Negative readability!

Also, my IDE will prefer to format a multi-line comprehension such as
this, by placing the last square-bracket on the following line (and
probably inserting a line-break after the opening bracket). Accordingly,
there is no 'shorter' because the 'length' of each alternative becomes
exactly the same, or the comprehension is 'longer' (counting as 'lines
of code')!


> The same applies to regex. They can be cryptic or readable. If
> they must be complex (and sometimes they must) they can be built
> up in stages with each group clearly commented.

+1
I've seen people tying themselves in knots to explain a complex RegEx.
Even to the point of trying to fit each 'operation' on its own line,
followed by a # explanation.


>> My take on these is you can write a more compact function this way -
>> you're more likely to have the meat of what's going on right there
>> together in a few lines, rather than building mountain ranges of
>> indentation
> 
> True to an extent, although recent studies suggest that functions can be
> up to 100 lines long before they become hard to maintain (they used to
> say 25!) But if we are getting to 4 or more levels of indentation its
> usually a sign that some refactoring needs to be done.

Didn't we used to say ~60 lines - the number of lines on a page of
green-striped, continuous, line-flo[w], stationery?

Those were the days!


>> anyone in a position to maintain my code at a later date) should have no
>> trouble recognizing the intent of simple ones.
> 
> That's true, although in many organisations the maintence team is the
> first assignment for new recruits. So they may have limited experience.
> But that usually affects their ability to deal with lots of code rather
> than the code within a single function. (And it's to learn that skill
> that they get assigned to maintenance first! - Along with learning the
> house style, if such exists) Of course, much software maintenance is now
> off-shored rather than kept in-house and the issue there is that the
> cheapest programmers are used and these also tend to be the least
> experienced or those with "limited career prospects" - aka old or mediocre.

OK, so now we're not just observing the grey in my beard?

Doesn't the root meaning of mediocrity come from observations of how
standards in professional journalism are steadily and markedly declining?


Jokes(?) aside, the observation is all-too correct though. Thus, the
added-virtue of providing tests alongside any code - or code not being
'complete' unless there is also a related test suite.

The problem is that many maintenance-fixes are performed under
time-pressure. Worst case: the company is at a standstill until you find
this bug...

The contention then, is that these 'learners-of-their-trade' should be
given *more* time. Time to look-up the docs and the reference books, to
see how various constructs work, eg list comprehensions. Time to learn!


Although when off-shoring, one is (imagines that you are) paying for
competence. So, the above scenario should not exist...

Oops!


>> Sometimes thinking about how to write a concise one-liner exposes a
>> failure to have thought through what you're doing completely
> 
> That's true too and many pieces of good quality code start of as
> one-liners. But in the interests of maintenance should be deconstructed
> and/or refactored once the solution is understood. Good engineering
> is all about cost reduction, so reducing maintenance cost is the
> primary objective of good software engineering because maintenance
> is by far the biggest cost of most software projects.

+1
Sadly an observation that is seldom experienced by students and
hobbyists, and only becomes apparent - indeed relevant - when the
complexity of one's projects increases.

Such (only) 'in your own head' behavior is philosophically-discouraged
in the practice of TDD. (just sayin'...)

-- 
Regards,
=dn

From alan.gauld at yahoo.co.uk  Thu Jul  7 20:33:16 2022
From: alan.gauld at yahoo.co.uk (Alan Gauld)
Date: Fri, 8 Jul 2022 01:33:16 +0100
Subject: [Tutor] this group and one liners
In-Reply-To: <006a01d8924d$133dbd60$39b93820$@gmail.com>
References: <007601d89150$1cf50320$56df0960$@gmail.com>
 <DB6PR01MB38950EE411948C03100BAAA683839@DB6PR01MB3895.eurprd01.prod.exchangelabs.com>
 <006a01d8924d$133dbd60$39b93820$@gmail.com>
Message-ID: <ta7u0c$6rm$1@ciao.gmane.io>

On 07/07/2022 23:01, avi.e.gross at gmail.com wrote:

> max([] or 0) 
> 
> breaks down with an error and the [] does not evaluate to false.

max([] or [0])


a sequence is all thats needed.

-- 
Alan G
Author of the Learn to Program web site
http://www.alan-g.me.uk/
http://www.amazon.com/author/alan_gauld
Follow my photo-blog on Flickr at:
http://www.flickr.com/photos/alangauldphotos


From avi.e.gross at gmail.com  Thu Jul  7 19:28:04 2022
From: avi.e.gross at gmail.com (avi.e.gross at gmail.com)
Date: Thu, 7 Jul 2022 19:28:04 -0400
Subject: [Tutor] this group and one liners
In-Reply-To: <9badc5d6-b94c-0a96-2149-d1a1c1e989d9@wichmann.us>
References: <008a01d890e0$756336a0$6029a3e0$@gmail.com>
 <7143f0d4-0fdf-88eb-22d9-391065b28044@yahoo.co.uk>
 <008b01d8915a$337e77c0$9a7b6740$@gmail.com>
 <9badc5d6-b94c-0a96-2149-d1a1c1e989d9@wichmann.us>
Message-ID: <02e601d89259$35982e20$a0c88a60$@gmail.com>

What sometimes amazes me, Max, is how some functions people write get adjusted to support more and more options. It can be as simple as setting a default value for missing options, or the ability to support more kinds of data and converting it to the type needed, or an argument to specify how many significant digits you want in the output and lots more you can imagine.

Now if you built a one-liner or even a very condensed multi-line version, fitting in additional options becomes a challenge. You may introduce errors by shoehorning in something without enough consideration and testing.

Building your code more loosely and more step by step, can make it far easier to modify cleanly and maybe easier to document as you have some room for comments and so on.

What you are talking about may well be a different idea at times, that a function that is too long and complex might better be re-written as a collection of smaller functions so each part of the task is done at a level of abstraction and comprehension that makes sense. That can be taken too far, as well, especially if the names and ideas are not at the level people think at. And, of course, it can lead to errors if someone just copies a function without the parts it depends on.

So a dumb question is whether you approve of defining functions within a function just to be used ONCE within the function but that makes it easier to read when you finally get to the meat and are able to say something somewhat meaningful like:

If (is_X((data) and is_Y(data) and not is_empty(data)): pass

If the functions are not that useful for anything else, this keeps them all together, albeit there may be some overhead if the main function is called repeatedly, or will there be if it is properly interpreted and pseudo-compiled?

I think it is high time I dropped this diversion and waited for a person asking for actual help. ?


-----Original Message-----
From: Tutor <tutor-bounces+avi.e.gross=gmail.com at python.org> On Behalf Of Mats Wichmann
Sent: Thursday, July 7, 2022 11:38 AM
To: tutor at python.org
Subject: Re: [Tutor] this group and one liners


On 7/6/22 11:02, avi.e.gross at gmail.com wrote:

Boy, we "old-timers" do get into these lengthy discussions...  having read several comments here, of which this is only one:

> The excuse for many one-liners is often that they are in some ways better as in use less memory or are faster in some sense.
> 
> Although that can be true, I think it is reasonable to say that often the exact opposite is true.
> 
> In order to make a compact piece of code, you often twist the problem around in a way that makes it possible to leave out some cases or not need to handle errors and so on. I did this a few days ago until I was reminded max(..., default=0) handled my need. Some of my attempts around it generated lots of empty strings which were all then evaluated as zero length and then max() processed a longer list with lots of zeroes.


After a period when they felt awkward because lots of my programming background was in languages that didn't have these, I now use simple one-liners extensively - comprehensions and ternary expressions.  If they start nesting, probably not.  Someone mentioned a decent metric - if a code formatter like Black starts breaking your comprehension into four lines, it probably got too complex.

My take on these is you can write a more compact function this way - you're more likely to have the meat of what's going on right there together in a few lines, rather than building mountain ranges of indentation - and this can actually *improve* readability, not obscure it.  I agree "clever" one-liners may be a support burden, but anyone with a reasonable amount of Python competency (which I'd expect of anyone in a position to maintain my code at a later date) should have no trouble recognizing the intent of simple ones.

Sometimes thinking about how to write a concise one-liner exposes a failure to have thought through what you're doing completely - so unlike what is mentioned above - twisting a problem around unnaturally (no argument that happens too), you might actually realize that there's a simpler way to structure a step.

Just one more opinion.
_______________________________________________
Tutor maillist  -  Tutor at python.org
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor


From avi.e.gross at gmail.com  Thu Jul  7 19:36:45 2022
From: avi.e.gross at gmail.com (avi.e.gross at gmail.com)
Date: Thu, 7 Jul 2022 19:36:45 -0400
Subject: [Tutor] this group and one liners
In-Reply-To: <l9oechlj6c5hveavglb9pdc06o7j381ja1@4ax.com>
References: <007601d89150$1cf50320$56df0960$@gmail.com>
 <DB6PR01MB38950EE411948C03100BAAA683839@DB6PR01MB3895.eurprd01.prod.exchangelabs.com>
 <006a01d8924d$133dbd60$39b93820$@gmail.com>
 <l9oechlj6c5hveavglb9pdc06o7j381ja1@4ax.com>
Message-ID: <02f301d8925a$6c0ddf30$44299d90$@gmail.com>

Thanks, Dennis. Indeed I should be using a list that contains a single 0
(albeit many works too),

max([] or [0])

Amusingly, I played around and max() takes either a comma separated list as
in max(1,2) or it takes any other iterable as in [1,2] and I think what is
happening is it sees anything with a comma as a tuple which is iterable. I
mean this works:

max(1,2 or 2,3)

returning 3 oddly enough and so does this:

max([] or 0,0)

returning a 0.


-----Original Message-----
From: Tutor <tutor-bounces+avi.e.gross=gmail.com at python.org> On Behalf Of
Dennis Lee Bieber
Sent: Thursday, July 7, 2022 6:50 PM
To: tutor at python.org
Subject: Re: [Tutor] this group and one liners

On Thu, 7 Jul 2022 18:01:13 -0400, <avi.e.gross at gmail.com> declaimed the
following:


>
>max([] or 0)
>
> 
>
>breaks down with an error and the [] does not evaluate to false.
>

	That's because the 0 is not an iterable, not that [] didn't
evaluate...

>>> [] or 0
0
>>> max([] or 0)
Traceback (most recent call last):
  File "<interactive input>", line 1, in <module>
TypeError: 'int' object is not iterable
>>> max(0)
Traceback (most recent call last):
  File "<interactive input>", line 1, in <module>
TypeError: 'int' object is not iterable
>>> 

	max() requires something it can iterate over. The expression

	<something falsey> or <anything>

returns <anything>, not true/false, and does NOT override max() normal
operation of trying to compare objects in an iterable..

>>> 
>>> [] or "anything"
'anything'
>>> max([] or "anything")
'y'
>>> 

	The "default" value option applies only if max() would otherwise
raise an exception for an empty sequence.

>>> max([], default=0)
0
>>> max([])
Traceback (most recent call last):
  File "<interactive input>", line 1, in <module>
ValueError: max() arg is an empty sequence
>>> 


-- 
	Wulfraed                 Dennis Lee Bieber         AF6VN
	wlfraed at ix.netcom.com    http://wlfraed.microdiversity.freeddns.org/

_______________________________________________
Tutor maillist  -  Tutor at python.org
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor


From avi.e.gross at gmail.com  Thu Jul  7 21:52:45 2022
From: avi.e.gross at gmail.com (avi.e.gross at gmail.com)
Date: Thu, 7 Jul 2022 21:52:45 -0400
Subject: [Tutor] this group and one liners
In-Reply-To: <ta7u0c$6rm$1@ciao.gmane.io>
References: <007601d89150$1cf50320$56df0960$@gmail.com>
 <DB6PR01MB38950EE411948C03100BAAA683839@DB6PR01MB3895.eurprd01.prod.exchangelabs.com>
 <006a01d8924d$133dbd60$39b93820$@gmail.com> <ta7u0c$6rm$1@ciao.gmane.io>
Message-ID: <005101d8926d$6ba5fb00$42f1f100$@gmail.com>

OK, Alan, in the interest of tying several annoying threads together, I make
a one-line very brief generator to return a darn zero so I can set a default
of zero on an empty list to max!

def zeroed(): yield(0)

max([] or zeroed())
0
max([6, 66, 666] or zeroed())
666
max([] or zeroed())
0

Kidding aside, although any iterable will do such as [0] or (0,) or {0} it
does sound like it would be useful to have a silly function like the above
that makes anything such as a scalar into an iterable just to make programs
that demand an iterable happy. There probably is something out there with
some strange name like list(numb) but this way is harder for anyone
maintaining it to figure out WHY ...

def gener8r(numb): yield(numb)

max([] or gener8r(3.1415926535))
3.1415926535

But oddly although brackets work, an explicit call to list() generates an
error! Ditto for {number} working and set(number) failing. Is this an
anomaly with a meaning? 

max([] or [3.1415926535])
3.1415926535

max([] or list(3.1415926535))
Traceback (most recent call last):
  File "<pyshell#44>", line 1, in <module>
    max([] or list(3.1415926535))
TypeError: 'float' object is not iterable

max([] or list(3.1415926535, 0))
Traceback (most recent call last):
  File "<pyshell#46>", line 1, in <module>
    max([] or list(3.1415926535, 0))
TypeError: list expected at most 1 argument, got 2

max([] or set(3.1415926535))
Traceback (most recent call last):
  File "<pyshell#47>", line 1, in <module>
    max([] or set(3.1415926535))
TypeError: 'float' object is not iterable

max([] or {3.1415926535})
3.1415926535


-----Original Message-----
From: Tutor <tutor-bounces+avi.e.gross=gmail.com at python.org> On Behalf Of
Alan Gauld via Tutor
Sent: Thursday, July 7, 2022 8:33 PM
To: tutor at python.org
Subject: Re: [Tutor] this group and one liners

On 07/07/2022 23:01, avi.e.gross at gmail.com wrote:

> max([] or 0) 
> 
> breaks down with an error and the [] does not evaluate to false.

max([] or [0])


a sequence is all thats needed.

-- 
Alan G
Author of the Learn to Program web site
http://www.alan-g.me.uk/
http://www.amazon.com/author/alan_gauld
Follow my photo-blog on Flickr at:
http://www.flickr.com/photos/alangauldphotos


_______________________________________________
Tutor maillist  -  Tutor at python.org
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor


From alan.gauld at yahoo.co.uk  Fri Jul  8 06:53:43 2022
From: alan.gauld at yahoo.co.uk (Alan Gauld)
Date: Fri, 8 Jul 2022 11:53:43 +0100
Subject: [Tutor] this group and one liners
In-Reply-To: <005101d8926d$6ba5fb00$42f1f100$@gmail.com>
References: <007601d89150$1cf50320$56df0960$@gmail.com>
 <DB6PR01MB38950EE411948C03100BAAA683839@DB6PR01MB3895.eurprd01.prod.exchangelabs.com>
 <006a01d8924d$133dbd60$39b93820$@gmail.com> <ta7u0c$6rm$1@ciao.gmane.io>
 <005101d8926d$6ba5fb00$42f1f100$@gmail.com>
Message-ID: <ta92bn$ssp$1@ciao.gmane.io>

On 08/07/2022 02:52, avi.e.gross at gmail.com wrote:

> But oddly although brackets work, an explicit call to list() generates an
> error! Ditto for {number} working and set(number) failing. Is this an
> anomaly with a meaning? 

list() and set() require iterables. They won't work with single values:

>>> set(4)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: 'int' object is not iterable
>>> set((1,))
{1}
>>> list(3)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: 'int' object is not iterable
>>> list([3])
[3]
>>> list((3,))
[3]
>>>

The error you are getting is not from max() its from list()/set()

What is slightly annoying is that, unlike max(), you cannot
just pass a sequence of values:

>>> set(1,2)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: set expected at most 1 argument, got 2


-- 
Alan G
Author of the Learn to Program web site
http://www.alan-g.me.uk/
http://www.amazon.com/author/alan_gauld
Follow my photo-blog on Flickr at:
http://www.flickr.com/photos/alangauldphotos


From wlfraed at ix.netcom.com  Fri Jul  8 10:03:14 2022
From: wlfraed at ix.netcom.com (Dennis Lee Bieber)
Date: Fri, 08 Jul 2022 10:03:14 -0400
Subject: [Tutor] this group and one liners
References: <007601d89150$1cf50320$56df0960$@gmail.com>
 <DB6PR01MB38950EE411948C03100BAAA683839@DB6PR01MB3895.eurprd01.prod.exchangelabs.com>
 <006a01d8924d$133dbd60$39b93820$@gmail.com>
 <l9oechlj6c5hveavglb9pdc06o7j381ja1@4ax.com>
 <02f301d8925a$6c0ddf30$44299d90$@gmail.com>
Message-ID: <t4egchdeenqfcmi2l7t6dha6jj0f3g3gm4@4ax.com>

On Thu, 7 Jul 2022 19:36:45 -0400, <avi.e.gross at gmail.com> declaimed the
following:

>
>max(1,2 or 2,3)
>
>returning 3 oddly enough and so does this:
>

	Well, that evaluates, if I recall the operator precedence, as

1, (2 or 2), 3 => 1, 2, 3


>max([] or 0,0)
>
>returning a 0.
>

([] or 0), 0 => 0, 0


-- 
	Wulfraed                 Dennis Lee Bieber         AF6VN
	wlfraed at ix.netcom.com    http://wlfraed.microdiversity.freeddns.org/


From avi.e.gross at gmail.com  Fri Jul  8 12:54:02 2022
From: avi.e.gross at gmail.com (avi.e.gross at gmail.com)
Date: Fri, 8 Jul 2022 12:54:02 -0400
Subject: [Tutor] this group and one liners
In-Reply-To: <ta92bn$ssp$1@ciao.gmane.io>
References: <007601d89150$1cf50320$56df0960$@gmail.com>
 <DB6PR01MB38950EE411948C03100BAAA683839@DB6PR01MB3895.eurprd01.prod.exchangelabs.com>
 <006a01d8924d$133dbd60$39b93820$@gmail.com> <ta7u0c$6rm$1@ciao.gmane.io>
 <005101d8926d$6ba5fb00$42f1f100$@gmail.com> <ta92bn$ssp$1@ciao.gmane.io>
Message-ID: <00c901d892eb$53fd03d0$fbf70b70$@gmail.com>

Alan,

That shakes me up a bit as I do not recall ever needing to use iterables in
the context of set() or list() back when I was learning the language. Then
again, I mainly would use other notations or convert to that data type from
another.

But this can be a teaching moment.

Part of the assumption is based o my experience with quite a few other
languages and each has its own rules and even peculiarities. In some
languages, you use list(0 to create a list and something like as.list() to
coerce another object not a list.

Further confusing issues is that sets and lists use a special notation of []
or {} to often handle sets. So, it seems that 
	var = [5]

And 

	var = list(5)

are far from the same thing.

So it seems one way to make a list is to use list() with no arguments to
make an empty list, or just use [] and then use list methods on the object
to append or insert or extend. But there is no trivial way to do that inside
the argument list of max() as described.

Similarly for set() which either makes an empty set (the only way to do that
as {} makes an empty dictionary) or coerces some iterable argument into a
set. 

I am not convinced this is fantastic design. The concept of an iterable is
very powerful but this brings us back to the question we began with. Why not
have a version of max() that accepts numbers like 1,2,3 as individual
arguments?

In my view, just like a nonexistent stretch of zeros in a string is
considered to be of length zero for our purposes, not unknown, I can imagine
many scenarios where a single value should be useful in the context of many
functions as if it was an iterable that returned one value just once. Heck,
I can see scenarios where a null value that returned nothing would be an
iterable. Python allows you to build your own iterables in several ways that
do exactly that!

So, yes, you can get around these rules when needed but outside of
one-liners, that may rarely be an issue if you know the rules.

Every language has plusses and minuses from the perspective of a user and
you either deal with it or try to use another. But if people seem to want to
do things a certain way, the language often is added to in ways  that force
things to be doable, albeit with more work and since [3.14] is enough to
make the change, perhaps not necessary. Still for a flexible dynamic
language, this way of doing things looks like a throwback to me. Many
programming paradigms are like that. They are great when used as designed
and a source of lots of frustration when not.


-----Original Message-----
From: Tutor <tutor-bounces+avi.e.gross=gmail.com at python.org> On Behalf Of
Alan Gauld via Tutor
Sent: Friday, July 8, 2022 6:54 AM
To: tutor at python.org
Subject: Re: [Tutor] this group and one liners

On 08/07/2022 02:52, avi.e.gross at gmail.com wrote:

> But oddly although brackets work, an explicit call to list() generates 
> an error! Ditto for {number} working and set(number) failing. Is this 
> an anomaly with a meaning?

list() and set() require iterables. They won't work with single values:

>>> set(4)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: 'int' object is not iterable
>>> set((1,))
{1}
>>> list(3)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: 'int' object is not iterable
>>> list([3])
[3]
>>> list((3,))
[3]
>>>

The error you are getting is not from max() its from list()/set()

What is slightly annoying is that, unlike max(), you cannot just pass a
sequence of values:

>>> set(1,2)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: set expected at most 1 argument, got 2


--
Alan G
Author of the Learn to Program web site
http://www.alan-g.me.uk/
http://www.amazon.com/author/alan_gauld
Follow my photo-blog on Flickr at:
http://www.flickr.com/photos/alangauldphotos


_______________________________________________
Tutor maillist  -  Tutor at python.org
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor


From avi.e.gross at gmail.com  Fri Jul  8 17:52:59 2022
From: avi.e.gross at gmail.com (avi.e.gross at gmail.com)
Date: Fri, 8 Jul 2022 17:52:59 -0400
Subject: [Tutor] this group and one liners
In-Reply-To: <ta92bn$ssp$1@ciao.gmane.io>
References: <007601d89150$1cf50320$56df0960$@gmail.com>
 <DB6PR01MB38950EE411948C03100BAAA683839@DB6PR01MB3895.eurprd01.prod.exchangelabs.com>
 <006a01d8924d$133dbd60$39b93820$@gmail.com> <ta7u0c$6rm$1@ciao.gmane.io>
 <005101d8926d$6ba5fb00$42f1f100$@gmail.com> <ta92bn$ssp$1@ciao.gmane.io>
Message-ID: <001501d89315$174b8090$45e281b0$@gmail.com>

Alan,

I get annoyed at myself when I find I do not apparently understand something
or it makes no sense to me. So I sometimes try to do something about it.

My first thought was to see if I could make a wrapper for max() that allowed
the use of a non-iterable.

My first attempt was to have this function that accepts any number of
arguments and places them in an iterable concept before calling maxim and I
invite criticism or improvements or other approaches.

We know max(1,2,3) fails as it demands an iterable like [1,2,3] so I tried
this:

def maxim(*args): return(max(args))

maxim(1,2,3)
3
maxim([1,2,3])
[1, 2, 3]

Well, clearly this needs work to be more general and forgiving! But it sort
of works on an empty list with a sort of default:

maxim([] or 0)
0

To make this more general would take quite a bit of work. For example, if
you want to make sure max(iterator) works, you may need to check first if
you have a simple object and if that object is a list or set or perhaps
quite a few other things, may need to convert it so it remains intact rather
than a list containing a list which max() returns as the maximum.

As always, I have to suspect that someone has already done this, and more,
and created some function that is a Swiss Army Knife and works on (almost)
anything.


-----Original Message-----
From: Tutor <tutor-bounces+avi.e.gross=gmail.com at python.org> On Behalf Of
Alan Gauld via Tutor
Sent: Friday, July 8, 2022 6:54 AM
To: tutor at python.org
Subject: Re: [Tutor] this group and one liners

On 08/07/2022 02:52, avi.e.gross at gmail.com wrote:

> But oddly although brackets work, an explicit call to list() generates 
> an error! Ditto for {number} working and set(number) failing. Is this 
> an anomaly with a meaning?

list() and set() require iterables. They won't work with single values:

>>> set(4)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: 'int' object is not iterable
>>> set((1,))
{1}
>>> list(3)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: 'int' object is not iterable
>>> list([3])
[3]
>>> list((3,))
[3]
>>>

The error you are getting is not from max() its from list()/set()

What is slightly annoying is that, unlike max(), you cannot just pass a
sequence of values:

>>> set(1,2)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: set expected at most 1 argument, got 2


--
Alan G
Author of the Learn to Program web site
http://www.alan-g.me.uk/
http://www.amazon.com/author/alan_gauld
Follow my photo-blog on Flickr at:
http://www.flickr.com/photos/alangauldphotos


_______________________________________________
Tutor maillist  -  Tutor at python.org
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor


From alan.gauld at yahoo.co.uk  Fri Jul  8 18:08:30 2022
From: alan.gauld at yahoo.co.uk (Alan Gauld)
Date: Fri, 8 Jul 2022 23:08:30 +0100
Subject: [Tutor] this group and one liners
In-Reply-To: <001501d89315$174b8090$45e281b0$@gmail.com>
References: <007601d89150$1cf50320$56df0960$@gmail.com>
 <DB6PR01MB38950EE411948C03100BAAA683839@DB6PR01MB3895.eurprd01.prod.exchangelabs.com>
 <006a01d8924d$133dbd60$39b93820$@gmail.com> <ta7u0c$6rm$1@ciao.gmane.io>
 <005101d8926d$6ba5fb00$42f1f100$@gmail.com> <ta92bn$ssp$1@ciao.gmane.io>
 <001501d89315$174b8090$45e281b0$@gmail.com>
Message-ID: <taa9su$ugs$1@ciao.gmane.io>

On 08/07/2022 22:52, avi.e.gross at gmail.com wrote:

> We know max(1,2,3) fails as it demands an iterable like [1,2,3] 

Nope.

max(1,2,3)

works just fine for me. Its only a single value that fails:

>>> max(1,2,3)
3
>>> max(3)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: 'int' object is not iterable
>>>

-- 
Alan G
Author of the Learn to Program web site
http://www.alan-g.me.uk/
http://www.amazon.com/author/alan_gauld
Follow my photo-blog on Flickr at:
http://www.flickr.com/photos/alangauldphotos


From wlfraed at ix.netcom.com  Sat Jul  9 01:23:11 2022
From: wlfraed at ix.netcom.com (Dennis Lee Bieber)
Date: Sat, 09 Jul 2022 01:23:11 -0400
Subject: [Tutor] this group and one liners
References: <007601d89150$1cf50320$56df0960$@gmail.com>
 <DB6PR01MB38950EE411948C03100BAAA683839@DB6PR01MB3895.eurprd01.prod.exchangelabs.com>
 <006a01d8924d$133dbd60$39b93820$@gmail.com> <ta7u0c$6rm$1@ciao.gmane.io>
 <005101d8926d$6ba5fb00$42f1f100$@gmail.com> <ta92bn$ssp$1@ciao.gmane.io>
 <00c901d892eb$53fd03d0$fbf70b70$@gmail.com>
Message-ID: <ju3ich549cjt0bti6aahl57ec9j0e23o3q@4ax.com>

On Fri, 8 Jul 2022 12:54:02 -0400, <avi.e.gross at gmail.com> declaimed the
following:


>I am not convinced this is fantastic design. The concept of an iterable is
>very powerful but this brings us back to the question we began with. Why not
>have a version of max() that accepts numbers like 1,2,3 as individual
>arguments?
>
>>> max(1, 2, 3)
3
>>> 

	In the absence of keyword arguments, the non-keyword arguments are
gathered up as a tuple -- which is an iterable.

>>> tpl = (1, 2, 3)
>>> max(tpl)
3
>>> max(*tpl)
3
>>> 

	First passes single arg tuple. Second /unpacks/ the tuple, passing
three args, which get gathered back into a tuple by max().


-- 
	Wulfraed                 Dennis Lee Bieber         AF6VN
	wlfraed at ix.netcom.com    http://wlfraed.microdiversity.freeddns.org/


From avi.e.gross at gmail.com  Fri Jul  8 19:49:36 2022
From: avi.e.gross at gmail.com (avi.e.gross at gmail.com)
Date: Fri, 8 Jul 2022 19:49:36 -0400
Subject: [Tutor] this group and one liners
In-Reply-To: <taa9su$ugs$1@ciao.gmane.io>
References: <007601d89150$1cf50320$56df0960$@gmail.com>
 <DB6PR01MB38950EE411948C03100BAAA683839@DB6PR01MB3895.eurprd01.prod.exchangelabs.com>
 <006a01d8924d$133dbd60$39b93820$@gmail.com> <ta7u0c$6rm$1@ciao.gmane.io>
 <005101d8926d$6ba5fb00$42f1f100$@gmail.com> <ta92bn$ssp$1@ciao.gmane.io>
 <001501d89315$174b8090$45e281b0$@gmail.com> <taa9su$ugs$1@ciao.gmane.io>
Message-ID: <004201d89325$61afe4e0$250faea0$@gmail.com>

You are right, Alan, so max mainly does what I expect. Since generally no
arguments makes no sense for having  a maximum and one argument that is
really just one also makes little sense as it is by default the maximum (and
minimum, median, mode, ...)  then it stands to reason to treat a single
argument as a kind of collection.

However, since max[1]) works fine, and max(1) fails, it seems a tad
inconsistent. 

I note in my experiments that max("a") works but only because like many
things in python, it sees a character string as an iterator of sorts and
max("max") returns 'x' of course. Unfortunately, when that is supplied as a
list: max(["max"]) --> 'max'

 Time to move on from this topic except to say that debugging some python
constructs may need to be part of what I do if I am not careful.


-----Original Message-----
From: Tutor <tutor-bounces+avi.e.gross=gmail.com at python.org> On Behalf Of
Alan Gauld via Tutor
Sent: Friday, July 8, 2022 6:09 PM
To: tutor at python.org
Subject: Re: [Tutor] this group and one liners

On 08/07/2022 22:52, avi.e.gross at gmail.com wrote:

> We know max(1,2,3) fails as it demands an iterable like [1,2,3] 

Nope.

max(1,2,3)

works just fine for me. Its only a single value that fails:

>>> max(1,2,3)
3
>>> max(3)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: 'int' object is not iterable
>>>

-- 
Alan G
Author of the Learn to Program web site
http://www.alan-g.me.uk/
http://www.amazon.com/author/alan_gauld
Follow my photo-blog on Flickr at:
http://www.flickr.com/photos/alangauldphotos


_______________________________________________
Tutor maillist  -  Tutor at python.org
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor


From nathan-tech at hotmail.com  Mon Jul 11 06:58:47 2022
From: nathan-tech at hotmail.com (nathan tech)
Date: Mon, 11 Jul 2022 10:58:47 +0000
Subject: [Tutor] Question about python decorators
Message-ID: <DB7PR07MB50938B20179ED25DC200B797E4879@DB7PR07MB5093.eurprd07.prod.outlook.com>

As I understand it, decorators are usuallyed created as:
@object.event
Def function_to_be_executed():
Do_stuff

My question is, is there a way to create this after the function is created?
So:
Def function():
Print("this is interesting stuff")

@myobject.event=function

Thanks
Nathan


From __peter__ at web.de  Mon Jul 11 12:57:17 2022
From: __peter__ at web.de (Peter Otten)
Date: Mon, 11 Jul 2022 18:57:17 +0200
Subject: [Tutor] Question about python decorators
In-Reply-To: <DB7PR07MB50938B20179ED25DC200B797E4879@DB7PR07MB5093.eurprd07.prod.outlook.com>
References: <DB7PR07MB50938B20179ED25DC200B797E4879@DB7PR07MB5093.eurprd07.prod.outlook.com>
Message-ID: <c3331073-63ae-2773-fc46-e1d5c2209259@web.de>

On 11/07/2022 12:58, nathan tech wrote:
> As I understand it, decorators are usuallyed created as:
> @object.event
> Def function_to_be_executed():
> Do_stuff
>
> My question is, is there a way to create this after the function is created?
> So:
> Def function():
> Print("this is interesting stuff")
>
> @myobject.event=function
>
> Thanks
> Nathan

@deco
def fun():
    ...

is basically a syntactic sugar for

def fun():
     ...

fun = deco(fun)

In your case you would write

function = object.event(function)

As this is just an ordinary function call followed by an ordinary
assignment you can of course use different names for the decorated and
undecorated version of your function and thus keep both easily accessible.

From mats at wichmann.us  Mon Jul 11 13:23:13 2022
From: mats at wichmann.us (Mats Wichmann)
Date: Mon, 11 Jul 2022 11:23:13 -0600
Subject: [Tutor] Question about python decorators
In-Reply-To: <DB7PR07MB50938B20179ED25DC200B797E4879@DB7PR07MB5093.eurprd07.prod.outlook.com>
References: <DB7PR07MB50938B20179ED25DC200B797E4879@DB7PR07MB5093.eurprd07.prod.outlook.com>
Message-ID: <1570f91e-edb0-70d2-8b3a-350eac3a4514@wichmann.us>

On 7/11/22 04:58, nathan tech wrote:
> As I understand it, decorators are usuallyed created as:
> @object.event
> Def function_to_be_executed():
> Do_stuff
> 
> My question is, is there a way to create this after the function is created?
> So:
> Def function():
> Print("this is interesting stuff")
> 
> @myobject.event=function

You don't have to use the @ form at all, but your attempt to assign
won't work.

def function():
    print("This is interesting stuff")

function = object.event(function)

Now the new version of function is the wrapped version; the original
version (from your def statement) is held by a reference in the instance
of the wrapper.

If that's what you are asking...


From nathan-tech at hotmail.com  Mon Jul 11 17:29:33 2022
From: nathan-tech at hotmail.com (Nathan Smith)
Date: Mon, 11 Jul 2022 22:29:33 +0100
Subject: [Tutor] Question about python decorators
In-Reply-To: <1570f91e-edb0-70d2-8b3a-350eac3a4514@wichmann.us>
References: <DB7PR07MB50938B20179ED25DC200B797E4879@DB7PR07MB5093.eurprd07.prod.outlook.com>
 <1570f91e-edb0-70d2-8b3a-350eac3a4514@wichmann.us>
Message-ID: <DB7PR07MB50938912AFDFA73DDA870D2CE4879@DB7PR07MB5093.eurprd07.prod.outlook.com>

Hiya,


I figured this is how they work, so it's nice to have that confirmed!

For some reason though I have a library that is not behaving this way, 
to example:


@object.attr.method

def function():

 ?do stuff


works, but

object.attr.method=func

does not


I'm likely missing something, but I'd be curious to here experts opinions.

The library I am working with is:

https://github.com/j-hc/Reddit_ChatBot_Python

On 11/07/2022 18:23, Mats Wichmann wrote:
> On 7/11/22 04:58, nathan tech wrote:
>> As I understand it, decorators are usuallyed created as:
>> @object.event
>> Def function_to_be_executed():
>> Do_stuff
>>
>> My question is, is there a way to create this after the function is created?
>> So:
>> Def function():
>> Print("this is interesting stuff")
>>
>> @myobject.event=function
> You don't have to use the @ form at all, but your attempt to assign
> won't work.
>
> def function():
>      print("This is interesting stuff")
>
> function = object.event(function)
>
> Now the new version of function is the wrapped version; the original
> version (from your def statement) is held by a reference in the instance
> of the wrapper.
>
> If that's what you are asking...
>
> _______________________________________________
> Tutor maillist  -  Tutor at python.org
> To unsubscribe or change subscription options:
> https://nam12.safelinks.protection.outlook.com/?url=https%3A%2F%2Fmail.python.org%2Fmailman%2Flistinfo%2Ftutor&amp;data=05%7C01%7C%7C8707660103e44e7ce3c908da6362eac3%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C637931573732838946%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&amp;sdata=xfszgTLAm%2BXPKzIfJBwk6NqRWA%2Fmh4QpfUNlHwX97Po%3D&amp;reserved=0
-- 

Best Wishes,

Nathan Smith, BSC


My Website: https://nathantech.net


From cs at cskk.id.au  Mon Jul 11 19:09:43 2022
From: cs at cskk.id.au (Cameron Simpson)
Date: Tue, 12 Jul 2022 09:09:43 +1000
Subject: [Tutor] Question about python decorators
In-Reply-To: <DB7PR07MB50938912AFDFA73DDA870D2CE4879@DB7PR07MB5093.eurprd07.prod.outlook.com>
References: <DB7PR07MB50938912AFDFA73DDA870D2CE4879@DB7PR07MB5093.eurprd07.prod.outlook.com>
Message-ID: <YsyttxFeWJKqCNNn@cskk.homeip.net>

On 11Jul2022 22:29, nathan tech <nathan-tech at hotmail.com> wrote:
>For some reason though I have a library that is not behaving this way, 
>to example:
>
>@object.attr.method
>def function():
>?do stuff
>
>works, but
>
>object.attr.method=func
>
>does not

That should be:

    func = object.attr.method(func)

Remember, a decorator takes a function and returns a function (often a 
new function which calls the old function).

Cheers,
Cameron Simpson <cs at cskk.id.au>

From nathan-tech at hotmail.com  Tue Jul 12 01:44:01 2022
From: nathan-tech at hotmail.com (Nathan Smith)
Date: Tue, 12 Jul 2022 06:44:01 +0100
Subject: [Tutor] Question about python decorators
In-Reply-To: <YsyttxFeWJKqCNNn@cskk.homeip.net>
References: <DB7PR07MB50938912AFDFA73DDA870D2CE4879@DB7PR07MB5093.eurprd07.prod.outlook.com>
 <YsyttxFeWJKqCNNn@cskk.homeip.net>
Message-ID: <DB7PR07MB509398D71C6C155D0AB2AB86E4869@DB7PR07MB5093.eurprd07.prod.outlook.com>

Aha! I am with you.

Thanks a lot! :)

On 12/07/2022 00:09, Cameron Simpson wrote:
> On 11Jul2022 22:29, nathan tech <nathan-tech at hotmail.com> wrote:
>> For some reason though I have a library that is not behaving this way,
>> to example:
>>
>> @object.attr.method
>> def function():
>>  ?do stuff
>>
>> works, but
>>
>> object.attr.method=func
>>
>> does not
> That should be:
>
>      func = object.attr.method(func)
>
> Remember, a decorator takes a function and returns a function (often a
> new function which calls the old function).
>
> Cheers,
> Cameron Simpson <cs at cskk.id.au>
> _______________________________________________
> Tutor maillist  -  Tutor at python.org
> To unsubscribe or change subscription options:
> https://nam12.safelinks.protection.outlook.com/?url=https%3A%2F%2Fmail.python.org%2Fmailman%2Flistinfo%2Ftutor&amp;data=05%7C01%7C%7C1ed565f9951245e8f3fc08da6393ba6a%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C637931783391597551%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&amp;sdata=1Y9J0uqLL66FgSohUUNtkOtIbE6wNOunUVKlHl9pZSg%3D&amp;reserved=0
-- 

Best Wishes,

Nathan Smith, BSC


My Website: https://nathantech.net


From alexanderrhodis at gmail.com  Sun Jul 10 02:51:19 2022
From: alexanderrhodis at gmail.com (alexander-rodis)
Date: Sun, 10 Jul 2022 09:51:19 +0300
Subject: [Tutor] Implicit passing of argument select functions being called
Message-ID: <931b5db0-047f-90c3-ee6c-35e6a34c01f2@gmail.com>

I'm working on a project, where accessibility is very important as it's 
addressed to non - specialists, who may even have no knowledge of coding 
of Python.

In a specific section, I've come up with this API to make data 
transformation pipeline:

make_pipeline(

 ??? load_data(fpath,...),

 ??? transform1(arg1,arg2,....),

 ??? ....,

 ??? transform2(arg1,arg2,....),

 ??? transform3(arg1,arg2,....),

 ??? transformN(arg1,arg2,....),

)

transformN are all callables (writing this in a functional style), 
decorated with a partial call in the background. make_pipeline just uses 
a forloop to call the functions

successively like so e = arg(e). Which transforms from the many 
available in the library I'm writting, are used will vary between uses. 
load_data returns a pandas

DataFrame. Some transforms may need to operate on sub-arrays. I've 
written functions (also decorated with a partial call) to select the sub 
array

and return it to its original shape.? The problem is, I currently have 
to pass the data filtering function explicitly as an argument to each 
function that needs it, but

this seems VERY error prone, even if I cautiously document it. I want 
the the filter function to be specified in one place and made 
automatically? available to all

transforms that need it. Something roughly equivalent to:

make_pipeline(

 ??? load_data(fpath,...),

 ??? transform1(arg1,arg2,....),

 ??? ....,

 ??? transform2(arg1,arg2,....),

 ??? transform3(arg1,arg2,....),

 ??? transformN(arg1,arg2,....),

 ??? , data_filter = simple_filter(start=0,) )

I thought all aliases local to caller make_pipelines becomes 
automatically available to the called functions. This seems to work but 
only on some small toy examples, not in

this case. global is usually considered bad practice, so I'm trying to 
avoid it and I'm not using any classes so not OOP. I've also tried using 
inspect.signature to check if each

callable accepts a certain argument and pass it if that's the case, 
however this raises an "incorrect signature error" which I could find 
documented anywhere. I've also considered

passing it to all functions with a try/except and ignore thrown errors, 
but it seems this could also be error prone, namely it could catch other 
errors too. So my question is, is there an

elegant and Pythonic way to specify data_filter in only one place and 
implicitly pass it to all functions that need it without global or classes?


Thanks


From avi.e.gross at gmail.com  Sat Jul  9 18:08:56 2022
From: avi.e.gross at gmail.com (avi.e.gross at gmail.com)
Date: Sat, 9 Jul 2022 18:08:56 -0400
Subject: [Tutor] maxed out
In-Reply-To: <ju3ich549cjt0bti6aahl57ec9j0e23o3q@4ax.com>
References: <007601d89150$1cf50320$56df0960$@gmail.com>
 <DB6PR01MB38950EE411948C03100BAAA683839@DB6PR01MB3895.eurprd01.prod.exchangelabs.com>
 <006a01d8924d$133dbd60$39b93820$@gmail.com> <ta7u0c$6rm$1@ciao.gmane.io>
 <005101d8926d$6ba5fb00$42f1f100$@gmail.com> <ta92bn$ssp$1@ciao.gmane.io>
 <00c901d892eb$53fd03d0$fbf70b70$@gmail.com>
 <ju3ich549cjt0bti6aahl57ec9j0e23o3q@4ax.com>
Message-ID: <002c01d893e0$7bfd12d0$73f73870$@gmail.com>

Now that Alan and Dennis have pointed things out, I want to summarize some
of the perspective I have.

The topic is why the max() (and probably min() and perhaps other) python
functions made the design choices they did and that they are NOT the only
possible choices. It confused me as I saw parts of the elephant without
seeing the entire outline.

So lesson one is READ THE MANUAL PAGE before making often unwarranted
assumptions. https://docs.python.org/3/library/functions.html#max

The problem originally presented was that using max() to measure the maximum
length of multiple instances of 0's interspersed with 1's resulted in what I
considered anomalies. Specifically, it stopped with an error on an empty
list which my algorithm would always  emit if there were no zero's.

My understanding is that in normal use, there are several views.

The practical view is that you should not be calling max() unless you have
multiple things to compare of the same kind. Thus a common use it should
deal with is:

max(5, 3, 2.4, 8.6, -2.3)

This being python, a related use is to hand it a SINGLE argument that is a
collection of it's own with the qualification that the items be compatible
with being compared to each other and are ordered. So tuples, lists and sets
and separately the keys and values of a dictionary come to mind. 

max(1,2,3)
3
max((1,2,3))
3
max([1,2,2])
2
max({1,3,4,3,5,2})
5
max({1:1.1, 2:1.2, 3:1.3}.keys())
3
max({1:1.1, 2:1.2, 3:1.3}.values())
1.3

There may well be tons of other things it can handle albeit I suspect
constructs from numpy or pandas may better be evaluated using their own
versions of functions like max that are built-in.

And it seems max now should handle iterables like this function:

def lowering(starting=100):
    using = int(starting)
    while (using > 0):
        yield(using)
        using //= 2

max(lowering(100))
100

min(lowering(100))
1

import statistics
statistics.mean(lowering(666))
132.7

len(list(lowering(42)))
6

But it is inconsistent as len will not see [lowering(42)] as more than one
argument as it sees an iterable function but does not iterate it.

Back to the point, from a practical point of view, you have to conditions. 

- Everything being measured should be of the same type or perhaps coercible
to the same type easily.
- You either have 2 or more things being asked about directly as in
max(1,2,3) OR you have a single argument that is an iterator.

So the decision they made almost makes sense to me except that it does not!

In a mathematical sense, a maximum also makes perfect sense for a single
value. The maximum for no arguments is clearly undefined. 

Following that logic, the max() function would need to test if the single
object it gets is SIMPLE or more complex (and I do not mean imaginary
numbers. I mean ANYTHING that evaluates to a single value, be it an integer,
floating point, Boolean or character string and maybe more, should return it
as the maximum. If it can be evaluated as a container that ends up holding
NO values, then it should fail unless a default is provided. If it returns a
single value, again, that is the result. If it returns multiple values ALL
OF THE SAME KIND, compare those.

Not sure if that is at all easy to implement, but does that overall design
concept make a tad more sense?

The problem Dennis hinted out is darn TUPLES in Python. They sort of need a
comma after a value if it is alone as in 

a = 1,
a
(1,)

And the darn tuple(function) stupidly has no easy way to make a tuple with a
single argument even if followed by an empty comma. Who designed that? I
mean it works for a string because it is actually a compound object as in
tuple("a") but not for tuple(1) so you need something like tuple([1]) ...

My GUESS from what Dennis wrote is the creators of max(0 may have said that
a singleton argument should be expanded only by coercing it to a tuple and
that makes some singleton arguments FAIL!

I looked at another design element in that this version of max also supports
other kinds of ordering for selected larger collections such as lists
containing objects of the same type:

max( [ 1, 1, 1], [1, 2, 1])
[1, 2, 1]
max( [ 1, 1, 1], [1, 2, 1], [3, 0])
[3, 0]

It won't take my iterator unless I expand it in a list like this:

max(list(lowering(42)), list(lowering(666)))
[666, 333, 166, 83, 41, 20, 10, 5, 2, 1]

So, overall, I understand why max does what it wants BUT I am not happy with
the fact that programs often have no control over the size of a list they
generate as in my case where no 0's means an empty list. So the default=0
argument there helps if used, albeit using the single argument to max()
method fails when used as max([] or 0) and requires something like max([] or
[0]) since the darn thing insists a single argument must be a sort of
container or iterable.

You learn something new every day if you look, albeit I sometimes wish I
hadn't!


-----Original Message-----
From: Tutor <tutor-bounces+avi.e.gross=gmail.com at python.org> On Behalf Of
Dennis Lee Bieber
Sent: Saturday, July 9, 2022 1:23 AM
To: tutor at python.org
Subject: Re: [Tutor] this group and one liners

On Fri, 8 Jul 2022 12:54:02 -0400, <avi.e.gross at gmail.com> declaimed the
following:


>I am not convinced this is fantastic design. The concept of an iterable 
>is very powerful but this brings us back to the question we began with. 
>Why not have a version of max() that accepts numbers like 1,2,3 as 
>individual arguments?
>
>>> max(1, 2, 3)
3
>>> 

	In the absence of keyword arguments, the non-keyword arguments are
gathered up as a tuple -- which is an iterable.

>>> tpl = (1, 2, 3)
>>> max(tpl)
3
>>> max(*tpl)
3
>>> 

	First passes single arg tuple. Second /unpacks/ the tuple, passing
three args, which get gathered back into a tuple by max().


-- 
	Wulfraed                 Dennis Lee Bieber         AF6VN
	wlfraed at ix.netcom.com    http://wlfraed.microdiversity.freeddns.org/

_______________________________________________
Tutor maillist  -  Tutor at python.org
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor


From avi.e.gross at gmail.com  Mon Jul 11 13:26:30 2022
From: avi.e.gross at gmail.com (avi.e.gross at gmail.com)
Date: Mon, 11 Jul 2022 13:26:30 -0400
Subject: [Tutor] Python one-liners
Message-ID: <005e01d8954b$5c5f39a0$151dace0$@gmail.com>

Based on the recent discussion, I note with amusement that I just ordered
this book but note there is nothing wrong with solving simple problems in a
few lines of code, or even complex problems using code that calls
functionality already written elsewhere with just a few lines of your code.
Anyone read this yet? Should anyone?

 
Title: 

Python one-liners :

 
write concise, eloquent Python like a professional

Author: 

Mayer, Christian (Computer Scientist),

ISBN: 

9781718500501

Personal Author: 

 
<https://lmac.ent.sirsi.net/client/en_US/matawan/search/results.displaypanel
.displaycell_0.detail.mainpanel.fielddisplay.linktonewsearch?qu=Mayer%2C+Chr
istian+%28Computer+Scientist%29%2C+author.> Mayer, Christian (Computer
Scientist), author.

Language: 

English

Custom PUBDATE: 

2020

Place of publication: 

San Francisco : No Starch Press, [2020]

Publication Date: 

2020

Summary: 

"Shows how to perform useful tasks with one line of Python code. Begins with
a brief overview of Python, then moves on to specific problems that deal
with essential topics such as regular expressions and lambda functions,
providing a concise one-liner Python solution for each"--

Abstract: 

"Shows how to perform useful tasks with one line of Python code. Begins with
a brief overview of Python, then moves on to specific problems that deal
with essential topics such as regular expressions and lambda functions,
providing a concise one-liner Python solution for each"-- Provided by
publisher.

Subject Term: 

 
<https://lmac.ent.sirsi.net/client/en_US/matawan/search/results.displaypanel
.displaycell_0.detail.mainpanel.fielddisplay.linktonewsearch?qu=Python+%28Co
mputer+program+language%29> Python (Computer program language)

Subject: 

 
<https://lmac.ent.sirsi.net/client/en_US/matawan/search/results.displaypanel
.displaycell_0.detail.mainpanel.fielddisplay.linktonewsearch?qu=Python+%28Co
mputer+program+language%29> Python (Computer program language)

 
From avi.e.gross at gmail.com  Mon Jul 11 13:51:01 2022
From: avi.e.gross at gmail.com (avi.e.gross at gmail.com)
Date: Mon, 11 Jul 2022 13:51:01 -0400
Subject: [Tutor] Question about python decorators
In-Reply-To: <DB7PR07MB50938B20179ED25DC200B797E4879@DB7PR07MB5093.eurprd07.prod.outlook.com>
References: <DB7PR07MB50938B20179ED25DC200B797E4879@DB7PR07MB5093.eurprd07.prod.outlook.com>
Message-ID: <00ab01d8954e$c92bfac0$5b83f040$@gmail.com>

Nathan,

It depends on what you want to do.

Decorators are sort of syntactic sugar that you can sort of do on your own
if you wish. I am answering in general but feel free to look at some online
resources such as this:

https://realpython.com/primer-on-python-decorators/

The bottom line is that a function is just an object like anything else in
Python and you can wrap a function inside of another function so the new
function is called instead of the old one. This function, with the same
name, sort of decorates the original by having the ability to do something
using the arguments provided before calling the function, such as logging
the call, or making changes to the arguments or just doing validation and
then calling the function. After the inner (sort-of) function returns, it
can do additional things and return what it wishes to the original caller. 

I have no idea what your @object.event decorator is and what it does for
you, but presumably you can find one of several ways to apply it to a
completed function, including  slightly indirect methods albeit there may be
minor details that may not be quite right such as docstrings.

Say you already have a function called original() and you want to wrap it in
@interior as a decorator. Could you imagine something like this:

Copy "original" to "subordinate" so you have a new function name.

Then do this:

@interior
def original(args):
    return(subordinate(args))

Again, some details are needed such as matching the args one way or another
but the above is by your definition a new function definition, albeit with
some overhead! LOL!

Yes, I know what you are going to say. This is re-decoration!

But kidding aside, as I said, it may be even simpler and you may be able to
do something as simple as:

func = myobject.event(func)

Experiment with it. Or read a bit and then experiment. 


-----Original Message-----
From: Tutor <tutor-bounces+avi.e.gross=gmail.com at python.org> On Behalf Of
nathan tech
Sent: Monday, July 11, 2022 6:59 AM
To: Tutor at python.org
Subject: [Tutor] Question about python decorators

As I understand it, decorators are usuallyed created as:
@object.event
Def function_to_be_executed():
Do_stuff

My question is, is there a way to create this after the function is created?
So:
Def function():
Print("this is interesting stuff")

@myobject.event=function

Thanks
Nathan


_______________________________________________
Tutor maillist  -  Tutor at python.org
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor


From avi.e.gross at gmail.com  Mon Jul 11 17:42:39 2022
From: avi.e.gross at gmail.com (avi.e.gross at gmail.com)
Date: Mon, 11 Jul 2022 17:42:39 -0400
Subject: [Tutor] Question about python decorators
In-Reply-To: <DB7PR07MB50938912AFDFA73DDA870D2CE4879@DB7PR07MB5093.eurprd07.prod.outlook.com>
References: <DB7PR07MB50938B20179ED25DC200B797E4879@DB7PR07MB5093.eurprd07.prod.outlook.com>
 <1570f91e-edb0-70d2-8b3a-350eac3a4514@wichmann.us>
 <DB7PR07MB50938912AFDFA73DDA870D2CE4879@DB7PR07MB5093.eurprd07.prod.outlook.com>
Message-ID: <01cb01d8956f$24ab3c60$6e01b520$@gmail.com>

Nathan.

I don't think anyonbe suggested you type what you say you did:

	object.attr.method=func

You are making a second pointer to your function which leaves everything unchanged.

The suggestion I think several offered was to call a function called object.attr.method and give it an object which is the function you want decorated, meaning "func" above, and CAPTURE the result as either a new meaning for the variable NAME called "func" or perhaps use a new handle.

The way it works is that not only does object.attr.method accept a function name as an argument but also creates  brand new function that encapsulates it and RETURNS it. You want the brand new function that is an enhancement over the old one. The old one is kept alive in the process, albeit maybe not reachable directly.

You may be confused by the way decorating is done as syntactic sugar where it is indeed hard to see or understand what decorating means, let alone multiple levels as all you SEE is:

@decorator
Function definition.

So until you try the format several have offered, why expect your way to work ...?

Again, you want to try:

result = object.attr.method(func)

or to re-use the name func:

func = object.attr.method(func)


-----Original Message-----
From: Tutor <tutor-bounces+avi.e.gross=gmail.com at python.org> On Behalf Of Nathan Smith
Sent: Monday, July 11, 2022 5:30 PM
To: tutor at python.org
Subject: Re: [Tutor] Question about python decorators

Hiya,


I figured this is how they work, so it's nice to have that confirmed!

For some reason though I have a library that is not behaving this way, to example:


@object.attr.method

def function():

  do stuff


works, but

object.attr.method=func

does not


I'm likely missing something, but I'd be curious to here experts opinions.

The library I am working with is:

https://github.com/j-hc/Reddit_ChatBot_Python

On 11/07/2022 18:23, Mats Wichmann wrote:
> On 7/11/22 04:58, nathan tech wrote:
>> As I understand it, decorators are usuallyed created as:
>> @object.event
>> Def function_to_be_executed():
>> Do_stuff
>>
>> My question is, is there a way to create this after the function is created?
>> So:
>> Def function():
>> Print("this is interesting stuff")
>>
>> @myobject.event=function
> You don't have to use the @ form at all, but your attempt to assign
> won't work.
>
> def function():
>      print("This is interesting stuff")
>
> function = object.event(function)
>
> Now the new version of function is the wrapped version; the original
> version (from your def statement) is held by a reference in the instance
> of the wrapper.
>
> If that's what you are asking...
>
> _______________________________________________
> Tutor maillist  -  Tutor at python.org
> To unsubscribe or change subscription options:
> https://nam12.safelinks.protection.outlook.com/?url=https%3A%2F%2Fmail.python.org%2Fmailman%2Flistinfo%2Ftutor&amp;data=05%7C01%7C%7C8707660103e44e7ce3c908da6362eac3%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C637931573732838946%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&amp;sdata=xfszgTLAm%2BXPKzIfJBwk6NqRWA%2Fmh4QpfUNlHwX97Po%3D&amp;reserved=0
-- 

Best Wishes,

Nathan Smith, BSC


My Website: https://nathantech.net


_______________________________________________
Tutor maillist  -  Tutor at python.org
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor


From roel at roelschroeven.net  Mon Jul 11 07:08:42 2022
From: roel at roelschroeven.net (Roel Schroeven)
Date: Mon, 11 Jul 2022 13:08:42 +0200
Subject: [Tutor] Question about python decorators
In-Reply-To: <DB7PR07MB50938B20179ED25DC200B797E4879@DB7PR07MB5093.eurprd07.prod.outlook.com>
References: <DB7PR07MB50938B20179ED25DC200B797E4879@DB7PR07MB5093.eurprd07.prod.outlook.com>
Message-ID: <daf80a24-1a43-ce66-d840-7a3f09bcf156@roelschroeven.net>


Op 11/07/2022 om 12:58 schreef nathan tech:
> As I understand it, decorators are usuallyed created as:
> @object.event
> Def function_to_be_executed():
> Do_stuff
>
> My question is, is there a way to create this after the function is created?
> So:
> Def function():
> Print("this is interesting stuff")
>
> @myobject.event=function
You can write it as

 ??? function = myobject.event

Actually the syntax using @object.event in front of the function is 
syntactic sugar for that notation.

Example:

 ??? import time
 ??? import functools

 ??? def slow():
 ??????? time.sleep(1)

 ??? slow = functools.cache(slow)

Now slow() will only be slow on the first call, because subsequent calls 
will be served from the cache.


-- 

"Peace cannot be kept by force. It can only be achieved through understanding."
         -- Albert Einstein


From roel at roelschroeven.net  Mon Jul 11 07:11:45 2022
From: roel at roelschroeven.net (Roel Schroeven)
Date: Mon, 11 Jul 2022 13:11:45 +0200
Subject: [Tutor] Question about python decorators
In-Reply-To: <DB7PR07MB50938B20179ED25DC200B797E4879@DB7PR07MB5093.eurprd07.prod.outlook.com>
References: <DB7PR07MB50938B20179ED25DC200B797E4879@DB7PR07MB5093.eurprd07.prod.outlook.com>
Message-ID: <21d2e0e6-a572-0ab6-811b-b6d229b51330@roelschroeven.net>

Op 11/07/2022 om 12:58 schreef nathan tech:
> As I understand it, decorators are usuallyed created as:
> @object.event
> Def function_to_be_executed():
> Do_stuff
>
> My question is, is there a way to create this after the function is created?
> So:
> Def function():
> Print("this is interesting stuff")
>
> @myobject.event=function
>
Oops, sorry, there is an error in my other mail!

It should be:

 ??? function = myobject.event(function)

instead of just

 ??? function = myobject.event


-- 
"Peace cannot be kept by force. It can only be achieved through understanding."
         -- Albert Einstein


From __peter__ at web.de  Thu Jul 14 03:30:33 2022
From: __peter__ at web.de (Peter Otten)
Date: Thu, 14 Jul 2022 09:30:33 +0200
Subject: [Tutor] Implicit passing of argument select functions being
 called
In-Reply-To: <931b5db0-047f-90c3-ee6c-35e6a34c01f2@gmail.com>
References: <931b5db0-047f-90c3-ee6c-35e6a34c01f2@gmail.com>
Message-ID: <efa60224-02b1-3b5c-3a61-f7b5f9abffd4@web.de>

On 10/07/2022 08:51, alexander-rodis wrote:

> I'm working on a project, where accessibility is very important as it's
> addressed to non - specialists, who may even have no knowledge of coding
> of Python.
>
> In a specific section, I've come up with this API to make data
> transformation pipeline:
>
> make_pipeline(
>
>  ??? load_data(fpath,...),
>
>  ??? transform1(arg1,arg2,....),
>
>  ??? ....,
>
>  ??? transform2(arg1,arg2,....),

Frankly, I have no idea what your actual scenario might be.

If you are still interested in a comment it would help if you provide a
more concrete scenario with two or three actual transformations working
together on a few rows of toy data, with a filter and one or two globals
that you are hoping to avoid.

Actual code is far easier to reshuffle and improve than an abstraction
with overgeneralized function names and signatures where you cannot tell
the necessary elements from the artifacts of the generalization.

From avi.e.gross at gmail.com  Wed Jul 13 12:32:29 2022
From: avi.e.gross at gmail.com (avi.e.gross at gmail.com)
Date: Wed, 13 Jul 2022 12:32:29 -0400
Subject: [Tutor] Implicit passing of argument select functions being
 called
In-Reply-To: <931b5db0-047f-90c3-ee6c-35e6a34c01f2@gmail.com>
References: <931b5db0-047f-90c3-ee6c-35e6a34c01f2@gmail.com>
Message-ID: <012d01d896d6$259725e0$70c571a0$@gmail.com>

It would be helpful to understand a little more, Alexander.

Just FYI, although I know you are trying to make something simple for people, I hope you are aware of the sklearn pipeline as it may help you figure out how to make your own. Pandas has its own way too. And of course we have various ways to create a pipeline in TensorFlow and modules build on top of it like Keras and at least 4 others including some that look a bit like what you are showing. Your application may not be compatible and you want to avoid complexity but how others do things can give you ideas.

Your question though seems to be focused not on how to make a pipeline, but how you can evaluate one function after another and first load initial data and then pass that data as an argument to each successive function, replacing the data for each input with the output of the previous, and finally return the last output.

So, are you using existing functions or supplying your own?  If you want a generic data_filter to be available within all these functions, there seem to be quite a few ways you might do that if you control everything. For example, you could pass it as an argument to each function directly. Or have it available in some name space.

So could you clarify what exactly is not working now and perhaps give a concrete example? 

-----Original Message-----
From: Tutor <tutor-bounces+avi.e.gross=gmail.com at python.org> On Behalf Of alexander-rodis
Sent: Sunday, July 10, 2022 2:51 AM
To: tutor at python.org
Subject: [Tutor] Implicit passing of argument select functions being called

I'm working on a project, where accessibility is very important as it's addressed to non - specialists, who may even have no knowledge of coding of Python.

In a specific section, I've come up with this API to make data transformation pipeline:

make_pipeline(

     load_data(fpath,...),

     transform1(arg1,arg2,....),

     ....,

     transform2(arg1,arg2,....),

     transform3(arg1,arg2,....),

     transformN(arg1,arg2,....),

)

transformN are all callables (writing this in a functional style), decorated with a partial call in the background. make_pipeline just uses a forloop to call the functions

successively like so e = arg(e). Which transforms from the many available in the library I'm writting, are used will vary between uses. 
load_data returns a pandas

DataFrame. Some transforms may need to operate on sub-arrays. I've written functions (also decorated with a partial call) to select the sub array

and return it to its original shape.  The problem is, I currently have to pass the data filtering function explicitly as an argument to each function that needs it, but

this seems VERY error prone, even if I cautiously document it. I want the the filter function to be specified in one place and made automatically  available to all

transforms that need it. Something roughly equivalent to:

make_pipeline(

     load_data(fpath,...),

     transform1(arg1,arg2,....),

     ....,

     transform2(arg1,arg2,....),

     transform3(arg1,arg2,....),

     transformN(arg1,arg2,....),

     , data_filter = simple_filter(start=0,) )

I thought all aliases local to caller make_pipelines becomes automatically available to the called functions. This seems to work but only on some small toy examples, not in

this case. global is usually considered bad practice, so I'm trying to avoid it and I'm not using any classes so not OOP. I've also tried using inspect.signature to check if each

callable accepts a certain argument and pass it if that's the case, however this raises an "incorrect signature error" which I could find documented anywhere. I've also considered

passing it to all functions with a try/except and ignore thrown errors, but it seems this could also be error prone, namely it could catch other errors too. So my question is, is there an

elegant and Pythonic way to specify data_filter in only one place and implicitly pass it to all functions that need it without global or classes?


Thanks

_______________________________________________
Tutor maillist  -  Tutor at python.org
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor


From alexanderrhodis at gmail.com  Thu Jul 14 03:56:31 2022
From: alexanderrhodis at gmail.com (=?UTF-8?B?zpHOu86tzr7Osc69zrTPgc6/z4IgzqHPjM60zrfPgg==?=)
Date: Thu, 14 Jul 2022 10:56:31 +0300
Subject: [Tutor] Implicit passing of argument select functions being
 called
In-Reply-To: <efa60224-02b1-3b5c-3a61-f7b5f9abffd4@web.de>
References: <931b5db0-047f-90c3-ee6c-35e6a34c01f2@gmail.com>
 <efa60224-02b1-3b5c-3a61-f7b5f9abffd4@web.de>
Message-ID: <CAPzH9RU0fr812=Qo0qha8kcC_CNhBG9nu0enVtpMx2nAxu4YuQ@mail.gmail.com>

I've finally figured ou a solution, I'll leave it here in case it helps
someone else, using inspect.signature


The actuall functions would look like this:
`
dfilter = simple_filter(start_col = 6,)
make_pipeline(
  load_dataset(fpath="some/path.xlsx",
  header=[0,1],
  apply_internal_standard(target_col = "EtD5"),
 export_to_sql(fname  = "some/name"),
  ,
  data _filter = dfilter
  )

`
Though I doubt the actuall code cleared things up, if make_pipeline is a
forloop
and lets say apply_internal_standard needs access to dfilter (along with
others not shown here) a pretty good solution is

`
def make_pipeline(
   *args, **kwargs):
  D = args[0]
  for arg in args[1:]:
    k = inspect.signature(arg).parameters.keys()
    if "data_filter" in k:
      D = arg(D, kwargs["data_filter"] = data_filter)
    D = arg(D)
  return D
`

On Thu, Jul 14, 2022, 10:36 Peter Otten <__peter__ at web.de> wrote:

> On 10/07/2022 08:51, alexander-rodis wrote:
>
> > I'm working on a project, where accessibility is very important as it's
> > addressed to non - specialists, who may even have no knowledge of coding
> > of Python.
> >
> > In a specific section, I've come up with this API to make data
> > transformation pipeline:
> >
> > make_pipeline(
> >
> >      load_data(fpath,...),
> >
> >      transform1(arg1,arg2,....),
> >
> >      ....,
> >
> >      transform2(arg1,arg2,....),
>
> Frankly, I have no idea what your actual scenario might be.
>
> If you are still interested in a comment it would help if you provide a
> more concrete scenario with two or three actual transformations working
> together on a few rows of toy data, with a filter and one or two globals
> that you are hoping to avoid.
>
> Actual code is far easier to reshuffle and improve than an abstraction
> with overgeneralized function names and signatures where you cannot tell
> the necessary elements from the artifacts of the generalization.
>

From manpritsinghece at gmail.com  Sat Jul 16 05:26:22 2022
From: manpritsinghece at gmail.com (Manprit Singh)
Date: Sat, 16 Jul 2022 14:56:22 +0530
Subject: [Tutor] Ways of removing consequtive duplicates from a list
Message-ID: <CAO1OCwZWOnS7bVLoW44-OkVUx4JX-ChYkTvDZ5EuO7Dk0ZQHXw@mail.gmail.com>

Dear Sir ,

I was just doing an experiment of removing consecutive duplicates from a
list . Did it in the following ways  and it all worked . Just need to know
which one should be preferred ?  which one is more good ?

lst = [2, 2, 3, 3, 3, 2, 2, 5, 5, 6, 3, 3, 3, 3]
# Ways of removing consequtive duplicates
[ele for i, ele in enumerate(lst) if i==0 or ele != lst[i-1]]
[2, 3, 2, 5, 6, 3]
val = object()
[(val := ele) for ele in lst if ele != val]
[2, 3, 2, 5, 6, 3]
import itertools
[val for val, grp in itertools.groupby(lst)]
[2, 3, 2, 5, 6, 3]

Is there anything else more efficient ?

Regards
Manprit Singh

From avi.e.gross at gmail.com  Sat Jul 16 11:45:54 2022
From: avi.e.gross at gmail.com (avi.e.gross at gmail.com)
Date: Sat, 16 Jul 2022 11:45:54 -0400
Subject: [Tutor] Ways of removing consequtive duplicates from a list
In-Reply-To: <CAO1OCwZWOnS7bVLoW44-OkVUx4JX-ChYkTvDZ5EuO7Dk0ZQHXw@mail.gmail.com>
References: <CAO1OCwZWOnS7bVLoW44-OkVUx4JX-ChYkTvDZ5EuO7Dk0ZQHXw@mail.gmail.com>
Message-ID: <005001d8992b$227b77b0$67726710$@gmail.com>

Manprit,

Your message is not formatted properly in my email and you just asked any
women present to not reply to you, nor anyone who has not been knighted by a
Queen. I personally do not expect such politeness but clearly some do.

What do you mean by most efficient? Seriously.

For a list this size, almost any method runs fast enough. Efficiency
considerations may still apply but mainly consist of startup costs that can
differ quite a bit and even change when some of the underlying functionality
is changed such as to fix bugs, or deal with special cases or added options.

lst = [2, 2, 3, 3, 3, 2, 2, 5, 5, 6, 3, 3, 3, 3]


So are you going to do the above ONCE or repeatedly in your program? There
are modules and methods available to do testing by say running your choice a
million times that might provide you with numbers. Asking people here,
probably will get you mostly opinions or guesses. And it is not clear why
you need to know what is more efficient unless the assignment asks you to
think analytically and the thinking is supposed to be by you.

Here are your choices that I hopefully formatted in a way that lets them be
seen. But first, this is how your original looked:

[ele for i, ele in enumerate(lst) if i==0 or ele != lst[i-1]] [2, 3, 2, 5,
6, 3] val = object() [(val := ele) for ele in lst if ele != val] [2, 3, 2,
5, 6, 3] import itertools [val for val, grp in itertools.groupby(lst)] [2,
3, 2, 5, 6, 3]

The first one looks like a list comprehension albeit it is not easy to see
where it ends. I stopped when I hit an "or" but the brackets were not
finished:

[ele for i, ele in enumerate(lst) if i==0


And even with a bracket, it makes no sense!

So I read on:

#-----Choice ONE:
[ele for i, ele in enumerate(lst) if i==0 or ele != lst[i-1]]

OK, that worked and returned: [2, 3, 2, 5, 6, 3]

But your rendition shows the answer "[2, 3, 2, 5, 6, 3"  which thus is not
code so I remove that and move on:

val = object() [(val := ele) for ele in lst if ele != val]

This seems to be intended as two lines:

#-----Choice TWO:
val = object() 
[(val := ele) for ele in lst if ele != val]

And yes it works and produces the same output I can ignore.

By now I know to make multiple lines as needed:

#-----Choice THREE:
import itertools 
[val for val, grp in itertools.groupby(lst)]

So how would you analyze the above three choices, once unscrambled? I am not
going to tell you what I think.

What do they have in common?

What I note is they are all the SAME in one way. All use a list
comprehension. If one would have used loops for example, that might be a
factor as they tend to be less efficient in python. But they are all the
same.

So what else may be different?

Choice THREE imports a module. There is a cost involved especially if you
import the entire module, not just the part you want so the import method
adds a somewhat constant cost. But if the module is already used elsewhere
in your program, it is sort of a free cost to use it here and if you use
this method on large lists or many times, the cost per unit drops. How much
this affects efficiency is something you might need to test and even then
may vary.

Do you know what "enumerate()" does in choice ONE? It can really matter in
deciding what is efficient. If I have a list a million or billion units
long, will enumerate make another list of numbers from 1 to N that long in
memory, or will it make an iterator that is called repeatedly to make the
next pair for you? 

Choices ONE and TWO both have visible IF clauses but the second one has an
OR with two parts to test. In general, the more tests or other things done
in a loop, compared to a loop of the same number of iterations, the more
expensive it can be. But a careful study of the code

if i==0 or ele != lst[i-1]

suggests the first condition is only true the very first time but is
evaluated all N times so the second condition is evaluated N-1 times.
Basically, both are done with no real savings.

Choice TWO has a single test in the if, albeit it is for an arbitrary object
which can be more or less expensive depending on the object. The first
condition in choice ONE was a fairly trivial integer comparison and the
second, again, could be for any object. So these algorithms should work on
things other than integers.

Consider this list containing tuples and sets:

obj_list = [ (1,2), (1,2,3), (1,2,3), {"a", 1}, {"b", 2}, {"b", 2} ]

Should this work?

[ele for i, ele in enumerate(obj_list) if i==0 or ele != obj_list[i-1]]

[(1, 2), (1, 2, 3), {1, 'a'}, {'b', 2}]

I think it worked but the COMPARISONS between objects had to be more complex
and thus less efficient than for your initial example. So the number and
type of comparisons can be a factor in your analysis depending on how you
want to use each algorithm.

For completeness, I also tried the other two algorithms using this alternate
test list:

[(val := ele) for ele in obj_list if ele != val]

[(1, 2), (1, 2, 3), {1, 'a'}, {'b', 2}]

And

[val for val, grp in itertools.groupby(obj_list)]

[(1, 2), (1, 2, 3), {1, 'a'}, {'b', 2}]

Which brings us to the latter. What exactly does the groupby() function do?

If it is an iterator, and it happens to be, it may use less memory but for
small examples, the iterator overhead may be more that just using a short
list, albeit lists are iterators of a sort too. 

You can look at these examples analytically and find more similarities and
differences but at some point you need benchmarks to really know. The
itertools module is often highly optimized, meaning in many cases being done
mostly NOT in interpreted python but in C or C++ or whatever. If you wrote a
python version of the same idea, it might be less efficient. And in this
case, it may be overkill. I mean do you know what is returned by groupby? A
hint is that it returns TWO things and you are only using one. The second is
nonsense for your example as you are using the default function that
generates keys based on a sort of equality so all the members of the group
are the same. But the full power of group_by is if you supply a function
that guides choices such as wanting all items that are the same if written
in all UPPER case.

So my guess is the itertools module chosen could be more than you need. But
if it is efficient and the defaults click right in and do the job, who
knows? My guess is there is more cost than the others for simple things but
perhaps not for more complex things. I think it does a form of hashing
rather than comparisons like the others.

I hope my thoughts are helpful even if they do not provide a single
unambiguous answer. They all seem like reasonable solutions and probably
NONE of them would be expected if this was homework for a class just getting
started. That class would expect a solution for a single type of object such
as small integers and a fairly trivial implementation in a loop that may be
an unrolled variant of perhaps close to choice TWO. Efficiency might be a
secondary concern, if at all.

And for really long lists, weirdly, I might suggest a variant that starts by
adding a unique item in front of the list and then removing it from the
results at the end.

 
-----Original Message-----
From: Tutor <tutor-bounces+avi.e.gross=gmail.com at python.org> On Behalf Of
Manprit Singh
Sent: Saturday, July 16, 2022 5:26 AM
To: tutor at python.org
Subject: [Tutor] Ways of removing consequtive duplicates from a list

Dear Sir ,

I was just doing an experiment of removing consecutive duplicates from a
list . Did it in the following ways  and it all worked . Just need to know
which one should be preferred ?  which one is more good ?

lst = [2, 2, 3, 3, 3, 2, 2, 5, 5, 6, 3, 3, 3, 3] # Ways of removing
consequtive duplicates [ele for i, ele in enumerate(lst) if i==0 or ele !=
lst[i-1]] [2, 3, 2, 5, 6, 3] val = object() [(val := ele) for ele in lst if
ele != val] [2, 3, 2, 5, 6, 3] import itertools [val for val, grp in
itertools.groupby(lst)] [2, 3, 2, 5, 6, 3]

Is there anything else more efficient ?

Regards
Manprit Singh
_______________________________________________
Tutor maillist  -  Tutor at python.org
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor


From alexkleider at gmail.com  Sat Jul 16 18:01:24 2022
From: alexkleider at gmail.com (Alex Kleider)
Date: Sat, 16 Jul 2022 15:01:24 -0700
Subject: [Tutor] Ways of removing consequtive duplicates from a list
In-Reply-To: <005001d8992b$227b77b0$67726710$@gmail.com>
References: <CAO1OCwZWOnS7bVLoW44-OkVUx4JX-ChYkTvDZ5EuO7Dk0ZQHXw@mail.gmail.com>
 <005001d8992b$227b77b0$67726710$@gmail.com>
Message-ID: <CAMCEyD5-gVczEpnmYh+2a2TWNr5vsewy3OapnGBJmYE=xggU=g@mail.gmail.com>

On Sat, Jul 16, 2022 at 12:54 PM <avi.e.gross at gmail.com> wrote:
>
> Manprit,
>
> Your message is not formatted properly in my email and you just asked any
> women present to not reply to you, nor anyone who has not been knighted by a
> Queen. I personally do not expect such politeness but clearly some do.
>

I confess it took me longer than it should have to figure out to what
you were referring in the second half of the above but eventually the
light came on and the smile blossomed!
My next thought was that it wouldn't necessarily have had to have been
a Queen although anyone knighted (by a King) prior to the beginning of
our current Queen's reign is unlikely to be even alive let alone
interested in this sort of thing.
Thanks for the morning smile!
a
PS My (at least for me easier to comprehend) solution:

def rm_duplicates(iterable):
    last = ''
    for item in iterable:
        if item != last:
            yield item
            last = item

lst = [2, 2, 3, 3, 3, 2, 2, 5, 5, 6, 3, 3, 3, 3]

if __name__ == '__main__':
    res = [res for res in rm_duplicates(lst)]
    print(res)
    assert res == [2, 3, 2, 5, 6, 3]

-- 
alex at kleider.ca  (sent from my current gizmo)

From wlfraed at ix.netcom.com  Sat Jul 16 20:17:15 2022
From: wlfraed at ix.netcom.com (Dennis Lee Bieber)
Date: Sat, 16 Jul 2022 20:17:15 -0400
Subject: [Tutor] Ways of removing consequtive duplicates from a list
References: <CAO1OCwZWOnS7bVLoW44-OkVUx4JX-ChYkTvDZ5EuO7Dk0ZQHXw@mail.gmail.com>
 <005001d8992b$227b77b0$67726710$@gmail.com>
Message-ID: <41l6dhhjki03i05uue62o0a35vnd6n0q6d@4ax.com>

On Sat, 16 Jul 2022 11:45:54 -0400, <avi.e.gross at gmail.com> declaimed the
following:

>Your message is not formatted properly in my email and you just asked any

	Just a comment: Might be your client -- it did come in as correctly
"broken" lines in Gmane's news-server gateway to the mailing list.

	OTOH: I had problems with a genealogy mailing list (not available as a
"news group" on any server) with some posts. They are formatted properly
when reading, but become one snarled string when quoted in a reply. But
only from one or two posters -- so it is a combination of posting client vs
reading client... <G>


-- 
	Wulfraed                 Dennis Lee Bieber         AF6VN
	wlfraed at ix.netcom.com    http://wlfraed.microdiversity.freeddns.org/


From bouncingcats at gmail.com  Sat Jul 16 20:37:25 2022
From: bouncingcats at gmail.com (David)
Date: Sun, 17 Jul 2022 10:37:25 +1000
Subject: [Tutor] Ways of removing consequtive duplicates from a list
In-Reply-To: <41l6dhhjki03i05uue62o0a35vnd6n0q6d@4ax.com>
References: <CAO1OCwZWOnS7bVLoW44-OkVUx4JX-ChYkTvDZ5EuO7Dk0ZQHXw@mail.gmail.com>
 <005001d8992b$227b77b0$67726710$@gmail.com>
 <41l6dhhjki03i05uue62o0a35vnd6n0q6d@4ax.com>
Message-ID: <CAMPXz=p0u0Uj27V3jovP-7KguOeWX_HXauME_NAABKau5hm_dw@mail.gmail.com>

On Sun, 17 Jul 2022 at 10:18, Dennis Lee Bieber <wlfraed at ix.netcom.com> wrote:
> On Sat, 16 Jul 2022 11:45:54 -0400, <avi.e.gross at gmail.com> declaimed the following:
>
> >Your message is not formatted properly in my email and you just asked any
>
>         Just a comment: Might be your client -- it did come in as correctly
> "broken" lines in Gmane's news-server gateway to the mailing list.

In case it is helpful for avi.e.gross in future, the original message
from Manprit Singh is formatted correctly the mailing list archive
when I view it in my web browser at
  https://mail.python.org/pipermail/tutor/2022-July/119936.html

From __peter__ at web.de  Sun Jul 17 04:26:37 2022
From: __peter__ at web.de (Peter Otten)
Date: Sun, 17 Jul 2022 10:26:37 +0200
Subject: [Tutor] Ways of removing consequtive duplicates from a list
In-Reply-To: <CAMCEyD5-gVczEpnmYh+2a2TWNr5vsewy3OapnGBJmYE=xggU=g@mail.gmail.com>
References: <CAO1OCwZWOnS7bVLoW44-OkVUx4JX-ChYkTvDZ5EuO7Dk0ZQHXw@mail.gmail.com>
 <005001d8992b$227b77b0$67726710$@gmail.com>
 <CAMCEyD5-gVczEpnmYh+2a2TWNr5vsewy3OapnGBJmYE=xggU=g@mail.gmail.com>
Message-ID: <f3fb5ee4-3d34-f4d0-3a0f-4edd4084538e@web.de>

On 17/07/2022 00:01, Alex Kleider wrote:

> PS My (at least for me easier to comprehend) solution:
>
> def rm_duplicates(iterable):
>      last = ''
>      for item in iterable:
>          if item != last:
>              yield item
>              last = item

The problem with this is the choice of the initial value for 'last':

 >>> list(rm_duplicates(["", "", 42, "a", "a", ""]))
[42, 'a', '']   # oops, we lost the initial empty string

Manprit avoided that in his similar solution by using a special value
that will compare false except in pathological cases:

 > val = object()
 > [(val := ele) for ele in lst if ele != val]

Another fix is to yield the first item unconditionally:

def rm_duplicates(iterable):
     it = iter(iterable)
     try:
         last = next(it)
     except StopIteration:
         return
     yield last
     for item in it:
         if item != last:
             yield item
             last = item

If you think that this doesn't look very elegant you may join me in the
https://peps.python.org/pep-0479/ haters' club ;)

From avi.e.gross at gmail.com  Sun Jul 17 00:52:20 2022
From: avi.e.gross at gmail.com (avi.e.gross at gmail.com)
Date: Sun, 17 Jul 2022 00:52:20 -0400
Subject: [Tutor] Ways of removing consequtive duplicates from a list
In-Reply-To: <41l6dhhjki03i05uue62o0a35vnd6n0q6d@4ax.com>
References: <CAO1OCwZWOnS7bVLoW44-OkVUx4JX-ChYkTvDZ5EuO7Dk0ZQHXw@mail.gmail.com>
 <005001d8992b$227b77b0$67726710$@gmail.com>
 <41l6dhhjki03i05uue62o0a35vnd6n0q6d@4ax.com>
Message-ID: <003701d89998$ffcfd940$ff6f8bc0$@gmail.com>

Dennis,

I did not blame the sender as much as say what trouble I had putting
together in proper order what THIS mailer showed me. I am using Microsoft
Outlook getting mail with IMAP4 from gmail and also forwarding a copy to AOL
mail which was also driving me nuts by making my text look like that when I
sent. Oddly, that one received the lines nicely!!!

Standards annoy some people but some standards really would be helpful if
various email vendors agreed tio implement thigs as much the same as
possible. Realistically, many allow all kinds of customization. For some
things it matters less but for code and ESPECIALLY code in python where
indentation is part of the language, it is frustrating.

I may be getting touchy without the feely, but I am having trouble listening
to the way some people with cultural differences, or far left/right
attitudes, try to address me/us in forums like this. Alex may have been
amused by my retort, and there is NOTHING wrong with saying "Dear Sirs" when
done in many contexts, just like someone a while ago was writing to
something like "Esteemed Professors" but it simply rubs me wrong here.

Back to topic, if I may, sometimes things set our moods. I am here partially
to be helpful and partially for my own amusement and education as looking at
some of the puzzles presented presents opportunities to think and
investigate.

But before I could get to the reasonable question here, I was perturbed at
the overly formulaic politeness and wrongness of the greeting from my
perhaps touchy perspective for the reasons mentioned including the way it
seeming assumes no Ladies are present and we are somehow Gentlemen, but also
by the mess I saw on one wrapped line that was a pain to take apart. Then I
wondered why the question was being asked. Yes, weirdly, it is a question
you and I have discussed before when wondering which way of doing something
worked better, was more efficient, or showed a more brilliant way to use the
wrong method to do something nobody designed it for!

But as this is supposed to be a TUTORIAL or HELP website, even if Alan
rightfully may disagree and it is his forum, I am conscious of not wanting
to make this into a major discussion group where the people we want to help
just scratch their heads.

I am not sure who read my longish message, but I hope the main point is that
sometimes you should just TEST it. This is not long and complex code.
However, there cannot be any one test everyone will agree on and it often
depends on factors other than CPU cycles. A robust implementation that can
handle multiple needs may well be slower and yet more cost effective in some
sense.

I have mentioned I do lots of my playing around with other languages too.
Many have a minor variant of the issue here as in finding unique items in
some collection such as a vector or data.frame. The problem here did not say
whether the data being used can be in random order or already has all
instances of the same value in order or is even in sorted order. Has anyone
guessed if that is the case?

Because if something is already sorted as described, such as
[0,0,1,1,1,1,2,4,4,4,4,4,4,5,5] then there are even more trivial solutions
by using something like numpy.unique() using just a few lines and I wonder
how efficient this is:

>>> import numpy as np
>>> np.unique([0,0,1,1,1,1,2,4,4,4,4,4,4,5,5] )
array([0, 1, 2, 4, 5])

Admittedly this is a special case. But only the one asking the question can
tell us if that is true. It also works with character data and probably much
more:

>>> np.unique(["a", "a", "b", "b", "b", "c"])
array(['a', 'b', 'c'], dtype='|S1')

But this was not offered as one of his three choices, so never mind!


-----Original Message-----
From: Tutor <tutor-bounces+avi.e.gross=gmail.com at python.org> On Behalf Of
Dennis Lee Bieber
Sent: Saturday, July 16, 2022 8:17 PM
To: tutor at python.org
Subject: Re: [Tutor] Ways of removing consequtive duplicates from a list

On Sat, 16 Jul 2022 11:45:54 -0400, <avi.e.gross at gmail.com> declaimed the
following:

>Your message is not formatted properly in my email and you just asked 
>any

	Just a comment: Might be your client -- it did come in as correctly
"broken" lines in Gmane's news-server gateway to the mailing list.

	OTOH: I had problems with a genealogy mailing list (not available as
a "news group" on any server) with some posts. They are formatted properly
when reading, but become one snarled string when quoted in a reply. But only
from one or two posters -- so it is a combination of posting client vs
reading client... <G>


-- 
	Wulfraed                 Dennis Lee Bieber         AF6VN
	wlfraed at ix.netcom.com    http://wlfraed.microdiversity.freeddns.org/

_______________________________________________
Tutor maillist  -  Tutor at python.org
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor


From avi.e.gross at gmail.com  Sun Jul 17 12:59:15 2022
From: avi.e.gross at gmail.com (avi.e.gross at gmail.com)
Date: Sun, 17 Jul 2022 12:59:15 -0400
Subject: [Tutor] Ways of removing consequtive duplicates from a list
In-Reply-To: <f3fb5ee4-3d34-f4d0-3a0f-4edd4084538e@web.de>
References: <CAO1OCwZWOnS7bVLoW44-OkVUx4JX-ChYkTvDZ5EuO7Dk0ZQHXw@mail.gmail.com>
 <005001d8992b$227b77b0$67726710$@gmail.com>
 <CAMCEyD5-gVczEpnmYh+2a2TWNr5vsewy3OapnGBJmYE=xggU=g@mail.gmail.com>
 <f3fb5ee4-3d34-f4d0-3a0f-4edd4084538e@web.de>
Message-ID: <00b001d899fe$8c3cb000$a4b61000$@gmail.com>

You could make the case, Peter, that you can use anything as a start that
will not likely match in your domain. You are correct if an empty string may
be in the data. 

Now an object returned by object is pretty esoteric and ought to be rare and
indeed each new object seems to be individual.

val=object()

[(val := ele) for ele in [1,1,2,object(),3,3,3] if ele != val]
-->> [1, 2, <object object at 0x00000176F33150D0>, 3]

So the only way to trip this up is to use the same object or another
reference to it where it is silently ignored.

[(val := ele) for ele in [1,1,2,val,3,3,3] if ele != val]
-->> [1, 2, 3]

valiant = val
[(val := ele) for ele in [1,1,2,valiant,3,3,3] if ele != val]
-->> [1, 2, 3]

But just about any out-of-band will presumably do. I mean if you are
comparing just numbers, all you need do is slip in something else like "The
quick brown fox jumped over the lazy dog" or float("inf") or even val =
(math.inf, -math.inf) and so on.

I would have thought also of using the special value of None and it works
fine unless the string has a None!

So what I see here is a choice between a heuristic solution that can fail to
work quite right on a perhaps obscure edge case, or a fully deterministic
algorithm that knows which is the first and treats it special.

The question asked was about efficiency, so let me ask a dumb question.

Is there a difference in efficiency of comparing to different things over
and over again in the loop? I would think so. Comparing to None could turn
out to be trivial. Math.inf as implemented in python seems to just be a big
floating number as is float("inf") and I have no idea what an object() looks
like but assume it is the parent class of all other objects ad thus has no
content but some odd methods attached.

Clearly the simplest comparison might be variable depending on what the data
you are working on is.

So, yes, an unconditional way of dealing with the first item often is
needed. It is very common in many algorithms for the first and perhaps last
item to have no neighbor on one side.

-----Original Message-----
From: Tutor <tutor-bounces+avi.e.gross=gmail.com at python.org> On Behalf Of
Peter Otten
Sent: Sunday, July 17, 2022 4:27 AM
To: tutor at python.org
Subject: Re: [Tutor] Ways of removing consequtive duplicates from a list

On 17/07/2022 00:01, Alex Kleider wrote:

> PS My (at least for me easier to comprehend) solution:
>
> def rm_duplicates(iterable):
>      last = ''
>      for item in iterable:
>          if item != last:
>              yield item
>              last = item

The problem with this is the choice of the initial value for 'last':

 >>> list(rm_duplicates(["", "", 42, "a", "a", ""]))
[42, 'a', '']   # oops, we lost the initial empty string

Manprit avoided that in his similar solution by using a special value that
will compare false except in pathological cases:

 > val = object()
 > [(val := ele) for ele in lst if ele != val]

Another fix is to yield the first item unconditionally:

def rm_duplicates(iterable):
     it = iter(iterable)
     try:
         last = next(it)
     except StopIteration:
         return
     yield last
     for item in it:
         if item != last:
             yield item
             last = item

If you think that this doesn't look very elegant you may join me in the
https://peps.python.org/pep-0479/ haters' club ;)
_______________________________________________
Tutor maillist  -  Tutor at python.org
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor


From avi.e.gross at gmail.com  Sun Jul 17 14:02:23 2022
From: avi.e.gross at gmail.com (avi.e.gross at gmail.com)
Date: Sun, 17 Jul 2022 14:02:23 -0400
Subject: [Tutor] Ways of removing consequtive duplicates from a list
In-Reply-To: <f3fb5ee4-3d34-f4d0-3a0f-4edd4084538e@web.de>
References: <CAO1OCwZWOnS7bVLoW44-OkVUx4JX-ChYkTvDZ5EuO7Dk0ZQHXw@mail.gmail.com>
 <005001d8992b$227b77b0$67726710$@gmail.com>
 <CAMCEyD5-gVczEpnmYh+2a2TWNr5vsewy3OapnGBJmYE=xggU=g@mail.gmail.com>
 <f3fb5ee4-3d34-f4d0-3a0f-4edd4084538e@web.de>
Message-ID: <015d01d89a07$5e34b230$1a9e1690$@gmail.com>

I was thinking of how expensive it is to push a copy of the first item in
front of a list to avoid special casing in this case.

I mean convert [first, ...] to [first, first, ...]

That neatly can deal with some algorithms such as say calculating a moving
average if you want the first N items to simply be the same as the start
item or a pre-calculated mean of he cumulative sum to that point, rather
than empty or an error.

But I realized that with the python emphasis on iterables in python, there
probably is no easy way to push items into a sort of queue. Maybe you can
somewhat do it with a decorator that intercepts your first calls by
supplying the reserved content and only afterwards calls the iterator. But
as there are so many reasonable solutions, not an avenue needed to explore.

-----Original Message-----
From: Tutor <tutor-bounces+avi.e.gross=gmail.com at python.org> On Behalf Of
Peter Otten
Sent: Sunday, July 17, 2022 4:27 AM
To: tutor at python.org
Subject: Re: [Tutor] Ways of removing consequtive duplicates from a list

On 17/07/2022 00:01, Alex Kleider wrote:

> PS My (at least for me easier to comprehend) solution:
>
> def rm_duplicates(iterable):
>      last = ''
>      for item in iterable:
>          if item != last:
>              yield item
>              last = item

The problem with this is the choice of the initial value for 'last':

 >>> list(rm_duplicates(["", "", 42, "a", "a", ""]))
[42, 'a', '']   # oops, we lost the initial empty string

Manprit avoided that in his similar solution by using a special value that
will compare false except in pathological cases:

 > val = object()
 > [(val := ele) for ele in lst if ele != val]

Another fix is to yield the first item unconditionally:

def rm_duplicates(iterable):
     it = iter(iterable)
     try:
         last = next(it)
     except StopIteration:
         return
     yield last
     for item in it:
         if item != last:
             yield item
             last = item

If you think that this doesn't look very elegant you may join me in the
https://peps.python.org/pep-0479/ haters' club ;)
_______________________________________________
Tutor maillist  -  Tutor at python.org
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor


From wlfraed at ix.netcom.com  Sun Jul 17 21:21:30 2022
From: wlfraed at ix.netcom.com (Dennis Lee Bieber)
Date: Sun, 17 Jul 2022 21:21:30 -0400
Subject: [Tutor] Ways of removing consequtive duplicates from a list
References: <CAO1OCwZWOnS7bVLoW44-OkVUx4JX-ChYkTvDZ5EuO7Dk0ZQHXw@mail.gmail.com>
 <005001d8992b$227b77b0$67726710$@gmail.com>
 <41l6dhhjki03i05uue62o0a35vnd6n0q6d@4ax.com>
 <003701d89998$ffcfd940$ff6f8bc0$@gmail.com>
Message-ID: <tbc9dh131q2gi8iolg4le0g7r1vhg2ub0q@4ax.com>

On Sun, 17 Jul 2022 00:52:20 -0400, <avi.e.gross at gmail.com> declaimed the
following:

>I did not blame the sender as much as say what trouble I had putting
>together in proper order what THIS mailer showed me. I am using Microsoft
>Outlook getting mail with IMAP4 from gmail and also forwarding a copy to AOL
>mail which was also driving me nuts by making my text look like that when I
>sent. Oddly, that one received the lines nicely!!!
>
	Personally, I don't trust anything gmail and/or GoogleGroups produces
<G>... And, hopefully without offending, I find M$ Outlook to be the next
offender. My experience (when I was employed, and the companies used
Outlook as the only email client authorized) was that it went out of its
way to make it almost impossible to respond in accordance with RFC1855

"""
    - If you are sending a reply to a message or a posting be sure you
      summarize the original at the top of the message, or include just
      enough text of the original to give a context.  This will make
      sure readers understand when they start to read your response.
      Since NetNews, especially, is proliferated by distributing the
      postings from one host to another, it is possible to see a
      response to a message before seeing the original.  Giving context
      helps everyone.  But do not include the entire original!
"""

	Outlook, in my experience, attempts to replicate corporate mail
practices of ages past. Primarily by treating "quoted content" as if it
were a photocopy being attached as a courtesy copy/reminder (I've seen
messages that had something like 6 or more levels of indentation and
font-size reductions as it just took the content of a post, applied an
indent (not a standard > quote marker) shift and/or font reduction). 

	Attempting to do a trim to relevant content with interspersed reply
content was nearly impossible as one had to figure out how to under the
style at the trim point -- otherwise one's inserted text ended up looking
just like the quoted text and not as new content.

	I'll admit that I've seen configuration options to allow for something
closer to RFC1855 format... But they were so buried most people never see
them. Instead we get heavily styled HTML for matters which only need simple
text.


>
>I may be getting touchy without the feely, but I am having trouble listening
>to the way some people with cultural differences, or far left/right
>attitudes, try to address me/us in forums like this. Alex may have been
>amused by my retort, and there is NOTHING wrong with saying "Dear Sirs" when
>done in many contexts, just like someone a while ago was writing to
>something like "Esteemed Professors" but it simply rubs me wrong here.
>
	The one that most affects me are those that start out with: "I have
doubt..." where "doubt" is being used in place of "question".


-- 
	Wulfraed                 Dennis Lee Bieber         AF6VN
	wlfraed at ix.netcom.com    http://wlfraed.microdiversity.freeddns.org/


From PythonList at DancesWithMice.info  Sun Jul 17 23:34:16 2022
From: PythonList at DancesWithMice.info (dn)
Date: Mon, 18 Jul 2022 15:34:16 +1200
Subject: [Tutor] Ways of removing consequtive duplicates from a list
In-Reply-To: <f3fb5ee4-3d34-f4d0-3a0f-4edd4084538e@web.de>
References: <CAO1OCwZWOnS7bVLoW44-OkVUx4JX-ChYkTvDZ5EuO7Dk0ZQHXw@mail.gmail.com>
 <005001d8992b$227b77b0$67726710$@gmail.com>
 <CAMCEyD5-gVczEpnmYh+2a2TWNr5vsewy3OapnGBJmYE=xggU=g@mail.gmail.com>
 <f3fb5ee4-3d34-f4d0-3a0f-4edd4084538e@web.de>
Message-ID: <afb076e9-33f5-99e9-bfe8-b5ffa52ff1ec@DancesWithMice.info>

On 17/07/2022 20.26, Peter Otten wrote:
> On 17/07/2022 00:01, Alex Kleider wrote:
> 
>> PS My (at least for me easier to comprehend) solution:
>>
>> def rm_duplicates(iterable):
>> ???? last = ''
>> ???? for item in iterable:
>> ???????? if item != last:
>> ???????????? yield item
>> ???????????? last = item
> 
> The problem with this is the choice of the initial value for 'last':

Remember "unpacking", eg

>> def rm_duplicates(iterable):
>>      current, *the_rest = iterable
>>      for item in the_rest:

Then there is the special case, which (assuming it is possible) can be
caught as an exception - which will likely need to 'ripple up' through
the function-calls because the final collection of 'duplicates' will be
empty/un-process-able. (see later comment about "unconditionally")


Playing in the REPL:

>>> iterable = [1,2,3]
>>> first, *rest = iterable
>>> first, rest
(1, [2, 3])
# iterable is a list

>>> iterable = [1,2]
>>> first, *rest = iterable
>>> first, rest
(1, [2])
# iterable is (technically) a list

>>> iterable = [1]
>>> first, *rest = iterable
>>> first, rest
(1, [])
# iterable is an empty list

>>> iterable = []
>>> first, *rest = l
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ValueError: not enough values to unpack (expected at least 1, got 0)
# nothing to see here: no duplicates - and no 'originals' either!


>>>> list(rm_duplicates(["", "", 42, "a", "a", ""]))
> [42, 'a', '']?? # oops, we lost the initial empty string
> 
> Manprit avoided that in his similar solution by using a special value
> that will compare false except in pathological cases:
> 
>> val = object()
>> [(val := ele) for ele in lst if ele != val]
> 
> Another fix is to yield the first item unconditionally:
> 
> def rm_duplicates(iterable):
> ??? it = iter(iterable)
> ??? try:
> ??????? last = next(it)
> ??? except StopIteration:
> ??????? return
> ??? yield last
> ??? for item in it:
> ??????? if item != last:
> ??????????? yield item
> ??????????? last = item
> 
> If you think that this doesn't look very elegant you may join me in the
> https://peps.python.org/pep-0479/ haters' club ;)
This does indeed qualify as 'ugly'. However, it doesn't need to be
expressed in such an ugly fashion!

--
Regards,
=dn

From PythonList at DancesWithMice.info  Mon Jul 18 01:34:08 2022
From: PythonList at DancesWithMice.info (dn)
Date: Mon, 18 Jul 2022 17:34:08 +1200
Subject: [Tutor] Ways of removing consequtive duplicates from a list
In-Reply-To: <003701d89998$ffcfd940$ff6f8bc0$@gmail.com>
References: <CAO1OCwZWOnS7bVLoW44-OkVUx4JX-ChYkTvDZ5EuO7Dk0ZQHXw@mail.gmail.com>
 <005001d8992b$227b77b0$67726710$@gmail.com>
 <41l6dhhjki03i05uue62o0a35vnd6n0q6d@4ax.com>
 <003701d89998$ffcfd940$ff6f8bc0$@gmail.com>
Message-ID: <5b42eea3-f886-bf85-1ec3-884b1fd2694c@DancesWithMice.info>

>> ... you just asked any
>> women present to not reply to you, nor anyone who has not been knighted by a
>> Queen. I personally do not expect such politeness but clearly some do.
>>
> 
> I confess it took me longer than it should have to figure out to what
> you were referring in the second half of the above but eventually the
> light came on and the smile blossomed!
> My next thought was that it wouldn't necessarily have had to have been
> a Queen although anyone knighted (by a King) prior to the beginning of
> our current Queen's reign is unlikely to be even alive let alone
> interested in this sort of thing.
> Thanks for the morning smile!

I've not been knighted, but am frequently called "Sir". Maybe when you
too have a grey beard? (hair atop same head, optional)

It is not something that provokes a positive response, and if in a
grumpy-mood may elicit a reply describing it as to be avoided "unless we
are both in uniform".

Thus, the same words hold different dictionary-meanings and different
implications for different people, in different contexts, and between
cultures!

I'm not going to invoke any Python community Code of Conduct terms or
insist upon 'politically-correctness', but am vaguely-surprised that
someone has not...

The gender observation is appropriate, but then how many of the OP's
discussions feature responses from other than males?
(not that such observation could be claimed as indisputable)

Writing 'here', have often used constructions such as "him/her", with
some question about how many readers might apply each form...


The dislocation in response to the OP is cultural. (In this) I have
advantage over most 'here', having lived and worked in India.
(also having been brought-up, way back in the last century, in an
old-English environment, where we were expected to address our elders
using such titles. Come to think of it, also in the US of those days)

In India, and many others parts of Asia, the respectful address of
teachers, guides, and elders generally, is required-behavior. In the
Antipodes, titles of almost any form are rarely used, and most will
exchange first-names at an introduction. Whereas in Germany (for
example), the exact-opposite applies, and one must remember to use the
Herr-s, Doktor-s, du-forms etc.

How to cope when one party is at the opposite 'end' of the scale from
another? I'm reminded of 'Postel's law': "Be liberal in what you accept,
and conservative in what you send".

Is whether someone actually knows what they're talking-about more
relevant (and more telling) than their qualifications, rank, title,
whatever - or does that only apply in the tech-world where we seem to
think we can be a technocracy?

Living in an 'immigrant society' today, and having gone through such a
process (many times, in many places) I'm intrigued by how quickly - or
how slowly, some will adapt to the local culture possibly quite alien to
them. Maybe worst of all are the ones who observe, but then react by
assuming (or claiming) superiority - less an attempt to 'fit in', but
perhaps an intent to be 'more equal'...


> I may be getting touchy without the feely, but I am having trouble listening
> to the way some people with cultural differences, or far left/right
> attitudes, try to address me/us in forums like this. Alex may have been
> amused by my retort, and there is NOTHING wrong with saying "Dear Sirs" when

Disagree: when *I* read the message, I am me. I am in the singular. When
*you* write, you (singular) are writing to many of us (plural). Who is
the more relevant party to the communication?

Accordingly, "Dear Sir" not "Sirs" - unless you are seeking a collective
or corporate reply, eg from a firm of solicitors.
(cf the individual replies (plural, one might hope) you expect from
multiple individuals - who happen to be personal-members of the
(collective) mailing-list).


> done in many contexts, just like someone a while ago was writing to
> something like "Esteemed Professors" but it simply rubs me wrong here.

Like it appears do you, I quickly lose respect for 'esteemed
professors/professionals' who expect this, even revel in it.

However, if one is a student or otherwise 'junior', it is a
career-limiting/grade-reducing move not to accede!

That said, two can play at that game: someone wanting to improve his/her
grade (or seeking some other favor) will attempt ingratiation through
more effusive recognition and compliment ("gilding the lily"). whither
'respect'?


I recall a colleague, on an International Development team assigned to a
small Pacific country, who may have been junior or at most 'equal' to
myself in 'rank'. Just as in India, he would introduce himself formally
as "Dr Chandrashekar" plus full position and assignment. In a more
relaxed situation, his informal introduction was "Dr Chandra". It was
amusing to watch the reactions both 'westerners' and locals had to this.
Seeing how it didn't 'fit' with our host-culture, we took sardonic
delight in referring to him as "Chandra". (yes, naughty little boys!)
One day my (local, and non-tech, and female) assistant, visibly shaking,
requested a private meeting with another member of the team and myself.
Breaking-down into tears she apologised for interrupting the urgent-fix
discussion we'd been having with senior IT staff the day before, even as
we knew we were scheduled elsewhere. Her 'apology' was that Chandra was
(twice) insistent for our presence and demanded that meeting be
interrupted, even terminated - and that she had to obey, she said,
"because he is Doctor". (we tried really hard not to laugh) For our
part, knowing the guy, we knew that she should not be the recipient of
any 'blow-back'. After plentiful reassurance that she was not 'in
trouble' with either of us, and a talk (similar to 'here') about the
[ab]use of 'titles', she not only understood, but paid us both a great
compliment saying something like: I call you (first-name) because we all
work together, but I call him "Doctor" because he expects me to do
things *for* him! Being called by my given-name, unadorned, always
proved a 'buzz' thereafter!


> Back to topic, if I may, sometimes things set our moods. I am here partially
> to be helpful and partially for my own amusement and education as looking at
> some of the puzzles presented presents opportunities to think and
> investigate.
> 
> But before I could get to the reasonable question here, I was perturbed at
> the overly formulaic politeness and wrongness of the greeting from my
> perhaps touchy perspective for the reasons mentioned including the way it
> seeming assumes no Ladies are present and we are somehow Gentlemen, but also
> by the mess I saw on one wrapped line that was a pain to take apart. Then I
> wondered why the question was being asked. Yes, weirdly, it is a question
> you and I have discussed before when wondering which way of doing something
> worked better, was more efficient, or showed a more brilliant way to use the
> wrong method to do something nobody designed it for!

Yep, rubs me the wrong way too!
(old grumpy-guts is likely to say "no!", on principle - and long before
they've even finished their wind-up!)

BTW such is not just an Asian 'thing' either - I recall seeing, and
quickly avoiding, the latest version of a perennial discussion about
protocol. Specifically, the sequence of email-addresses one should use
in the To: and Cc: fields of email-messages (and whether or not Bcc: is
"respectful"). Even today, in the US and UK, some people and/or
organisations demand that the more 'important' names should precede
those of mere-minions. "We the people" meets "some, more equal than others"!

Yes, and the OP does irritate by not answering questions from 'helpers'.
He does publish (for income/profit). I don't know if he has ever
used/repeated any of the topics discussed 'here' - nor if in doing-so he
attributes and credits appropriately (by European, UK, US... standards).


> I am not sure who read my longish message, but I hope the main point is that
> sometimes you should just TEST it. This is not long and complex code.
> However, there cannot be any one test everyone will agree on and it often
> depends on factors other than CPU cycles. A robust implementation that can
> handle multiple needs may well be slower and yet more cost effective in some
> sense.

Another source of irritation: define terms-used, eg what is the metric
for "better" or "best"?

Frankly, the succession of 'academic questions' with dubious application
in the real world (CRC-checks notwithstanding) have all the flavor of
someone writing an old-fashioned text-book - emphasis on facts, with
professional application relegated to lesser (if any) import, and
perhaps more than a little "I'm so much smarter than you".

NB the Indian and many Asian education systems use techniques which are
regarded as 'old', yet at the same time they are apparently effective!


-- 
Regards,
=dn

From __peter__ at web.de  Mon Jul 18 02:49:21 2022
From: __peter__ at web.de (Peter Otten)
Date: Mon, 18 Jul 2022 08:49:21 +0200
Subject: [Tutor] Implicit passing of argument select functions being
 called
In-Reply-To: <CAPzH9RU0fr812=Qo0qha8kcC_CNhBG9nu0enVtpMx2nAxu4YuQ@mail.gmail.com>
References: <931b5db0-047f-90c3-ee6c-35e6a34c01f2@gmail.com>
 <efa60224-02b1-3b5c-3a61-f7b5f9abffd4@web.de>
 <CAPzH9RU0fr812=Qo0qha8kcC_CNhBG9nu0enVtpMx2nAxu4YuQ@mail.gmail.com>
Message-ID: <915d5595-34bd-fd58-0819-4fb13a09a50b@web.de>

On 14/07/2022 09:56, ?????????? ????? wrote:
> I've finally figured ou a solution, I'll leave it here in case it helps
> someone else, using inspect.signature
>
>
> The actuall functions would look like this:
> `
> dfilter = simple_filter(start_col = 6,)
> make_pipeline(
>    load_dataset(fpath="some/path.xlsx",
>    header=[0,1],
>    apply_internal_standard(target_col = "EtD5"),
>   export_to_sql(fname  = "some/name"),
>    ,
>    data _filter = dfilter
>    )

Whatever works ;) I think I would instead ensure that all
transformations have the same signature. functools.partial() could be
helpful to implement this. Simple example:

 >>> def add(items, value):
	return [item + value for item in items]

 >>> def set_value(items, value, predicate):
	return [value if predicate(item) else item for item in items]

 >>> def transform(items, *transformations):
	for trafo in transformations:
		items = trafo(items)
	return items

 >>> from functools import partial
 >>> transform(
	[-3, 7, 5],
         # add 5 to each item
	partial(add, value=5),
         # set items > 10 to 0
	partial(set_value, value=0, predicate=lambda x: x > 10)
)
[2, 0, 10]


From __peter__ at web.de  Mon Jul 18 03:15:14 2022
From: __peter__ at web.de (Peter Otten)
Date: Mon, 18 Jul 2022 09:15:14 +0200
Subject: [Tutor] Ways of removing consequtive duplicates from a list
In-Reply-To: <00b001d899fe$8c3cb000$a4b61000$@gmail.com>
References: <CAO1OCwZWOnS7bVLoW44-OkVUx4JX-ChYkTvDZ5EuO7Dk0ZQHXw@mail.gmail.com>
 <005001d8992b$227b77b0$67726710$@gmail.com>
 <CAMCEyD5-gVczEpnmYh+2a2TWNr5vsewy3OapnGBJmYE=xggU=g@mail.gmail.com>
 <f3fb5ee4-3d34-f4d0-3a0f-4edd4084538e@web.de>
 <00b001d899fe$8c3cb000$a4b61000$@gmail.com>
Message-ID: <28dee502-7274-b2cb-26f5-89010761f42e@web.de>

On 17/07/2022 18:59, avi.e.gross at gmail.com wrote:
> You could make the case, Peter, that you can use anything as a start that
> will not likely match in your domain. You are correct if an empty string may
> be in the data.
>
> Now an object returned by object is pretty esoteric and ought to be rare and
> indeed each new object seems to be individual.
>
> val=object()
>
> [(val := ele) for ele in [1,1,2,object(),3,3,3] if ele != val]
> -->> [1, 2, <object object at 0x00000176F33150D0>, 3]
>
> So the only way to trip this up is to use the same object or another
> reference to it where it is silently ignored.

When you want a general solution for removal of consecutive duplicates
you can put the line

val = object()

into the deduplication function which makes it *very* unlikely that val
will also be passed as an argument to that function.

To quote myself:

> Manprit avoided that in his similar solution by using a special value
> that will compare false except in pathological cases:
>
>> val = object()
>> [(val := ele) for ele in lst if ele != val]

What did I mean with "pathological"?

One problematic case would be an object that compares equal to everything,

class A:
     def __eq__(self, other): return True
     def __ne__(self, other): return False

but that is likely to break the algorithm anyway.

Another problematic case: objects that only implement comparison for
other objects of the same type. For these deduplication will work if you
avoid the out-of-band value:

 >>> class A:
	def __init__(self, name):
		self.name = name
	def __eq__(self, other): return self.name == other.name
	def __ne__(self, other): return self.name != other.name
	def __repr__(self): return f"A(name={self.name})"


 >>> prev = object()
 >>>
 >>> [(prev:=item) for item in map(A, "abc") if item != prev]
Traceback (most recent call last):
   File "<pyshell#57>", line 1, in <module>
     [(prev:=item) for item in map(A, "abc") if item != prev]
   File "<pyshell#57>", line 1, in <listcomp>
     [(prev:=item) for item in map(A, "abc") if item != prev]
   File "<pyshell#54>", line 5, in __ne__
     def __ne__(self, other): return self.name != other.name
AttributeError: 'object' object has no attribute 'name'


 >>> def rm_duplicates(iterable):
     it = iter(iterable)
     try:
         last = next(it)
     except StopIteration:
         return
     yield last
     for item in it:
         if item != last:
             yield item
             last = item

 >>> list(rm_duplicates(map(A, "aabccc")))
[A(name=a), A(name=b), A(name=c)]
 >>>

From avi.e.gross at gmail.com  Mon Jul 18 00:22:32 2022
From: avi.e.gross at gmail.com (avi.e.gross at gmail.com)
Date: Mon, 18 Jul 2022 00:22:32 -0400
Subject: [Tutor] Ways of removing consequtive duplicates from a list
In-Reply-To: <afb076e9-33f5-99e9-bfe8-b5ffa52ff1ec@DancesWithMice.info>
References: <CAO1OCwZWOnS7bVLoW44-OkVUx4JX-ChYkTvDZ5EuO7Dk0ZQHXw@mail.gmail.com>
 <005001d8992b$227b77b0$67726710$@gmail.com>
 <CAMCEyD5-gVczEpnmYh+2a2TWNr5vsewy3OapnGBJmYE=xggU=g@mail.gmail.com>
 <f3fb5ee4-3d34-f4d0-3a0f-4edd4084538e@web.de>
 <afb076e9-33f5-99e9-bfe8-b5ffa52ff1ec@DancesWithMice.info>
Message-ID: <012301d89a5e$00863b70$0192b250$@gmail.com>

Dennis,

Unpacking is an interesting approach. Your list example seems to return a shorter list which remains iterable. But what does it mean to unpack other iterables like a function that yields? Does the unpacking call it as often as needed to satisfy the first variables you want filled and then pass a usable version of the iterable to the last argument?

Since the question asked was about what approach is in some way better, unpacking can be a sort of hidden cost or it can be done very efficiently.


-----Original Message-----
From: Tutor <tutor-bounces+avi.e.gross=gmail.com at python.org> On Behalf Of dn
Sent: Sunday, July 17, 2022 11:34 PM
To: tutor at python.org
Subject: Re: [Tutor] Ways of removing consequtive duplicates from a list

On 17/07/2022 20.26, Peter Otten wrote:
> On 17/07/2022 00:01, Alex Kleider wrote:
> 
>> PS My (at least for me easier to comprehend) solution:
>>
>> def rm_duplicates(iterable):
>>      last = ''
>>      for item in iterable:
>>          if item != last:
>>              yield item
>>              last = item
> 
> The problem with this is the choice of the initial value for 'last':

Remember "unpacking", eg

>> def rm_duplicates(iterable):
>>      current, *the_rest = iterable
>>      for item in the_rest:

Then there is the special case, which (assuming it is possible) can be caught as an exception - which will likely need to 'ripple up' through the function-calls because the final collection of 'duplicates' will be empty/un-process-able. (see later comment about "unconditionally")


Playing in the REPL:

>>> iterable = [1,2,3]
>>> first, *rest = iterable
>>> first, rest
(1, [2, 3])
# iterable is a list

>>> iterable = [1,2]
>>> first, *rest = iterable
>>> first, rest
(1, [2])
# iterable is (technically) a list

>>> iterable = [1]
>>> first, *rest = iterable
>>> first, rest
(1, [])
# iterable is an empty list

>>> iterable = []
>>> first, *rest = l
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ValueError: not enough values to unpack (expected at least 1, got 0) # nothing to see here: no duplicates - and no 'originals' either!


>>>> list(rm_duplicates(["", "", 42, "a", "a", ""]))
> [42, 'a', '']   # oops, we lost the initial empty string
> 
> Manprit avoided that in his similar solution by using a special value 
> that will compare false except in pathological cases:
> 
>> val = object()
>> [(val := ele) for ele in lst if ele != val]
> 
> Another fix is to yield the first item unconditionally:
> 
> def rm_duplicates(iterable):
>     it = iter(iterable)
>     try:
>         last = next(it)
>     except StopIteration:
>         return
>     yield last
>     for item in it:
>         if item != last:
>             yield item
>             last = item
> 
> If you think that this doesn't look very elegant you may join me in 
> the https://peps.python.org/pep-0479/ haters' club ;)
This does indeed qualify as 'ugly'. However, it doesn't need to be expressed in such an ugly fashion!

--
Regards,
=dn
_______________________________________________
Tutor maillist  -  Tutor at python.org
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor


From __peter__ at web.de  Mon Jul 18 04:25:19 2022
From: __peter__ at web.de (Peter Otten)
Date: Mon, 18 Jul 2022 10:25:19 +0200
Subject: [Tutor] Ways of removing consequtive duplicates from a list
In-Reply-To: <afb076e9-33f5-99e9-bfe8-b5ffa52ff1ec@DancesWithMice.info>
References: <CAO1OCwZWOnS7bVLoW44-OkVUx4JX-ChYkTvDZ5EuO7Dk0ZQHXw@mail.gmail.com>
 <005001d8992b$227b77b0$67726710$@gmail.com>
 <CAMCEyD5-gVczEpnmYh+2a2TWNr5vsewy3OapnGBJmYE=xggU=g@mail.gmail.com>
 <f3fb5ee4-3d34-f4d0-3a0f-4edd4084538e@web.de>
 <afb076e9-33f5-99e9-bfe8-b5ffa52ff1ec@DancesWithMice.info>
Message-ID: <ba124001-fae6-3634-ee9c-fc373f024b8d@web.de>

On 18/07/2022 05:34, dn wrote:
> On 17/07/2022 20.26, Peter Otten wrote:
>> On 17/07/2022 00:01, Alex Kleider wrote:
>>
>>> PS My (at least for me easier to comprehend) solution:
>>>
>>> def rm_duplicates(iterable):
>>>  ???? last = ''
>>>  ???? for item in iterable:
>>>  ???????? if item != last:
>>>  ???????????? yield item
>>>  ???????????? last = item
>>
>> The problem with this is the choice of the initial value for 'last':
>
> Remember "unpacking", eg

Try

first, *rest = itertools.count()

;)

If you want the full generality of the iterator-based approach unpacking
is right out.


>> Another fix is to yield the first item unconditionally:
>>
>> def rm_duplicates(iterable):
>>  ??? it = iter(iterable)
>>  ??? try:
>>  ??????? last = next(it)
>>  ??? except StopIteration:
>>  ??????? return
>>  ??? yield last
>>  ??? for item in it:
>>  ??????? if item != last:
>>  ??????????? yield item
>>  ??????????? last = item
>>
>> If you think that this doesn't look very elegant you may join me in the
>> https://peps.python.org/pep-0479/ haters' club ;)

> This does indeed qualify as 'ugly'. However, it doesn't need to be
> expressed in such an ugly fashion!

On second thought, I'm not exception-shy, and compared to the other options

last = next(it, out_of_band)
if last is out_of_band: return

or

for last in it:
     break
else:
    return

I prefer the first version. I'd probably go with

[key for key, _value in groupby(items)]

though.

However, if you look at the Python equivalent to groupby()

https://docs.python.org/3/library/itertools.html#itertools.groupby

you'll find that this just means that the "ugly" parts have been written
by someone else. I generally like that strategy -- if you have an "ugly"
piece of code, stick it into a function with a clean interface, add some
tests to ensure it works as advertised, and move on.

From marcus.luetolf at bluewin.ch  Mon Jul 18 05:59:51 2022
From: marcus.luetolf at bluewin.ch (marcus.luetolf at bluewin.ch)
Date: Mon, 18 Jul 2022 11:59:51 +0200
Subject: [Tutor] problem solving with lists: final (amateur) solution
In-Reply-To: <c021a502-f4b8-232b-17b8-97abcc10eb86@DancesWithMice.info>
References: <000f01d888c3$c32e6eb0$498b4c10$@bluewin.ch>
 <c021a502-f4b8-232b-17b8-97abcc10eb86@DancesWithMice.info>
Message-ID: <000b01d89a8d$1f6fda80$5e4f8f80$@bluewin.ch>

Hello Experts, hello dn,
after having studied your valuable critiques I revised may code as below.

A few remarks: 
The terms "pythonish" and "dummy_i" I got from a Rice University's online lecture on python style.

If there is concern about a reader to have to "switch gears" between reading a for loop and a list comprehension then the concern
should be even greater to read a nested list comprehension coding the 4 flights for day 1.
I've read that a list comprehension should not exceed one line of code, otherwise a for loop should be used in favor of readability.

I'am quite shure that my revised code does not meet all critique points especially there are still separate code parts (snippets?) for
day_1 and day_2 to day_5 but not separate functions for at present I'am unable to "fold them in".

I also updated my code on github accordingly and adjusted docstrings and comments: 
https://github.com/luemar/player_startlist/blob/main/start_list.py

def start_list(all_players, num_in_flight, num_days_played):
    print('...............day_1................')
    history = {'a':[], 'b':[],'c':[],'d':[],'e':[],'f':[],'g':[],'h':[],\
              'i':[],'j':[],'k':[],'l':[],'m':[],'n':[],'o':[],'p':[]}
    
    for lead_player_index in range(0, len(all_players), num_in_flight):
        players = all_players[lead_player_index: lead_player_index + num_in_flight]
        [history[pl_day_1].extend(players) for pl_day_1 in players]
        print(all_players[lead_player_index] + '_flight_day_1:',players)

    for i in range(num_days_played - 1):
        flights = {}
        c_all_players = all_players[:]
        print('...............day_' + str(i)+'................')
        flights['a_flight_day_'+str(i+2)]= []
        flights['b_flight_day_'+str(i+2)]= []
        flights['c_flight_day_'+str(i+2)]= []
        flights['d_flight_day_'+str(i+2)]= []            
        lead = list('abcd')                   
        flight_list = [flights['a_flight_day_'+str(i+2)], flights['b_flight_day_'+str(i+2)],\
                       flights['c_flight_day_'+str(i+2)], flights['d_flight_day_'+str(i+2)]]

        for j in range(len(flight_list)):            
            def flight(cond, day):
                for player in all_players:
                    if player not in cond:
                        day.extend(player)
                        cond.extend(history[player])
                        history[lead[j]].extend(player)
                day.extend(lead[j])
                day.sort()
                [history[pl_day_2_5].extend(day) for pl_day_2_5 in day[1:]]
                return lead[j]+'_flight_day_'+str(i+2)+ ': ' + str(flight_list[j])
               
            conditions = [history[lead[j]], history[lead[j]] + flights['a_flight_day_'+str(i+2)],\
                          history[lead[j]] + flights['a_flight_day_'+str(i+2)] + \
                          flights['b_flight_day_'+str(i+2)], \
                          history[lead[j]] + flights['a_flight_day_'+str(i+2)] + \
                          flights['b_flight_day_'+str(i+2)]+ flights['c_flight_day_'+str(i+2)]] 
            print(flight(list(set(conditions[j])), flight_list[j]))
num_in_flight = 4
if num_in_flight != 4:
    raise ValueError('out of seize of flight limit')
num_days_played = 5
if num_days_played >5 or num_days_played <2:
    raise ValueError('out of playing days limit')
all_players = list('abcdefghijklmnop')
start_list(all_players,num_in_flight , num_days_played)

Many thanks and regards, Marcus.
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------

-----Urspr?ngliche Nachricht-----
Von: dn <PythonList at DancesWithMice.info> 
Gesendet: Sonntag, 26. Juni 2022 02:35
An: marcus.luetolf at bluewin.ch; tutor at python.org
Betreff: Re: AW: [Tutor] problem solving with lists: final (amateur) solution

On 26/06/2022 06.45, marcus.luetolf at bluewin.ch wrote:
> Hello Experts, hello dn,
> it's a while since I - in terms of Mark Lawrence - bothered you with 
> my problem.
> Thanks to your comments, especially to dn's structured guidance I've 
> come up with the code below, based on repeatability.
> I am shure there is room for improvement concerning pythonish style 
> but for the moment the code serves my purposes.
> A commented version can be found on
> https://github.com/luemar/player_startlist.
> 
> def startlist(all_players, num_in_flight):
>     c_all_players = all_players[:]
>     history = {'a':[], 'b':[],'c':[],'d':[],'e':[],'f':[],'g':[],'h':[],\
>                'i':[],'j':[],'k':[],'l':[],'m':[],'n':[],'o':[],'p':[]}
>     print('...............day_1................')
>     def day_1_flights():
>         key_hist = list(history.keys())
>         c_key_hist = key_hist[:]
>         for dummy_i in c_key_hist:
>             print('flights_day_1: ', c_key_hist[:num_in_flight])
>             for key in c_key_hist[:num_in_flight]:
>                 [history[key].append(player)for player in 
> c_all_players[0:num_in_flight]]
>             del c_key_hist[:num_in_flight]
>             del c_all_players[0:num_in_flight]
>     day_1_flights()
>     
>     def day_2_to_5_flights():
>         flights = {}
>         for i in range(2,6):
>             print('...............day_' + str(i)+'................')
>             flights['a_flight_day_'+str(i)]= []
>             flights['b_flight_day_'+str(i)]= []
>             flights['c_flight_day_'+str(i)]= []
>             flights['d_flight_day_'+str(i)]= []            
>             lead = list('abcd')                   
>             flight_list = [flights['a_flight_day_'+str(i)], 
> flights['b_flight_day_'+str(i)],\
>                            flights['c_flight_day_'+str(i)], 
> flights['d_flight_day_'+str(i)]]
> 
>             for j in range(len(flight_list)):            
>                 def flight(cond, day):
>                     for player in all_players:
>                         if player not in cond:
>                             day.extend(player)
>                             cond.extend(history[player])
>                             history[lead[j]].extend(player)
>                     day.extend(lead[j])
>                     day.sort()
>                     [history[pl].extend(day) for pl in day[1:]]
>                     return lead[j]+'_flight_day_'+str(i)+ ': ' +
> str(flight_list[j])
>                    
>                 conditions = [history[lead[j]], history[lead[j]] + 
> flights['a_flight_day_'+str(i)],\
>                               history[lead[j]] + 
> flights['a_flight_day_'+str(i)] + \
>                               flights['b_flight_day_'+str(i)], \
>                               history[lead[j]] + 
> flights['a_flight_day_'+str(i)] + \
>                               flights['b_flight_day_'+str(i)]+ 
> flights['c_flight_day_'+str(i)]]
>                 print(flight(list(set(conditions[j])), flight_list[j]))
>     day_2_to_5_flights()
> startlist(list('abcdefghijklmnop'), 4)
> 
>  Many thanks, Marcus.
...

> The word "hardcoded" immediately stopped me in my tracks!
> 
> The whole point of using the computer is to find 'repetition' and have 
> the machine/software save us from such boredom (or nit-picking detail 
> in which we might make an error/become bored).
...

> The other 'side' of both of these code-constructs is the data-construct.
> Code-loops require data-collections! The hard-coded "a" and "day_1" 
> made me shudder.
> (not a pretty sight - the code, nor me shuddering!)
...
> Sadly, the 'hard-coded' parts may 'help' sort-out week-one, but (IMHO) 
> have made things impossibly-difficult to proceed into week-two (etc).
...


It works. Well done!
What could be better than that?


[Brutal] critique:

- ?pythonish? in German becomes ?pythonic? in English (but I'm sure we all understood)

- position the two inner-functions outside and before startlist()

- whereas the ?4?, ie number of players per flight (num_in_flight), is defined as a parameter in the call to startlist(), the five ?times or days? is a 'magic constant' (worse, it appears in day_2_to_5_flights() as part of ?range(2,6)? which 'disguises' it due to Python's way of working)

- the comments also include reference to those parameters as if they are constants (which they are - if you only plan to use the algorithm for this 16-4-5 configuration of the SGP). Thus, if the function were called with different parameters, the comments would be not only wrong but have the potential to mislead the reader

- in the same vein (as the two points above), the all_players (variable) argument is followed by the generation of history as a list of constants (?constant? cf ?variable?)

- on top of which: day_1_flights() generates key_hist from history even though it already exists as all_players

- the Python facility for a 'dummy value' (that will never be used, or perhaps only 'plugged-in' to 'make things happen') is _ (the under-score/under-line character), ie

    for _ in c_key_hist:

- an alternative to using a meaningless 'placeholder' with no computational-purpose, such as _ or dummy_i, is to choose an identifier which aids readability, eg

    for each_flight in c_key_hist

- well done for noting that a list-comprehension could be used to generate history/ies. Two thoughts:

  1 could the two for-loops be combined into a single nested list-comprehension?

  2 does the reader's mind have to 'change gears' between reading the outer for-loop as a 'traditional-loop' structure, and then the inner-loop as a list-comprehension? ie would it be better to use the same type of code-construct for both?

- both the code- and data-structures of day_1_flights() seem rather tortured (and tortuous), and some are unused and therefore unnecessary.
Might it be possible to simplify, if the control-code commences with:

for first_player_index in range( 0, len( all_players ), num_in_flight ):
    print( first_player_index,
           all_players[ first_player_index:
                        first_player_index+num_in_flight
                      ]
         )

NB the print() is to make the methodology 'visible'.

- the docstring for day_1_flights() is only partly-correct. Isn't the function also creating and building the history set?

- that being the case, should the initial set-definition be moved inside the function?

- functions should not depend upon global values. How does the history 'pass' from one function to another - which is allied to the question:
how do the functions know about values such as _all_players and num_in_flight? To make the functions self-contained and ?independent?, these values should be passed-in as parameters/arguments and/or return-ed

- much of the above also applies to day_2_to_5_flights()

- chief concern with day_2_to_5_flights() is: what happens to d_flight_day_N if there are fewer/more than four players per flight, or what if there are fewer/more than 5 flights?

- the observation that the same players would always be the 'lead' of a flight, is solid. Thus, could the lead-list be generated from a provided-parameter, rather than stated as a constant? Could that construct (also) have been used in the earlier function?

- we know (by definition) that flight() is an unnecessary set of conditions to apply during day_1, but could it be used nonetheless? If so, could day_1_flights() be 'folded into' day_2_to_5_flights() instead of having separate functions?
(yes, I recall talking about the essential differences in an earlier post - and perhaps I'm biased because this was how I structured the
draft-solution)


[More than] enough for now?
--
Regards,
=dn


From wlfraed at ix.netcom.com  Mon Jul 18 11:17:43 2022
From: wlfraed at ix.netcom.com (Dennis Lee Bieber)
Date: Mon, 18 Jul 2022 11:17:43 -0400
Subject: [Tutor] Ways of removing consequtive duplicates from a list
References: <CAO1OCwZWOnS7bVLoW44-OkVUx4JX-ChYkTvDZ5EuO7Dk0ZQHXw@mail.gmail.com>
 <005001d8992b$227b77b0$67726710$@gmail.com>
 <41l6dhhjki03i05uue62o0a35vnd6n0q6d@4ax.com>
 <003701d89998$ffcfd940$ff6f8bc0$@gmail.com>
 <5b42eea3-f886-bf85-1ec3-884b1fd2694c@DancesWithMice.info>
Message-ID: <dmtadhh56kthdujcimblqq9pnjt4ud2ljr@4ax.com>

On Mon, 18 Jul 2022 17:34:08 +1200, dn <PythonList at DancesWithMice.info>
declaimed the following:

>Yes, and the OP does irritate by not answering questions from 'helpers'.
>He does publish (for income/profit). I don't know if he has ever
>used/repeated any of the topics discussed 'here' - nor if in doing-so he
>attributes and credits appropriately (by European, UK, US... standards).
>

	You managed to pique my interest -- so I hit Google. I don't have a
LinkedIn account, so I can't get beyond the intro page, but...

https://in.linkedin.com/in/manprit-singh-87961a1ba

has recent "activity" posts that appear to be derived from the SQLite3
Aggregate function thread (LinkedIn blocks my cut&pasting the heading
text). 

	Presuming this IS the same person, I begin to have /my/ doubts: a
"technical trainer" who seems to be using the tutor list for his own
training? Constantly posting toy examples with the question "which is
better/more efficient/etc." yet never (apparently) bothering to learn
techniques for profiling/timing these examples (nor making examples with
real-world data quantities for which profiling would show differences).


-- 
	Wulfraed                 Dennis Lee Bieber         AF6VN
	wlfraed at ix.netcom.com    http://wlfraed.microdiversity.freeddns.org/


From avi.e.gross at gmail.com  Mon Jul 18 12:20:11 2022
From: avi.e.gross at gmail.com (avi.e.gross at gmail.com)
Date: Mon, 18 Jul 2022 12:20:11 -0400
Subject: [Tutor] Ways of removing consequtive duplicates from a list
In-Reply-To: <dmtadhh56kthdujcimblqq9pnjt4ud2ljr@4ax.com>
References: <CAO1OCwZWOnS7bVLoW44-OkVUx4JX-ChYkTvDZ5EuO7Dk0ZQHXw@mail.gmail.com>
 <005001d8992b$227b77b0$67726710$@gmail.com>
 <41l6dhhjki03i05uue62o0a35vnd6n0q6d@4ax.com>
 <003701d89998$ffcfd940$ff6f8bc0$@gmail.com>
 <5b42eea3-f886-bf85-1ec3-884b1fd2694c@DancesWithMice.info>
 <dmtadhh56kthdujcimblqq9pnjt4ud2ljr@4ax.com>
Message-ID: <006801d89ac2$41311be0$c39353a0$@gmail.com>

Dennis,

You may be changing the topic from pythons to lions as an extremely common
Sikh name is variations on Singh and it is also used by some Hindus and
others.

I, too, have had no incentive to join LinkedIn but can see DOZENS of people
with a name that might as well be John Smith. 


How can you zoom in on the right one? They are literally everywhere
including working at LinkedIn, Microsoft and so on and plenty of them are in
a search with SQL as part of it. Your link fails for me.

However, I had not considered some possibilities of how someone might use a
group like this. I mean it can be to collect mistakes people make or where
people get stuck or how the people posting replies think, make assumptions,
suggest techniques and so on.

But so what? Some questions may well be reasonable even if for a purpose.
And yes, some people abuse things and can make it worse for others. I know
that some people/questions do after a while motivate me to ignore them. The
fact is that within a few days of people discussing something here, it is
considered decent for the one who started it to chime in and answer our
questions or comment on what we said or simply say the problem is solved and
we can move on.

I will say this, I have had what I wrote ending up published when it still
contained a spelling error and I was not thrilled as my informal posts are
not meant to be used this way without my permission. If someone here told us
up-front what they wanted to do with our work, I might e way more careful or
opt out.

-----Original Message-----
From: Tutor <tutor-bounces+avi.e.gross=gmail.com at python.org> On Behalf Of
Dennis Lee Bieber
Sent: Monday, July 18, 2022 11:18 AM
To: tutor at python.org
Subject: Re: [Tutor] Ways of removing consequtive duplicates from a list

On Mon, 18 Jul 2022 17:34:08 +1200, dn <PythonList at DancesWithMice.info>
declaimed the following:

>Yes, and the OP does irritate by not answering questions from 'helpers'.
>He does publish (for income/profit). I don't know if he has ever 
>used/repeated any of the topics discussed 'here' - nor if in doing-so 
>he attributes and credits appropriately (by European, UK, US... standards).
>

	You managed to pique my interest -- so I hit Google. I don't have a
LinkedIn account, so I can't get beyond the intro page, but...

https://in.linkedin.com/in/manprit-singh-87961a1ba

has recent "activity" posts that appear to be derived from the SQLite3
Aggregate function thread (LinkedIn blocks my cut&pasting the heading text).


	Presuming this IS the same person, I begin to have /my/ doubts: a
"technical trainer" who seems to be using the tutor list for his own
training? Constantly posting toy examples with the question "which is
better/more efficient/etc." yet never (apparently) bothering to learn
techniques for profiling/timing these examples (nor making examples with
real-world data quantities for which profiling would show differences).


-- 
	Wulfraed                 Dennis Lee Bieber         AF6VN
	wlfraed at ix.netcom.com    http://wlfraed.microdiversity.freeddns.org/

_______________________________________________
Tutor maillist  -  Tutor at python.org
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor


From avi.e.gross at gmail.com  Mon Jul 18 12:45:19 2022
From: avi.e.gross at gmail.com (avi.e.gross at gmail.com)
Date: Mon, 18 Jul 2022 12:45:19 -0400
Subject: [Tutor] Ways of removing consequtive duplicates from a list
In-Reply-To: <5b42eea3-f886-bf85-1ec3-884b1fd2694c@DancesWithMice.info>
References: <CAO1OCwZWOnS7bVLoW44-OkVUx4JX-ChYkTvDZ5EuO7Dk0ZQHXw@mail.gmail.com>
 <005001d8992b$227b77b0$67726710$@gmail.com>
 <41l6dhhjki03i05uue62o0a35vnd6n0q6d@4ax.com>
 <003701d89998$ffcfd940$ff6f8bc0$@gmail.com>
 <5b42eea3-f886-bf85-1ec3-884b1fd2694c@DancesWithMice.info>
Message-ID: <008001d89ac5$c49bcf90$4dd36eb0$@gmail.com>

I will try a short answer to the topic of why some of us (meaning in this
case ME) react to what we may see not so much as cultural differences but to
feeling manipulated.

I have had people in my life who say things like "You are so smart you can
probably do this for me in just a few minutes" and lots of variations on
such a theme. As it happens, that is occasionally true when I already happen
to know how to do it. But sometimes I have to do lots of research and
experimentation or ask others with specific expertise. Being smart or
over-educated in one thing is not the same as being particularly good at
other things. How would you feel if asked to write a program (for no pay)
and after spending lots of time and showing the results, the other guy says
that this is pretty much how they already did it and they just wanted to see
if it was right or the best way so they asked you? What a waste of time for
no real result!

I have found there are people in this world who use techniques ranging from
flattery to guilt to get you to do things for them. One of these finally
really annoyed me by asking if I knew of a good lawyer to do real estate for
a friend of his about 25 miles from where I live. He lives perhaps a hundred
miles away and neither of us is particularly knowledgeable about the area
and I don't even know lawyers in my area. We both can use the darn internet.
So why ask me? The answer is because they are users.

Note many people ask serious questions here and we ask the same question.
Did you do any kind of search before asking? Did you write any code and see
where it fails? 

So next time we get a question like this one, how about we reply with a
request that they provide their own thoughts FIRST and also spell out what
the meaning of words like "best" is and only once they convince us they have
tried and really need help, do we jump in. I am not necessarily talking
about everyone with a question, but definitely about repeaters., 

-----Original Message-----
From: Tutor <tutor-bounces+avi.e.gross=gmail.com at python.org> On Behalf Of dn
Sent: Monday, July 18, 2022 1:34 AM
To: tutor at python.org
Subject: Re: [Tutor] Ways of removing consequtive duplicates from a list

>> ... you just asked any
>> women present to not reply to you, nor anyone who has not been 
>> knighted by a Queen. I personally do not expect such politeness but
clearly some do.
>>
> 
> I confess it took me longer than it should have to figure out to what 
> you were referring in the second half of the above but eventually the 
> light came on and the smile blossomed!
> My next thought was that it wouldn't necessarily have had to have been 
> a Queen although anyone knighted (by a King) prior to the beginning of 
> our current Queen's reign is unlikely to be even alive let alone 
> interested in this sort of thing.
> Thanks for the morning smile!

I've not been knighted, but am frequently called "Sir". Maybe when you too
have a grey beard? (hair atop same head, optional)

It is not something that provokes a positive response, and if in a
grumpy-mood may elicit a reply describing it as to be avoided "unless we are
both in uniform".

Thus, the same words hold different dictionary-meanings and different
implications for different people, in different contexts, and between
cultures!

I'm not going to invoke any Python community Code of Conduct terms or insist
upon 'politically-correctness', but am vaguely-surprised that someone has
not...

The gender observation is appropriate, but then how many of the OP's
discussions feature responses from other than males?
(not that such observation could be claimed as indisputable)

Writing 'here', have often used constructions such as "him/her", with some
question about how many readers might apply each form...


The dislocation in response to the OP is cultural. (In this) I have
advantage over most 'here', having lived and worked in India.
(also having been brought-up, way back in the last century, in an
old-English environment, where we were expected to address our elders using
such titles. Come to think of it, also in the US of those days)

In India, and many others parts of Asia, the respectful address of teachers,
guides, and elders generally, is required-behavior. In the Antipodes, titles
of almost any form are rarely used, and most will exchange first-names at an
introduction. Whereas in Germany (for example), the exact-opposite applies,
and one must remember to use the Herr-s, Doktor-s, du-forms etc.

How to cope when one party is at the opposite 'end' of the scale from
another? I'm reminded of 'Postel's law': "Be liberal in what you accept, and
conservative in what you send".

Is whether someone actually knows what they're talking-about more relevant
(and more telling) than their qualifications, rank, title, whatever - or
does that only apply in the tech-world where we seem to think we can be a
technocracy?

Living in an 'immigrant society' today, and having gone through such a
process (many times, in many places) I'm intrigued by how quickly - or how
slowly, some will adapt to the local culture possibly quite alien to them.
Maybe worst of all are the ones who observe, but then react by assuming (or
claiming) superiority - less an attempt to 'fit in', but perhaps an intent
to be 'more equal'...


> I may be getting touchy without the feely, but I am having trouble 
> listening to the way some people with cultural differences, or far 
> left/right attitudes, try to address me/us in forums like this. Alex 
> may have been amused by my retort, and there is NOTHING wrong with 
> saying "Dear Sirs" when

Disagree: when *I* read the message, I am me. I am in the singular. When
*you* write, you (singular) are writing to many of us (plural). Who is the
more relevant party to the communication?

Accordingly, "Dear Sir" not "Sirs" - unless you are seeking a collective or
corporate reply, eg from a firm of solicitors.
(cf the individual replies (plural, one might hope) you expect from multiple
individuals - who happen to be personal-members of the
(collective) mailing-list).


> done in many contexts, just like someone a while ago was writing to 
> something like "Esteemed Professors" but it simply rubs me wrong here.

Like it appears do you, I quickly lose respect for 'esteemed
professors/professionals' who expect this, even revel in it.

However, if one is a student or otherwise 'junior', it is a
career-limiting/grade-reducing move not to accede!

That said, two can play at that game: someone wanting to improve his/her
grade (or seeking some other favor) will attempt ingratiation through more
effusive recognition and compliment ("gilding the lily"). whither 'respect'?


I recall a colleague, on an International Development team assigned to a
small Pacific country, who may have been junior or at most 'equal' to myself
in 'rank'. Just as in India, he would introduce himself formally as "Dr
Chandrashekar" plus full position and assignment. In a more relaxed
situation, his informal introduction was "Dr Chandra". It was amusing to
watch the reactions both 'westerners' and locals had to this.
Seeing how it didn't 'fit' with our host-culture, we took sardonic delight
in referring to him as "Chandra". (yes, naughty little boys!) One day my
(local, and non-tech, and female) assistant, visibly shaking, requested a
private meeting with another member of the team and myself.
Breaking-down into tears she apologised for interrupting the urgent-fix
discussion we'd been having with senior IT staff the day before, even as we
knew we were scheduled elsewhere. Her 'apology' was that Chandra was
(twice) insistent for our presence and demanded that meeting be interrupted,
even terminated - and that she had to obey, she said, "because he is
Doctor". (we tried really hard not to laugh) For our part, knowing the guy,
we knew that she should not be the recipient of any 'blow-back'. After
plentiful reassurance that she was not 'in trouble' with either of us, and a
talk (similar to 'here') about the [ab]use of 'titles', she not only
understood, but paid us both a great compliment saying something like: I
call you (first-name) because we all work together, but I call him "Doctor"
because he expects me to do things *for* him! Being called by my given-name,
unadorned, always proved a 'buzz' thereafter!


> Back to topic, if I may, sometimes things set our moods. I am here 
> partially to be helpful and partially for my own amusement and 
> education as looking at some of the puzzles presented presents 
> opportunities to think and investigate.
> 
> But before I could get to the reasonable question here, I was 
> perturbed at the overly formulaic politeness and wrongness of the 
> greeting from my perhaps touchy perspective for the reasons mentioned 
> including the way it seeming assumes no Ladies are present and we are 
> somehow Gentlemen, but also by the mess I saw on one wrapped line that 
> was a pain to take apart. Then I wondered why the question was being 
> asked. Yes, weirdly, it is a question you and I have discussed before 
> when wondering which way of doing something worked better, was more 
> efficient, or showed a more brilliant way to use the wrong method to do
something nobody designed it for!

Yep, rubs me the wrong way too!
(old grumpy-guts is likely to say "no!", on principle - and long before
they've even finished their wind-up!)

BTW such is not just an Asian 'thing' either - I recall seeing, and quickly
avoiding, the latest version of a perennial discussion about protocol.
Specifically, the sequence of email-addresses one should use in the To: and
Cc: fields of email-messages (and whether or not Bcc: is "respectful"). Even
today, in the US and UK, some people and/or organisations demand that the
more 'important' names should precede those of mere-minions. "We the people"
meets "some, more equal than others"!

Yes, and the OP does irritate by not answering questions from 'helpers'.
He does publish (for income/profit). I don't know if he has ever
used/repeated any of the topics discussed 'here' - nor if in doing-so he
attributes and credits appropriately (by European, UK, US... standards).


> I am not sure who read my longish message, but I hope the main point 
> is that sometimes you should just TEST it. This is not long and complex
code.
> However, there cannot be any one test everyone will agree on and it 
> often depends on factors other than CPU cycles. A robust 
> implementation that can handle multiple needs may well be slower and 
> yet more cost effective in some sense.

Another source of irritation: define terms-used, eg what is the metric for
"better" or "best"?

Frankly, the succession of 'academic questions' with dubious application in
the real world (CRC-checks notwithstanding) have all the flavor of someone
writing an old-fashioned text-book - emphasis on facts, with professional
application relegated to lesser (if any) import, and perhaps more than a
little "I'm so much smarter than you".

NB the Indian and many Asian education systems use techniques which are
regarded as 'old', yet at the same time they are apparently effective!


--
Regards,
=dn
_______________________________________________
Tutor maillist  -  Tutor at python.org
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor


From avi.e.gross at gmail.com  Mon Jul 18 13:42:54 2022
From: avi.e.gross at gmail.com (avi.e.gross at gmail.com)
Date: Mon, 18 Jul 2022 13:42:54 -0400
Subject: [Tutor] Ways of removing consequtive duplicates from a list
In-Reply-To: <28dee502-7274-b2cb-26f5-89010761f42e@web.de>
References: <CAO1OCwZWOnS7bVLoW44-OkVUx4JX-ChYkTvDZ5EuO7Dk0ZQHXw@mail.gmail.com>
 <005001d8992b$227b77b0$67726710$@gmail.com>
 <CAMCEyD5-gVczEpnmYh+2a2TWNr5vsewy3OapnGBJmYE=xggU=g@mail.gmail.com>
 <f3fb5ee4-3d34-f4d0-3a0f-4edd4084538e@web.de>
 <00b001d899fe$8c3cb000$a4b61000$@gmail.com>
 <28dee502-7274-b2cb-26f5-89010761f42e@web.de>
Message-ID: <008d01d89acd$cf82d220$6e887660$@gmail.com>

Peter,

I studied Pathology in school but we used human bodies rather than the
pythons you are abusing.

The discussion we are having is almost as esoteric and has to do with
theories of computation and what algorithms are in some sense provable and
which are probabilistic to the point of them failing being very rare and
which are more heuristic and tend to work and perhaps get a solution that is
close enough to optimal and which ones never terminate and so on.

My point was that I played with your idea and was convinced it should work
as long as you only create the object once and never copy it or in any way
include it in the list or iterable. That seems very doable.

But your new comment opens up another door. 

Turn your class A around:

class A:
     def __eq__(self, other): return True
     def __ne__(self, other): return False

make it:

class all_alone:
     def __eq__(self, other): return False
     def __ne__(self, other): return True

If you made an object of that class, it won't even report being equal to
itself!

That is very slightly better but not really an important distinction. But
what happens when A is compared to all_alone may depend on which is first.
Worth a try? You can always flip the order of the comparison as needed.

I do note back in my UNIX days, we often needed a guaranteed unique ID as in
a temporary filename and often used a process ID deemed to be unique. But
processed come and go and eventually that process ID is re-used and odd
things can happen if it find files already there or ...

-----Original Message-----
From: Tutor <tutor-bounces+avi.e.gross=gmail.com at python.org> On Behalf Of
Peter Otten
Sent: Monday, July 18, 2022 3:15 AM
To: tutor at python.org
Subject: Re: [Tutor] Ways of removing consequtive duplicates from a list

On 17/07/2022 18:59, avi.e.gross at gmail.com wrote:
> You could make the case, Peter, that you can use anything as a start 
> that will not likely match in your domain. You are correct if an empty 
> string may be in the data.
>
> Now an object returned by object is pretty esoteric and ought to be 
> rare and indeed each new object seems to be individual.
>
> val=object()
>
> [(val := ele) for ele in [1,1,2,object(),3,3,3] if ele != val]
> -->> [1, 2, <object object at 0x00000176F33150D0>, 3]
>
> So the only way to trip this up is to use the same object or another 
> reference to it where it is silently ignored.

When you want a general solution for removal of consecutive duplicates you
can put the line

val = object()

into the deduplication function which makes it *very* unlikely that val will
also be passed as an argument to that function.

To quote myself:

> Manprit avoided that in his similar solution by using a special value 
> that will compare false except in pathological cases:
>
>> val = object()
>> [(val := ele) for ele in lst if ele != val]

What did I mean with "pathological"?

One problematic case would be an object that compares equal to everything,

class A:
     def __eq__(self, other): return True
     def __ne__(self, other): return False

but that is likely to break the algorithm anyway.

Another problematic case: objects that only implement comparison for other
objects of the same type. For these deduplication will work if you avoid the
out-of-band value:

 >>> class A:
	def __init__(self, name):
		self.name = name
	def __eq__(self, other): return self.name == other.name
	def __ne__(self, other): return self.name != other.name
	def __repr__(self): return f"A(name={self.name})"


 >>> prev = object()
 >>>
 >>> [(prev:=item) for item in map(A, "abc") if item != prev] Traceback
(most recent call last):
   File "<pyshell#57>", line 1, in <module>
     [(prev:=item) for item in map(A, "abc") if item != prev]
   File "<pyshell#57>", line 1, in <listcomp>
     [(prev:=item) for item in map(A, "abc") if item != prev]
   File "<pyshell#54>", line 5, in __ne__
     def __ne__(self, other): return self.name != other.name
AttributeError: 'object' object has no attribute 'name'


 >>> def rm_duplicates(iterable):
     it = iter(iterable)
     try:
         last = next(it)
     except StopIteration:
         return
     yield last
     for item in it:
         if item != last:
             yield item
             last = item

 >>> list(rm_duplicates(map(A, "aabccc"))) [A(name=a), A(name=b), A(name=c)]
>>> _______________________________________________
Tutor maillist  -  Tutor at python.org
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor


From siddharthsatish93 at gmail.com  Wed Jul 20 19:23:02 2022
From: siddharthsatish93 at gmail.com (Siddharth Satishchandran)
Date: Wed, 20 Jul 2022 18:23:02 -0500
Subject: [Tutor] Need help installing a program on my computer using python
 (Bruker2nifti)
Message-ID: <E990F36C-F6E1-4CF8-A51E-91702A9B7D68@gmail.com>

Hi I am a new user to Python. I am interested in installing a program on Python (Bruker2nifti). I am having trouble writing out the appropriate code to install the program. I know pip install bruker is needed but it is not working on my console. I would like to know the correct code I need to install the program. I can provide the GitHub if needed. 

-Sidd

From alan.gauld at yahoo.co.uk  Wed Jul 20 20:25:29 2022
From: alan.gauld at yahoo.co.uk (Alan Gauld)
Date: Thu, 21 Jul 2022 01:25:29 +0100
Subject: [Tutor] Need help installing a program on my computer using
 python (Bruker2nifti)
In-Reply-To: <E990F36C-F6E1-4CF8-A51E-91702A9B7D68@gmail.com>
References: <E990F36C-F6E1-4CF8-A51E-91702A9B7D68@gmail.com>
Message-ID: <tba6dq$125m$1@ciao.gmane.io>

On 21/07/2022 00:23, Siddharth Satishchandran wrote:
> I know pip install bruker is needed but it is not working on my console. 

Please be secific. "Not working" ios not helpful.
What exactly did you type? what exactly was the result?
cut n paste from your console into the message,
do not paraphrase the errors or output.

Also tell us the OS you are using and the python version.

The actual command needed, according to the PyPi page is

pip install bruker2nifti

And you can use the copy-to-clipboard link to paste it
into your terminal. This needs to be run from your OS
prompt (after installing Python, if necessary). That
should download and install all necessary files.

Most folks usually prefer to use the following however,
because it gives more reliable results and better debugging
data...

python3 -m pip install bruker2nifti

The package maintainer seems to have an issues page, you
should try contacting him/her directly if pip does not
succeed.

https://github.com/SebastianoF/bruker2nifti/issues

-- 
Alan G
Author of the Learn to Program web site
http://www.alan-g.me.uk/
http://www.amazon.com/author/alan_gauld
Follow my photo-blog on Flickr at:
http://www.flickr.com/photos/alangauldphotos


From ahires99 at gmail.com  Thu Jul 21 02:58:58 2022
From: ahires99 at gmail.com (Sagar Ahire)
Date: Thu, 21 Jul 2022 12:28:58 +0530
Subject: [Tutor] PermissionError: [WinError 32]
Message-ID: <CAEgMS7s7btHy8cZQEYomL_V-YZv0n=cNZik5HR0D88-8yEcgug@mail.gmail.com>

Hello sir,


I am getting below error while I install ?pip install sasl? in cmd


Error: PermissionError: [WinError 32] The process cannot access the file
because it is being used by another process:
'C:\\Users\\SAGAR~1.AHI\\AppData\\Local\\Temp\\tmpxlfme666'


I tried multiple things to resolve it but no luck, below is the list I tried

Uninstall python and re-install
Uninstall python and re-install with other directory folder
Delete the temp folder 'C:\\Users\\SAGAR~1.AHI\\AppData\\Local\\Temp\\?
Net Stop http and net start http in command prompt (getting error to stop
http)


Any help to resolve this will be greatly appreciated.


Thank you

Sagar Ahire

From Sagar.Ahire at asurion.com  Thu Jul 21 02:54:25 2022
From: Sagar.Ahire at asurion.com (Ahire, Sagar)
Date: Thu, 21 Jul 2022 06:54:25 +0000
Subject: [Tutor] WinError 32
Message-ID: <MW3PR16MB388231F60856E9D2136DFC79F7919@MW3PR16MB3882.namprd16.prod.outlook.com>

Hello sir,

I am getting below error while I install "pip install sasl" in cmd

Error: PermissionError: [WinError 32] The process cannot access the file because it is being used by another process: 'C:\\Users\\SAGAR~1.AHI\\AppData\\Local\\Temp\\tmpxlfme666'

I tried multiple things to resolve it but no luck, below is the list I tried

  1.  Uninstall python and re-install
  2.  Uninstall python and re-install with other directory folder
  3.  Delete the temp folder 'C:\\Users\\SAGAR~1.AHI\\AppData\\Local\\Temp\\"
  4.  Net Stop http  and net start http in command prompt (getting error to stop http)

Any help to resolve this will be greatly appreciated.

Thank you
Sagar Ahire

________________________________

This message (including any attachments) contains confidential and/or privileged information. It is intended for a specific individual and purpose and is protected by law. If you are not the intended recipient, please notify the sender immediately and delete this message. Any disclosure, copying, or distribution of this message, or the taking of any action based on it, is strictly prohibited.

Asurion_Internal_Use_Only

From connectsachit at gmail.com  Fri Jul 22 10:36:24 2022
From: connectsachit at gmail.com (Sachit Murarka)
Date: Fri, 22 Jul 2022 20:06:24 +0530
Subject: [Tutor] Error while connecting to SQL Server using Python
Message-ID: <CAA6YSGHK2ahNrpWXW8dMLhjc5aG-0=4E5PE4f7Pv_QgBAATGWw@mail.gmail.com>

Hello Users,

Facing below error while using pyodbc to connect to SQL Server.

conn = pyodbc.connect( pyodbc.Error: ('01000', "[01000] [unixODBC][Driver
Manager]Can't open lib 'ODBC Driver 17 for SQL Server' : file not found (0)
(SQLDriverConnect)")

Can anyone pls suggest what could be done?

Kind Regards,
Sachit Murarka

From wlfraed at ix.netcom.com  Fri Jul 22 13:27:14 2022
From: wlfraed at ix.netcom.com (Dennis Lee Bieber)
Date: Fri, 22 Jul 2022 13:27:14 -0400
Subject: [Tutor] Error while connecting to SQL Server using Python
References: <CAA6YSGHK2ahNrpWXW8dMLhjc5aG-0=4E5PE4f7Pv_QgBAATGWw@mail.gmail.com>
Message-ID: <vqmldh9bakl3ojlai259djlac1objpkheh@4ax.com>

On Fri, 22 Jul 2022 20:06:24 +0530, Sachit Murarka
<connectsachit at gmail.com> declaimed the following:

>Facing below error while using pyodbc to connect to SQL Server.
>
>conn = pyodbc.connect( pyodbc.Error: ('01000', "[01000] [unixODBC][Driver
>Manager]Can't open lib 'ODBC Driver 17 for SQL Server' : file not found (0)
>(SQLDriverConnect)")
>
>Can anyone pls suggest what could be done?
>

	What OS? (the "unixODBC" suggests it may be some variant of Linux,
but...).

	What is the connect string (obscure any passwords, maybe hostname, but
leave the rest). Your "error" doesn't look like a normal Python traceback,
and/or newlines have been stripped.

	Have you installed the required ODBC module? Or some other DB-API...
For example (from a stale Debian "apt search"): 

python3-pymssql/oldstable 2.1.4+dfsg-1 amd64
  Python database access for MS SQL server and Sybase - Python 3


https://docs.microsoft.com/en-us/sql/connect/odbc/download-odbc-driver-for-sql-server?view=sql-server-ver16
https://docs.microsoft.com/en-us/sql/connect/odbc/linux-mac/installing-the-microsoft-odbc-driver-for-sql-server?view=sql-server-ver16
(actual current version is 18, not 17)


-- 
	Wulfraed                 Dennis Lee Bieber         AF6VN
	wlfraed at ix.netcom.com    http://wlfraed.microdiversity.freeddns.org/


From mats at wichmann.us  Fri Jul 22 13:32:30 2022
From: mats at wichmann.us (Mats Wichmann)
Date: Fri, 22 Jul 2022 11:32:30 -0600
Subject: [Tutor] Error while connecting to SQL Server using Python
In-Reply-To: <vqmldh9bakl3ojlai259djlac1objpkheh@4ax.com>
References: <CAA6YSGHK2ahNrpWXW8dMLhjc5aG-0=4E5PE4f7Pv_QgBAATGWw@mail.gmail.com>
 <vqmldh9bakl3ojlai259djlac1objpkheh@4ax.com>
Message-ID: <89b1fad7-f97f-753b-6960-380f13d742c2@wichmann.us>

On 7/22/22 11:27, Dennis Lee Bieber wrote:
> On Fri, 22 Jul 2022 20:06:24 +0530, Sachit Murarka
> <connectsachit at gmail.com> declaimed the following:
> 
>> Facing below error while using pyodbc to connect to SQL Server.
>>
>> conn = pyodbc.connect( pyodbc.Error: ('01000', "[01000] [unixODBC][Driver
>> Manager]Can't open lib 'ODBC Driver 17 for SQL Server' : file not found (0)
>> (SQLDriverConnect)")
>>
>> Can anyone pls suggest what could be done?
>>
> 
> 	What OS? (the "unixODBC" suggests it may be some variant of Linux,
> but...).

If it *is* Linux, check that the driver you're trying to use is listed
in /etc/odbcinst.ini.  There are a bunch of defaults there, but I think
the Microsoft ODBC drivers, of which there about two zillion different
versions, are not among them.

There should be documentation on that topic, if that is indeed the issue.


From mats at wichmann.us  Fri Jul 22 14:12:45 2022
From: mats at wichmann.us (Mats Wichmann)
Date: Fri, 22 Jul 2022 12:12:45 -0600
Subject: [Tutor] Error while connecting to SQL Server using Python
In-Reply-To: <CAA6YSGE1nB2JwEFKzgDmwytevCtwm3mX623wEXkijQx+x7Htpw@mail.gmail.com>
References: <CAA6YSGHK2ahNrpWXW8dMLhjc5aG-0=4E5PE4f7Pv_QgBAATGWw@mail.gmail.com>
 <vqmldh9bakl3ojlai259djlac1objpkheh@4ax.com>
 <89b1fad7-f97f-753b-6960-380f13d742c2@wichmann.us>
 <CAA6YSGE1nB2JwEFKzgDmwytevCtwm3mX623wEXkijQx+x7Htpw@mail.gmail.com>
Message-ID: <40461bf5-5f96-efb7-1946-17e928cd2a0d@wichmann.us>

On 7/22/22 12:08, Sachit Murarka wrote:
> Hi Mats/Dennis,
> 
> Following is the output.
> 
> ?$cat /etc/odbcinst.ini
> [ODBC Driver 17 for SQL Server]
> Description=Microsoft ODBC Driver 17 for SQL Server
> Driver=/opt/microsoft/msodbcsql17/lib64/libmsodbcsql-17.10.so.1.1
> UsageCount=1
> 
> We are using Ubuntu and we have installed odbc driver as well.

>>     If it *is* Linux, check that the driver you're trying to use is listed
>>     in /etc/odbcinst.ini.? There are a bunch of defaults there, but I think
>>     the Microsoft ODBC drivers, of which there about two zillion different
>>     versions, are not among them.

Well, there goes my one idea, already taken care of (assuming all is
correct with that - it did say "file not found")... hope somebody else
has some ideas!


From bouncingcats at gmail.com  Fri Jul 22 19:03:40 2022
From: bouncingcats at gmail.com (David)
Date: Sat, 23 Jul 2022 09:03:40 +1000
Subject: [Tutor] Error while connecting to SQL Server using Python
In-Reply-To: <40461bf5-5f96-efb7-1946-17e928cd2a0d@wichmann.us>
References: <CAA6YSGHK2ahNrpWXW8dMLhjc5aG-0=4E5PE4f7Pv_QgBAATGWw@mail.gmail.com>
 <vqmldh9bakl3ojlai259djlac1objpkheh@4ax.com>
 <89b1fad7-f97f-753b-6960-380f13d742c2@wichmann.us>
 <CAA6YSGE1nB2JwEFKzgDmwytevCtwm3mX623wEXkijQx+x7Htpw@mail.gmail.com>
 <40461bf5-5f96-efb7-1946-17e928cd2a0d@wichmann.us>
Message-ID: <CAMPXz=p-3qA5tvTgh0-cpoPsjYWr6MbDtv8_yTCjxikUpy4xuA@mail.gmail.com>

On Sat, 23 Jul 2022 at 04:14, Mats Wichmann <mats at wichmann.us> wrote:
> On 7/22/22 12:08, Sachit Murarka wrote:

> >  $cat /etc/odbcinst.ini
> > [ODBC Driver 17 for SQL Server]
> > Description=Microsoft ODBC Driver 17 for SQL Server
> > Driver=/opt/microsoft/msodbcsql17/lib64/libmsodbcsql-17.10.so.1.1
> > UsageCount=1

Another easy thing to check could be to find out what user the Python
process is running as, and confirm that this user can read the file at
  /opt/microsoft/msodbcsql17/lib64/libmsodbcsql-17.10.so.1.1
and all its parent directories.

From connectsachit at gmail.com  Fri Jul 22 14:08:34 2022
From: connectsachit at gmail.com (Sachit Murarka)
Date: Fri, 22 Jul 2022 23:38:34 +0530
Subject: [Tutor] Error while connecting to SQL Server using Python
In-Reply-To: <89b1fad7-f97f-753b-6960-380f13d742c2@wichmann.us>
References: <CAA6YSGHK2ahNrpWXW8dMLhjc5aG-0=4E5PE4f7Pv_QgBAATGWw@mail.gmail.com>
 <vqmldh9bakl3ojlai259djlac1objpkheh@4ax.com>
 <89b1fad7-f97f-753b-6960-380f13d742c2@wichmann.us>
Message-ID: <CAA6YSGE1nB2JwEFKzgDmwytevCtwm3mX623wEXkijQx+x7Htpw@mail.gmail.com>

Hi Mats/Dennis,

Following is the output.

 $cat /etc/odbcinst.ini
[ODBC Driver 17 for SQL Server]
Description=Microsoft ODBC Driver 17 for SQL Server
Driver=/opt/microsoft/msodbcsql17/lib64/libmsodbcsql-17.10.so.1.1
UsageCount=1

We are using Ubuntu and we have installed odbc driver as well.

Have referred following documentation:-

https://docs.microsoft.com/en-us/sql/connect/odbc/linux-mac/installing-the-microsoft-odbc-driver-for-sql-server?view=sql-server-ver16#ubuntu17


Kind Regards,
Sachit Murarka


On Fri, Jul 22, 2022 at 11:04 PM Mats Wichmann <mats at wichmann.us> wrote:

> On 7/22/22 11:27, Dennis Lee Bieber wrote:
> > On Fri, 22 Jul 2022 20:06:24 +0530, Sachit Murarka
> > <connectsachit at gmail.com> declaimed the following:
> >
> >> Facing below error while using pyodbc to connect to SQL Server.
> >>
> >> conn = pyodbc.connect( pyodbc.Error: ('01000', "[01000]
> [unixODBC][Driver
> >> Manager]Can't open lib 'ODBC Driver 17 for SQL Server' : file not found
> (0)
> >> (SQLDriverConnect)")
> >>
> >> Can anyone pls suggest what could be done?
> >>
> >
> >       What OS? (the "unixODBC" suggests it may be some variant of Linux,
> > but...).
>
> If it *is* Linux, check that the driver you're trying to use is listed
> in /etc/odbcinst.ini.  There are a bunch of defaults there, but I think
> the Microsoft ODBC drivers, of which there about two zillion different
> versions, are not among them.
>
> There should be documentation on that topic, if that is indeed the issue.
>
>
> _______________________________________________
> Tutor maillist  -  Tutor at python.org
> To unsubscribe or change subscription options:
> https://mail.python.org/mailman/listinfo/tutor
>

From trent.shipley at gmail.com  Fri Jul 22 11:27:57 2022
From: trent.shipley at gmail.com (trent shipley)
Date: Fri, 22 Jul 2022 08:27:57 -0700
Subject: [Tutor] Volunteer teacher
Message-ID: <CAEFLybLsOF9j-dV5EfQkCVRRVgYyvfGQMKYhu_23FkLR-+iE0w@mail.gmail.com>

I've volunteered to do some informal Python teaching.

What are some useful online resources and tutorials?

What are some good, introductory books--whether ebooks or dead tree.  I'm
thinking of something very reader friendly, like the "teach yourself in 24
hours", or "for dummies series", but with good exercises.

Has anyone used https://exercism.org/tracks/python?  I've had good luck
with the four JavaScript exercises I did, but I did one Scala exercise and
the grader was broken (I confirmed it with a live mentor.)


Trent

From connectsachit at gmail.com  Fri Jul 22 23:15:10 2022
From: connectsachit at gmail.com (Sachit Murarka)
Date: Sat, 23 Jul 2022 08:45:10 +0530
Subject: [Tutor] Error while connecting to SQL Server using Python
In-Reply-To: <CAMPXz=p-3qA5tvTgh0-cpoPsjYWr6MbDtv8_yTCjxikUpy4xuA@mail.gmail.com>
References: <CAA6YSGHK2ahNrpWXW8dMLhjc5aG-0=4E5PE4f7Pv_QgBAATGWw@mail.gmail.com>
 <vqmldh9bakl3ojlai259djlac1objpkheh@4ax.com>
 <89b1fad7-f97f-753b-6960-380f13d742c2@wichmann.us>
 <CAA6YSGE1nB2JwEFKzgDmwytevCtwm3mX623wEXkijQx+x7Htpw@mail.gmail.com>
 <40461bf5-5f96-efb7-1946-17e928cd2a0d@wichmann.us>
 <CAMPXz=p-3qA5tvTgh0-cpoPsjYWr6MbDtv8_yTCjxikUpy4xuA@mail.gmail.com>
Message-ID: <CAA6YSGEa5xMgmNsevDNwEp+hz=KWpCnPZqnJLb=WYPAejoyi8g@mail.gmail.com>

Hey David,

Thanks for response, The below file has access to the user which is being
executed to run python.

Kind Regards,
Sachit Murarka


On Sat, Jul 23, 2022 at 4:35 AM David <bouncingcats at gmail.com> wrote:

> On Sat, 23 Jul 2022 at 04:14, Mats Wichmann <mats at wichmann.us> wrote:
> > On 7/22/22 12:08, Sachit Murarka wrote:
>
> > >  $cat /etc/odbcinst.ini
> > > [ODBC Driver 17 for SQL Server]
> > > Description=Microsoft ODBC Driver 17 for SQL Server
> > > Driver=/opt/microsoft/msodbcsql17/lib64/libmsodbcsql-17.10.so.1.1
> > > UsageCount=1
>
> Another easy thing to check could be to find out what user the Python
> process is running as, and confirm that this user can read the file at
>   /opt/microsoft/msodbcsql17/lib64/libmsodbcsql-17.10.so.1.1
> and all its parent directories.
> _______________________________________________
> Tutor maillist  -  Tutor at python.org
> To unsubscribe or change subscription options:
> https://mail.python.org/mailman/listinfo/tutor
>

From leamhall at gmail.com  Sat Jul 23 05:53:22 2022
From: leamhall at gmail.com (Leam Hall)
Date: Sat, 23 Jul 2022 04:53:22 -0500
Subject: [Tutor] Volunteer teacher
In-Reply-To: <CAEFLybLsOF9j-dV5EfQkCVRRVgYyvfGQMKYhu_23FkLR-+iE0w@mail.gmail.com>
References: <CAEFLybLsOF9j-dV5EfQkCVRRVgYyvfGQMKYhu_23FkLR-+iE0w@mail.gmail.com>
Message-ID: <a1f1f3d5-9385-853e-3db1-54a65435980b@gmail.com>

Trent,

Two variables are the ages of the students, and their existing coding skills. Are they teens who need a first language or grizzled Assembler veterans in need of recovery?

For the former, I'd recommend the combination of "Practical Programming" (https://www.amazon.com/Practical-Programming-Introduction-Computer-Science/dp/1680502689) and the Coursera courses by the authors (https://www.coursera.org/learn/learn-to-program  and   https://www.coursera.org/learn/program-code). That gives an easy introduction into doing stuff with computers, and hits both visual and book learners.

For the latter, Python Object Oriented Programming (https://www.amazon.com/Python-Object-Oriented-Programming-maintainable-object-oriented/dp/1801077266).

There's a new book out, Python Distilled (https://www.amazon.com/Python-Essential-Reference-Developers-Library/dp/0134173279) by David Beazley. I liked David's work on the "Python Cookbook", but can't speak about "Python Distilled" from experience.

The Exercism stuff is usually good, but I'd suggest you go through it first. There are quirks in the problem explanations, and some bugs.

Leam

On 7/22/22 10:27, trent shipley wrote:
> I've volunteered to do some informal Python teaching.
> 
> What are some useful online resources and tutorials?
> 
> What are some good, introductory books--whether ebooks or dead tree.  I'm
> thinking of something very reader friendly, like the "teach yourself in 24
> hours", or "for dummies series", but with good exercises.
> 
> Has anyone used https://exercism.org/tracks/python?  I've had good luck
> with the four JavaScript exercises I did, but I did one Scala exercise and
> the grader was broken (I confirmed it with a live mentor.)
> 
> 
> Trent
> _______________________________________________
> Tutor maillist  -  Tutor at python.org
> To unsubscribe or change subscription options:
> https://mail.python.org/mailman/listinfo/tutor

-- 
Automation Engineer        (reuel.net/resume)
Scribe: The Domici War     (domiciwar.net)
General Ne'er-do-well      (github.com/LeamHall)

From wlfraed at ix.netcom.com  Sat Jul 23 15:14:47 2022
From: wlfraed at ix.netcom.com (Dennis Lee Bieber)
Date: Sat, 23 Jul 2022 15:14:47 -0400
Subject: [Tutor] Volunteer teacher
References: <CAEFLybLsOF9j-dV5EfQkCVRRVgYyvfGQMKYhu_23FkLR-+iE0w@mail.gmail.com>
 <a1f1f3d5-9385-853e-3db1-54a65435980b@gmail.com>
Message-ID: <tuhodhp3lgl9ukvnb5ncuecs3i03qmqa38@4ax.com>

On Sat, 23 Jul 2022 04:53:22 -0500, Leam Hall <leamhall at gmail.com>
declaimed the following:

>
>For the latter, Python Object Oriented Programming (https://www.amazon.com/Python-Object-Oriented-Programming-maintainable-object-oriented/dp/1801077266).
>

	<snicker>

	A rather unfortunate name... Acronym POOP... "Object Oriented
Programming in Python" avoids such <G>

	Of course -- my view is that, if one is going to focus on OOP, one
should precede it with an introduction to a language-neutral OOAD textbook.


-- 
	Wulfraed                 Dennis Lee Bieber         AF6VN
	wlfraed at ix.netcom.com    http://wlfraed.microdiversity.freeddns.org/


From leamhall at gmail.com  Sat Jul 23 15:24:31 2022
From: leamhall at gmail.com (Leam Hall)
Date: Sat, 23 Jul 2022 14:24:31 -0500
Subject: [Tutor] Volunteer teacher
In-Reply-To: <tuhodhp3lgl9ukvnb5ncuecs3i03qmqa38@4ax.com>
References: <CAEFLybLsOF9j-dV5EfQkCVRRVgYyvfGQMKYhu_23FkLR-+iE0w@mail.gmail.com>
 <a1f1f3d5-9385-853e-3db1-54a65435980b@gmail.com>
 <tuhodhp3lgl9ukvnb5ncuecs3i03qmqa38@4ax.com>
Message-ID: <2118ecff-c37f-c4cf-5f1e-1b706c500c06@gmail.com>

On 7/23/22 14:14, Dennis Lee Bieber wrote:
> On Sat, 23 Jul 2022 04:53:22 -0500, Leam Hall <leamhall at gmail.com>
> declaimed the following:

>> For the latter, Python Object Oriented Programming (https://www.amazon.com/Python-Object-Oriented-Programming-maintainable-object-oriented/dp/1801077266).
>>
> 
> 	<snicker>
> 
> 	A rather unfortunate name... Acronym POOP... "Object Oriented
> Programming in Python" avoids such <G>
> 
> 	Of course -- my view is that, if one is going to focus on OOP, one
> should precede it with an introduction to a language-neutral OOAD textbook.

Worse, the book is published by Packt; so it's "Packt POOP".  :)

I disagree on the "OOAD first" opinion, though. Programming is about exploration, and we learn more by exploring with fewer third party constraints. Those OOAD tomes are someone else's opinion on how we should do things, and until we have a handle on what we're actually able to do then there's no frame of reference for the OODA to stick to.

I'm a prime example of "needs to read less and code more". Incredibly bad habit, see a good book and buy it before really understanding the last half-dozen or so books I already have on that topic. Well, with Python I'm over a dozen, but other languages not so much.

-- 
Automation Engineer        (reuel.net/resume)
Scribe: The Domici War     (domiciwar.net)
General Ne'er-do-well      (github.com/LeamHall)

From mats at wichmann.us  Sat Jul 23 15:27:09 2022
From: mats at wichmann.us (Mats Wichmann)
Date: Sat, 23 Jul 2022 13:27:09 -0600
Subject: [Tutor] Volunteer teacher
In-Reply-To: <tuhodhp3lgl9ukvnb5ncuecs3i03qmqa38@4ax.com>
References: <CAEFLybLsOF9j-dV5EfQkCVRRVgYyvfGQMKYhu_23FkLR-+iE0w@mail.gmail.com>
 <a1f1f3d5-9385-853e-3db1-54a65435980b@gmail.com>
 <tuhodhp3lgl9ukvnb5ncuecs3i03qmqa38@4ax.com>
Message-ID: <75553f79-af0e-6fe1-7f6b-9c2e7f509c72@wichmann.us>

On 7/23/22 13:14, Dennis Lee Bieber wrote:
> On Sat, 23 Jul 2022 04:53:22 -0500, Leam Hall <leamhall at gmail.com>
> declaimed the following:
> 
>>
>> For the latter, Python Object Oriented Programming (https://www.amazon.com/Python-Object-Oriented-Programming-maintainable-object-oriented/dp/1801077266).
>>
> 
> 	<snicker>
> 
> 	A rather unfortunate name... Acronym POOP... "Object Oriented
> Programming in Python" avoids such <G>
> 
> 	Of course -- my view is that, if one is going to focus on OOP, one
> should precede it with an introduction to a language-neutral OOAD textbook.


Maybe... I haven't looked at one for so long, but I'd worry that they'd
nod too much to existing implementations like Java which enforce a
rather idiotic "everything must be a class even if it isn't, like your
main() routine".


From learn2program at gmail.com  Sat Jul 23 19:25:06 2022
From: learn2program at gmail.com (Alan Gauld)
Date: Sun, 24 Jul 2022 00:25:06 +0100
Subject: [Tutor] Volunteer teacher
In-Reply-To: <2118ecff-c37f-c4cf-5f1e-1b706c500c06@gmail.com>
References: <CAEFLybLsOF9j-dV5EfQkCVRRVgYyvfGQMKYhu_23FkLR-+iE0w@mail.gmail.com>
 <a1f1f3d5-9385-853e-3db1-54a65435980b@gmail.com>
 <tuhodhp3lgl9ukvnb5ncuecs3i03qmqa38@4ax.com>
 <2118ecff-c37f-c4cf-5f1e-1b706c500c06@gmail.com>
Message-ID: <0ef4cf34-21fe-9af1-48d6-c6bc0c699d7a@yahoo.co.uk>


On 23/07/2022 20:24, Leam Hall wrote:
>
>> 	Of course -- my view is that, if one is going to focus on OOP, one
>> should precede it with an introduction to a language-neutral OOAD textbook.
> I disagree on the "OOAD first" opinion, though. Programming is about exploration,

Its one view. But not a universal one and certainly not what the
founding fathers thought.

And defiitely not what the originators of "software enginering" thought.
To them
programming was akin to bricklaying. The final part of the process after
you had
analyzed the system and designed the solution. Then you got your
materials and
followed the design. And, agile theories not withstanding, it's still
how many large
organisations view things.

> Those OOAD tomes are someone else's opinion on how we should do things,

That's true, albeit based on a lot of data driven science rather than
the gut-feel
and "personal experience" theory that drives much of modern software
development.

But especially OOP is a style of programming that needs understanding of
the
principles before programming constructs like classes etc make sense. OOP
came about before classes as we know them. Classes were borrowed from
Simula as a convenient mechanism for building OOP systems.

> until we have a handle on what we're actually able to do then there's no |
> frame of reference for the OODA to stick to.

I'd turn that around and say without the OOAD frame of reference you
can't make sense of OOP constructs. Sadly many students today are not
taught OOP but only taught how to build classes, as if classes were OOP.

Then they call themselves OOP programmers but in reality build procedural
programs using quasi abstract-data- types implemented as classes. And many
never do understand the difference between programming with objects and
building genuinely object-oriented programs.

-- 

Alan G
Author of the Learn to Program web site
http://www.alan-g.me.uk/
http://www.amazon.com/author/alan_gauld
Follow my photo-blog on Flickr at:
http://www.flickr.com/photos/alangauldphotos


From leamhall at gmail.com  Sat Jul 23 20:53:07 2022
From: leamhall at gmail.com (Leam Hall)
Date: Sat, 23 Jul 2022 19:53:07 -0500
Subject: [Tutor] Pair 'o dimes wuz: Volunteer teacher
In-Reply-To: <0ef4cf34-21fe-9af1-48d6-c6bc0c699d7a@yahoo.co.uk>
References: <CAEFLybLsOF9j-dV5EfQkCVRRVgYyvfGQMKYhu_23FkLR-+iE0w@mail.gmail.com>
 <a1f1f3d5-9385-853e-3db1-54a65435980b@gmail.com>
 <tuhodhp3lgl9ukvnb5ncuecs3i03qmqa38@4ax.com>
 <2118ecff-c37f-c4cf-5f1e-1b706c500c06@gmail.com>
 <0ef4cf34-21fe-9af1-48d6-c6bc0c699d7a@yahoo.co.uk>
Message-ID: <5482fe87-c7a4-db6d-d490-c89b83c2869e@gmail.com>


On 7/23/22 18:25, Alan Gauld wrote:
> 
> On 23/07/2022 20:24, Leam Hall wrote:
>>
>>> 	Of course -- my view is that, if one is going to focus on OOP, one
>>> should precede it with an introduction to a language-neutral OOAD textbook.
>> I disagree on the "OOAD first" opinion, though. Programming is about exploration,
> 
> Its one view. But not a universal one and certainly not what the
> founding fathers thought.
> 
> And defiitely not what the originators of "software enginering" thought.
> To them
> programming was akin to bricklaying. The final part of the process after
> you had
> analyzed the system and designed the solution. Then you got your
> materials and
> followed the design. And, agile theories not withstanding, it's still
> how many large
> organisations view things.
> 
>> Those OOAD tomes are someone else's opinion on how we should do things,
> 
> That's true, albeit based on a lot of data driven science rather than
> the gut-feel
> and "personal experience" theory that drives much of modern software
> development.
> 
> But especially OOP is a style of programming that needs understanding of
> the
> principles before programming constructs like classes etc make sense. OOP
> came about before classes as we know them. Classes were borrowed from
> Simula as a convenient mechanism for building OOP systems.
> 
>> until we have a handle on what we're actually able to do then there's no |
>> frame of reference for the OODA to stick to.
> 
> I'd turn that around and say without the OOAD frame of reference you
> can't make sense of OOP constructs. Sadly many students today are not
> taught OOP but only taught how to build classes, as if classes were OOP.
> 
> Then they call themselves OOP programmers but in reality build procedural
> programs using quasi abstract-data- types implemented as classes. And many
> never do understand the difference between programming with objects and
> building genuinely object-oriented programs.
> 

About the only truly universal things are hydrogen and paperwork, most everything else is contextual.

I'd be surprised that the founding fathers couldn't code in anything before they came up with OOP. It seems odd to design a building before you know how bricks or electrical systems work. The building architects and civil engineers I know really do have a handle on the nuts and bolts of things, and then they spend years as underlings before they ever get to be lead designer.

Design isn't code, it won't run on the computer. It is a nice skill to have, and large organizations often spend a considerable time on design. And they spend a lot of resources on failed projects and no-longer-useful designs. Does anyone not have at least one experience where the designers cooked up something that wouldn't work?

I feel one of Python's strengths is that it can do OOP, as well as other styles of programming. That lets people create actual working "stuff", and then evaluate how to improve the system as new environmental data and requirements come in. What people call themselves, and what paradigms they use is irrelevant; working code wins.


-- 
Automation Engineer        (reuel.net/resume)
Scribe: The Domici War     (domiciwar.net)
General Ne'er-do-well      (github.com/LeamHall)

From PythonList at DancesWithMice.info  Sat Jul 23 20:54:32 2022
From: PythonList at DancesWithMice.info (dn)
Date: Sun, 24 Jul 2022 12:54:32 +1200
Subject: [Tutor] Volunteer teacher
In-Reply-To: <75553f79-af0e-6fe1-7f6b-9c2e7f509c72@wichmann.us>
References: <CAEFLybLsOF9j-dV5EfQkCVRRVgYyvfGQMKYhu_23FkLR-+iE0w@mail.gmail.com>
 <a1f1f3d5-9385-853e-3db1-54a65435980b@gmail.com>
 <tuhodhp3lgl9ukvnb5ncuecs3i03qmqa38@4ax.com>
 <75553f79-af0e-6fe1-7f6b-9c2e7f509c72@wichmann.us>
Message-ID: <29035bba-b1b5-dc58-7379-f009eeae12a5@DancesWithMice.info>

On 24/07/2022 07.27, Mats Wichmann wrote:
> On 7/23/22 13:14, Dennis Lee Bieber wrote:
>> On Sat, 23 Jul 2022 04:53:22 -0500, Leam Hall <leamhall at gmail.com>
>> declaimed the following:
>>
>>>
>>> For the latter, Python Object Oriented Programming (https://www.amazon.com/Python-Object-Oriented-Programming-maintainable-object-oriented/dp/1801077266).
>>>
>>
>> 	<snicker>
>>
>> 	A rather unfortunate name... Acronym POOP... "Object Oriented
>> Programming in Python" avoids such <G>
>>
>> 	Of course -- my view is that, if one is going to focus on OOP, one
>> should precede it with an introduction to a language-neutral OOAD textbook.
> 
> 
> Maybe... I haven't looked at one for so long, but I'd worry that they'd
> nod too much to existing implementations like Java which enforce a
> rather idiotic "everything must be a class even if it isn't, like your
> main() routine".


+1 (here, and +1 with @Alan's post)

and +1 to the 'paralysis through analysis' syndrome @Leam mentions. The
best way to 'learn' is to 'do'! Reading the same facts in similar
books/sources is unlikely to improve 'learning' as much as you might hope!


As a trainer (my $job is not in Python) my colleagues and I struggle
with trying to find a balance between 'doing stuff' with code, and
learning the more academic theory 'behind' ComSc and/or
software-development. It's a lot easier if one only covers 'Python', and
not the theory behind OOP (or example). However, professionals will
(usually) benefit from both.

Worse, there are a number of topics, eg Source Code Management,
Test-Driven Development, O-O, even 'modules' and (the infamous)
'comments'; which are/will always be problematic topics at the 'Beginner
level' - because if one is only writing short 'toy' program[me]s, there
is no *apparent* need or purpose - which makes it very much an exercise
in theory!


We are running something of a (pedagogical) experiment at our Python
Users' Group. The PUG consists of many 'groups' of folks, including
hobbyists and professionals, ranging from 101-students and 'Beginners',
through to 'Python Masters' - so it is quite a challenge to find
meeting-topics which will offer 'something for everyone'.

Currently, we have Olaf running a Presentation-based series on "Software
Craftsmanship". Accordingly, he talks of systems at a 'high level'. For
example 'inversion of control' is illustrated with 'Uncle Bob's' very
theoretical "The Clean Architecture" diagram (which I call 'the circles
diagram':
https://image.slidesharecdn.com/cleanarchitectureonandroid-160528160204/95/clean-architecture-on-android-11-638.jpg?cb=1464451395

In complementary (and complimentary) fashion, I am working 'bottom up'
on a series of 'Coding Evenings' called "Crafting Software" using the
ultimate practical 'code-along-at-home' approach. This has started with
using the REPL (PyCharm's Python Console - our meetings feature a 'door
prize', sponsored by JetBrains - thanks guys!). We commenced
implementing a very (very) simple Business Rule/spec, and immediately
launched into grabbing some input data. Coders who 'dive straight in'
make me wince, so you can only imagine the required manful
self-discipline...

Newcomers were learning how to use input() - that's how 'low' we
started! The object of this series is to build a routine and then
gradually expand and improve same. Along the way we will quickly
discover the hassles of changing from a single (constant) price (for
1KG/2lb of apples), to (say) an in-code price-list, to having product
detail stored in a database. Hopefully there will be realisation(s): 'we
should have asked that question before warming-up the keyboard', as well
as 'if we design with change in-mind, our later-lives will be a lot
easier'... 'SOLID by stealth'!

So, the series' aim is to show that a bit of thought (essentially the
implementation of U.Bob's diagram showing 'inversion', independence,
cohesion, and coupling) up-front is a worthwhile investment - as well as
demonstrating 'how to do it' in Python, and a bunch of paradigms and
principles, etc, along-the-way.


Relevance to the OP:
is to get-going, and realise that any 'later' "refactoring" is not a
crime/sin - indeed may be a valuable part of the learning-experience.


Relevance to you/more details about the two series:
- Olaf's series runs bi-monthly (s/b mid-August*), but Coding Evenings
monthly
* KiwiPyCon will be held (in-person and also on-line) 19-21 August
(after three postponements! https://kiwipycon.nz)
- the PUG gathers for two meetings pcm
- meeting-details are published through
https://www.meetup.com/nzpug-auckland/
- although the labels say "New Zealand" and "Auckland", we are no-longer
'local', running in the virtual world
- accordingly, relative time-zones are the deciding-factor, so we often
refer to ourselves as 'UTC+12' (or the PUG at the end of the universe?)
- all welcome - learners and contributors alike!


PS are looking at introducing git (or...) as part of the "Crafting
Software" series, and UML to ease Olaf's 'load'. Would you please
volunteer a 'lightning talk' and demo on one/the other/both subject(s)?
-- 
Regards,
=dn

From PythonList at DancesWithMice.info  Sat Jul 23 21:01:49 2022
From: PythonList at DancesWithMice.info (dn)
Date: Sun, 24 Jul 2022 13:01:49 +1200
Subject: [Tutor] Pair 'o dimes wuz: Volunteer teacher
In-Reply-To: <5482fe87-c7a4-db6d-d490-c89b83c2869e@gmail.com>
References: <CAEFLybLsOF9j-dV5EfQkCVRRVgYyvfGQMKYhu_23FkLR-+iE0w@mail.gmail.com>
 <a1f1f3d5-9385-853e-3db1-54a65435980b@gmail.com>
 <tuhodhp3lgl9ukvnb5ncuecs3i03qmqa38@4ax.com>
 <2118ecff-c37f-c4cf-5f1e-1b706c500c06@gmail.com>
 <0ef4cf34-21fe-9af1-48d6-c6bc0c699d7a@yahoo.co.uk>
 <5482fe87-c7a4-db6d-d490-c89b83c2869e@gmail.com>
Message-ID: <4da5d32b-7301-3cbf-d0a5-588cd48fd3e0@DancesWithMice.info>

On 24/07/2022 12.53, Leam Hall wrote:
...

> I feel one of Python's strengths is that it can do OOP, as well as other
> styles of programming. That lets people create actual working "stuff",
> and then evaluate how to improve the system as new environmental data
> and requirements come in. What people call themselves, and what
> paradigms they use is irrelevant; working code wins.

Agree: at one level.

However, consider the statement saying "code is read more often than it
is written". Thereafter, the meaning of "working"...

Yes, it is likely that if we both tackled the same requirement, our
delivered-code would be different. We may have used different paradigms
or structures to 'get there'. This is not particularly at-issue if, as
you say, they are 'working code'.

However, software is subject to change.

At which time, the ability with which we could read each-other's code,
or code from our 6-months-ago self; becomes a major contributor to the
success of the new project! Thus, my code *works* on the computer, but
does it *work* for you - will you be able to take it and *win*?

-- 
Regards,
=dn

From leamhall at gmail.com  Sun Jul 24 07:23:43 2022
From: leamhall at gmail.com (Leam Hall)
Date: Sun, 24 Jul 2022 06:23:43 -0500
Subject: [Tutor] Pair 'o dimes wuz: Volunteer teacher
In-Reply-To: <4da5d32b-7301-3cbf-d0a5-588cd48fd3e0@DancesWithMice.info>
References: <CAEFLybLsOF9j-dV5EfQkCVRRVgYyvfGQMKYhu_23FkLR-+iE0w@mail.gmail.com>
 <a1f1f3d5-9385-853e-3db1-54a65435980b@gmail.com>
 <tuhodhp3lgl9ukvnb5ncuecs3i03qmqa38@4ax.com>
 <2118ecff-c37f-c4cf-5f1e-1b706c500c06@gmail.com>
 <0ef4cf34-21fe-9af1-48d6-c6bc0c699d7a@yahoo.co.uk>
 <5482fe87-c7a4-db6d-d490-c89b83c2869e@gmail.com>
 <4da5d32b-7301-3cbf-d0a5-588cd48fd3e0@DancesWithMice.info>
Message-ID: <9bff8b0e-bc7d-d26a-738a-1d402748ac1d@gmail.com>

On 7/23/22 20:01, dn wrote:
> On 24/07/2022 12.53, Leam Hall wrote:
> ...
> 
>> I feel one of Python's strengths is that it can do OOP, as well as other
>> styles of programming. That lets people create actual working "stuff",
>> and then evaluate how to improve the system as new environmental data
>> and requirements come in. What people call themselves, and what
>> paradigms they use is irrelevant; working code wins.
> 
> Agree: at one level.
> 
> However, consider the statement saying "code is read more often than it
> is written". Thereafter, the meaning of "working"...
> 
> Yes, it is likely that if we both tackled the same requirement, our
> delivered-code would be different. We may have used different paradigms
> or structures to 'get there'. This is not particularly at-issue if, as
> you say, they are 'working code'.
> 
> However, software is subject to change.
> 
> At which time, the ability with which we could read each-other's code,
> or code from our 6-months-ago self; becomes a major contributor to the
> success of the new project! Thus, my code *works* on the computer, but
> does it *work* for you - will you be able to take it and *win*?


Totally agree, on two levels.

First, Python is a lot easier to read than many languages. My first Python task, years ago, was to try and do something with Twisted. I was successful, and I didn't even know Python at the time. The language was just that clear.

Secondly, you could argue that the Twisted code was particularly well written and that's an argument for good design. I would take you at your word, I don't know the quality of Twisted code. I would very much agree with you that a good design, implemented well, beats a lot of the code I have seen. It beats a lot of code I have written, too.

I chuckled as I read your earlier response. Imagine the dev team trying to work through a spaghetti of undesigned codebase, and the design person saying "Now do you believe me that design is important?" I'm all for good design, and if you or Alan look at my code, swear under your breath, and then ask me if I'd consider fixing things with a good design, I'm going to listen.

Unfortunately, we can't just open our skulls up, drop in the GoF or Booch's OOAD, and magically do good design. To learn how to implement good design, our brains need to play. First with the language itself, and Python is becoming the language of choice for many on-line college courses (Coursera, EdX). This play will be just like learning a human language; we'll sound awful and not make a lot of sense, but learning takes time. In both computer and human language, a lot of people can't get past the early learning failures, never realizing that failure is implicit in play, and play is mandatory for learning. Too often we burden them with rules and expectations that kill the joy of play.

Once we have the basics, hopefully a mentor shows up that can take us to the next level. Either a strong base in verb declensions or an introduction to design concepts. Then we have to play with those new toys. Play will integrate the concepts into our skills, but it takes lots of time and lots of play. Having the toys to play "design this" with engages our brains and gives us a chance to deeply learn.

We agree that good design is good. My opinion, even if it's mine alone, is that design is not the first thing to learn.

Leam


-- 
Automation Engineer        (reuel.net/resume)
Scribe: The Domici War     (domiciwar.net)
General Ne'er-do-well      (github.com/LeamHall)

From alan.gauld at yahoo.co.uk  Sun Jul 24 07:47:24 2022
From: alan.gauld at yahoo.co.uk (Alan Gauld)
Date: Sun, 24 Jul 2022 12:47:24 +0100
Subject: [Tutor] Pair 'o dimes wuz: Volunteer teacher
In-Reply-To: <5482fe87-c7a4-db6d-d490-c89b83c2869e@gmail.com>
References: <CAEFLybLsOF9j-dV5EfQkCVRRVgYyvfGQMKYhu_23FkLR-+iE0w@mail.gmail.com>
 <a1f1f3d5-9385-853e-3db1-54a65435980b@gmail.com>
 <tuhodhp3lgl9ukvnb5ncuecs3i03qmqa38@4ax.com>
 <2118ecff-c37f-c4cf-5f1e-1b706c500c06@gmail.com>
 <0ef4cf34-21fe-9af1-48d6-c6bc0c699d7a@yahoo.co.uk>
 <5482fe87-c7a4-db6d-d490-c89b83c2869e@gmail.com>
Message-ID: <tbjbgc$gbo$1@ciao.gmane.io>

On 24/07/2022 01:53, Leam Hall wrote:

>> programming was akin to bricklaying. The final part of the process after
>> you had
>> analyzed the system and designed the solution. Then you got your
>> materials and
>> followed the design. And, agile theories not withstanding, it's still
>> how many large
>> organisations view things.
>>
>>> Those OOAD tomes are someone else's opinion on how we should do things,
>>
>> That's true, albeit based on a lot of data driven science rather than
>> the gut-feel
>> and "personal experience" theory that drives much of modern software
>> development.
>>
>> But especially OOP is a style of programming that needs understanding of
>> the
>> principles before programming constructs like classes etc make sense. OOP
>> came about before classes as we know them. Classes were borrowed from
>> Simula as a convenient mechanism for building OOP systems.
>>
>>> until we have a handle on what we're actually able to do then there's no |
>>> frame of reference for the OODA to stick to.
>>
>> I'd turn that around and say without the OOAD frame of reference you
>> can't make sense of OOP constructs. Sadly many students today are not
>> taught OOP but only taught how to build classes, as if classes were OOP.
>>
>> Then they call themselves OOP programmers but in reality build procedural
>> programs using quasi abstract-data- types implemented as classes. And many
>> never do understand the difference between programming with objects and
>> building genuinely object-oriented programs.
>>
> 
> About the only truly universal things are hydrogen and paperwork, most everything else is contextual.
> 

> I'd be surprised that the founding fathers couldn't code in anything before they came up with OOP.

Not OOP. But analysis and design. Remember the founding fathers of
programming didn't have any interactive capability, they couldn't easily
experiment. They had to write their code in assembler and transfer it to
punch cards or tape. So they made sure to design their code carefully
before they started writing it. They only had limited tools such as flow
charts and pseudo code but the structure had to be clear. Code was
ameans of translating a design idea into someting the machine understood
and could execute.

It was only after the development of teletypes and interpreters that
experimental programming came about or even became possible. But it was
still the exception. Even in the mid 80's a friend working for a large
insurance company was limited to one compile run per day, anything
beyond that needed formal sign-off from his boss. And in the early 90s I
was working pn a project where we could only build our own module, we
were'nt allowed to build the entire system(in case we broke it!).
System builds ran overnight(and too several hours)

So it is only in relatively recent times that the idea of programming as
an experimental/explorative activity has become common place. And there
is definirely a place for that, its a good way to learn language
features. But if looking at things like OOP which are mich higher level
it still helps to have a context and an idea of what we are trying to
achieve. Focusing on the minutiae of language features to build classes
can lead to very bad practices, misusing fratures. The classic example
is the abuse of inheritance as a code reuse tool rather than as part of
an OOD.

>  It seems odd to design a building before you know how bricks or 
> electrical systems work.

But that is exactly what happens. You design the structure first
then choose the appropriate materials. Yes you need to undersand
about bricks etc, and what they are capable of but you don't need
to be proficient in the craft of laying them.

>  The building architects and civil engineers I know really do 
> have a handle on the nuts and bolts of things, 

In my experience (in electrical engineering) very few engineers actually
have much practical experience of assembling and maintaining electrical
components. They have an army of technicians to do that. They know what
the components are for, how they work and may have basic construction
skills for prototyping. But they don't normally get involved at that
level. Software engineering is one of the few fields where the designer
is often the constructor too.

> they spend years as underlings before they ever get to be lead designer.

Of course, but they till need to know the design principles.

> Design isn't code, it won't run on the computer. 

Neither will code without a design. You can design it in your head as
you go along, but it will generally take longer and be less flexible
and harder to maintain. Especially when it goes above a few hundred
lines. And thats where things like OOP comer in because each object is
like a small standalone program. Which makes it easier to design minimally.

> spend a lot of resources on failed projects and no-longer-useful designs.

This is a flat out myth! The vast majority of lage scale projects
succeed, very few are cancelled or go badly wrong (they are often
over budget and over time, but thats only measuring against the
initial budgets and timescales). It's just that when they do fail they
attract a lot of attention because the cost a lot of money!
If a $10-50k 4-man project goes belly up nobody notices. But when
a 4 year, 1000 man, project costing $100 million goes belly up it is
very noticeable. But you can't build 4000 man-year projects using
agile - it's been tried and invariably winds up moving to more
traditional methods. Usually after a lot of wasted time and money.
But the fact is that our modern world is run by large scale software
projects successfully delivered by the fortune 500 companies. We just
don't think about it and take it for granted every time we board
a train or plane, turn on the electricity or water, collect our
wages, etc.

>  Does anyone not have at least one experience where the designers 
> cooked up something that wouldn't work?

I've seen stuff that was too ambitious or didn't run fast enough.
But I've never seen anything designed that just didn't work at all.
(I've seen designs rejected for that reason, but they never got built -
thus saving a huge amount of money! That's the purpose of design.)
but I've seen dozens of programs that were attempted without design
that just didn't run, did the wrong thing, soaked up memory, etc etc.
Its just easier to hide when there is limited evidence (documentation etc)

> I feel one of Python's strengths is that it can do OOP, 

You can do OOP in any language, even assembler.
OOP is a style of programming. Language features help make
it easier that's all. OOP helps control complexity. Sometimes
its the best approach, sometimes not. But that depends on
the nature of the problem not the language used.

>  what paradigms they use is irrelevant; working code wins.

Working code wins over non working code, for sure.
But working code alone is not good enough. It also needs to be
maintainable, efficient, and economical. For that you usually
need a design. Spaghetti code has caused far more project
failures than design faulures ever did.


-- 
Alan G
Author of the Learn to Program web site
http://www.alan-g.me.uk/
http://www.amazon.com/author/alan_gauld
Follow my photo-blog on Flickr at:
http://www.flickr.com/photos/alangauldphotos


From alan.gauld at yahoo.co.uk  Sun Jul 24 07:52:14 2022
From: alan.gauld at yahoo.co.uk (Alan Gauld)
Date: Sun, 24 Jul 2022 12:52:14 +0100
Subject: [Tutor] Volunteer teacher
In-Reply-To: <75553f79-af0e-6fe1-7f6b-9c2e7f509c72@wichmann.us>
References: <CAEFLybLsOF9j-dV5EfQkCVRRVgYyvfGQMKYhu_23FkLR-+iE0w@mail.gmail.com>
 <a1f1f3d5-9385-853e-3db1-54a65435980b@gmail.com>
 <tuhodhp3lgl9ukvnb5ncuecs3i03qmqa38@4ax.com>
 <75553f79-af0e-6fe1-7f6b-9c2e7f509c72@wichmann.us>
Message-ID: <tbjbpf$17pn$1@ciao.gmane.io>

On 23/07/2022 20:27, Mats Wichmann wrote:

>> should precede it with an introduction to a language-neutral OOAD textbook.
> 
> Maybe... I haven't looked at one for so long, but I'd worry that they'd
> nod too much to existing implementations like Java

There are very few language neutral OOAD books and most are
from the early days of OOP in the 80's and 90's.

Sadly OOP has become synonymous with class based programming
(with Java a prime example) and as a result has acquired a
poor reputation with a generation of programmers who have
never really understood what it was about. Partly because
they never had to struggle with a world where OOP was not
an option.

-- 
Alan G
Author of the Learn to Program web site
http://www.alan-g.me.uk/
http://www.amazon.com/author/alan_gauld
Follow my photo-blog on Flickr at:
http://www.flickr.com/photos/alangauldphotos


From avi.e.gross at gmail.com  Sat Jul 23 21:16:14 2022
From: avi.e.gross at gmail.com (avi.e.gross at gmail.com)
Date: Sat, 23 Jul 2022 21:16:14 -0400
Subject: [Tutor] Volunteer teacher
In-Reply-To: <2118ecff-c37f-c4cf-5f1e-1b706c500c06@gmail.com>
References: <CAEFLybLsOF9j-dV5EfQkCVRRVgYyvfGQMKYhu_23FkLR-+iE0w@mail.gmail.com>
 <a1f1f3d5-9385-853e-3db1-54a65435980b@gmail.com>
 <tuhodhp3lgl9ukvnb5ncuecs3i03qmqa38@4ax.com>
 <2118ecff-c37f-c4cf-5f1e-1b706c500c06@gmail.com>
Message-ID: <00c801d89efa$f87d6650$e97832f0$@gmail.com>

You guys have it all wrong about naming and marketing.

It is Snake Language Object Oriented Programming

-- SLOOP ---

--POOLS-- if using it backward.


-----Original Message-----
From: Tutor <tutor-bounces+avi.e.gross=gmail.com at python.org> On Behalf Of
Leam Hall
Sent: Saturday, July 23, 2022 3:25 PM
To: tutor at python.org
Subject: Re: [Tutor] Volunteer teacher

On 7/23/22 14:14, Dennis Lee Bieber wrote:
> On Sat, 23 Jul 2022 04:53:22 -0500, Leam Hall <leamhall at gmail.com> 
> declaimed the following:

>> For the latter, Python Object Oriented Programming
(https://www.amazon.com/Python-Object-Oriented-Programming-maintainable-obje
ct-oriented/dp/1801077266).
>>
> 
> 	<snicker>
> 
> 	A rather unfortunate name... Acronym POOP... "Object Oriented 
> Programming in Python" avoids such <G>
> 
> 	Of course -- my view is that, if one is going to focus on OOP, one 
> should precede it with an introduction to a language-neutral OOAD
textbook.

Worse, the book is published by Packt; so it's "Packt POOP".  :)

I disagree on the "OOAD first" opinion, though. Programming is about
exploration, and we learn more by exploring with fewer third party
constraints. Those OOAD tomes are someone else's opinion on how we should do
things, and until we have a handle on what we're actually able to do then
there's no frame of reference for the OODA to stick to.

I'm a prime example of "needs to read less and code more". Incredibly bad
habit, see a good book and buy it before really understanding the last
half-dozen or so books I already have on that topic. Well, with Python I'm
over a dozen, but other languages not so much.

-- 
Automation Engineer        (reuel.net/resume)
Scribe: The Domici War     (domiciwar.net)
General Ne'er-do-well      (github.com/LeamHall)
_______________________________________________
Tutor maillist  -  Tutor at python.org
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor


From avi.e.gross at gmail.com  Sat Jul 23 21:23:26 2022
From: avi.e.gross at gmail.com (avi.e.gross at gmail.com)
Date: Sat, 23 Jul 2022 21:23:26 -0400
Subject: [Tutor] Volunteer teacher
In-Reply-To: <75553f79-af0e-6fe1-7f6b-9c2e7f509c72@wichmann.us>
References: <CAEFLybLsOF9j-dV5EfQkCVRRVgYyvfGQMKYhu_23FkLR-+iE0w@mail.gmail.com>
 <a1f1f3d5-9385-853e-3db1-54a65435980b@gmail.com>
 <tuhodhp3lgl9ukvnb5ncuecs3i03qmqa38@4ax.com>
 <75553f79-af0e-6fe1-7f6b-9c2e7f509c72@wichmann.us>
Message-ID: <00d101d89efb$f9a91fa0$ecfb5ee0$@gmail.com>

Dumb Question.

Every damn language I have done so-called object-oriented programming in
DOES IT DIFFERENT.

Some have not quite proper objects and use tricks to fake it to a point.
Some have very precise objects you can only access parts of  in certain
ways, if at all, and others are so free-for all that it takes ingenuity to
hide things so a user can not get around you and so on.

If you had a book on generic object-oriented techniques and then saw Python
or R or JAVA and others, what would their experience be?

And I thing things do not always exist in a vacuum. Even when writing a
program that uses OO I also use functional methods, recursion and anything
else I feel like. Just learning OO may leave them stranded in Python!

-----Original Message-----
From: Tutor <tutor-bounces+avi.e.gross=gmail.com at python.org> On Behalf Of
Mats Wichmann
Sent: Saturday, July 23, 2022 3:27 PM
To: tutor at python.org
Subject: Re: [Tutor] Volunteer teacher

On 7/23/22 13:14, Dennis Lee Bieber wrote:
> On Sat, 23 Jul 2022 04:53:22 -0500, Leam Hall <leamhall at gmail.com> 
> declaimed the following:
> 
>>
>> For the latter, Python Object Oriented Programming
(https://www.amazon.com/Python-Object-Oriented-Programming-maintainable-obje
ct-oriented/dp/1801077266).
>>
> 
> 	<snicker>
> 
> 	A rather unfortunate name... Acronym POOP... "Object Oriented 
> Programming in Python" avoids such <G>
> 
> 	Of course -- my view is that, if one is going to focus on OOP, one 
> should precede it with an introduction to a language-neutral OOAD
textbook.


Maybe... I haven't looked at one for so long, but I'd worry that they'd nod
too much to existing implementations like Java which enforce a rather
idiotic "everything must be a class even if it isn't, like your
main() routine".

_______________________________________________
Tutor maillist  -  Tutor at python.org
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor


From bouncingcats at gmail.com  Sun Jul 24 09:02:09 2022
From: bouncingcats at gmail.com (David)
Date: Sun, 24 Jul 2022 23:02:09 +1000
Subject: [Tutor] Volunteer teacher
In-Reply-To: <00d101d89efb$f9a91fa0$ecfb5ee0$@gmail.com>
References: <CAEFLybLsOF9j-dV5EfQkCVRRVgYyvfGQMKYhu_23FkLR-+iE0w@mail.gmail.com>
 <a1f1f3d5-9385-853e-3db1-54a65435980b@gmail.com>
 <tuhodhp3lgl9ukvnb5ncuecs3i03qmqa38@4ax.com>
 <75553f79-af0e-6fe1-7f6b-9c2e7f509c72@wichmann.us>
 <00d101d89efb$f9a91fa0$ecfb5ee0$@gmail.com>
Message-ID: <CAMPXz=qmWiMSrce5ZfVugQ_bCDiW=os_=FU-=Ybw=1-i-jM3Xg@mail.gmail.com>

On Sat, 23 Jul 2022 at 09:25, trent shipley <trent.shipley at gmail.com> wrote:
>
> I've volunteered to do some informal Python teaching.
>
> What are some useful online resources and tutorials?

The topic of discussion seems to be drifting away from the
question asked.

This is the Tutor list. As I understand it, the point of this list
is to respond to questions.

Out of consideration for the OP, I am repeating the question
that they asked, in the hope that it might focus replies
towards addressing the OP questions.

From alan.gauld at yahoo.co.uk  Sun Jul 24 09:04:57 2022
From: alan.gauld at yahoo.co.uk (Alan Gauld)
Date: Sun, 24 Jul 2022 14:04:57 +0100
Subject: [Tutor] Pair 'o dimes wuz: Volunteer teacher
In-Reply-To: <9bff8b0e-bc7d-d26a-738a-1d402748ac1d@gmail.com>
References: <CAEFLybLsOF9j-dV5EfQkCVRRVgYyvfGQMKYhu_23FkLR-+iE0w@mail.gmail.com>
 <a1f1f3d5-9385-853e-3db1-54a65435980b@gmail.com>
 <tuhodhp3lgl9ukvnb5ncuecs3i03qmqa38@4ax.com>
 <2118ecff-c37f-c4cf-5f1e-1b706c500c06@gmail.com>
 <0ef4cf34-21fe-9af1-48d6-c6bc0c699d7a@yahoo.co.uk>
 <5482fe87-c7a4-db6d-d490-c89b83c2869e@gmail.com>
 <4da5d32b-7301-3cbf-d0a5-588cd48fd3e0@DancesWithMice.info>
 <9bff8b0e-bc7d-d26a-738a-1d402748ac1d@gmail.com>
Message-ID: <tbjg1p$i6c$1@ciao.gmane.io>

On 24/07/2022 12:23, Leam Hall wrote:

> First, Python is a lot easier to read than many languages. 

True but that only really helps when working at the detail level. It
does nothing to help with figuring out how a system works - which
functions call which other functions. How  data structures relate to
each other etc. And thats where design comes in. One of the big failures
in design is to go too deep and try to design every line of code. Design
should only go to the level where it becomes easier to write code than
design. When that stage is reached it is time to write code!

> Imagine the dev team trying to work through a spaghetti of 
> undesigned codebase, and the design person saying "Now do 
> you believe me that design is important?" 

I've been there several times. We once received about 1 million
lines of C code with no design. We had a team of guys stepping
through the code in debuggers for 6 months documenting how it
works and reverse engineering the "design". it took us 3 years
to fully document the system. But once ew did we could turn around
bugs in 24 hours and new starts could be productive within 2 days,
rather than the 2 or 3 months it took when we first got the code!

When I ran a maintenance team and we took on a new project one of
the first tasks was to review the design and if necessary update
it (or even rewrite it) in a useful style. A good design is
critical to efficient maintenance, even if it has to be retro-fitted.

> Unfortunately, we can't just open our skulls up, drop in the GoF 
> or Booch's OOAD, and magically do good design.

Absolutely true. You need to start with the basics.
Branching, loops, functions, modules, coupling v cohesion.
Separation of concerns, data hiding. Clean interface design.
Then you build up to higher level concepts like state machines,
table driven code, data driven code, data structures and
normalisation.

And even if using a book like Booch (which is very good) it
should be woked through in parallel with the language constructs.
But just as reading booch alone would be useless, so is learning
to define functions without understanding their purpose. Or
building classes without understanding why.

Learning is an iterative process. And this is especially
true with software. But you need to understand the why
and how equally. Learning how without knowing why leads to bad
programming practices - global variables, mixing logic and
display, tight coupling  etc.

> Once we have the basics, hopefully a mentor shows up 

Ideally we have the mentor in place before we even look at the basics.
Even the basics can be baffling without guidance.
I've see so many beginners completely baffled by a line like:

x = x + 1

It makes no sense to a non-programmer. It is mathematical nonsense!
(Its also my many languages have a distince assignment operator:

x := x+1

is much easier to assimilate....

> We agree that good design is good. My opinion, even if it's mine alone, 
> is that design is not the first thing to learn.

I dont think I'm arguing for design as a skill - certainly not
things like UML or SSADM or even flow charts. But rather the
rationale behind programming constructs. Why do we have loops?
Why so many of them? And why are functions useful? Why not just
cut 'n paste the code?

For OOP it's about explaining why we want to use classes/objects.
What does an OOP program look like - "a set of objects communicating
by messages". How do we send a message from one object to another?
What happens when an object receives a message? What method does
the receiver choose to fulfil the message request? How does
the receiver reply to the requestor? These ideas can then be
translated/demonstrated in the preferred language.

A decent OOAD book will describe those concepts better than a
programming language tutorial in my experience. (Again, the
first section of Booch with its famous cat cartoons is very
good at that)

-- 
Alan G
Author of the Learn to Program web site
http://www.alan-g.me.uk/
http://www.amazon.com/author/alan_gauld
Follow my photo-blog on Flickr at:
http://www.flickr.com/photos/alangauldphotos


From alan.gauld at yahoo.co.uk  Sun Jul 24 09:15:25 2022
From: alan.gauld at yahoo.co.uk (Alan Gauld)
Date: Sun, 24 Jul 2022 14:15:25 +0100
Subject: [Tutor] Volunteer teacher
In-Reply-To: <00d101d89efb$f9a91fa0$ecfb5ee0$@gmail.com>
References: <CAEFLybLsOF9j-dV5EfQkCVRRVgYyvfGQMKYhu_23FkLR-+iE0w@mail.gmail.com>
 <a1f1f3d5-9385-853e-3db1-54a65435980b@gmail.com>
 <tuhodhp3lgl9ukvnb5ncuecs3i03qmqa38@4ax.com>
 <75553f79-af0e-6fe1-7f6b-9c2e7f509c72@wichmann.us>
 <00d101d89efb$f9a91fa0$ecfb5ee0$@gmail.com>
Message-ID: <tbjgld$qis$1@ciao.gmane.io>

On 24/07/2022 02:23, avi.e.gross at gmail.com wrote:
> Dumb Question.
> 
> Every damn language I have done so-called object-oriented programming in
> DOES IT DIFFERENT.

Of course, because OOP is not a language feature. Languages implement
tools to facilitate OOP. And each language designer will have different
ideas about which features of OOP need support and how best to provide
that. In some it will be by classes, in others actors, in others
prototyping. Some will try to make OOP look like existing procedural
code where others will create a special syntax specifically for objects.

> If you had a book on generic object-oriented techniques and then saw Python
> or R or JAVA and others, what would their experience be?

That's what happens every time I meet a new language. I look
to see how that language implements the concepts of OOP.

> And I thing things do not always exist in a vacuum. Even when writing a
> program that uses OO I also use functional methods, recursion and anything
> else I feel like. Just learning OO may leave them stranded in Python!

OOP doesn't preclude these other programming techniques.
OOP is a design idiom that allows for any style of lower
level coding. (What is more difficult is taking a high level
functional design and introducing OOP into that - those
two things don't blend well at all!)

I've also never succeeded in doing OOP in Prolog.
Maybe somebody has done it, but it beats me! I've also
never felt quite comfortable shoe-horning objects into SQL
despite the alleged support for the OOP concepts of some
database systems/vendors...

-- 
Alan G
Author of the Learn to Program web site
http://www.alan-g.me.uk/
http://www.amazon.com/author/alan_gauld
Follow my photo-blog on Flickr at:
http://www.flickr.com/photos/alangauldphotos


From bouncingcats at gmail.com  Sun Jul 24 09:18:39 2022
From: bouncingcats at gmail.com (David)
Date: Sun, 24 Jul 2022 23:18:39 +1000
Subject: [Tutor] Volunteer teacher
In-Reply-To: <CAEFLybLsOF9j-dV5EfQkCVRRVgYyvfGQMKYhu_23FkLR-+iE0w@mail.gmail.com>
References: <CAEFLybLsOF9j-dV5EfQkCVRRVgYyvfGQMKYhu_23FkLR-+iE0w@mail.gmail.com>
Message-ID: <CAMPXz=p2C8nkY-EubN8Dpkv0SM96Ua--UykgcySGpukbxdaFjw@mail.gmail.com>

On Sat, 23 Jul 2022 at 09:25, trent shipley <trent.shipley at gmail.com> wrote:
>
> I've volunteered to do some informal Python teaching.
>
> What are some useful online resources and tutorials?

Hi Trent,

I don't have much knowledge of this area, but one site
that I have noticed seems to be good quality is realpython.com.
It looks like most of their tutorials require a fee to be paid, but
some of them do not and might give you some ideas.
Their basic courses are here:
  https://realpython.com/tutorials/basics/
An example free one that might give some ideas:
  https://realpython.com/python-dice-roll/

From avi.e.gross at gmail.com  Sun Jul 24 12:35:58 2022
From: avi.e.gross at gmail.com (avi.e.gross at gmail.com)
Date: Sun, 24 Jul 2022 12:35:58 -0400
Subject: [Tutor] Volunteer teacher
In-Reply-To: <CAMPXz=qmWiMSrce5ZfVugQ_bCDiW=os_=FU-=Ybw=1-i-jM3Xg@mail.gmail.com>
References: <CAEFLybLsOF9j-dV5EfQkCVRRVgYyvfGQMKYhu_23FkLR-+iE0w@mail.gmail.com>
 <a1f1f3d5-9385-853e-3db1-54a65435980b@gmail.com>
 <tuhodhp3lgl9ukvnb5ncuecs3i03qmqa38@4ax.com>
 <75553f79-af0e-6fe1-7f6b-9c2e7f509c72@wichmann.us>
 <00d101d89efb$f9a91fa0$ecfb5ee0$@gmail.com>
 <CAMPXz=qmWiMSrce5ZfVugQ_bCDiW=os_=FU-=Ybw=1-i-jM3Xg@mail.gmail.com>
Message-ID: <004801d89f7b$74b7dc70$5e279550$@gmail.com>

dAVId,

That happens quite often when a topic comes up and people react to it.

You are correct that the original request was about resources for teaching a
subset of python.

It is then quite reasonable to ask what exactly the purpose was as the right
materials need to be chosen based on the level of the students before they
enter, and various other goals.

My view is that it is harder to appreciate the advantages or uses of
object-oriented styles without at least some idea of the alternatives and
showing why it is better. 

I go back to a time it was common to work with multiple variables such as
one array to hold lots of names and another for birthdays and another for
salaries and how it became advantageous to collect them in a new structure
called a "structure" so they could be moved around as a unit and not
accidentally get messed up if you deleted from one array but not another,
for example. 

Later "classes" were build atop such structures by adding additional layers
such as limiting access to the internals while adding member functions that
were specific to the needs and so on. 

But I will stop my digression here. I have no specific materials to offer.

AVI

-----Original Message-----
From: Tutor <tutor-bounces+avi.e.gross=gmail.com at python.org> On Behalf Of
David
Sent: Sunday, July 24, 2022 9:02 AM
To: Python Tutor <tutor at python.org>
Subject: Re: [Tutor] Volunteer teacher

On Sat, 23 Jul 2022 at 09:25, trent shipley <trent.shipley at gmail.com> wrote:
>
> I've volunteered to do some informal Python teaching.
>
> What are some useful online resources and tutorials?

The topic of discussion seems to be drifting away from the question asked.

This is the Tutor list. As I understand it, the point of this list is to
respond to questions.

Out of consideration for the OP, I am repeating the question that they
asked, in the hope that it might focus replies towards addressing the OP
questions.
_______________________________________________
Tutor maillist  -  Tutor at python.org
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor


From avi.e.gross at gmail.com  Sun Jul 24 13:00:40 2022
From: avi.e.gross at gmail.com (avi.e.gross at gmail.com)
Date: Sun, 24 Jul 2022 13:00:40 -0400
Subject: [Tutor] Pair 'o dimes wuz: Volunteer teacher
In-Reply-To: <tbjbgc$gbo$1@ciao.gmane.io>
References: <CAEFLybLsOF9j-dV5EfQkCVRRVgYyvfGQMKYhu_23FkLR-+iE0w@mail.gmail.com>
 <a1f1f3d5-9385-853e-3db1-54a65435980b@gmail.com>
 <tuhodhp3lgl9ukvnb5ncuecs3i03qmqa38@4ax.com>
 <2118ecff-c37f-c4cf-5f1e-1b706c500c06@gmail.com>
 <0ef4cf34-21fe-9af1-48d6-c6bc0c699d7a@yahoo.co.uk>
 <5482fe87-c7a4-db6d-d490-c89b83c2869e@gmail.com> <tbjbgc$gbo$1@ciao.gmane.io>
Message-ID: <005301d89f7e$e8066ae0$b81340a0$@gmail.com>

At the risk of not answering the question, I am responding to the thoughts
others express here.

Object-Oriented programming is a fairly meaningless idea as it means a
little bit of everything with some people focusing on only some parts. Other
ideas like functional programming have similar aspects.

As Alan pointed out, one major goal is to solve a much larger problem pretty
much in smaller chunks that can each be understood or made by a small team.
But it does not end there as implementations have added back too much
complexity in the name of generality and things like multiple inheritance
can make it a huge challenge to figure out what is inherited in some
programs.

I would beware books and resources that have an almost religious orientation
towards a particular use of things. The reality is that most good programs
should not use any one style but mix and match them as makes sense. If you
never expect to use it in more than one place, why create a new class to
encapsulate something just so you can say it is object oriented. I have seen
students make a class with a single variable within it and several instance
functions that simply set or get the content and NOTHING ELSE. True, you may
later want to expand on that and add functionality but why add overhead just
in case when you can change it later?

I have seen books that brag about how you can do pretty much anything using
recursion. But why? Why would I even want to decide if a billion is more or
less than 2 billion by recursively subtracting one from each of two
arguments until one or the other hits zero?

If your goal is to teach the principles of object-oriented approaches, in
the abstract, you still end up using a form of pseudo-code. Picking an
actual language can be helpful to show some instantiations. And python can
be a good choice alongside many others and also can be a bad choice. It now
supports pretty much everything as an object including the number 42.

I did a quick search for "object oriented programming in python" and then
substituted "java", "c++", "R" and others. There aplenty of such matches and
the big question is which fits YOUR class and needs.


From mayoadams at gmail.com  Sun Jul 24 17:10:21 2022
From: mayoadams at gmail.com (Mayo Adams)
Date: Sun, 24 Jul 2022 17:10:21 -0400
Subject: [Tutor] Volunteer teacher
In-Reply-To: <004801d89f7b$74b7dc70$5e279550$@gmail.com>
References: <CAEFLybLsOF9j-dV5EfQkCVRRVgYyvfGQMKYhu_23FkLR-+iE0w@mail.gmail.com>
 <a1f1f3d5-9385-853e-3db1-54a65435980b@gmail.com>
 <tuhodhp3lgl9ukvnb5ncuecs3i03qmqa38@4ax.com>
 <75553f79-af0e-6fe1-7f6b-9c2e7f509c72@wichmann.us>
 <00d101d89efb$f9a91fa0$ecfb5ee0$@gmail.com>
 <CAMPXz=qmWiMSrce5ZfVugQ_bCDiW=os_=FU-=Ybw=1-i-jM3Xg@mail.gmail.com>
 <004801d89f7b$74b7dc70$5e279550$@gmail.com>
Message-ID: <CALaREKYp4w=0pnb2QSnp47AKDNjkq+RNZCk7D7JDh7jYG6HQNQ@mail.gmail.com>

It happens quite often, and that is not in itself a reason to let it pass.

On Sun, Jul 24, 2022 at 2:00 PM <avi.e.gross at gmail.com> wrote:

> dAVId,
>
> That happens quite often when a topic comes up and people react to it.
>
> You are correct that the original request was about resources for teaching
> a
> subset of python.
>
> It is then quite reasonable to ask what exactly the purpose was as the
> right
> materials need to be chosen based on the level of the students before they
> enter, and various other goals.
>
> My view is that it is harder to appreciate the advantages or uses of
> object-oriented styles without at least some idea of the alternatives and
> showing why it is better.
>
> I go back to a time it was common to work with multiple variables such as
> one array to hold lots of names and another for birthdays and another for
> salaries and how it became advantageous to collect them in a new structure
> called a "structure" so they could be moved around as a unit and not
> accidentally get messed up if you deleted from one array but not another,
> for example.
>
> Later "classes" were build atop such structures by adding additional layers
> such as limiting access to the internals while adding member functions that
> were specific to the needs and so on.
>
> But I will stop my digression here. I have no specific materials to offer.
>
> AVI
>
> -----Original Message-----
> From: Tutor <tutor-bounces+avi.e.gross=gmail.com at python.org> On Behalf Of
> David
> Sent: Sunday, July 24, 2022 9:02 AM
> To: Python Tutor <tutor at python.org>
> Subject: Re: [Tutor] Volunteer teacher
>
> On Sat, 23 Jul 2022 at 09:25, trent shipley <trent.shipley at gmail.com>
> wrote:
> >
> > I've volunteered to do some informal Python teaching.
> >
> > What are some useful online resources and tutorials?
>
> The topic of discussion seems to be drifting away from the question asked.
>
> This is the Tutor list. As I understand it, the point of this list is to
> respond to questions.
>
> Out of consideration for the OP, I am repeating the question that they
> asked, in the hope that it might focus replies towards addressing the OP
> questions.
> _______________________________________________
> Tutor maillist  -  Tutor at python.org
> To unsubscribe or change subscription options:
> https://mail.python.org/mailman/listinfo/tutor
>
> _______________________________________________
> Tutor maillist  -  Tutor at python.org
> To unsubscribe or change subscription options:
> https://mail.python.org/mailman/listinfo/tutor
>


-- 

Mayo Adams

From alan.gauld at yahoo.co.uk  Mon Jul 25 07:10:04 2022
From: alan.gauld at yahoo.co.uk (Alan Gauld)
Date: Mon, 25 Jul 2022 12:10:04 +0100
Subject: [Tutor] Volunteer teacher
In-Reply-To: <CALaREKYp4w=0pnb2QSnp47AKDNjkq+RNZCk7D7JDh7jYG6HQNQ@mail.gmail.com>
References: <CAEFLybLsOF9j-dV5EfQkCVRRVgYyvfGQMKYhu_23FkLR-+iE0w@mail.gmail.com>
 <a1f1f3d5-9385-853e-3db1-54a65435980b@gmail.com>
 <tuhodhp3lgl9ukvnb5ncuecs3i03qmqa38@4ax.com>
 <75553f79-af0e-6fe1-7f6b-9c2e7f509c72@wichmann.us>
 <00d101d89efb$f9a91fa0$ecfb5ee0$@gmail.com>
 <CAMPXz=qmWiMSrce5ZfVugQ_bCDiW=os_=FU-=Ybw=1-i-jM3Xg@mail.gmail.com>
 <004801d89f7b$74b7dc70$5e279550$@gmail.com>
 <CALaREKYp4w=0pnb2QSnp47AKDNjkq+RNZCk7D7JDh7jYG6HQNQ@mail.gmail.com>
Message-ID: <tbltmd$52v$1@ciao.gmane.io>

On 24/07/2022 22:10, Mayo Adams wrote:
> It happens quite often, and that is not in itself a reason to let it pass.

We should always strive to answer the original question.

However, a group like this is designed to foster wider
understanding of programming issues, especially with
regard to Python, so side discussions are not only
permitted but encouraged.

Although the OP is looking for a specific answer, the
hundreds of other readers are often interested in the
other, wider, issues raised. It is the nature of public
fora and mailing lists.

-- 
Alan G
Author of the Learn to Program web site
http://www.alan-g.me.uk/
http://www.amazon.com/author/alan_gauld
Follow my photo-blog on Flickr at:
http://www.flickr.com/photos/alangauldphotos


From avi.e.gross at gmail.com  Mon Jul 25 14:23:54 2022
From: avi.e.gross at gmail.com (avi.e.gross at gmail.com)
Date: Mon, 25 Jul 2022 14:23:54 -0400
Subject: [Tutor] teaching for purpose
Message-ID: <007501d8a053$b3033050$190990f0$@gmail.com>

The recent discussion inspires me to open a new thread.

 
It is about teaching a topic like python or kinds of programming it supports
and doing it for a purpose in a way that does not overwhelm students but
helps them understand the motivation and use of features.

 
The recent topic was OOP and my thought was in several directions.

 
One was bottom-up. Show what the world was like when you had individual
variables only (meaning memory locations of sorts) and then some sort of
array data structure was added so you can ask for var[5] and then if you
wanted to store info about a group of say employees, you had one array for
each attribute like name, salary, age. So a user could be identified as
name[5] and salary[5] and age[5].

 
Then show what a pain it was to say fire a user or move them elsewhere
without remembering to make changes all over the place.

 
So you show the invention of structures like the C struct and how they now
hold an employee together. Then you show how functions that operate on
structs had to know what it was in order to be effective and general and
that enhancing a struct to contain additional member functions and other
features like inheritance, could lead to a concept where an object was
central. Yes, there are more rungs on this ladder but this in enough for my
purpose.

 
The other way is to work more from top-down. Show them some complex network
they all use and say this could not easily be build all at once, or
efficiently, if it was not seen as parts and then parts within parts, with
some parts re-used as needed. At the level of the programming language, the
parts can be made somewhat smart and with goals if designed well. If you
want to be able to process a container holding many kinds of parts, you may
want each part to know how to do several things to itself so if you want all
parts to be able to print some version of what they contain, you want to say
object.printYourSelf() and each will do it their own way. You may want to
invent protocols and insist that any object that wishes to be considered a
member of this protocol, must implement the methods needed by the protocol
and thus you can pass a collection of such objects to anything that wants
the protocol to be used without worrying about errors if the wrong kind of
object is there too.

 
Again, I hope you get the idea. You can lead up to objects from basics or
lead down as you deconstruct. Of course there are many other ways.

 
I mention this because I personally find some features almost meaningless
without the right context. Consider any GUI such as a browser where events
seem to happen asynchronously depending on movements of the mouse, clicks,
key-presses, and even events coming in from outside. Various parts of the
program are not RUN in a deterministic manner but must pop into being, do a
few things, and go away till needed again. In a sense, you set event
listeners and attach bits of code such as functions or perhaps a method call
for any object. In such an environment, all kinds of things get interesting
and choices for implementation that do things in what may seem odd or
indirect ways, suddenly may make some sense.

 
So imagine first building an almost empty object that holds nothing and has
a very few methods. Why would anyone want this? A while later you show how
to make one and then more objects that inherit from the same object but
BECAUSE they share a base, they have something in common, including in some
languages the ability to be in a collection that insists everything be the
SAME. Then you show how an object can be made easier and faster if it is
only a bit different than another or by combining several others.

 
At some point you want to make sure the students get the ideas behind it,
not just some silly localized syntax useful only in that language.

 
Once you have an idea of purpose, you may show them that others had similar
purposes in mind but chose other ideas. Take the five or more ways python
allows formatting text. As soon as you realize they are all variants with
some similarities in purpose, you can more easily classify them and not
struggle so much as to why anyone would do this. At least that is how my
mind works.

 
What I find helpful in motivating using objects when they really are NOT
NEEDED is when it is explained that part of the purpose was to make a set of
tools that work well together. Modules in python like sklearn stopped
looking weird after I got that. I mean you can easily make functions for
yourself (well, maybe not easily) that implement taking some data and return
N clusters that are mostly closer within each cluster to some centroid than
to other clusters. 

 
So why ask the user to set a variable to a new object of some sort, then
repeatably ask the object to do things to itself like accept data or add to
existing data, set various internal attributes like values telling it to be
fast or accurate or which technique to use, or to transform the data by
normalizing it some way, and run analyses and supply aspects of what it came
up with or predict now new data would be transformed by the internal model?
This does not, at first, seem necessary or at all useful.

 
But consider how this scales up if say you want to analyze many files of
data and do some comparisons. Each file can be held by it's own object and
the objects can be kept in something like a list or matrix  and can be used
almost interchangeably with other objects that implement very different ways
to analyze the data if they all take the same overall set of commands, as
appropriate. All may be called, say, with new data, and asked to predict
results based on previous learning steps. 

 
The point in all the above, which may easily be dismissed by the volume, is
that I think part of learning is a mix of ideas in the abstract as well as
some really concrete programming along the lines of prototyping. Learning
Object Oriented Programming in the abstract may seem like a good idea,
albeit implementations vary greatly. But knowing WHY some people developed
those ideas and what they were meant to improve or handle, .

 
I shudder sometimes when a functional programming person tries to sell me on
using closures to retain values and so on, until I realize that in a sense,
it overlaps object-oriented programming. Or does it?

 
From avi.e.gross at gmail.com  Mon Jul 25 15:58:26 2022
From: avi.e.gross at gmail.com (avi.e.gross at gmail.com)
Date: Mon, 25 Jul 2022 15:58:26 -0400
Subject: [Tutor] Volunteer teacher
In-Reply-To: <tbjgld$qis$1@ciao.gmane.io>
References: <CAEFLybLsOF9j-dV5EfQkCVRRVgYyvfGQMKYhu_23FkLR-+iE0w@mail.gmail.com>
 <a1f1f3d5-9385-853e-3db1-54a65435980b@gmail.com>
 <tuhodhp3lgl9ukvnb5ncuecs3i03qmqa38@4ax.com>
 <75553f79-af0e-6fe1-7f6b-9c2e7f509c72@wichmann.us>
 <00d101d89efb$f9a91fa0$ecfb5ee0$@gmail.com> <tbjgld$qis$1@ciao.gmane.io>
Message-ID: <011301d8a060$e7b62c50$b72284f0$@gmail.com>

Alan,

I can't say the FIRST thing I look at in a language is OOP. LOL!

I stopped counting languages long ago and many did not have anything like
OOP then, albeit some may have aspects of it now, if they are still being
used.

What I look at depends on my needs and how it is being learned. I consider
life to be the accumulation of lots of little tricks and techniques that
often allow you to use them to solve many kinds of problems, often in
association with others. In my UNIX days back at Bell labs, (and still in
many places like LINUX distributions) there actually were sets of tools
ranging from echo and cat to cut and sed and grep and awk that could be
combined in pipelines to do all kinds of rapid prototyping and, if it became
commonly used and important, might be rewritten in more  than one-liners in
C or awk or PERL or whatever.

There are many levels of mental tools you can use including some that try to
guess which of many tools may be best suited (in combination with others) to
get a task done. Smart people are often not so much more gifted overall than
those who strive to learn and master some simple tools and find ways to
combine them in a semi-optimal way to get jobs done. They do not always have
to re-invent the wheel!

So when I start a new language, I start by getting a feel for the language
so I can compare and contrast it with others and see if there are some
purposes it is designed for and some it is very much NOT designed for and a
royal pain. If so, I look to see if add-ons help.

If you look at Python, it has some glaring gaps as they considered
heterogeneous lists to be the most abstract and neglected to make something
basic in other languages like an array/vector. Hence, many people just load
numpy and perhaps pandas and go on from there and suddenly python is
friendly for doing things with objects like dataframes that depend on
columns having a uniform nature. Perhaps amusingly is how a language like R
where a data.frame was a strictly-enforced list of vectors, now allows list
columns as well since allowing varied content can sometimes be useful.
Languages start with certain premises but some evolve.

I do not know why people say python is simple. Compared to what? It may well
be that the core has some simplicity but nothing that supports so many ways
of programming as well as many ways of doing similar things, can be simple.
Not even debatable. It may be that the deliberate use of indentation rather
than braces, for grouping, makes it look simpler. I think the opposite and
my own code in other languages has lots of such indentation deliberately
along with braces and other such structures. Not because it is required, but
because I like to see trends at a glance. Copying code from one place to
another is trivial and will work without reformatting, albeit tools in
editors easily can reformat it. On the other hand, copying python code can
be a mess and a source of error if you copy it to a region of different
indentation and things dangle.

So back to first appearances, I look for themes and differences. Does the
language require something special such as a semicolon to terminate an
instruction, or a colon to denote the beginning of a body of text. What are
the meanings of symbols I use elsewhere and are they different. Think of how
many differences there are in how some languages use single and double
quotes (Python uses them interchangeably and in triples) or what characters
may need to be escaped when used differently. I look at issues of scope
which vary widely. And I look for idioms, often highly compressed notation
like x //= 10 and so on.

But overall, there is a kernel in which most languages seem almost identical
except for pesky details. Or are they? Yes, everything has an IF statement,
often followed by an ELSE (upper case just for emphasis) but some use an
ELIF and others an ELSE IF and some provide alternatives like some kind of
CASE or SWITCH statement with variations  or ternary operations like ?:  or
operations designed to apply vectorized like ifelse(condition, this, that)
and so on. Lots of creative minds have come up with so many variations. You
can get looked at strangely if you end up programming in very basic tiny
steps using literal translations from your days writing in BASIC until you
get stuck with how to translate GOSUB, LOL!

In my opinion, to teach OOP using Python in a single class is an exercise in
what NOT to teach. Yes, inside an object you create there may lurk recursive
method calls or functional programming constructs or who knows what cute
method. All you care about is the damn object sorts the contents when asked
to or when it feels like it.

Do hey need to know why or how the following works?

a_sorted = [a.pop(a.index(min(a))) for _ in range(len(a))]

Or this quicksort one liner:

q = lambda l: q([x for x in l[1:] if x <= l[0]]) + [l[0]] + q([x for x in l
if x > l[0]]) if l else []

Since the above can easily be done like this more understandable way:

def quicksort(my_list):
    # recursion base case - empty list
    if not my_list:
        return []
    # first element is pivot
    pivot = my_list[0]
    # break up problem
    smaller = [x for x in my_list[1:] if x < pivot]
    greater = [x for x in my_list[1:] if x >= pivot]
    # recursively solve problem and recombine solutions
    return quicksort(smaller) + [pivot] + quicksort(greater)

The goal is to let them study more python on their own when they feel like
it but focus in on OOP in general, unless that is not the full purpose of
the course.

I actually enjoy courses at times that are heterogeneous and show dozens of
ways to solve a particular problem using lots of sides of a language. This
forum often gets a question answered many different ways. But a focused
course is best not pushed off the track. After all, a major focus on OOP is
to hide how it is done and to allow existing objects to change how they do
it as long as the outside view is the same. As an example, an object could
re-sort every time an item is added or perhaps changed or deleted, but it
could also NOT do that but mark the fact that the data is not currently
sorted, and any attempt to use the data would notice that and sort it before
handing back anything. In some cases, the latter approach may be more
efficient. But the user rarely knows or cares what happens as long as it
happens as expected from the outside of a black box.


-----Original Message-----
From: Tutor <tutor-bounces+avi.e.gross=gmail.com at python.org> On Behalf Of
Alan Gauld via Tutor
Sent: Sunday, July 24, 2022 9:15 AM
To: tutor at python.org
Subject: Re: [Tutor] Volunteer teacher

On 24/07/2022 02:23, avi.e.gross at gmail.com wrote:
> Dumb Question.
> 
> Every damn language I have done so-called object-oriented programming 
> in DOES IT DIFFERENT.

Of course, because OOP is not a language feature. Languages implement tools
to facilitate OOP. And each language designer will have different ideas
about which features of OOP need support and how best to provide that. In
some it will be by classes, in others actors, in others prototyping. Some
will try to make OOP look like existing procedural code where others will
create a special syntax specifically for objects.

> If you had a book on generic object-oriented techniques and then saw 
> Python or R or JAVA and others, what would their experience be?

That's what happens every time I meet a new language. I look to see how that
language implements the concepts of OOP.

> And I thing things do not always exist in a vacuum. Even when writing 
> a program that uses OO I also use functional methods, recursion and 
> anything else I feel like. Just learning OO may leave them stranded in
Python!

OOP doesn't preclude these other programming techniques.
OOP is a design idiom that allows for any style of lower level coding. (What
is more difficult is taking a high level functional design and introducing
OOP into that - those two things don't blend well at all!)

I've also never succeeded in doing OOP in Prolog.
Maybe somebody has done it, but it beats me! I've also never felt quite
comfortable shoe-horning objects into SQL despite the alleged support for
the OOP concepts of some database systems/vendors...

--
Alan G
Author of the Learn to Program web site
http://www.alan-g.me.uk/
http://www.amazon.com/author/alan_gauld
Follow my photo-blog on Flickr at:
http://www.flickr.com/photos/alangauldphotos


_______________________________________________
Tutor maillist  -  Tutor at python.org
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor


From avi.e.gross at gmail.com  Mon Jul 25 17:09:54 2022
From: avi.e.gross at gmail.com (avi.e.gross at gmail.com)
Date: Mon, 25 Jul 2022 17:09:54 -0400
Subject: [Tutor] Pair 'o dimes wuz: Volunteer teacher
In-Reply-To: <tbjg1p$i6c$1@ciao.gmane.io>
References: <CAEFLybLsOF9j-dV5EfQkCVRRVgYyvfGQMKYhu_23FkLR-+iE0w@mail.gmail.com>
 <a1f1f3d5-9385-853e-3db1-54a65435980b@gmail.com>
 <tuhodhp3lgl9ukvnb5ncuecs3i03qmqa38@4ax.com>
 <2118ecff-c37f-c4cf-5f1e-1b706c500c06@gmail.com>
 <0ef4cf34-21fe-9af1-48d6-c6bc0c699d7a@yahoo.co.uk>
 <5482fe87-c7a4-db6d-d490-c89b83c2869e@gmail.com>
 <4da5d32b-7301-3cbf-d0a5-588cd48fd3e0@DancesWithMice.info>
 <9bff8b0e-bc7d-d26a-738a-1d402748ac1d@gmail.com> <tbjg1p$i6c$1@ciao.gmane.io>
Message-ID: <016401d8a06a$e3a8fcf0$aafaf6d0$@gmail.com>

Alan,

Good points. I find a major reason for a published design is to highlight
easily what CANNOT and SHOULD NOT be done.

Too often people ask for new features or other changes without knowing (or
caring) if it can be done trivially, or not at all, or perhaps would require
a set of new designs/requirements followed by a complete rewrite, perhaps in
another language.

It can be something as simple as pointing out how the code has a function
that takes TWO arguments and the new feature would require adding a third.
In some languages that can be as simple and in others it might mean
searching all existing code and adding some harmless third argument for all
cases that do not want or need it, and recompiling everything in sight and
hoping you did not miss anything or break something else. Ditto for making
one argument optional but with a default.

Now in python, some things like this may be easier to change. But my point
is asking a program to do something it was not designed to do is easier to
refuse to accept when you can show how it clashes with the design. Yes, they
can still ask for it, but cannot expect to get it soooooon.


-----Original Message-----
From: Tutor <tutor-bounces+avi.e.gross=gmail.com at python.org> On Behalf Of
Alan Gauld via Tutor
Sent: Sunday, July 24, 2022 9:05 AM
To: tutor at python.org
Subject: Re: [Tutor] Pair 'o dimes wuz: Volunteer teacher

On 24/07/2022 12:23, Leam Hall wrote:

> First, Python is a lot easier to read than many languages. 

True but that only really helps when working at the detail level. It does
nothing to help with figuring out how a system works - which functions call
which other functions. How  data structures relate to each other etc. And
thats where design comes in. One of the big failures in design is to go too
deep and try to design every line of code. Design should only go to the
level where it becomes easier to write code than design. When that stage is
reached it is time to write code!

> Imagine the dev team trying to work through a spaghetti of undesigned 
> codebase, and the design person saying "Now do you believe me that 
> design is important?"

I've been there several times. We once received about 1 million lines of C
code with no design. We had a team of guys stepping through the code in
debuggers for 6 months documenting how it works and reverse engineering the
"design". it took us 3 years to fully document the system. But once ew did
we could turn around bugs in 24 hours and new starts could be productive
within 2 days, rather than the 2 or 3 months it took when we first got the
code!

When I ran a maintenance team and we took on a new project one of the first
tasks was to review the design and if necessary update it (or even rewrite
it) in a useful style. A good design is critical to efficient maintenance,
even if it has to be retro-fitted.

> Unfortunately, we can't just open our skulls up, drop in the GoF or 
> Booch's OOAD, and magically do good design.

Absolutely true. You need to start with the basics.
Branching, loops, functions, modules, coupling v cohesion.
Separation of concerns, data hiding. Clean interface design.
Then you build up to higher level concepts like state machines, table driven
code, data driven code, data structures and normalisation.

And even if using a book like Booch (which is very good) it should be woked
through in parallel with the language constructs.
But just as reading booch alone would be useless, so is learning to define
functions without understanding their purpose. Or building classes without
understanding why.

Learning is an iterative process. And this is especially true with software.
But you need to understand the why and how equally. Learning how without
knowing why leads to bad programming practices - global variables, mixing
logic and display, tight coupling  etc.

> Once we have the basics, hopefully a mentor shows up

Ideally we have the mentor in place before we even look at the basics.
Even the basics can be baffling without guidance.
I've see so many beginners completely baffled by a line like:

x = x + 1

It makes no sense to a non-programmer. It is mathematical nonsense!
(Its also my many languages have a distince assignment operator:

x := x+1

is much easier to assimilate....

> We agree that good design is good. My opinion, even if it's mine 
> alone, is that design is not the first thing to learn.

I dont think I'm arguing for design as a skill - certainly not things like
UML or SSADM or even flow charts. But rather the rationale behind
programming constructs. Why do we have loops?
Why so many of them? And why are functions useful? Why not just cut 'n paste
the code?

For OOP it's about explaining why we want to use classes/objects.
What does an OOP program look like - "a set of objects communicating by
messages". How do we send a message from one object to another?
What happens when an object receives a message? What method does the
receiver choose to fulfil the message request? How does the receiver reply
to the requestor? These ideas can then be translated/demonstrated in the
preferred language.

A decent OOAD book will describe those concepts better than a programming
language tutorial in my experience. (Again, the first section of Booch with
its famous cat cartoons is very good at that)

--
Alan G
Author of the Learn to Program web site
http://www.alan-g.me.uk/
http://www.amazon.com/author/alan_gauld
Follow my photo-blog on Flickr at:
http://www.flickr.com/photos/alangauldphotos


_______________________________________________
Tutor maillist  -  Tutor at python.org
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor


From alan.gauld at yahoo.co.uk  Mon Jul 25 19:53:58 2022
From: alan.gauld at yahoo.co.uk (Alan Gauld)
Date: Tue, 26 Jul 2022 00:53:58 +0100
Subject: [Tutor] Volunteer teacher
In-Reply-To: <011301d8a060$e7b62c50$b72284f0$@gmail.com>
References: <CAEFLybLsOF9j-dV5EfQkCVRRVgYyvfGQMKYhu_23FkLR-+iE0w@mail.gmail.com>
 <a1f1f3d5-9385-853e-3db1-54a65435980b@gmail.com>
 <tuhodhp3lgl9ukvnb5ncuecs3i03qmqa38@4ax.com>
 <75553f79-af0e-6fe1-7f6b-9c2e7f509c72@wichmann.us>
 <00d101d89efb$f9a91fa0$ecfb5ee0$@gmail.com> <tbjgld$qis$1@ciao.gmane.io>
 <011301d8a060$e7b62c50$b72284f0$@gmail.com>
Message-ID: <tbnaen$nfb$1@ciao.gmane.io>

On 25/07/2022 20:58, avi.e.gross at gmail.com wrote:

> I can't say the FIRST thing I look at in a language is OOP. LOL!

I didn't say it was the first thing I looked at, just one
of the things I looked at.

Although as someone who learned to design code (and indeed analyse
problems) using OOP I probably do look at OOP features earlier
than most.

> I stopped counting languages long ago and many did not have anything like
> OOP then,

As I said in an earlier post you can do OOP in most any language.
I've used in in Assembler, COBOL and vanilla C as well as many OOPLs.


> life to be the accumulation of lots of little tricks and techniques that
> often allow you to use them to solve many kinds of problems, often in
> association with others. 

Absolutely and especially at the small scale level.
I hardly ever use OOP when I'm programming things with less than 1000
lines of code. But as the code count goes up so does the likelihood of
my using OOP. But on the little projects I will use any number of tools
from Lisp to awk to nroff and SQL.

> ranging from echo and cat to cut and sed and grep and awk that could be
> combined in pipelines

Indeed. And even in Windows NT I wrote a suite of small programs
(using awk, cut and sed from the cygwin package for NT) that
were combined together via a DOS batch file to create the
distribution media for a project which had to be installed
in over 30 sites each with very specific network configurations.
(And eventually that suite was rewritten entirely in VBA as an
Excel spreadsheet macro!)

>  rewritten in more  than one-liners in
> C or awk or PERL or whatever.

At work my main use of Python (actually Jython) was to write
little prototype objects to demonstrate concepts to our
offshore developers who then turned them into working
strength Java. [Jython had the wonderful feature of allowing
my to import our industrial strength Java code and create
objects and call Java methods from my Python prototypes.]

But I've also worked on large projects where there is no coding
done by humans at all. These are mostly real-time projects
such as telephone exchange control systems and a speech
recognition platform where the design was done using SDL
(Specification & Design Language) and the design tool
generated C code (that was all but unreadable by humans).
But if you didn't like C you could flip a configuration option
and it would spew out ADA or Modula3 instead... it didn't
matter to the developers they only worked at the SDL level.

> If you look at Python, it has some glaring gaps as they considered
> heterogeneous lists to be the most abstract and neglected to make something
> basic in other languages like an array/vector.

But that's something I've never found a need for in my 20 years
of using Python. I can fill a list with homogenous objects
as easily as with hereogenous ones. It's only when dealing with
third party tools (often written in other languages under the hood)
that the need for an array becomes useful IME. I'm not saying that
nobody has a genuine use for a strictly homogenous container,
just that I've never needed such a thing personally.

> I do not know why people say python is simple. 

Python used to be simple (compared to almost every other
general purpose language) but it is not so today. So many
bells and whistles and abstract mechanisms have been added
that Python is quite a complex language. (In the same way
C++ went from a few simple add-ons to C to being a
compeletly different, and vastly complex, animal!) Last
time I updated my programming tutorial I seriously
considered choosing a different language, but when I went
looking I could find nothing simpler that offered
all of the real-world usefulness of Python! But it is
a much more difficult language to learn now than it
was 25 years ago wen I first found it.

> It may be that the deliberate use of indentation rather
> than braces, for grouping, makes it look simpler. 

Simpler for beginners to understand. Remember Python came about
largely as a response to Guido's experience teaching programming
with ABC. So it took the features that students found helpful.
Block delimiters have always been a major cause of bugs for beginners.

> ...Copying code from one place to another is trivial 

This is one of the strongest arguments for them.
and there are others too, however, as a language deliberately
targeted at beginners (but with potential use by experts too)
Python (ie Guido) chose ease of learning over ease of copying.

> So back to first appearances, I look for themes and differences. Does the
> language require something special such as a semicolon to terminate an
> instruction, or a colon to denote the beginning of a body of text. What are
> the meanings of symbols I use elsewhere and are they different.

I confess I don't look at those kinds of things at all. They are just
part of the inevitable noise of learning a language but I don't care
whether they are there or not. I just accept it whatever it is.
But then, I rarely choose a language, I usually have it thrust upon me
as a necessity for some project or other. The only languages I've ever
chosen to teach myself are Python, Eiffel, Prolog, Haskell and Logo.
The other 20 or 30 that I know have been learned to complete some task
or other. (However I do intend to teach myself Erlang, Lua and Clojure,
and am currently studying Swift so the list will get longer!)

> a major focus on OOP is to hide how it is done 

That's not OOP, it's information hiding and predates OOP by quite
a way. But it's a good example of how language implementations
of OOP have obscured the whole point of OOP which is purely
and simply about objects communicating via messages. Information
hiding is a nice thing to include and OOP implementation details
like classes can provide a vehicle to enforce it. But its not
part of the essence of OOP.

> and to allow existing objects to change how they do
> it as long as the outside view is the same.

But that is polymorphism and a fundamental part of OOP.
The fact that you can send a message to an object and the
object itself is responsible for figuring out the method
of processing that message is absolutely key to OOP.

-- 
Alan G
Author of the Learn to Program web site
http://www.alan-g.me.uk/
http://www.amazon.com/author/alan_gauld
Follow my photo-blog on Flickr at:
http://www.flickr.com/photos/alangauldphotos


From leamhall at gmail.com  Mon Jul 25 20:10:06 2022
From: leamhall at gmail.com (Leam Hall)
Date: Mon, 25 Jul 2022 19:10:06 -0500
Subject: [Tutor] Pair 'o dimes wuz: Volunteer teacher
In-Reply-To: <tbjbgc$gbo$1@ciao.gmane.io>
References: <CAEFLybLsOF9j-dV5EfQkCVRRVgYyvfGQMKYhu_23FkLR-+iE0w@mail.gmail.com>
 <a1f1f3d5-9385-853e-3db1-54a65435980b@gmail.com>
 <tuhodhp3lgl9ukvnb5ncuecs3i03qmqa38@4ax.com>
 <2118ecff-c37f-c4cf-5f1e-1b706c500c06@gmail.com>
 <0ef4cf34-21fe-9af1-48d6-c6bc0c699d7a@yahoo.co.uk>
 <5482fe87-c7a4-db6d-d490-c89b83c2869e@gmail.com> <tbjbgc$gbo$1@ciao.gmane.io>
Message-ID: <cb3233d2-31f4-2671-953c-0a1edd6d6cf5@gmail.com>

On 7/24/22 06:47, Alan Gauld via Tutor wrote:
> On 24/07/2022 01:53, Leam Hall wrote:

>> spend a lot of resources on failed projects and no-longer-useful designs.
> 
> This is a flat out myth! The vast majority of lage scale projects
> succeed, very few are cancelled or go badly wrong (they are often
> over budget and over time, but thats only measuring against the
> initial budgets and timescales). It's just that when they do fail they
> attract a lot of attention because the cost a lot of money!
> If a $10-50k 4-man project goes belly up nobody notices. But when
> a 4 year, 1000 man, project costing $100 million goes belly up it is
> very noticeable. 

If the are over budget or over time, then the project and the design failed. You can still pour resources into something, and stretch the calendar, but I've worked for some of those Fortune 500 companies. "On time and within budget" is the extreme rarity.

> But the fact is that our modern world is run by large scale software
> projects successfully delivered by the fortune 500 companies. We just
> don't think about it and take it for granted every time we board
> a train or plane, turn on the electricity or water, collect our
> wages, etc.

Our modern world is run by large scale, complex software that is buggy and often unsupportable. When weekly or monthly patch updates may or may not work, but they are always needed, then you have another sort of failure.

(Taking this out of context, but in partial agreement)
> But you can't build 4000 man-year projects using
> agile - it's been tried and invariably winds up moving to more
> traditional methods. Usually after a lot of wasted time and money.

Yes and no. I wouldn't want to build an aircraft carrier totally with Agile methodology. However, aircraft carriers are mostly known technology, and in theory design shouldn't have too many surprises. Some software is like that; you can have a three tier application (web, middleware, db) in a myriad of variations, but the tiers are pretty standard. If you can nail down how things connect then you can keep the internals fluid and responsive.

I've been on those million dollar projects that bust the budget and the schedule, and I've seem some really good project management skills displayed during high visibility projects. Waterfall, or Big Design Up Front, in theory can work when every component is well known by all responsible teams. That is seldom, if ever, the case. That's a lot of what fueled Agile, I would guess. Re-reading Uncle Bob's history on it now.

Does tradition, and sticking to what the founding fathers meant, have a place? Again, yes and no. Both can have value, depending on the context. Neither is intrinsically evil, but neither are they universally useful. Follow tradition and stick to the intent when it helps you build better software. Find something when it causes you to fail.

Leam

-- 
Automation Engineer        (reuel.net/resume)
Scribe: The Domici War     (domiciwar.net)
General Ne'er-do-well      (github.com/LeamHall)

From alan.gauld at yahoo.co.uk  Mon Jul 25 20:31:57 2022
From: alan.gauld at yahoo.co.uk (Alan Gauld)
Date: Tue, 26 Jul 2022 01:31:57 +0100
Subject: [Tutor] teaching for purpose
In-Reply-To: <007501d8a053$b3033050$190990f0$@gmail.com>
References: <007501d8a053$b3033050$190990f0$@gmail.com>
Message-ID: <tbnclu$12hu$1@ciao.gmane.io>

n 25/07/2022 19:23, avi.e.gross at gmail.com wrote:
> The recent discussion inspires me to open a new thread.

Probably a good idea! :-)

> It is about teaching a topic like python or kinds of programming it supports
> and doing it for a purpose in a way that does not overwhelm students but
> helps them understand the motivation and use of features.

I think motivation is key here. But many of the things we have been
discussing cannot be understood by true beginners until they have the
elementary programming constructs under their belt (sequence, loop and
selection plus input/output and basic data types),

Only once you get to the level where you can introduce the creation
of functions (and in some languages records or structures) do higher
level issues like procedural v functional v OO become even
comprehensible, let alone relevant.

> One was bottom-up. Show what the world was like...
> Then show what a pain it was to say fire a user or move them elsewhere
> without remembering to make changes all over the place.

That is the usual starting point for OOP, it is certainly
where Booch goes.

> The other way is to work more from top-down. Show them some complex network
> they all use and say this could not easily be build all at once, or
> efficiently, if it was not seen as parts and then parts within parts, 

The Smalltalk community (from whence the term OOP came) took such
a high level view and discussed programming as a process of
communicating objects. And likened it to real world physical objects
like a vehicle etc. You could demontrate how a vehicle is constructed of
parts each of which is an object, then deconstruct those objects further
etc. Thus their students came to programming thinking about ibjects
fiorst and the low level "data entities" were just low-level objects,
no different to the higher level ones they already used.

Seymour Papert took a similar approach with Logo which is
not explicitly OOP focused but does share the idea of issuing
commands to code objects. He used robots(turtles) to teach the
concepts but the ideas were the same.

> program are not RUN in a deterministic manner but must pop into being, 

The may not run in a sequential manner but they most assuredly are
deterministic for any given sequence of events. And from an OOP point of
view an OOP program does not normally appear to run sequentially but
more like an event driven system. As messages are received from other
objects the receiving objects respond. Of course there must be some
initial message to start the system off but its conceptually more
like a GUI than a traditional compiler or payroll system.

> At some point you want to make sure the students get the ideas behind it,
> not just some silly localized syntax useful only in that language.

This is exactly the point I've been trying (badly) to make.
It is the conceptual understanding that is critical, the syntax
is a nice to have and will change with every language.

> ...Take the five or more ways python allows formatting text.

Hmm, yes that's one of the features of modern python I really dislike.
It's simply confusing and a result of the open-source development model.
Even with a benevolant dictator too many variations exist to do the same
job. The community should pick one, make it fit for all purposes(that
already exist) and deeprecate the others. But historic code would get
broken so we have to endure (and become familiar with) them all!
It's the nightmare of legacy code and the same reason that modern
C++ is such a mess. Java is showing signs of going the same way.

> What I find helpful in motivating using objects when they really are NOT
> NEEDED is when it is explained that part of the purpose was to make a set of
> tools that work well together.

One point that needs to be made clear to students is that code objects
are useful and don't need to be used in OOP. In fact most objects are
not used in OOP. Coding with objects is not the same as coding in OOP.
Objects are a useful extension to records/structures and can be used in
procedural code just as well as in OOP code.

It is the same with functions. You can use functions in non functional
code, and indeed the vast majority of functions are non functional in
nature. But you can't write functional code without functions. And you
can't write OOP code without objects(in the general sense, not in the
"instance of a class" sense).


> So why ask the user to set a variable to a new object of some sort, then
> repeatably ask the object to do things to itself like accept data or add to
> existing data, set various internal attributes like values telling it to be
> fast or accurate or which technique to use, or to transform the data by
> normalizing it some way, and run analyses and supply aspects of what it came
> up with or predict now new data would be transformed by the internal model?
> This does not, at first, seem necessary or at all useful.
And certainly not OOP!

> But consider how this scales up if say you want to analyze many files of
> data and do some comparisons. Each file can be held by it's own object and
> the objects can be kept in something like a list or matrix  and can be used
> almost interchangeably with other objects that implement very different ways
> to analyze the data if they all take the same overall set of commands, as
> appropriate. All may be called, say, with new data, and asked to predict
> results based on previous learning steps. 

Again, none of which is OOP.

> The point in all the above, which may easily be dismissed by the volume, is
> that I think part of learning is a mix of ideas in the abstract as well as
> some really concrete programming along the lines of prototyping. Learning
> Object Oriented Programming in the abstract may seem like a good idea,
> albeit implementations vary greatly. But knowing WHY some people developed
> those ideas and what they were meant to improve or handle, .

Learning anything in the purely abstract rarely works.
The problem with OOP is that it only makes sense in fairly
big projects, certainly much bigger than most programming
learners ever experience. Probably bigger than most university
students experience too. Arguably only where multiple programmers
are involved does it really show value - or where the requirements
are very complex.

> I shudder sometimes when a functional programming person tries to sell me on
> using closures to retain values and so on, until I realize that in a sense,
> it overlaps object-oriented programming. Or does it?

Most functional adherents view OOP as the very opposite of
good functional practice with the implicit retention of state.
I've seen arguments to suggest that instance attributes are
just special forms of closure but I'm not convinced. I tend
to view them as orthogonal views of the world that rarely
intersect (but can be usefully combined).

-- 
Alan G
Author of the Learn to Program web site
http://www.alan-g.me.uk/
http://www.amazon.com/author/alan_gauld
Follow my photo-blog on Flickr at:
http://www.flickr.com/photos/alangauldphotos


From alan.gauld at yahoo.co.uk  Mon Jul 25 21:01:15 2022
From: alan.gauld at yahoo.co.uk (Alan Gauld)
Date: Tue, 26 Jul 2022 02:01:15 +0100
Subject: [Tutor] Pair 'o dimes wuz: Volunteer teacher
In-Reply-To: <cb3233d2-31f4-2671-953c-0a1edd6d6cf5@gmail.com>
References: <CAEFLybLsOF9j-dV5EfQkCVRRVgYyvfGQMKYhu_23FkLR-+iE0w@mail.gmail.com>
 <a1f1f3d5-9385-853e-3db1-54a65435980b@gmail.com>
 <tuhodhp3lgl9ukvnb5ncuecs3i03qmqa38@4ax.com>
 <2118ecff-c37f-c4cf-5f1e-1b706c500c06@gmail.com>
 <0ef4cf34-21fe-9af1-48d6-c6bc0c699d7a@yahoo.co.uk>
 <5482fe87-c7a4-db6d-d490-c89b83c2869e@gmail.com> <tbjbgc$gbo$1@ciao.gmane.io>
 <cb3233d2-31f4-2671-953c-0a1edd6d6cf5@gmail.com>
Message-ID: <tbnecr$803$1@ciao.gmane.io>

On 26/07/2022 01:10, Leam Hall wrote:

>> succeed, very few are cancelled or go badly wrong (they are often
>> over budget and over time, but thats only measuring against the
>> initial budgets and timescales). 

> If the are over budget or over time, then the project and the design failed.

Not at all. If the requirements changed (and they invariably do) then
extra cost and time will be required. Thats true of all large projects
not just software. Large building projects like the English Channel
Tunnel went over budget and took longer than expected. It is impossible
to accurately estimate cost and timescales based on an initial
requirement spec. That is understood and accepted by anyone working on
large projects. It's true of small projects as well but the margin of
error is much smaller and theefore typically absorbed by contingecy.
But accountants don't like "contingencies" of $10 million!

But if you dream big you must expect to spend big too.

A project only fails if it doesn't deliver the benefits required in a
cost effective manner. Judging a multi-year project based on time/cost
estimates given before any work is started is foolishness.
Most large projects do deliver eventually.

> Our modern world is run by large scale, complex software |
> that is buggy and often unsupportable. 

That's not my experience. especially in safety-critical systems
like air-traffic control, power-station control systems etc. Sure there
will be bugs, even formal methods won't find all of them. But our world
would simply not function if the software was that bad.

> When weekly or monthly patch updates may or may not work,

That's bad. Most large mission critical projects I've worked on work on
6-monthly (occasionally quarterly) releases rigorously tested and that
rarely go wrong.

> Yes and no. I wouldn't want to build an aircraft carrier totally 
> with Agile methodology. 

That's the point. I've used Agile on large projects at the component
level. Each component team runs an Agile shop. But the overall project
is controlled with more traditional techniques and the overall
architecture is designed up front (albeit subject to change, see above).

> However, aircraft carriers are mostly known technology, 

Mostly, but each generation has a lot of cutting edge new stuff too!

> a three tier application

That's what design patterns are for.

> I've been on those million dollar projects that bust the budget 
> and the schedule, and I've seem some really good project management 
> skills... Waterfall, or Big Design Up Front, in theory can work 
> when every component is well known by all responsible teams. 

Which, as you point out is the case of 80% of the code in a large
system. (Also in most small systems too for that matter!) The
trick is to identify the 20% that doesn't fit (but that can be a
difficult trick up front!) and apply more flexible approaches
such as prototyping and Agile in those areas.

And then risk manage them like crazy!

> Does tradition, and sticking to what the founding fathers meant, have a place? 

Only to provide context. Unlike much of modern software engineering the
early programmers were driven by research and had the time to study the
best approaches based on data not anecdote. They couldn't afford to do
anything else given the tools available. Our modern tools are so
powerful we can afford to waste cycles trying things and if it doesn't
work roll back and try again. But the hard-won lessons of the past
should not be forgotten because they mostly remain valid.

But at the same time we must maximize the benefits of the
modern computing power too. Otherwise we'd still all be
drawing flowcharts and using teletypes!

-- 
Alan G
Author of the Learn to Program web site
http://www.alan-g.me.uk/
http://www.amazon.com/author/alan_gauld
Follow my photo-blog on Flickr at:
http://www.flickr.com/photos/alangauldphotos


From avi.e.gross at gmail.com  Mon Jul 25 23:03:16 2022
From: avi.e.gross at gmail.com (avi.e.gross at gmail.com)
Date: Mon, 25 Jul 2022 23:03:16 -0400
Subject: [Tutor] Volunteer teacher
In-Reply-To: <tbnaen$nfb$1@ciao.gmane.io>
References: <CAEFLybLsOF9j-dV5EfQkCVRRVgYyvfGQMKYhu_23FkLR-+iE0w@mail.gmail.com>
 <a1f1f3d5-9385-853e-3db1-54a65435980b@gmail.com>
 <tuhodhp3lgl9ukvnb5ncuecs3i03qmqa38@4ax.com>
 <75553f79-af0e-6fe1-7f6b-9c2e7f509c72@wichmann.us>
 <00d101d89efb$f9a91fa0$ecfb5ee0$@gmail.com> <tbjgld$qis$1@ciao.gmane.io>
 <011301d8a060$e7b62c50$b72284f0$@gmail.com> <tbnaen$nfb$1@ciao.gmane.io>
Message-ID: <024101d8a09c$40e32540$c2a96fc0$@gmail.com>

Alan,

 
I will selectively quote just a few parts where I want to clarify as overall
we often mainly agree.

 
<AG-UK> As I said in an earlier post you can do OOP in most any language.

<AG-UK> I've used in in Assembler, COBOL and vanilla C as well as many
OOPLs.

 
I differentiate between using IDEAS and code built-in to the language meant
to facilitate it. If you mean you can implement a design focusing on objects
with serious amounts of work, sure. I mean I can define an object as
multiple pieces of data and pass all of them to every function I write that
handles them instead of a pointer of sorts to a class. Or if pointers to
functions are allowed I can simulate many things. I just mean that languages
like FORTRAN and Pascal and BASIC that I used way back when, or PROLOG which
you mentioned earlier, came in my timeline BEFORE I had heard much about
Object-Oriented. 

 
I was programming in plain vanilla C when Bjarne Stroustrup started talking
about making an OO version he eventually put out as C++ and was the first
one in my area who jumped in and started using it and got the agreement to
use it on an existing project where I saw a way to use it for a sort of
polymorphism. I may have told this story already but I was SHOCKED they
allowed me to do that and a tad suspicious. I mean I also had to change the
"makefile" to properly distinguish between my files and regular C files and
properly call the right compilers with the right arguments and then link
them together.

 
Turns out I was right and the product we were building was never used. This
being the AT&T Bell Labs that wasted money, it seems management pulled
permission for the project but gave them 6 months to find a new project. But
rather than telling us and letting use some vacation time or take classes
and add to our skills or look for other work, they simply called a meeting
to announce that a NEW Project was starting NOW and that management had been
working it out for months and was reshuffling staff among supervisors
assigned to each part and so on! Sheesh!

 
And, yes, a project or two later, we did start using C++. My point was not
that concepts cannot be used, and arguably were sometimes used. 

 
<AG-US> If you look at Python, it has some glaring gaps as they considered 

<AG-US> heterogeneous lists to be the most abstract and neglected to make 

<AG-US> something basic in other languages like an array/vector.

 
<AG-UK> But that's something I've never found a need for in my 20 years of
using Python. I can fill a list with 

<AG-UK> homogenous objects as easily as with he[te]r[e]ogenous ones. It's
only when dealing with third party

<AG-UK>  tools (often written in other languages under the  hood) that the
need for an array becomes useful 

<AG-UK> IME. I'm not saying that nobody has a genuine use for a strictly
homogenous container, just that 

<AG-UK> I've never needed such a thing personally.

 
I think it goes deeper than that, Alan. Yes, other than efficiency
considerations, I fully agree that a list data structure can do a superset
of what a uniform data structure can do. So can a tuple if you want
immutability. In a sense you could argue that vectors or arrays or whatever
name a language uses for some data structures that only can contain one kind
of thing are old-school and based on the ideas and technology of an earlier
time. Mind you, it is not that simple. Old arrays typically were of fixed
length upon creation and usually took work to extend by creating a new one
and often led to memory faults or leaks if you followed them too far. But
when I look at some implementation in say R, they have no fixed length and
lots of details are handled behind the scenes for you. 

 
What I was talking about may be subtle. There are times you want to
guarantee things or perhaps have automatic conversions. You can build a
class, of course, that internally maintains both a list and the right TYPE
it accepts and access methods that enforce that anything changed or added is
only allowed if it either is the right type, or perhaps can be coerced to
that type. Or, perhaps, that if a new type is added that cannot mingle, then
change everything else to the new type. R is built on something like that. A
vector can hold one or more (or zero if you know how) contents of the same
type. It can expand and contract and even flip the content of all its parts
to a new one. I can start with a vector of integers and add or change a
float and they all become floats. I can do the same with a character string
and they all become character strings. But they are pretty much guaranteed
to all be the same type (or an NA, albeit there are type specific NA
variables under the hood).

 
This become useful because R encourages use of data structures like
data.frames in which it is necessary (or used to be necessary) for columns
in such tables to all reflect the same kind of thing. And, yes, it now
allows you to use lists which are technically a kind of vector too. I have
created some rather bizarre data structures this way including taking a
larger data.frame, grouping it on selected columns, folding up the remaining
columns to be smaller data.frames embedded in such a list column, doing
statistical analyses on it and saving the results in another list column,
extracting some of the results from that column to explode out into more
individual columns, and so on, including sometimes taking the smaller
data.frames and re-expanding them back into the main data.frame. The
structures can be quite convoluted, and in some sense resemble LISP
constructs where you used functions like CADDAR(.) to traverse a complex
graph, as many "classes" I R are made of named lists nested within .

 
I will note that many programming languages that tried to force you to have
containers that only held on kind of thing, often cheated by techniques like
a UNION of structures so everything was in some sense the same type.
Languages like JAVA and others  can really play the game using tricks like
inheriting from a common ancestor or just claim to implement the same
interface, to allow some generic ideas that let you ride comfortably enough
together.

 
In a sense, Python just did away with much of that and allowed you to hold
anything in a list or tuple, and perhaps not quite everything in hashable
constructs like a dictionary. 

 
I won't quote what you said about Python being simple or complex except to
say that it depends on perspective. As an experienced programmer, I do not
want something so simple it is hard to do much with it without lots of extra
work. I like that you can extend the core language in many ways including
with so many modules out there that are tailored to make some chores easier
- including oodles of specialized or general objects.

 
But as a teaching tool, it reminds me a bit of conversations I have had with
my kids where everything I said seemed to have a reference or vocabulary
word they did not know. If you said Abyssinia and they asked what that was
and you answered it is now mostly Ethiopia across the Red Sea then they want
to know what that is or why it is called Red. If you explain it may be more
about a misunderstanding of Reed, and get into the Exodus story you find
words and idea you have to explain and so on until you wonder when they will
learn all this stuff and be able to have a real conversation!!!!

 
So there is always one kid in a class where you are teaching fairly basic
stuff who wants to know why you are using a loop rather than a comprehension
or not doing it recursively or using itertools and how many times can you
keep saying that is more advanced and some will be covered later and others
are beyond this intro course so ask me AFTER class or look it up yourself?

 
<AG-US> a major focus on OOP is to hide how it is done

 
<AG-UK> That's not OOP, it's information hiding and predates OOP by quite a
way.

 
I accept that in a sense data hiding may be partially or even completely
independent from OOP and the same for hiding an implementation so if it
changes, it does not cause problems. The idea is the object is a sort of
black box that only does what it says in the documentation and can only be
accessed exactly as it says. Some languages allow you to make all kinds of
things private and if they are compiled, it is not easy to make changes.
Some languages use fixed slots in classes to hold data while python allows
any object to pick up attributes so something like a function can create
memory attached to itself sort of on the outside outside and so can any
object. Someone posted some Python code recently that did things in ways
often discouraged such as changing a variable within an object directly and
was shown ways to not allow that in cases where it is a calculated value
that should change when another value changes but remain in sync.

 
In that sense, we could argue (and I would lose) about what in a language is
OOP and what is something else or optional. My views are not based o one
formal course in the abstract but maybe I should read a recent book on the
topic, not just see how each language brags it supports OOP. In particular,
your idea it involves message passing is true in some arenas but mostly NOT
how it is done elsewhere. Calling a member function is not a general message
passing method, nor is being called while asleep on a queue and being told
nothing except that it is tie to wake up. 

 
Perhaps it does make sense to not just teach OOP in Python but also snippets
of how it is implemented in pseudocode or in other languages that perhaps do
it more purely.

 
-----Original Message-----
From: Tutor <tutor-bounces+avi.e.gross=gmail.com at python.org> On Behalf Of
Alan Gauld via Tutor
Sent: Monday, July 25, 2022 7:54 PM
To: tutor at python.org
Subject: Re: [Tutor] Volunteer teacher

 
On 25/07/2022 20:58,  <mailto:avi.e.gross at gmail.com> avi.e.gross at gmail.com
wrote:

 
> I can't say the FIRST thing I look at in a language is OOP. LOL!

 
I didn't say it was the first thing I looked at, just one of the things I
looked at.

 
Although as someone who learned to design code (and indeed analyse

problems) using OOP I probably do look at OOP features earlier than most.

 
> I stopped counting languages long ago and many did not have anything 

> like OOP then,

 
As I said in an earlier post you can do OOP in most any language.

I've used in in Assembler, COBOL and vanilla C as well as many OOPLs.

 
> life to be the accumulation of lots of little tricks and techniques 

> that often allow you to use them to solve many kinds of problems, 

> often in association with others.

 
Absolutely and especially at the small scale level.

I hardly ever use OOP when I'm programming things with less than 1000 lines
of code. But as the code count goes up so does the likelihood of my using
OOP. But on the little projects I will use any number of tools from Lisp to
awk to nroff and SQL.

 
> ranging from echo and cat to cut and sed and grep and awk that could 

> be combined in pipelines

 
Indeed. And even in Windows NT I wrote a suite of small programs (using awk,
cut and sed from the cygwin package for NT) that were combined together via
a DOS batch file to create the distribution media for a project which had to
be installed in over 30 sites each with very specific network
configurations.

(And eventually that suite was rewritten entirely in VBA as an Excel
spreadsheet macro!)

 
>  rewritten in more  than one-liners in C or awk or PERL or whatever.

 
At work my main use of Python (actually Jython) was to write little
prototype objects to demonstrate concepts to our offshore developers who
then turned them into working strength Java. [Jython had the wonderful
feature of allowing my to import our industrial strength Java code and
create objects and call Java methods from my Python prototypes.]

 
But I've also worked on large projects where there is no coding done by
humans at all. These are mostly real-time projects such as telephone
exchange control systems and a speech recognition platform where the design
was done using SDL (Specification & Design Language) and the design tool
generated C code (that was all but unreadable by humans).

But if you didn't like C you could flip a configuration option and it would
spew out ADA or Modula3 instead... it didn't matter to the developers they
only worked at the SDL level.

 
> If you look at Python, it has some glaring gaps as they considered 

> heterogeneous lists to be the most abstract and neglected to make 

> something basic in other languages like an array/vector.

 
But that's something I've never found a need for in my 20 years of using
Python. I can fill a list with homogenous objects as easily as with
hereogenous ones. It's only when dealing with third party tools (often
written in other languages under the hood) that the need for an array
becomes useful IME. I'm not saying that nobody has a genuine use for a
strictly homogenous container, just that I've never needed such a thing
personally.

 
> I do not know why people say python is simple. 

 
Python used to be simple (compared to almost every other general purpose
language) but it is not so today. So many bells and whistles and abstract
mechanisms have been added that Python is quite a complex language. (In the
same way

C++ went from a few simple add-ons to C to being a

compeletly different, and vastly complex, animal!) Last time I updated my
programming tutorial I seriously considered choosing a different language,
but when I went looking I could find nothing simpler that offered all of the
real-world usefulness of Python! But it is a much more difficult language to
learn now than it was 25 years ago wen I first found it.

 
> It may be that the deliberate use of indentation rather than braces, 

> for grouping, makes it look simpler.

 
Simpler for beginners to understand. Remember Python came about largely as a
response to Guido's experience teaching programming with ABC. So it took the
features that students found helpful.

Block delimiters have always been a major cause of bugs for beginners.

 
> ...Copying code from one place to another is trivial

 
This is one of the strongest arguments for them.

and there are others too, however, as a language deliberately targeted at
beginners (but with potential use by experts too) Python (ie Guido) chose
ease of learning over ease of copying.

 
> So back to first appearances, I look for themes and differences. Does 

> the language require something special such as a semicolon to 

> terminate an instruction, or a colon to denote the beginning of a body 

> of text. What are the meanings of symbols I use elsewhere and are they
different.

 
I confess I don't look at those kinds of things at all. They are just part
of the inevitable noise of learning a language but I don't care whether they
are there or not. I just accept it whatever it is.

But then, I rarely choose a language, I usually have it thrust upon me as a
necessity for some project or other. The only languages I've ever chosen to
teach myself are Python, Eiffel, Prolog, Haskell and Logo.

The other 20 or 30 that I know have been learned to complete some task or
other. (However I do intend to teach myself Erlang, Lua and Clojure, and am
currently studying Swift so the list will get longer!)

 
> a major focus on OOP is to hide how it is done

 
That's not OOP, it's information hiding and predates OOP by quite a way. But
it's a good example of how language implementations of OOP have obscured the
whole point of OOP which is purely and simply about objects communicating
via messages. Information hiding is a nice thing to include and OOP
implementation details like classes can provide a vehicle to enforce it. But
its not part of the essence of OOP.

 
> and to allow existing objects to change how they do it as long as the 

> outside view is the same.

 
But that is polymorphism and a fundamental part of OOP.

The fact that you can send a message to an object and the object itself is
responsible for figuring out the method of processing that message is
absolutely key to OOP.

 
--

Alan G

Author of the Learn to Program web site

 <http://www.alan-g.me.uk/> http://www.alan-g.me.uk/

 <http://www.amazon.com/author/alan_gauld>
http://www.amazon.com/author/alan_gauld

Follow my photo-blog on Flickr at:

 <http://www.flickr.com/photos/alangauldphotos>
http://www.flickr.com/photos/alangauldphotos

 
_______________________________________________

Tutor maillist  -   <mailto:Tutor at python.org> Tutor at python.org

To unsubscribe or change subscription options:

 <https://mail.python.org/mailman/listinfo/tutor>
https://mail.python.org/mailman/listinfo/tutor


From alan.gauld at yahoo.co.uk  Tue Jul 26 05:21:46 2022
From: alan.gauld at yahoo.co.uk (Alan Gauld)
Date: Tue, 26 Jul 2022 10:21:46 +0100
Subject: [Tutor] Volunteer teacher
In-Reply-To: <024101d8a09c$40e32540$c2a96fc0$@gmail.com>
References: <CAEFLybLsOF9j-dV5EfQkCVRRVgYyvfGQMKYhu_23FkLR-+iE0w@mail.gmail.com>
 <a1f1f3d5-9385-853e-3db1-54a65435980b@gmail.com>
 <tuhodhp3lgl9ukvnb5ncuecs3i03qmqa38@4ax.com>
 <75553f79-af0e-6fe1-7f6b-9c2e7f509c72@wichmann.us>
 <00d101d89efb$f9a91fa0$ecfb5ee0$@gmail.com> <tbjgld$qis$1@ciao.gmane.io>
 <011301d8a060$e7b62c50$b72284f0$@gmail.com> <tbnaen$nfb$1@ciao.gmane.io>
 <024101d8a09c$40e32540$c2a96fc0$@gmail.com>
Message-ID: <tbobnb$o94$1@ciao.gmane.io>

On 26/07/2022 04:03, avi.e.gross at gmail.com wrote:

> <AG-UK> As I said in an earlier post you can do OOP in most any language.
> 
> I differentiate between using IDEAS and code built-in to the language meant
> to facilitate it. If you mean you can implement a design focusing on objects
> with serious amounts of work, sure. 

That's exactly what I mean. My point is that OOP is a style of
programming not a set of language features (that's an OOPL). It's
the same with functional programming. You can discuss how languages
like ML, Lisp, Haskell or Python implement functional features but
you cannot fully discuss FP purely by using any one of those languages.
And you can write non FP code in any of those languages while
still using the FP features.

> <AG-UK> IME. I'm not saying that nobody has a genuine use for a strictly
> homogenous container, just that 
> <AG-UK> I've never needed such a thing personally.
> 
> What I was talking about may be subtle. There are times you want to
> guarantee things or perhaps have automatic conversions. ...

Sure, and I have used arrays when dealing with code from say C
libraries that require type consistency. But that's a requirement
of the tools I'm using and their interface. Efficiency is
another pragmatic reason to use them but I've never needed
that much efficiency in my Python projects.

> I will note that many programming languages that tried to force you to have
> containers that only held on kind of thing, often cheated by techniques like
> a UNION...
> Languages like JAVA and others inheriting from a common ancestor

That's how most OOPL programs work by defining a collection of "object"
and all objects inherit from "object". And generics allow you to do the
same in non OOPLs like ADA.

> I won't quote what you said about Python being simple or complex except to
> say that it depends on perspective. As an experienced programmer, I do not
> want something so simple it is hard to do much with it without lots of extra
> work. 

Sure, and Guido had to walk the tightrope between ease of use for
beginners and experts. In the early days the bias was towards beginners
but nowadays it is towards experts, but we have historic legacy that
clearly favours the learners.

> But as a teaching tool, it reminds me a bit of conversations I have had with
> my kids where everything I said seemed to have a reference or vocabulary
> word they did not know.

That's always a problem teaching programming. I tried very hard when
writing my tutorial not to use features before decribing them but
it is almost impossible. You can get away with a cursory explanation
then go deeper later but you can't avoid it completely. I suspect thats
true of any complex topic (music, art etc)

> <AG-UK> That's not OOP, it's information hiding and predates OOP by quite a
> way.
> 
> I accept that in a sense data hiding may be partially or even completely
> independent from OOP 

> In that sense, we could argue (and I would lose) about what in a language is
> OOP and what is something else or optional. 

My point is that language features are never OOP. They are tools to
facilitate OOP. Data hiding is a curious case because it's a concept
that is separate from OOP but often conflated with OOP in the
implementation.

> topic, not just see how each language brags it supports OOP. In particular,
> your idea it involves message passing is true in some arenas but mostly NOT
> how it is done elsewhere. 

OOP is all about message passing. That's why we have the terminology we
do. Why is a method called that? It's because when a message is received
by an object the object knows the method it should use to service that
message. The fact that the method is implemented as a function with the
same name as the message is purely an implementation detail. There are
OOPLs that allow you to map messages to methods internally and these
(correctly from an OOP standpoint) allow the same method to be used to
process multiple messages.

> Calling a member function is not a general message
> passing method, 

No, but that's a language feature not OOP.
Languages like Lisp Flavours used a different approach where you
wrote code like:

(Send (anObject, "messageName", (object list))

The receiving object then mapped the string "messageName" to
an internal method function.

So the important point is that programmers writing OOP code in an
OOPL should *think* that they are passing messages not calling
functions. That's a subtle but very important distinction. And
of course, in one sense it's true because, when you have a class
heirarchy (something that is also not an essential OOP feature!),
and send a message to a leaf node you may not in fact be calling
a function in that node, it may well be defined in a totally
different class further up the heirarchy.

[ And we can say the same about calling functions. That's a
conceptual thing inherited from math. It practice we are doing
a goto in the assembler code. But we think of it as a conceptual
function invocation like we were taught at high school. The
difference with OOP is that message passing is a new concept
to learn (unless coming from a traditional engineering background)
whereas function calls we already know about from math class.]

> Perhaps it does make sense to not just teach OOP in Python but also snippets
> of how it is implemented in pseudocode or in other languages that perhaps do
> it more purely.

Certainly Booch takes that approach in his first and third editions
of his book. But even that leads to confusion between OOP and
language features (OOPLs). The essence of OOP is about how you
think about the program. Is it composed of objects communicating -
object *oriented* -  or is it composed of a functional decomposition
that uses objects in the mix (the more common case). Objects are a
powerful tool that can add value to a procedural solution. But OOP
is a far more powerful tool, especially when dealing with larger systems.

One of the best books IMHO to describe the difference in the
approaches is Peter Coad's book "Object Models, Strategies
and Implementations" It's not an especially well written book
and makes some controversial claims, but the concepts within
are clearly explained. But it is written at the UML level not code.

And of course there comes a point in almost all OOPLs where you
have to drop out of OOP thinking to use low level raw data types.
(Smalltalk and a few others being the exceptions where
absolutely everything is an object) Ultimately, pragmatism has
to take over from principle and the trick is working with OOP
at the program structure level and  raw data at the detailed
implementation level. Knowing where that line lives is still
one of the hardest aspects of using any OOPL. But it is still
an implementation issue not an OOP issue.

-- 
Alan G
Author of the Learn to Program web site
http://www.alan-g.me.uk/
http://www.amazon.com/author/alan_gauld
Follow my photo-blog on Flickr at:
http://www.flickr.com/photos/alangauldphotos


From o1bigtenor at gmail.com  Tue Jul 26 06:32:14 2022
From: o1bigtenor at gmail.com (o1bigtenor)
Date: Tue, 26 Jul 2022 05:32:14 -0500
Subject: [Tutor] Volunteer teacher
In-Reply-To: <tbobnb$o94$1@ciao.gmane.io>
References: <CAEFLybLsOF9j-dV5EfQkCVRRVgYyvfGQMKYhu_23FkLR-+iE0w@mail.gmail.com>
 <a1f1f3d5-9385-853e-3db1-54a65435980b@gmail.com>
 <tuhodhp3lgl9ukvnb5ncuecs3i03qmqa38@4ax.com>
 <75553f79-af0e-6fe1-7f6b-9c2e7f509c72@wichmann.us>
 <00d101d89efb$f9a91fa0$ecfb5ee0$@gmail.com>
 <tbjgld$qis$1@ciao.gmane.io> <011301d8a060$e7b62c50$b72284f0$@gmail.com>
 <tbnaen$nfb$1@ciao.gmane.io> <024101d8a09c$40e32540$c2a96fc0$@gmail.com>
 <tbobnb$o94$1@ciao.gmane.io>
Message-ID: <CAPpdf5-fBKNs64UCeSbTy7+bqg1FkMKOEz6fj-1w1G4rf9ko1w@mail.gmail.com>

On Tue, Jul 26, 2022 at 4:22 AM Alan Gauld via Tutor <tutor at python.org> wrote:
>
> On 26/07/2022 04:03, avi.e.gross at gmail.com wrote:
>

Fascinating discussion!!!!

Thank you for not only engaging in it but also for allowing it.
I leant a long time ago that I could find plenty of nuggets for learning in the
digressions - - -in fact sometimes even more things of interest in such than
the original topic(s).

A lurker (here to learn - - -grin!)

From avi.e.gross at gmail.com  Tue Jul 26 13:19:23 2022
From: avi.e.gross at gmail.com (avi.e.gross at gmail.com)
Date: Tue, 26 Jul 2022 13:19:23 -0400
Subject: [Tutor] Volunteer teacher
In-Reply-To: <tbobnb$o94$1@ciao.gmane.io>
References: <CAEFLybLsOF9j-dV5EfQkCVRRVgYyvfGQMKYhu_23FkLR-+iE0w@mail.gmail.com>
 <a1f1f3d5-9385-853e-3db1-54a65435980b@gmail.com>
 <tuhodhp3lgl9ukvnb5ncuecs3i03qmqa38@4ax.com>
 <75553f79-af0e-6fe1-7f6b-9c2e7f509c72@wichmann.us>
 <00d101d89efb$f9a91fa0$ecfb5ee0$@gmail.com> <tbjgld$qis$1@ciao.gmane.io>
 <011301d8a060$e7b62c50$b72284f0$@gmail.com> <tbnaen$nfb$1@ciao.gmane.io>
 <024101d8a09c$40e32540$c2a96fc0$@gmail.com> <tbobnb$o94$1@ciao.gmane.io>
Message-ID: <006f01d8a113$d9dd28f0$8d977ad0$@gmail.com>

The way I recall it, Alan, is that many language designers and users were
once looking for some kind of guarantees that a program would run without
crashing AND that every possible path would lead to a valid result. One such
attempt was to impose very strict typing. You could not just add two numbers
without first consciously converting them to the same type and so on.
Programming in these languages rapidly became tedious and lots of people
worked around them to get things done. Other languages were a bit too far in
the other direction and did thigs like flip a character string into a
numeric form if it was being used in a context where that made sense.

My point, perhaps badly made, was that one reason some OOP ideas made it
into languages INCLUDED attempts to do useful things when the original
languages made them hard. Ideas like allowing anything that CLAIMED to
implement an INTERFACE to be allowed in a list whose type was that
interface, could let you create some silly interface that did next to
nothing and add that interface to just about anything and get around
restrictions. But that also meant their programs were not really very safe
in one sense.

Languages like python largely did away with some such mathematical
restrictions and seem easier to new programmers because of that and other
similar things.

I remember getting annoyed at having to spell out every variable before
using it as being of a particular type, sometimes a rather complex type like
address of a pointer to an integer. Many languages now simply make educated
guesses when possible so a=1 makes an integer and b=a^2 is also obviously an
integer while c=a*2.0 must be a float and so on. Such a typing system may
still have the ability to specify things precisely but most people stop
using that except when needed and the language becomes easier, albeit can
develop subtle bugs. What makes some languages really hard to understand,
including some things like JAVA, is their attempt to create sort of generic
functions. There is an elusive syntax that declares abstract types that are
instantiated as needed and if you use the function many times using the
object types allowed, it compiles multiple actual functions with one for
each combo. So if a function takes 4 arguments that call all be 5 kinds, it
may end up compiling as many as 625 functions internally. 

Python does make something like that much easier. You just create a function
that takes ANYTHING and at run time, the arguments arrive with known types
and are simply handled. But the simplicity can cost you as you cannot
trivially restrict  what can be used and may have to work inside the
function to verify you are only getting the types you want to handle.

You make some points about efficiency that make me wonder. There are so many
obvious tradeoffs but it clearly does not always pay to make something
efficient as the original goal for every single project. As noted, parts of
python or added modules are often written first in python then strategic
bits of compiled code are substituted. But in a world where some software
(such as nonsense like making digital money by using lots of electricity)
uses up lots of energy, it makes sense to try to cut back on some more
grandiose software. And it is not just about efficiency. Does a program need
to check if I have new mail in a tight loop or can it check once a second or
even every ten minutes?

I am getting to understand your viewpoint in focusing on ideas not so much
implementations and agree. The method of transmitting a message can vary as
long as you have objects communicating and influencing each other. Arguably
sending interrupts or generating events and many other such things are all
possible implementations. 

I think the ability to try to catch various errors and interrupts is part of
why languages like python can relax some rules as an error need not stop a
program.

Heading out to the beach.


-----Original Message-----
From: Tutor <tutor-bounces+avi.e.gross=gmail.com at python.org> On Behalf Of
Alan Gauld via Tutor
Sent: Tuesday, July 26, 2022 5:22 AM
To: tutor at python.org
Subject: Re: [Tutor] Volunteer teacher

On 26/07/2022 04:03, avi.e.gross at gmail.com wrote:

> <AG-UK> As I said in an earlier post you can do OOP in most any language.
> 
> I differentiate between using IDEAS and code built-in to the language 
> meant to facilitate it. If you mean you can implement a design 
> focusing on objects with serious amounts of work, sure.

That's exactly what I mean. My point is that OOP is a style of programming
not a set of language features (that's an OOPL). It's the same with
functional programming. You can discuss how languages like ML, Lisp, Haskell
or Python implement functional features but you cannot fully discuss FP
purely by using any one of those languages.
And you can write non FP code in any of those languages while still using
the FP features.

> <AG-UK> IME. I'm not saying that nobody has a genuine use for a 
> strictly homogenous container, just that <AG-UK> I've never needed 
> such a thing personally.
> 
> What I was talking about may be subtle. There are times you want to 
> guarantee things or perhaps have automatic conversions. ...

Sure, and I have used arrays when dealing with code from say C libraries
that require type consistency. But that's a requirement of the tools I'm
using and their interface. Efficiency is another pragmatic reason to use
them but I've never needed that much efficiency in my Python projects.

> I will note that many programming languages that tried to force you to 
> have containers that only held on kind of thing, often cheated by 
> techniques like a UNION...
> Languages like JAVA and others inheriting from a common ancestor

That's how most OOPL programs work by defining a collection of "object"
and all objects inherit from "object". And generics allow you to do the same
in non OOPLs like ADA.

> I won't quote what you said about Python being simple or complex 
> except to say that it depends on perspective. As an experienced 
> programmer, I do not want something so simple it is hard to do much 
> with it without lots of extra work.

Sure, and Guido had to walk the tightrope between ease of use for beginners
and experts. In the early days the bias was towards beginners but nowadays
it is towards experts, but we have historic legacy that clearly favours the
learners.

> But as a teaching tool, it reminds me a bit of conversations I have 
> had with my kids where everything I said seemed to have a reference or 
> vocabulary word they did not know.

That's always a problem teaching programming. I tried very hard when writing
my tutorial not to use features before decribing them but it is almost
impossible. You can get away with a cursory explanation then go deeper later
but you can't avoid it completely. I suspect thats true of any complex topic
(music, art etc)

> <AG-UK> That's not OOP, it's information hiding and predates OOP by 
> quite a way.
> 
> I accept that in a sense data hiding may be partially or even 
> completely independent from OOP

> In that sense, we could argue (and I would lose) about what in a 
> language is OOP and what is something else or optional.

My point is that language features are never OOP. They are tools to
facilitate OOP. Data hiding is a curious case because it's a concept that is
separate from OOP but often conflated with OOP in the implementation.

> topic, not just see how each language brags it supports OOP. In 
> particular, your idea it involves message passing is true in some 
> arenas but mostly NOT how it is done elsewhere.

OOP is all about message passing. That's why we have the terminology we do.
Why is a method called that? It's because when a message is received by an
object the object knows the method it should use to service that message.
The fact that the method is implemented as a function with the same name as
the message is purely an implementation detail. There are OOPLs that allow
you to map messages to methods internally and these (correctly from an OOP
standpoint) allow the same method to be used to process multiple messages.

> Calling a member function is not a general message passing method,

No, but that's a language feature not OOP.
Languages like Lisp Flavours used a different approach where you wrote code
like:

(Send (anObject, "messageName", (object list))

The receiving object then mapped the string "messageName" to an internal
method function.

So the important point is that programmers writing OOP code in an OOPL
should *think* that they are passing messages not calling functions. That's
a subtle but very important distinction. And of course, in one sense it's
true because, when you have a class heirarchy (something that is also not an
essential OOP feature!), and send a message to a leaf node you may not in
fact be calling a function in that node, it may well be defined in a totally
different class further up the heirarchy.

[ And we can say the same about calling functions. That's a conceptual thing
inherited from math. It practice we are doing a goto in the assembler code.
But we think of it as a conceptual function invocation like we were taught
at high school. The difference with OOP is that message passing is a new
concept to learn (unless coming from a traditional engineering background)
whereas function calls we already know about from math class.]

> Perhaps it does make sense to not just teach OOP in Python but also 
> snippets of how it is implemented in pseudocode or in other languages 
> that perhaps do it more purely.

Certainly Booch takes that approach in his first and third editions of his
book. But even that leads to confusion between OOP and language features
(OOPLs). The essence of OOP is about how you think about the program. Is it
composed of objects communicating - object *oriented* -  or is it composed
of a functional decomposition that uses objects in the mix (the more common
case). Objects are a powerful tool that can add value to a procedural
solution. But OOP is a far more powerful tool, especially when dealing with
larger systems.

One of the best books IMHO to describe the difference in the approaches is
Peter Coad's book "Object Models, Strategies and Implementations" It's not
an especially well written book and makes some controversial claims, but the
concepts within are clearly explained. But it is written at the UML level
not code.

And of course there comes a point in almost all OOPLs where you have to drop
out of OOP thinking to use low level raw data types.
(Smalltalk and a few others being the exceptions where absolutely everything
is an object) Ultimately, pragmatism has to take over from principle and the
trick is working with OOP at the program structure level and  raw data at
the detailed implementation level. Knowing where that line lives is still
one of the hardest aspects of using any OOPL. But it is still an
implementation issue not an OOP issue.

--
Alan G
Author of the Learn to Program web site
http://www.alan-g.me.uk/
http://www.amazon.com/author/alan_gauld
Follow my photo-blog on Flickr at:
http://www.flickr.com/photos/alangauldphotos


_______________________________________________
Tutor maillist  -  Tutor at python.org
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor


From alan.gauld at yahoo.co.uk  Tue Jul 26 17:19:37 2022
From: alan.gauld at yahoo.co.uk (Alan Gauld)
Date: Tue, 26 Jul 2022 22:19:37 +0100
Subject: [Tutor] Volunteer teacher
In-Reply-To: <006f01d8a113$d9dd28f0$8d977ad0$@gmail.com>
References: <CAEFLybLsOF9j-dV5EfQkCVRRVgYyvfGQMKYhu_23FkLR-+iE0w@mail.gmail.com>
 <a1f1f3d5-9385-853e-3db1-54a65435980b@gmail.com>
 <tuhodhp3lgl9ukvnb5ncuecs3i03qmqa38@4ax.com>
 <75553f79-af0e-6fe1-7f6b-9c2e7f509c72@wichmann.us>
 <00d101d89efb$f9a91fa0$ecfb5ee0$@gmail.com> <tbjgld$qis$1@ciao.gmane.io>
 <011301d8a060$e7b62c50$b72284f0$@gmail.com> <tbnaen$nfb$1@ciao.gmane.io>
 <024101d8a09c$40e32540$c2a96fc0$@gmail.com> <tbobnb$o94$1@ciao.gmane.io>
 <006f01d8a113$d9dd28f0$8d977ad0$@gmail.com>
Message-ID: <tbplpd$seh$1@ciao.gmane.io>

On 26/07/2022 18:19, avi.e.gross at gmail.com wrote:
> The way I recall it, Alan, is that many language designers and users were
> once looking for some kind of guarantees that a program would run without
> crashing AND that every possible path would lead to a valid result. One such
> attempt was to impose very strict typing. 

Yes and many still follow that mantra. C++ is perhaps the most
obsessive of the modern manifestations.

> Programming in these languages rapidly became tedious and lots of people
> worked around them to get things done. 

Yep, I first learned to program in Pascal(*) which was super strict.

####### Pseudo Pascal - too lazy to lok up the exact syntax!  ###

Type
   SingleDigit = INTEGER(0..9);

funtion f(SingleDigit):Boolean
....

var x: INTEGER
var y: SingleDigit
var b: Boolean

begin
  x := 3
  y := 3

  b := f(x)   (*Fails because x is not a SingleDigit *)
  b := f(y)   (* Succeeds because y is... even though both are 3 *)
end.

The problem with these levels of strictness is that people are
forced to convert types for trivial reasons like above. And every
type conversion is a potential bug. In my experience of maintaining
C code type conversions (especially "casting") are one of the
top 5 causes of production code bugs.

(*)Actually I did a single-term class in programming in BASIC in the
early 70s at high school, but the technology meant we didn't go beyond
sequences, loops and selection. In the mid 80s at University I did a
full two years of Pascal. (While simultaneously studying Smalltalk
and C++ having discovered OOP in the famous BYTE magazine article)

> the other direction and did thigs like flip a character string into a
> numeric form if it was being used in a context where that made sense.

Javascript, Tcl, et al...

> My point, perhaps badly made, was that one reason some OOP ideas made it
> into languages INCLUDED attempts to do useful things when the original
> languages made them hard. Ideas like allowing anything that CLAIMED to
> implement an INTERFACE to be allowed in a list whose type was that
> interface, could let you create some silly interface that did next to
> nothing and add that interface to just about anything and get around
> restrictions. But that also meant their programs were not really very safe
> in one sense.

Absolutely but that's a result of somebody trying to hitch their
particular programming irk onto the OOP bandwagon. It has nothing
whatsoever to do with OOP. There were a lot of different ideas
circulating around the 80s/90s and language implementors used
the OOP hype to include their pet notions. So lots of ideas
all got conflated into "OOP" and the core principles got lost
completely in the noise!

> address of a pointer to an integer. Many languages now simply make educated
> guesses when possible so a=1 makes an integer and b=a^2 is also obviously an
> integer 

Java does a little of this and Swift is very good at it.

> including some things like JAVA, is their attempt to create sort of generic
> functions. 

But generics are another topic again...

> ...There is an elusive syntax that declares abstract types that are
> instantiated as needed and if you use the function many times using the
> object types allowed, it compiles multiple actual functions with one for
> each combo. So if a function takes 4 arguments that call all be 5 kinds, it
> may end up compiling as many as 625 functions internally.

True, that's also what happens in C++. But it is only an issue at
assembler level - and if you care about the size of the executable
which is rare these days. At the source code level the definitions are
fairly compact and clear and still enables the compiler to do strict
typing.

> trivially restrict  what can be used and may have to work inside the
> function to verify you are only getting the types you want to handle.

Although, if you stick to using the interfaces, then you should be able
to trust the objects to "do the right thing". But there is a measure of
responsibility on the programmer not to wilfully do stupid things!

> I am getting to understand your viewpoint in focusing on ideas not so much
> implementations and agree. The method of transmitting a message can vary as
> long as you have objects communicating and influencing each other. Arguably
> sending interrupts or generating events and many other such things are all
> possible implementations. 

Absolutely and in real-time OOP systems it's common to wrap the OS
interrupt handling into some kind of dispatcher object which collects
the interrupt and determines the correct receiver and then sends the
interrupt details as a message to that object. And from a purely
theoretical systems engineering viewpoint, where an OOP system is
a form of a network of sequential machines, interrupts are about
the closest to a pure OOP architecture.

-- 
Alan G
Author of the Learn to Program web site
http://www.alan-g.me.uk/
http://www.amazon.com/author/alan_gauld
Follow my photo-blog on Flickr at:
http://www.flickr.com/photos/alangauldphotos


From alan.gauld at yahoo.co.uk  Tue Jul 26 17:52:40 2022
From: alan.gauld at yahoo.co.uk (Alan Gauld)
Date: Tue, 26 Jul 2022 22:52:40 +0100
Subject: [Tutor] Volunteer teacher
In-Reply-To: <tbobnb$o94$1@ciao.gmane.io>
References: <CAEFLybLsOF9j-dV5EfQkCVRRVgYyvfGQMKYhu_23FkLR-+iE0w@mail.gmail.com>
 <a1f1f3d5-9385-853e-3db1-54a65435980b@gmail.com>
 <tuhodhp3lgl9ukvnb5ncuecs3i03qmqa38@4ax.com>
 <75553f79-af0e-6fe1-7f6b-9c2e7f509c72@wichmann.us>
 <00d101d89efb$f9a91fa0$ecfb5ee0$@gmail.com> <tbjgld$qis$1@ciao.gmane.io>
 <011301d8a060$e7b62c50$b72284f0$@gmail.com> <tbnaen$nfb$1@ciao.gmane.io>
 <024101d8a09c$40e32540$c2a96fc0$@gmail.com> <tbobnb$o94$1@ciao.gmane.io>
Message-ID: <tbpnn9$sq8$1@ciao.gmane.io>

You know things are bad when you reply to your own emails!

On 26/07/2022 10:21, Alan Gauld via Tutor wrote:
> That's exactly what I mean. My point is that OOP is a style of
> programming not a set of language features (that's an OOPL). ...
> OOP is all about message passing. 

I realize I may be causing confusion by over-simplifying.

In reality the term OOP has come to mean many differnt
things. The reason for this is historic because what we now
call OOP is an amalgam of programming developments during
the late 60s-late 80s. There were basically 3 different groups
all working along similar lines but with different goals and
interests:

Group 1 - Simula/Modula
This group were focused on improving the implementation of
abstract data types and modules and their use in simulating
real-world systems. This resulted in the invention of classes
as we know them, in the language Simula. Languages influenced
by this group include C++, Object Pascal, Java and most of
the others that we consider OOPLs today.

Group 2 - Allan Kay and his Dynabook project at Xerox Parc.
Kay was intent on developing a computing device that could
be used by the masses. He focused on teaching programming
to youngsters as a representative group. The lessons learnt
included the fact that youngsters could relate to objects
sending messages to each other and from this he developed
the ideas and coined the term "Object Oriented Programming".
He built Smalltalk (in 3 different versions culminating
in Smalltalk 80) to implement those concepts. Along the
way he picked up Simulas class concept. It is he who
"defines" OOP as a message passing mechanism and a programming
style rather than a set of language features. Object Pascal,
Objective C, Actor and Python all include strong influences
from this group.

Group 3 - Lisp and the Academics
At the same time lots of academics were experimenting with
different programming styles to try to find a way to accelerate
program development. This was driven by the "software crisis"
where software development times were increasing expoentially
with complexity. They picked up on the activity by groups 1
and 2 and added some spice of their own. Mostly they used
Lisp and came up with several OOP systems, best known of
which are Flavors and CLOS. CLOS in particular is intended
to support multiple styles of object based programming
including pure OOP(as defined by Kay).

[Bertrand Meyer developed his Eiffel language in parallel
with Group 3 but strongly influenced by Group 1 too. Eiffel
along with CLOS are probably the most complete implementations
of all currently existing OO concepts]

[Seymour Papert was working on similar concepts to Kay but was
idealogically more closely aligned with group 3 but never espoused
objects per se. Instead he developed Logo which is closely
related to Lisp but includes the concept of sending messages
but not the encapsulation of the receiver data and functions.]


When I talk about OOP I am firmly in the Group 2 category.
You can do OOP in almost any language. You can use objects
in almost any style of programming. But don't make the mistake
that just building and using classes means you are doing OOP.
Even in the purest of OOP languages(Smalltalk?) it is entirely
possible to write non OOP code. The manual for Smalltalk/V
includes a good example of non OOP code written in Smalltalk
and how it looks when re-written in an OOP style. The point
being that simply learning Smalltalk does not mean you are
learning OOP! I can probably dig that example out (and
maybe even translate it to Python) if anyone is interested.

-- 
Alan G
Author of the Learn to Program web site
http://www.alan-g.me.uk/
http://www.amazon.com/author/alan_gauld
Follow my photo-blog on Flickr at:
http://www.flickr.com/photos/alangauldphotos


From wlfraed at ix.netcom.com  Tue Jul 26 19:12:03 2022
From: wlfraed at ix.netcom.com (Dennis Lee Bieber)
Date: Tue, 26 Jul 2022 19:12:03 -0400
Subject: [Tutor] Volunteer teacher
References: <CAEFLybLsOF9j-dV5EfQkCVRRVgYyvfGQMKYhu_23FkLR-+iE0w@mail.gmail.com>
 <a1f1f3d5-9385-853e-3db1-54a65435980b@gmail.com>
 <tuhodhp3lgl9ukvnb5ncuecs3i03qmqa38@4ax.com>
 <75553f79-af0e-6fe1-7f6b-9c2e7f509c72@wichmann.us>
 <00d101d89efb$f9a91fa0$ecfb5ee0$@gmail.com> <tbjgld$qis$1@ciao.gmane.io>
 <011301d8a060$e7b62c50$b72284f0$@gmail.com> <tbnaen$nfb$1@ciao.gmane.io>
 <024101d8a09c$40e32540$c2a96fc0$@gmail.com> <tbobnb$o94$1@ciao.gmane.io>
 <006f01d8a113$d9dd28f0$8d977ad0$@gmail.com>
Message-ID: <3os0ehduim97utbff3ea0teqv6r7h1rj93@4ax.com>

On Tue, 26 Jul 2022 13:19:23 -0400, <avi.e.gross at gmail.com> declaimed the
following:

>The way I recall it, Alan, is that many language designers and users were
>once looking for some kind of guarantees that a program would run without
>crashing AND that every possible path would lead to a valid result. One such
>attempt was to impose very strict typing. You could not just add two numbers
>without first consciously converting them to the same type and so on.
>Programming in these languages rapidly became tedious and lots of people
>worked around them to get things done. Other languages were a bit too far in
>the other direction and did thigs like flip a character string into a
>numeric form if it was being used in a context where that made sense.
>

	Why pussy-foot?

	You've essentially described Ada and REXX <G>

Ada:
	Conceptually define a data type for every discrete component (so you
can't compare "apples" to "oranges" without transmuting one into the other
or to some common type ("fruit").

REXX:
	Everything is a string unless context says otherwise. And statements
beginning with unknown keywords (or explicitly quoted strings) are assumed
to be commands to an external command processor (in most implementation,
the "shell" -- IBM mainframe mostly supported addressing an editor as
command processor, Amiga ARexx supported ANY application opening a
"RexxPort" to which commands could be sent -- even another ARexx script, or
(with Irmen's work) Python.


-- 
	Wulfraed                 Dennis Lee Bieber         AF6VN
	wlfraed at ix.netcom.com    http://wlfraed.microdiversity.freeddns.org/


From wlfraed at ix.netcom.com  Tue Jul 26 19:22:49 2022
From: wlfraed at ix.netcom.com (Dennis Lee Bieber)
Date: Tue, 26 Jul 2022 19:22:49 -0400
Subject: [Tutor] Volunteer teacher
References: <a1f1f3d5-9385-853e-3db1-54a65435980b@gmail.com>
 <tuhodhp3lgl9ukvnb5ncuecs3i03qmqa38@4ax.com>
 <75553f79-af0e-6fe1-7f6b-9c2e7f509c72@wichmann.us>
 <00d101d89efb$f9a91fa0$ecfb5ee0$@gmail.com> <tbjgld$qis$1@ciao.gmane.io>
 <011301d8a060$e7b62c50$b72284f0$@gmail.com> <tbnaen$nfb$1@ciao.gmane.io>
 <024101d8a09c$40e32540$c2a96fc0$@gmail.com> <tbobnb$o94$1@ciao.gmane.io>
 <006f01d8a113$d9dd28f0$8d977ad0$@gmail.com> <tbplpd$seh$1@ciao.gmane.io>
Message-ID: <oat0eht8n170anlcig5eamhhacuga7qoop@4ax.com>

On Tue, 26 Jul 2022 22:19:37 +0100, Alan Gauld via Tutor <tutor at python.org>
declaimed the following:

>(*)Actually I did a single-term class in programming in BASIC in the
>early 70s at high school, but the technology meant we didn't go beyond
>sequences, loops and selection. In the mid 80s at University I did a
>full two years of Pascal. (While simultaneously studying Smalltalk
>and C++ having discovered OOP in the famous BYTE magazine article)
>

	Welcome to the club... I needed a final 3 credits for graduation, so
took BASIC in my senior year of college. No effort class -- considering I'd
already used BASIC in the data structures course implementing an assignment
using "hashed head multiply linked lists" (and have never seen such used
any except for the Amiga disk directory/file structure -- hash into the
root directory block to find a pointer to the start of a linked list of
file header blocks [file name, followed by pointers to data blocks and a
pointer to an overflow list of data blocks] and/or directory blocks
[directory name followed by hashed list of pointers to next level of
chains])

	This (BASIC course) was AFTER FORTRAN (-IV), Advanced FORTRAN, Assembly
[sequence A], COBOL, Advanced COBOL, Database [sequence B, Database text
covered Hierarchical, Network, and Relational as theoretical -- subsequent
editions covered Relational, and treated Hierarchical and Network as
historical], the aforesaid data structures, and a language design course.
<G>


-- 
	Wulfraed                 Dennis Lee Bieber         AF6VN
	wlfraed at ix.netcom.com    http://wlfraed.microdiversity.freeddns.org/


From bobxander87 at gmail.com  Tue Jul 26 16:58:06 2022
From: bobxander87 at gmail.com (bobx ander)
Date: Tue, 26 Jul 2022 22:58:06 +0200
Subject: [Tutor] Building dictionary from large txt file
Message-ID: <CAMNwu9Hp-nP_FB3sEwAhOFSQw_H2KnTzYOoEqotd72Q5k0qwkw@mail.gmail.com>

Hi all,
I'm trying to build a dictionary from a rather large file of following
format after it has being read into a list(excerpt from start of list below)
--------

Atomic Number = 1
    Atomic Symbol = H
    Mass Number = 1
    Relative Atomic Mass = 1.00782503223(9)
    Isotopic Composition = 0.999885(70)
    Standard Atomic Weight = [1.00784,1.00811]
    Notes = m
--------

My goal is to extract the content into a dictionary that displays each
unique triplet as indicated below
{'H1': {'Z': 1,'A': 1,'m': 1.00782503223},
              'D2': {'Z': 1,'A': 2,'m': 2.01410177812}
               ...} etc
My code that I have attempted is as follows:

filename='ex.txt'

afile=open(filename,'r') #opens the file
content=afile.readlines()
afile.close()
isotope_data={'Z':0,'A':0,'m':0}#start to create subdictionary for
each case of atoms with its unique keys and values
for line in content:
    data=line.strip().split()

    if len(data)<1:
        pass
    elif data[0]=="Atomic" and data[1]=="Number":
        atomic_number=data[3]


     elif data[0]=="Mass" and data[1]=="Number":
        mass_number=data[3]


    elif data[0]=="Relative" and data[1]=="Atomic" and data[2]=="Mass":
        relative_atomic_mass=data[4]


isotope_data['Z']=atomic_number
isotope_data['A']=mass_number
isotope_data['A']=relative_atomic_mass
isotope_data

the output from the programme is only

{'Z': '118', 'A': '295', 'm': '295.21624(69#)'}

I seem to be owerwriting each dictionary and ends up with the above
result.Somehow i think I have to put
the assigment of the key,value pairs elsewhere.

I have tried directly below the elif statements also,but that did not work.

Any hints or ideas

Regards

Bob

From learn2program at gmail.com  Tue Jul 26 20:33:39 2022
From: learn2program at gmail.com (Alan Gauld)
Date: Wed, 27 Jul 2022 01:33:39 +0100
Subject: [Tutor] Building dictionary from large txt file
In-Reply-To: <CAMNwu9Hp-nP_FB3sEwAhOFSQw_H2KnTzYOoEqotd72Q5k0qwkw@mail.gmail.com>
References: <CAMNwu9Hp-nP_FB3sEwAhOFSQw_H2KnTzYOoEqotd72Q5k0qwkw@mail.gmail.com>
Message-ID: <0aa4daf4-6c45-ab98-ed5f-f9d381ca6299@yahoo.co.uk>


On 26/07/2022 21:58, bobx ander wrote:
> Hi all,
> I'm trying to build a dictionary from a rather large file of following
> format after it has being read into a list(excerpt from start of list below)
> --------
>
> Atomic Number = 1
>     Atomic Symbol = H
>     Mass Number = 1
>     Relative Atomic Mass = 1.00782503223(9)
>     Isotopic Composition = 0.999885(70)
>     Standard Atomic Weight = [1.00784,1.00811]
>     Notes = m
> --------
>
> My goal is to extract the content into a dictionary that displays each
> unique triplet as indicated below
> {'H1': {'Z': 1,'A': 1,'m': 1.00782503223},
>               'D2': {'Z': 1,'A': 2,'m': 2.01410177812}
>                ...} etc

Unfortunately to those of us unfamiliar with your data that is as clear
as mud.

You refer to a triplet but your sample file entry has 7 fields, some of
which
have multiple values. Where is the triplet among all that data?

Then you show us a dictionary with keys that do not correspond to any of
the fields in your data sample. How do the fields correspond - the only
"obvious" one is the mass which evidently corresponds with the key 'm'.

But what are H1 and D2? Another file record or some derived value from
the record shown above? Similarly for Z, A and m. How do they relate to
the data?

You need to specify your requirement more explicitly for us to be sure we
are giving valid advice.


> My code that I have attempted is as follows:
>
> filename='ex.txt'
>
> afile=open(filename,'r') #opens the file
> content=afile.readlines()
> afile.close()

You probably don't need to read the file into a list if you
are going to process it line by line. Just read the lines
from the file and process them as you go.


> isotope_data={'Z':0,'A':0,'m':0}#start to create subdictionary for
> each case of atoms with its unique keys and values
> for line in content:
>     data=line.strip().split()
>
>     if len(data)<1:
>         pass
>     elif data[0]=="Atomic" and data[1]=="Number":
>         atomic_number=data[3]
>
>
>      elif data[0]=="Mass" and data[1]=="Number":
>         mass_number=data[3]
>
>
>
>     elif data[0]=="Relative" and data[1]=="Atomic" and data[2]=="Mass":
>         relative_atomic_mass=data[4]
>
Rather than split the line then compare each field it might be easier
(and more readable) to compare the full strings using the startswith()
method then split the string:

for line in file:

???? if line.startwith("Atomic Number"):

???????? atomic_number = line.strip().split()[3]

??? etc...

> isotope_data['Z']=atomic_number
> isotope_data['A']=mass_number
> isotope_data['A']=relative_atomic_mass
> isotope_data
>
> the output from the programme is only
>
> {'Z': '118', 'A': '295', 'm': '295.21624(69#)'}
>
> I seem to be owerwriting each dictionary 

Yes, you never detect the end of a record - you never explain how records
are separated in the file either!

You need something like


master = []?? # empty dict.

for line in file:

?????? if line.startswith("Atomic Number")

?????????? create variable....

????? if line.startswith(....):....etc

?????? if <record separator detected>?? # we don't know what this is...

???????????? # save variables in a dictionary

???????????? record = { key1:variable1, key2:variable2....}

???????????? # insert dictionary to master dictionary

???????????? master[key] = record

How you generate the keys is a mystery to me but presumably you know.


You could write the values directly into the master dictionary if you
prefer.

Also note that you are currently storing strings. If you want the
numeric data
you will need to convert it with int() or float() as appropriate.

-- 
Alan G
Author of the Learn to Program web site
http://www.alan-g.me.uk/
http://www.amazon.com/author/alan_gauld
Follow my photo-blog on Flickr at:
http://www.flickr.com/photos/alangauldphotos


From PythonList at DancesWithMice.info  Tue Jul 26 22:36:37 2022
From: PythonList at DancesWithMice.info (dn)
Date: Wed, 27 Jul 2022 14:36:37 +1200
Subject: [Tutor] Building dictionary from large txt file
In-Reply-To: <CAMNwu9Hp-nP_FB3sEwAhOFSQw_H2KnTzYOoEqotd72Q5k0qwkw@mail.gmail.com>
References: <CAMNwu9Hp-nP_FB3sEwAhOFSQw_H2KnTzYOoEqotd72Q5k0qwkw@mail.gmail.com>
Message-ID: <b5636889-7836-de3e-a484-d415f98c3d55@DancesWithMice.info>

On 27/07/2022 08.58, bobx ander wrote:
> Hi all,
> I'm trying to build a dictionary from a rather large file of following
> format after it has being read into a list(excerpt from start of list below)
> --------
> 
> Atomic Number = 1
>     Atomic Symbol = H
>     Mass Number = 1
>     Relative Atomic Mass = 1.00782503223(9)
>     Isotopic Composition = 0.999885(70)
>     Standard Atomic Weight = [1.00784,1.00811]
>     Notes = m
> --------
> 
> My goal is to extract the content into a dictionary that displays each
> unique triplet as indicated below
> {'H1': {'Z': 1,'A': 1,'m': 1.00782503223},
>               'D2': {'Z': 1,'A': 2,'m': 2.01410177812}
>                ...} etc
> My code that I have attempted is as follows:
> 
> filename='ex.txt'
> 
> afile=open(filename,'r') #opens the file
> content=afile.readlines()
> afile.close()
> isotope_data={'Z':0,'A':0,'m':0}#start to create subdictionary for
> each case of atoms with its unique keys and values
> for line in content:
>     data=line.strip().split()
> 
>     if len(data)<1:
>         pass
>     elif data[0]=="Atomic" and data[1]=="Number":
>         atomic_number=data[3]
> 
> 
>      elif data[0]=="Mass" and data[1]=="Number":
>         mass_number=data[3]
> 
> 
> 
>     elif data[0]=="Relative" and data[1]=="Atomic" and data[2]=="Mass":
>         relative_atomic_mass=data[4]
> 
> 
> isotope_data['Z']=atomic_number
> isotope_data['A']=mass_number
> isotope_data['A']=relative_atomic_mass
> isotope_data


+1 after @Alan: it is difficult to ascertain how the dictionary is
transformed from the input file (not list!).

Because things are "not work[ing]" the code is evidently 'too complex'.
NB this is not an insult to your intelligence. It is however a
reflection on your programming expertise/experience, and/or your Python
expertise. The (recommended) answer is to break-down the total problem
into smaller units which you-personally can 'see' and confidently
manage. (which level of detail or size of "chunk", is different for each
of us!)


Is the file *guaranteed* to have all seven lines per isotope (or
whatever we have to imagine it contains)?

Alternately, are some 'isotopes' described with fewer than seven lines
of data? In which case, each line must be read and 'understood' - plus
any missing data must be handled, presumably with some default value or
an indicator that such data is not available.


The first option seems more likely. Note how (first line, above) the
problem was expressed, perhaps 'backwards'! This is because it is easier
to understand that way around - and possibly a source of the problem
described.

So, here is a suggested approach - with the larger-problem broken-down
into smaller (and more-easily understood) units:-


The first component to code is a Python-generator which opens the file
(use a Context Manager if you want to be 'advanced'/an excuse to learn
such), reads a line, 'cleans' the data, and "yield"s the data-value;
'rinse and repeat'.

Next, (up to) seven 'basic-functions', representing each of the
dictionary entries/lines in the data file. These will be very similar to
each-other, but each is solely-devoted to creating one dictionary entry
from the data generated by the generator. If they are called in the
correct sequence, each call will correspond to the next (expected)
record being read-in from the data-file.

I'm assuming (in probable ignorance) that some data-items are
collated/collected together as nested dictionaries. In which case,
another 'level' of subroutine may be required - an 'assembly-function'.
This/these will call 'however many' of the above 'basic-functions' in
order to assemble a dictionary-entry which contains a dictionary as its
"value" (dicts are "key"-"value" pairs - in case you haven't met this
terminology before).

Those 'assembly-functions' will return that more complex dictionary
entry. We can now 'see' that the one-to-one relationship between a
dictionary sub-structure is more important than any one-to-one
relationship with the input file! Thus, given that the objective is to
build "a dictionary" of "unique triplet[s]", each function should return
a sub-component of that 'isotope's' entry in the dictionary - some
larger sub-components and others a single value or key-value pair!

Finally then, the 'top level' is a loop-forever until the generator
returns an 'end of file' exception. The loop calls each basic-function
or assembly-function in-turn, and either gradually or 'at the bottom of
each loop' assembles the dictionary-entry for that 'isotope' and adds it
to the dictionary.


Try a main-loop which looks something like:

# init dict

while "there's data":
  atomic_number = get_atomic_number()
  atomic_symbol = get_atomic_symbol()
  assemble_atomic_mass = get_atomic_mass()
  # etc
  assemble_dict_entry( atomic_number, atomic_symbol, ... )

  # probably only need a try...except around the first call
  # which will break out of the while-loop

# dict is now fully assembled and ready for use...


# sample 'assembly-function'
def assemble_atomic_mass():
  # init sub-dict
  mass_number = get_mass_number()
  relative_atomic_mass = get_relative_atomic_mass()
  #etc
  # assemble sub-dict entry with atomic mass data
  return sub-dict

# repeat above with function for each sub-dict/sub-collection of data

# which brings us to the individual data-items. These, it is implied,
appear on separate lines of the data file, but in sets of seven
data-lines (am ignoring lines of dashes, but if present, then eight-line
sets). Accordingly:

def get_atomic_number():
  get_next_line()
  # whatever checks/processing
  return atomic_number

# and repeat for each of the seven data-items
# if necessary, add read-and-discard for line of dashes

# all the input functionality has been devolved to:

def get_next_line():
  # a Python generator which
  # open the file
  # loop-forever
    # reads single line/record
    # (no need for more - indeed no point in reading the whole and then
having to break that down!)
    # strip, split, etc
    # yield data-value
  # until eof and an exception will be 'returned' and ripple 'up' the
hierarchy of functions to the 'top-level'.


Here is another question: having assembled this dictionary, what will be
done with it? Always start at that back-end - we used to mutter the
mantra "input - process - output" and start 'backwards' (you've probably
already noted that!)


Another elegant feature is that each of the functions (starting from the
lowest level) can be developed and tested individually (or tested and
developed if you practice "TDD"). By testing that the generator returns
the data-file's records appropriately, the complexity of writing and
testing the next 'layer' of subroutine/function becomes easier - because
you will know that at least half of it 'already works'! Each (working)
small module can be built-upon and more-easily assembled into a working
whole - and if/when something 'goes wrong', it will most likely be
contained (only) within the newly-developed code!

(of course, if a fault is found to be caused by 'lower level code' (draw
conclusion here), then, provided the tests have been retained, the test
for that lower-level can be expanded with the needed check, the tests
re-run, and one's attention allowed to rise back 'up' through the
layers...)

"Divide and conquer"!

-- 
Regards,
=dn

From wlfraed at ix.netcom.com  Tue Jul 26 23:11:02 2022
From: wlfraed at ix.netcom.com (Dennis Lee Bieber)
Date: Tue, 26 Jul 2022 23:11:02 -0400
Subject: [Tutor] Building dictionary from large txt file
References: <CAMNwu9Hp-nP_FB3sEwAhOFSQw_H2KnTzYOoEqotd72Q5k0qwkw@mail.gmail.com>
Message-ID: <o4a1ehluua7fgt433ucga81bainrb5vuir@4ax.com>

On Tue, 26 Jul 2022 22:58:06 +0200, bobx ander <bobxander87 at gmail.com>
declaimed the following:

>
>Atomic Number = 1
>    Atomic Symbol = H
>    Mass Number = 1
>    Relative Atomic Mass = 1.00782503223(9)
>    Isotopic Composition = 0.999885(70)
>    Standard Atomic Weight = [1.00784,1.00811]
>    Notes = m
>--------
>
>My goal is to extract the content into a dictionary that displays each
>unique triplet as indicated below
>{'H1': {'Z': 1,'A': 1,'m': 1.00782503223},
>              'D2': {'Z': 1,'A': 2,'m': 2.01410177812}
>               ...} etc

	First thing I'd want to know is how each entry in your source data MAPS
to each item in your desired dictionary.

>My code that I have attempted is as follows:
>
>filename='ex.txt'
>
>afile=open(filename,'r') #opens the file
>content=afile.readlines()
>afile.close()

	I'd probably run a loop inside the open/close section, collecting the
items for ONE entry. I presume "Atomic Number" starts each entry. Then,
when the next "Atomic Number" line is reached you process the collected
lines to make your dictionary entry.

>isotope_data={'Z':0,'A':0,'m':0}#start to create subdictionary for
>each case of atoms with its unique keys and values

	Usually not needed as addressing a key to add a value doesn't need
predefined keys or values. The only reason to initialize is if you expect
to have blocks that DON'T define all key/value pairs.

>for line in content:
>    data=line.strip().split()
>

	Drop the .split() at this level... IF you don't mind some loss in
processing speed to allow...

>    if len(data)<1:

	if not data: #empty string
		pass

see:
>>> str1 = "Atomic Number = 1"
>>> str2 = " "
>>> bool(str1)
True
>>> bool(str2)
True
>>> bool(str1.strip())
True
>>> bool(str2.strip())		<<<<
False
>>> 


>        pass
>    elif data[0]=="Atomic" and data[1]=="Number":
>        atomic_number=data[3]
>

	elif data.startswith("Atomic Number":
		atomic_number = data.split()[-1]


>
>     elif data[0]=="Mass" and data[1]=="Number":
>        mass_number=data[3]
>
>
>
>    elif data[0]=="Relative" and data[1]=="Atomic" and data[2]=="Mass":
>        relative_atomic_mass=data[4]
>

	Ditto for all those.
>
>isotope_data['Z']=atomic_number
>isotope_data['A']=mass_number
>isotope_data['A']=relative_atomic_mass

	This REPLACES any previous value of the key "A". To store multiple
values for a single key you need to put the values into a list... Presuming
you will always have both "mass_number" and "relative_atomic_mass"

	isotope_date["A"] = [mass_number, relative_atomic_mass]


	You don't show the outer dictionary in the example (the same list
concern may apply, you may need to do something like

	dict["key"] = []

	if term_1:
		dict["key"].append(term_1_value)
	if term_2:
		dict["key"].append(term_2_value)

etc.


-- 
	Wulfraed                 Dennis Lee Bieber         AF6VN
	wlfraed at ix.netcom.com    http://wlfraed.microdiversity.freeddns.org/


From avi.e.gross at gmail.com  Tue Jul 26 20:47:06 2022
From: avi.e.gross at gmail.com (avi.e.gross at gmail.com)
Date: Tue, 26 Jul 2022 20:47:06 -0400
Subject: [Tutor] Building dictionary from large txt file
In-Reply-To: <CAMNwu9Hp-nP_FB3sEwAhOFSQw_H2KnTzYOoEqotd72Q5k0qwkw@mail.gmail.com>
References: <CAMNwu9Hp-nP_FB3sEwAhOFSQw_H2KnTzYOoEqotd72Q5k0qwkw@mail.gmail.com>
Message-ID: <006801d8a152$65a37130$30ea5390$@gmail.com>

There is so much work to be done based on what you show and questions to
answer.

The short answer is you made ONE dictionary and overwrote it. You want an
empty dictionary that you keep inserting this dictionary you made into.

You need to recognize when a section of lines is complete. When you see a
blank line now, you PASS. 

Your goal seems to be to read in a multi-line entry perhaps between dividers
like this"--------" so your readlines may not be doing what you want as each
lines has a single item and some may have none.

Whatever you read in seems to be in content. Your code wrapped funny on MY
screen so I did not see this line:

isotope_data={'Z':0,'A':0,'m':0}#start to create subdictionary for each case
of atoms with its unique keys and values 
for line in content:
    data=line.strip().split()

OK, without comment, the rest of your code seems to suggest there is perhaps
a blank line between a series of lines containing info.

It is of some concern that two entries now start with Atomic but you only
look for one.

And are you aware that the split leaves these as single entities:
1.00782503223(9), 0.999885(70), [1.00784,1.00811]

Those are all TEXT and you seem to want to remove things in parentheses from
your output. Do you really want to generally store text or various forms of
numbers?

You have lots of work to make the details work such as by concatenating what
in your code was NOT READ containing an "H" with the isotopic number of "1"
into H1. 

It is best to consider reading in a small sample of ONE and making the code
work to extract what is needed and combine it into the form you want and
make a single entry. Then add a loop. If your data is guaranteed to always
have the same N lines, other methods may work as well or better. 

-----Original Message-----
From: Tutor <tutor-bounces+avi.e.gross=gmail.com at python.org> On Behalf Of
bobx ander
Sent: Tuesday, July 26, 2022 4:58 PM
To: tutor at python.org
Subject: [Tutor] Building dictionary from large txt file

Hi all,
I'm trying to build a dictionary from a rather large file of following
format after it has being read into a list(excerpt from start of list below)
--------

Atomic Number = 1
    Atomic Symbol = H
    Mass Number = 1
    Relative Atomic Mass = 1.00782503223(9)
    Isotopic Composition = 0.999885(70)
    Standard Atomic Weight = [1.00784,1.00811]
    Notes = m
--------

My goal is to extract the content into a dictionary that displays each
unique triplet as indicated below
{'H1': {'Z': 1,'A': 1,'m': 1.00782503223},
              'D2': {'Z': 1,'A': 2,'m': 2.01410177812}
               ...} etc
My code that I have attempted is as follows:

filename='ex.txt'

afile=open(filename,'r') #opens the file
content=afile.readlines()
afile.close()
isotope_data={'Z':0,'A':0,'m':0}#start to create subdictionary for each case
of atoms with its unique keys and values for line in content:
    data=line.strip().split()

    if len(data)<1:
        pass
    elif data[0]=="Atomic" and data[1]=="Number":
        atomic_number=data[3]


     elif data[0]=="Mass" and data[1]=="Number":
        mass_number=data[3]


    elif data[0]=="Relative" and data[1]=="Atomic" and data[2]=="Mass":
        relative_atomic_mass=data[4]


isotope_data['Z']=atomic_number
isotope_data['A']=mass_number
isotope_data['A']=relative_atomic_mass
isotope_data

the output from the programme is only

{'Z': '118', 'A': '295', 'm': '295.21624(69#)'}

I seem to be owerwriting each dictionary and ends up with the above
result.Somehow i think I have to put the assigment of the key,value pairs
elsewhere.

I have tried directly below the elif statements also,but that did not work.

Any hints or ideas

Regards

Bob
_______________________________________________
Tutor maillist  -  Tutor at python.org
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor


From avi.e.gross at gmail.com  Tue Jul 26 21:11:01 2022
From: avi.e.gross at gmail.com (avi.e.gross at gmail.com)
Date: Tue, 26 Jul 2022 21:11:01 -0400
Subject: [Tutor] Building dictionary from large txt file
In-Reply-To: <0aa4daf4-6c45-ab98-ed5f-f9d381ca6299@yahoo.co.uk>
References: <CAMNwu9Hp-nP_FB3sEwAhOFSQw_H2KnTzYOoEqotd72Q5k0qwkw@mail.gmail.com>
 <0aa4daf4-6c45-ab98-ed5f-f9d381ca6299@yahoo.co.uk>
Message-ID: <006d01d8a155$bcdbc080$36934180$@gmail.com>

Sorry, Alan, I found that part quite clear. Then again, one of my degrees is a B.S. in Chemistry. No idea why I ever did that as I have never had any use for it, well, except in Med School, which I also wonder why ...

Bob may not have been clear, but what he is reading in is basically a table of atomic elements starting with the Atomic Number (number of Protons so Hydrogen is 1 and Helium is 2 and so on). Many elements come in a variety of isotopes meaning the number of neutrons varies so Hydrogen can be a single proton or have one neutron too for Deuterium or 2 for tritium. The mass number is loosely how many nucleons it has as they are almost the same mass. 

He wants the key to be the concatenation of Atomic number (which is always one or two letters like H or He with I believe the mass number, thus he should make H1 in his one and only example (albeit you can look up things like a periodical table to see others in whatever format.) That field clearly should be text and used as a unique key.

He then want the values to be another dictionary where the many second-level dictionaries contain keys of 'Z', 'A', 'm' and whatever his etc. is. The Atomic mass is obvious but not really as it depends on the mix of isotopes in a sample. Hydrogen normally is predominantly the single proton version so the atomic weight is very close to one. But if you extracted out almost pure Deuterium, it would be about double. Whatever it is, he needs to extract out a value like this "1.00782503223(9)" and either toss the leaky digit, parens and all, or include it and then convert it into a FLOAT of some kind.

Isotopic Composition is sort of clear(as mud) as I mentioned there are more than two isotopes, albeit tritium does not last long before breaking down, so here it means the H-1 version is 0.999885(70) or way over 99% of the sample, with a tiny bit of H-2 or deuterium. (Sorry, hard to write anything in text mode when there are subscripts and superscripts used on both the left and right of symbols). I am not sure how this scales up when many other elements have many isotopes including stable ones, but assume it means the primary is some percent of the total. Chlorine, for example, has over two dozen known isotopes and an Atomic Weight in the neighborhood of 35 1/2 as nothing quite dominates. 

And since samples vary in percentage composition, the Atomic Weight is shown as some kind of range in:

Standard Atomic Weight = [1.00784,1.00811]


He needs to extract the two numbers using whatever techniques he wants and either record both as a tuple or list (after converting perhaps to float) or take an average or whatever the assignment requires.

I have no idea if Notes matters as he stopped explain what he wants his output to be BUT he should know it may be a pain to deal with the split text as it may show up as multiple items in his list of tokens.

But as I wrote earlier, his main request was to ask why his badly formatted single dictionary gets overwritten and the answer is because he does that instead of adding it to an outer dictionary first and then starting over.

So the rest of your comments do apply. Just satisfying your curiosity and if I am wrong, someone feel free to correct me.

-----Original Message-----
From: Tutor <tutor-bounces+avi.e.gross=gmail.com at python.org> On Behalf Of Alan Gauld
Sent: Tuesday, July 26, 2022 8:34 PM
To: tutor at python.org
Subject: Re: [Tutor] Building dictionary from large txt file


On 26/07/2022 21:58, bobx ander wrote:
> Hi all,
> I'm trying to build a dictionary from a rather large file of following 
> format after it has being read into a list(excerpt from start of list 
> below)
> --------
>
> Atomic Number = 1
>     Atomic Symbol = H
>     Mass Number = 1
>     Relative Atomic Mass = 1.00782503223(9)
>     Isotopic Composition = 0.999885(70)
>     Standard Atomic Weight = [1.00784,1.00811]
>     Notes = m
> --------
>
> My goal is to extract the content into a dictionary that displays each 
> unique triplet as indicated below
> {'H1': {'Z': 1,'A': 1,'m': 1.00782503223},
>               'D2': {'Z': 1,'A': 2,'m': 2.01410177812}
>                ...} etc

Unfortunately to those of us unfamiliar with your data that is as clear as mud.

You refer to a triplet but your sample file entry has 7 fields, some of which have multiple values. Where is the triplet among all that data?

Then you show us a dictionary with keys that do not correspond to any of the fields in your data sample. How do the fields correspond - the only "obvious" one is the mass which evidently corresponds with the key 'm'.

But what are H1 and D2? Another file record or some derived value from the record shown above? Similarly for Z, A and m. How do they relate to the data?

You need to specify your requirement more explicitly for us to be sure we are giving valid advice.


> My code that I have attempted is as follows:
>
> filename='ex.txt'
>
> afile=open(filename,'r') #opens the file
> content=afile.readlines()
> afile.close()

You probably don't need to read the file into a list if you are going to process it line by line. Just read the lines from the file and process them as you go.


> isotope_data={'Z':0,'A':0,'m':0}#start to create subdictionary for 
> each case of atoms with its unique keys and values for line in 
> content:
>     data=line.strip().split()
>
>     if len(data)<1:
>         pass
>     elif data[0]=="Atomic" and data[1]=="Number":
>         atomic_number=data[3]
>
>
>      elif data[0]=="Mass" and data[1]=="Number":
>         mass_number=data[3]
>
>
>
>     elif data[0]=="Relative" and data[1]=="Atomic" and data[2]=="Mass":
>         relative_atomic_mass=data[4]
>
Rather than split the line then compare each field it might be easier (and more readable) to compare the full strings using the startswith() method then split the string:

for line in file:

     if line.startwith("Atomic Number"):

         atomic_number = line.strip().split()[3]

    etc...

> isotope_data['Z']=atomic_number
> isotope_data['A']=mass_number
> isotope_data['A']=relative_atomic_mass
> isotope_data
>
> the output from the programme is only
>
> {'Z': '118', 'A': '295', 'm': '295.21624(69#)'}
>
> I seem to be owerwriting each dictionary

Yes, you never detect the end of a record - you never explain how records are separated in the file either!

You need something like


master = []   # empty dict.

for line in file:

       if line.startswith("Atomic Number")

           create variable....

      if line.startswith(....):....etc

       if <record separator detected>   # we don't know what this is...

             # save variables in a dictionary

             record = { key1:variable1, key2:variable2....}

             # insert dictionary to master dictionary

             master[key] = record

How you generate the keys is a mystery to me but presumably you know.


You could write the values directly into the master dictionary if you prefer.

Also note that you are currently storing strings. If you want the numeric data you will need to convert it with int() or float() as appropriate.

--
Alan G
Author of the Learn to Program web site
http://www.alan-g.me.uk/
http://www.amazon.com/author/alan_gauld
Follow my photo-blog on Flickr at:
http://www.flickr.com/photos/alangauldphotos

_______________________________________________
Tutor maillist  -  Tutor at python.org
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor


From roel at roelschroeven.net  Wed Jul 27 03:39:04 2022
From: roel at roelschroeven.net (Roel Schroeven)
Date: Wed, 27 Jul 2022 09:39:04 +0200
Subject: [Tutor] Building dictionary from large txt file
In-Reply-To: <o4a1ehluua7fgt433ucga81bainrb5vuir@4ax.com>
References: <CAMNwu9Hp-nP_FB3sEwAhOFSQw_H2KnTzYOoEqotd72Q5k0qwkw@mail.gmail.com>
 <o4a1ehluua7fgt433ucga81bainrb5vuir@4ax.com>
Message-ID: <e5b01024-7532-550a-e209-c52985677043@roelschroeven.net>

Op 27/07/2022 om 5:11 schreef Dennis Lee Bieber:

> [...]
> 	I'd probably run a loop inside the open/close section, collecting the
> items for ONE entry. I presume "Atomic Number" starts each entry. Then,
> when the next "Atomic Number" line is reached you process the collected
> lines to make your dictionary entry.
I'd probably do something like that too. But don't forget to also 
process the collected lines at the end of the file! After the last entry 
there is no "Atomic Number" line anymore, so it's easy to inadvertently 
skip that last entry (speaking from experience here... ).

-- 
"Most of us, when all is said and done, like what we like and make up
reasons for it afterwards."
         -- Soren F. Petersen


From avi.e.gross at gmail.com  Fri Jul 29 12:22:48 2022
From: avi.e.gross at gmail.com (avi.e.gross at gmail.com)
Date: Fri, 29 Jul 2022 12:22:48 -0400
Subject: [Tutor] POSIT and QUARTO
Message-ID: <011e01d8a367$715933e0$540b9ba0$@gmail.com>

This is not a question. Just a fairly short comment about changes that may
impact some Python users.

 
I have long used both R and python to do things but had to use different
development environments. The company formerly called RSTUDIO has been
increasingly supporting python as well as R and now has been renamed to
POSIT, presumably by adding a P for Python and keeping some letter from
STudIO:

 
https://www.r-bloggers.com/2022/07/posit-why-rstudio-is-changing-its-name/?u
tm_source=phpList
<https://www.r-bloggers.com/2022/07/posit-why-rstudio-is-changing-its-name/?
utm_source=phpList&utm_medium=email&utm_campaign=R-bloggers-daily&utm_conten
t=HTML> &utm_medium=email&utm_campaign=R-bloggers-daily&utm_content=HTML

 
I mention this as I have already been doing some python work in RSTUDIO as
well as anaconda and for small bits in IDLE and even in a CYGWIN environment
and my machine is a tad confused at the multiple downloads of various
versions of python.

 
Also, I have been using tools to make live documents that run code and
interleave it with text and the new company also supports a sort of vastly
improved and merged version of a product that will now also work with python
and other languages called QUARTO that some might be interested in.

 
https://www.r-bloggers.com/2022/07/announcing-quarto-a-new-scientific-and-te
chnical-publishing-system/?utm_source=phpList
<https://www.r-bloggers.com/2022/07/announcing-quarto-a-new-scientific-and-t
echnical-publishing-system/?utm_source=phpList&utm_medium=email&utm_campaign
=R-bloggers-daily&utm_content=HTML>
&utm_medium=email&utm_campaign=R-bloggers-daily&utm_content=HTML

 
I have no personal connection with the company except as a happy user for
many years who has been interested much more broadly than their initial
product and will happily use the abilities they provide that let me mix and
match what I do in an assortment of languages. Time for me to revisit Julia
and Javascript that are now supported.

 
I am NOT saying there is anything wrong with python, just a new option on
how to work with python in a nice well-considered GUI that many already have
found very useful. In many fields of use, many programmers and projects
often choose among various programming languages and environments so you
often end up having to be, in a sense, multilingual and multicultural. So it
can be nice to work toward an environment where many people can be
comfortable and even work together while remaining somewhat unique. The
above is an example I have used to write documents that incorporate
functionality as in use R to read in a file, convert it and save another
file while producing some statistics in the text, then in the same document,
have a snippet of python open that output file and do more and show it in
the same document, as an example.

 
From sjeik_appie at hotmail.com  Sun Jul 31 12:19:42 2022
From: sjeik_appie at hotmail.com (Albert-Jan Roskam)
Date: Sun, 31 Jul 2022 18:19:42 +0200
Subject: [Tutor] Unittest question
Message-ID: <DB6PR01MB38956952B35C867DDF255F6E839B9@DB6PR01MB3895.eurprd01.prod.exchangelabs.com>

   Hi,
   I am trying to run the same unittests against two different
   implementations of a module. Both use C functions in a dll, but one uses
   ctypes and the other that I just wrote uses CFFI. How can do this nicely?
   I don't understand why the code below won't run both flavours of tests. It
   says "Ran 1 test", I expected two!
   Any ideas?
   Thanks!
   Albert-Jan

   import unittest

   ?

   class MyTest(unittest.TestCase):

   ?

   ??? def __init__(self, *args, **kwargs):

   ??????? self.implementation = kwargs.pop("implementation", None)

   ??????? if self.implementation == "CFFI":

   ????????????import my_module_CFFI as xs

   ????????else:

   ??????????? import my_module as xs

   ??????? super().__init__(*args, **kwargs)?????????????????????????????????

   ?

   ??? def test_impl(self):

   ??????? print(self.implementation)

   ??????? self.assertEqual(self.implementation, "")

   ?

   if __name__ == "__main__":

   ????suite? = unittest.TestSuite()

   ??? suite.addTests(MyTest(implementation="CFFI"))

   ????suite.addTests(MyTest(implementation="ctypes"))???

   ????runner = unittest.TextTestRunner(verbosity=3)

   ??? runner.run(suite)

   ?

   ###Output:

   AssertionError: None != ''

   -------------------- >> begin captured stdout << ---------------------

   None

   ?

   --------------------- >> end captured stdout << ----------------------

   ?

   ----------------------------------------------------------------------

From sjeik_appie at hotmail.com  Sun Jul 31 16:53:33 2022
From: sjeik_appie at hotmail.com (Albert-Jan Roskam)
Date: Sun, 31 Jul 2022 22:53:33 +0200
Subject: [Tutor] Unittest question
In-Reply-To: <DB6PR01MB38956952B35C867DDF255F6E839B9@DB6PR01MB3895.eurprd01.prod.exchangelabs.com>
Message-ID: <DB6PR01MB3895763CE9606F444F603396839B9@DB6PR01MB3895.eurprd01.prod.exchangelabs.com>

   On Jul 31, 2022 18:19, Albert-Jan Roskam <sjeik_appie at hotmail.com> wrote:

     ?suite.addTests(MyTest(implementation="CFFI"))

     ?? ????suite.addTests(MyTest(implementation="ctypes"))???

     ? ?

   Hmm, maybe addTest and not addTests.?
   I found this SO post with a very similar approach. Will try this
   tomorrow.?https://stackoverflow.com/questions/32899/how-do-you-generate-dynamic-parameterized-unit-tests-in-python
   And unittest has "subtests", which may also work

From sjeik_appie at hotmail.com  Sun Jul 31 17:02:20 2022
From: sjeik_appie at hotmail.com (Albert-Jan Roskam)
Date: Sun, 31 Jul 2022 23:02:20 +0200
Subject: [Tutor] Unittest question
In-Reply-To: <DB6PR01MB3895763CE9606F444F603396839B9@DB6PR01MB3895.eurprd01.prod.exchangelabs.com>
Message-ID: <DB6PR01MB3895211CEB8B2F9031E14865839B9@DB6PR01MB3895.eurprd01.prod.exchangelabs.com>

   Me likey :-) https://bugs.python.org/msg151444
   Looks like a clean approach
   On Jul 31, 2022 22:53, Albert-Jan Roskam <sjeik_appie at hotmail.com> wrote:

     On Jul 31, 2022 18:19, Albert-Jan Roskam <sjeik_appie at hotmail.com>
     wrote:

       ?suite.addTests(MyTest(implementation="CFFI"))

       ?? ????suite.addTests(MyTest(implementation="ctypes"))???

       ? ?

     Hmm, maybe addTest and not addTests.?
     I found this SO post with a very similar approach. Will try this
     tomorrow.?https://stackoverflow.com/questions/32899/how-do-you-generate-dynamic-parameterized-unit-tests-in-python
     And unittest has "subtests", which may also work