From manpritsinghece at gmail.com Sat Jul 2 22:49:43 2022 From: manpritsinghece at gmail.com (Manprit Singh) Date: Sun, 3 Jul 2022 08:19:43 +0530 Subject: [Tutor] toy program to find standard deviation of 2 columns of a sqlite3 database Message-ID: Dear sir , I have tried writing a program in which I am calculating the population standard deviation of two columns X1 & X2 of a table of sqlite3 in - memory database . import sqlite3 import statistics class StdDev: def __init__(self): self.lst = [] def step(self, value): self.lst.append(value) def finalize(self): return statistics.pstdev(self.lst) con = sqlite3.connect(":memory:") cur = con.cursor() cur.execute("create table table1(X1 int, X2 int)") ls = [(2, 4), (3, 5), (4, 7), (5, 8)] cur.executemany("insert into table1 values(?, ?)", ls) con.commit() con.create_aggregate("stddev", 1, StdDev) cur.execute("select stddev(X1), stddev(X2) from table1") print(cur.fetchone()) cur.close() con.close() prints the output as : (1.118033988749895, 1.5811388300841898) which is correct . My question is, as you can see i have used list inside the class StdDev, which I think is an inefficient way to do this kind of problem because there may be a large number of values in a column and it can take a huge amount of memory. Can this problem be solved with the use of iterators ? What would be the best approach to do it ? Regards Manprit Singh From manpritsinghece at gmail.com Sun Jul 3 00:23:05 2022 From: manpritsinghece at gmail.com (Manprit Singh) Date: Sun, 3 Jul 2022 09:53:05 +0530 Subject: [Tutor] toy program to find standard deviation of 2 columns of a sqlite3 database In-Reply-To: <015201d88e91$13d44470$3b7ccd50$@gmail.com> References: <015201d88e91$13d44470$3b7ccd50$@gmail.com> Message-ID: Yes it is obviously a homework kind of thing ....I do create problems for myself , try to solve and try to find better ways . I was trying to learn the use of create- aggregate() . Thank you for the hint that is given by you . Let me try . My purpose is not to find the std dev. It is actually to learn how to use the functions On Sun, 3 Jul, 2022, 09:28 , wrote: > Maybe a dumb question but why the need to do a calculation of a standard > deviation so indirectly in SQL and in the database but starting from > Python? > > R has a built-in function that calculates a standard deviation. You can > easily save it where you want after. > > As for memory use in general, there are several ways to calculate a > standard > deviation but there is a tradeoff. You could read in an entry at a time and > add it to a continuing sum while keeping track of the number of entries. > You > then calculate the mean. Then you can read it al in AGAIN and calculate the > difference between each number and the mean, and do the rest of the > calculation by squaring that and so on as you sum that and finally play > with > a division and a square root. > > But that may not be needed except with large amounts of data. > > What am I missing? Is this an artificial HW situation? > > > -----Original Message----- > From: Tutor On Behalf Of > Manprit Singh > Sent: Saturday, July 2, 2022 10:50 PM > To: tutor at python.org > Subject: [Tutor] toy program to find standard deviation of 2 columns of a > sqlite3 database > > Dear sir , > > I have tried writing a program in which I am calculating the population > standard deviation of two columns X1 & X2 of a table of sqlite3 in - > memory database . > import sqlite3 > import statistics > > class StdDev: > def __init__(self): > self.lst = [] > > def step(self, value): > self.lst.append(value) > > def finalize(self): > return statistics.pstdev(self.lst) > > > con = sqlite3.connect(":memory:") > cur = con.cursor() > cur.execute("create table table1(X1 int, X2 int)") ls = [(2, 4), > (3, 5), > (4, 7), > (5, 8)] > cur.executemany("insert into table1 values(?, ?)", ls) > con.commit() > con.create_aggregate("stddev", 1, StdDev) cur.execute("select stddev(X1), > stddev(X2) from table1") > print(cur.fetchone()) > cur.close() > con.close() > > prints the output as : > > (1.118033988749895, 1.5811388300841898) > > which is correct . > > My question is, as you can see i have used list inside the class StdDev, > which > > I think is an inefficient way to do this kind of problem because there may > be > > a large number of values in a column and it can take a huge amount of > memory. > > Can this problem be solved with the use of iterators ? What would be the > best > > approach to do it ? > > Regards > > Manprit Singh > _______________________________________________ > Tutor maillist - Tutor at python.org > To unsubscribe or change subscription options: > https://mail.python.org/mailman/listinfo/tutor > > From manpritsinghece at gmail.com Sun Jul 3 03:54:50 2022 From: manpritsinghece at gmail.com (Manprit Singh) Date: Sun, 3 Jul 2022 13:24:50 +0530 Subject: [Tutor] toy program to find standard deviation of 2 columns of a sqlite3 database In-Reply-To: References: <015201d88e91$13d44470$3b7ccd50$@gmail.com> Message-ID: Dear Sir, I have chosen this standard deviation as an exercise because there are two steps: first you have to find the mean, then subtract the mean from each value of the column . Writing an aggregate function for this using python's sqlite3 seems a little difficult as there is only single step function inside the class, used to make that . Kindly put some light . This is just an exercise to understand how this create_aggregate() works . Kindly help Regards Manprit Singh On Sun, Jul 3, 2022 at 9:53 AM Manprit Singh wrote: > Yes it is obviously a homework kind of thing ....I do create problems for > myself , try to solve and try to find better ways . > > I was trying to learn the use of create- aggregate() . > > Thank you for the hint that is given by you . Let me try . > > My purpose is not to find the std dev. It is actually to learn how to use > the functions > > > On Sun, 3 Jul, 2022, 09:28 , wrote: > >> Maybe a dumb question but why the need to do a calculation of a standard >> deviation so indirectly in SQL and in the database but starting from >> Python? >> >> R has a built-in function that calculates a standard deviation. You can >> easily save it where you want after. >> >> As for memory use in general, there are several ways to calculate a >> standard >> deviation but there is a tradeoff. You could read in an entry at a time >> and >> add it to a continuing sum while keeping track of the number of entries. >> You >> then calculate the mean. Then you can read it al in AGAIN and calculate >> the >> difference between each number and the mean, and do the rest of the >> calculation by squaring that and so on as you sum that and finally play >> with >> a division and a square root. >> >> But that may not be needed except with large amounts of data. >> >> What am I missing? Is this an artificial HW situation? >> >> >> -----Original Message----- >> From: Tutor On Behalf Of >> Manprit Singh >> Sent: Saturday, July 2, 2022 10:50 PM >> To: tutor at python.org >> Subject: [Tutor] toy program to find standard deviation of 2 columns of a >> sqlite3 database >> >> Dear sir , >> >> I have tried writing a program in which I am calculating the population >> standard deviation of two columns X1 & X2 of a table of sqlite3 in - >> memory database . >> import sqlite3 >> import statistics >> >> class StdDev: >> def __init__(self): >> self.lst = [] >> >> def step(self, value): >> self.lst.append(value) >> >> def finalize(self): >> return statistics.pstdev(self.lst) >> >> >> con = sqlite3.connect(":memory:") >> cur = con.cursor() >> cur.execute("create table table1(X1 int, X2 int)") ls = [(2, 4), >> (3, 5), >> (4, 7), >> (5, 8)] >> cur.executemany("insert into table1 values(?, ?)", ls) >> con.commit() >> con.create_aggregate("stddev", 1, StdDev) cur.execute("select stddev(X1), >> stddev(X2) from table1") >> print(cur.fetchone()) >> cur.close() >> con.close() >> >> prints the output as : >> >> (1.118033988749895, 1.5811388300841898) >> >> which is correct . >> >> My question is, as you can see i have used list inside the class StdDev, >> which >> >> I think is an inefficient way to do this kind of problem because there may >> be >> >> a large number of values in a column and it can take a huge amount of >> memory. >> >> Can this problem be solved with the use of iterators ? What would be the >> best >> >> approach to do it ? >> >> Regards >> >> Manprit Singh >> _______________________________________________ >> Tutor maillist - Tutor at python.org >> To unsubscribe or change subscription options: >> https://mail.python.org/mailman/listinfo/tutor >> >> From __peter__ at web.de Sun Jul 3 05:02:19 2022 From: __peter__ at web.de (Peter Otten) Date: Sun, 3 Jul 2022 11:02:19 +0200 Subject: [Tutor] toy program to find standard deviation of 2 columns of a sqlite3 database In-Reply-To: References: <015201d88e91$13d44470$3b7ccd50$@gmail.com> Message-ID: <9fd97b5f-67cf-0e29-e7eb-1398eea478db@web.de> On 03/07/2022 06:23, Manprit Singh wrote: > Yes it is obviously a homework kind of thing ....I do create problems for > myself , try to solve and try to find better ways . > > I was trying to learn the use of create- aggregate() . > > Thank you for the hint that is given by you . Let me try . > > My purpose is not to find the std dev. It is actually to learn how to use > the functions > > > On Sun, 3 Jul, 2022, 09:28 , wrote: > >> Maybe a dumb question but why the need to do a calculation of a standard >> deviation so indirectly in SQL and in the database but starting from >> Python? >> >> R has a built-in function that calculates a standard deviation. You can >> easily save it where you want after. >> >> As for memory use in general, there are several ways to calculate a >> standard >> deviation but there is a tradeoff. You could read in an entry at a time and >> add it to a continuing sum while keeping track of the number of entries. >> You >> then calculate the mean. Then you can read it al in AGAIN and calculate the >> difference between each number and the mean, and do the rest of the >> calculation by squaring that and so on as you sum that and finally play >> with >> a division and a square root. >> >> But that may not be needed except with large amounts of data. >> >> What am I missing? Is this an artificial HW situation? >> >> >> -----Original Message----- >> From: Tutor On Behalf Of >> Manprit Singh >> Sent: Saturday, July 2, 2022 10:50 PM >> To: tutor at python.org >> Subject: [Tutor] toy program to find standard deviation of 2 columns of a >> sqlite3 database >> >> Dear sir , >> >> I have tried writing a program in which I am calculating the population >> standard deviation of two columns X1 & X2 of a table of sqlite3 in - >> memory database . >> import sqlite3 >> import statistics >> >> class StdDev: >> def __init__(self): >> self.lst = [] >> >> def step(self, value): >> self.lst.append(value) >> >> def finalize(self): >> return statistics.pstdev(self.lst) >> >> >> con = sqlite3.connect(":memory:") >> cur = con.cursor() >> cur.execute("create table table1(X1 int, X2 int)") ls = [(2, 4), >> (3, 5), >> (4, 7), >> (5, 8)] >> cur.executemany("insert into table1 values(?, ?)", ls) >> con.commit() >> con.create_aggregate("stddev", 1, StdDev) cur.execute("select stddev(X1), >> stddev(X2) from table1") >> print(cur.fetchone()) >> cur.close() >> con.close() >> >> prints the output as : >> >> (1.118033988749895, 1.5811388300841898) >> >> which is correct . >> >> My question is, as you can see i have used list inside the class StdDev, >> which >> >> I think is an inefficient way to do this kind of problem because there may >> be >> >> a large number of values in a column and it can take a huge amount of >> memory. >> >> Can this problem be solved with the use of iterators ? I don't think that you can convert the callback into a generator here; and if you could it probably wouldn't help as the statistics module uses a two-pass algorithm. You could switch to a simpler algorithm like the one used by some old calculators that only keep track of sum(xi), sum(xi*xi) and count, and calculate stddev from these in the finalize() method. What would be the >> best >> >> approach to do it ? >> >> Regards >> >> Manprit Singh >> _______________________________________________ >> Tutor maillist - Tutor at python.org >> To unsubscribe or change subscription options: >> https://mail.python.org/mailman/listinfo/tutor >> >> > _______________________________________________ > Tutor maillist - Tutor at python.org > To unsubscribe or change subscription options: > https://mail.python.org/mailman/listinfo/tutor > From alan.gauld at yahoo.co.uk Sun Jul 3 08:30:51 2022 From: alan.gauld at yahoo.co.uk (Alan Gauld) Date: Sun, 3 Jul 2022 13:30:51 +0100 Subject: [Tutor] toy program to find standard deviation of 2 columns of a sqlite3 database In-Reply-To: References: Message-ID: On 03/07/2022 03:49, Manprit Singh wrote: > con.create_aggregate("stddev", 1, StdDev) > cur.execute("select stddev(X1), stddev(X2) from table1") I just wanted to say thanks for posting this. I have never used, nor seen anyone else use, the ability to create a user defined aggregate function in SQLite - usually I just extract the data into python and use python to do the aggregation. But your question made me read up on how that all worked so it has taught me something new. (It also makes me appreciate how the Pyhon API is much easier to use than the raw C API to SQLite!) > My question is, as you can see i have used list inside the class StdDev, which > I think is an inefficient way to do this kind of problem because there may be > a large number of values in a column and it can take a huge amount of memory. > Can this problem be solved with the use of iterators ? What would be the best > approach to do it ? If I'm working with so much data that this would be a problem I'd use the database itself to store the intermediate data. That would be much slower but much less memory dependant. But as others have said, with aggregate functions you don't usually need to store data from all rows you just store a few inermediate results which you combine at the end. If you are trying to use an in-memory function - like the stddev function here - then you need to fit all the data in memory anyway so the function will simply not work if you can't store the data in RAM. In that case you need to find(or write) another function that doesn't use memory for storage or uses less storage. It is also worth pointing out that most industrial strength SQL databases come with a far richer set of aggregate functions than SQLite. So if you do have to work with large volumes of data you should probably switch to someting like Oracle, DB2, SQLServer(*), etc and just use the functions built into the server. If they don't have such a function they also have amuch simpler way of defining stored procedures. As ever, choose the appropriate tool for the job. (*)These are just the ones I know, I assume MySql, Postgres etc have similarly broad libraries. -- Alan G Author of the Learn to Program web site http://www.alan-g.me.uk/ http://www.amazon.com/author/alan_gauld Follow my photo-blog on Flickr at: http://www.flickr.com/photos/alangauldphotos From manpritsinghece at gmail.com Sun Jul 3 09:01:20 2022 From: manpritsinghece at gmail.com (Manprit Singh) Date: Sun, 3 Jul 2022 18:31:20 +0530 Subject: [Tutor] toy program to find standard deviation of 2 columns of a sqlite3 database In-Reply-To: References: Message-ID: Sir, I am just going through all the functionalities available in sqlite3 module , just to see if I can use sqlite3 as a good data analysis tool or not . Upto this point I have figured out that and sqlite data base file can be an excellent replacement for data stored in files . You can preserve data in a structured form, email to someone who need it etc etc . But for good data analysis ....I found pandas is superior . I use pandas for data analysis and visualization . Btw ....this is true . You should use right tool for your task . Regards Manprit Singh On Sun, 3 Jul, 2022, 18:02 Alan Gauld via Tutor, wrote: > On 03/07/2022 03:49, Manprit Singh wrote: > > > con.create_aggregate("stddev", 1, StdDev) > > cur.execute("select stddev(X1), stddev(X2) from table1") > > I just wanted to say thanks for posting this. I have never used, > nor seen anyone else use, the ability to create a user defined aggregate > function in SQLite - usually I just extract the data into python > and use python to do the aggregation. But your question made me > read up on how that all worked so it has taught me something new. > (It also makes me appreciate how the Pyhon API is much easier > to use than the raw C API to SQLite!) > > > My question is, as you can see i have used list inside the class StdDev, > which > > I think is an inefficient way to do this kind of problem because there > may be > > a large number of values in a column and it can take a huge amount of > memory. > > Can this problem be solved with the use of iterators ? What would be the > best > > approach to do it ? > > If I'm working with so much data that this would be a problem I'd > use the database itself to store the intermediate data. That would > be much slower but much less memory dependant. But as others have > said, with aggregate functions you don't usually need to store > data from all rows you just store a few inermediate results > which you combine at the end. > > If you are trying to use an in-memory function - like the > stddev function here - then you need to fit all the data in > memory anyway so the function will simply not work if you can't > store the data in RAM. In that case you need to find(or write) > another function that doesn't use memory for storage or > uses less storage. > > It is also worth pointing out that most industrial strength > SQL databases come with a far richer set of aggregate functions > than SQLite. So if you do have to work with large volumes of data > you should probably switch to someting like Oracle, DB2, SQLServer(*), > etc and just use the functions built into the server. If they > don't have such a function they also have amuch simpler way > of defining stored procedures. As ever, choose the appropriate > tool for the job. > > (*)These are just the ones I know, I assume MySql, Postgres etc > have similarly broad libraries. > > -- > Alan G > Author of the Learn to Program web site > http://www.alan-g.me.uk/ > http://www.amazon.com/author/alan_gauld > Follow my photo-blog on Flickr at: > http://www.flickr.com/photos/alangauldphotos > > > _______________________________________________ > Tutor maillist - Tutor at python.org > To unsubscribe or change subscription options: > https://mail.python.org/mailman/listinfo/tutor > From manpritsinghece at gmail.com Sun Jul 3 12:59:41 2022 From: manpritsinghece at gmail.com (Manprit Singh) Date: Sun, 3 Jul 2022 22:29:41 +0530 Subject: [Tutor] toy program to find standard deviation of 2 columns of a sqlite3 database In-Reply-To: <007001d88ef3$3e032420$ba096c60$@gmail.com> References: <015201d88e91$13d44470$3b7ccd50$@gmail.com> <007001d88ef3$3e032420$ba096c60$@gmail.com> Message-ID: Dear Sir, Leaving all that standard deviation thing. I would say something about python's sqlite3 module . There is a create_aggregate(*name*, *n_arg*, *aggregate_class*) function there, which can create a user defined aggregate function . This function requires an aggregate class as argument, this class must contain a step method and finalize method, as per written in Python documentation . I just want to learn about using this create_aggregate(). So far what i have concluded is the step method of the class is called for each element of the column and finalize is to return the final result of the aggregate. Again coming to the example given in the documentation : import sqlite3 class MySum: def __init__(self): self.count = 0 def step(self, value): self.count += value def finalize(self): return self.count con = sqlite3.connect(":memory:")con.create_aggregate("mysum", 1, MySum)cur = con.cursor()cur.execute("create table test(i)")cur.execute("insert into test(i) values (1)")cur.execute("insert into test(i) values (2)")cur.execute("select mysum(i) from test")print(cur.fetchone()[0]) con.close() The answer is 3 , which is the correct answer and is the sum of values in the column named i in the table test.It was easy to implement as the need only is to add all values of the column. For each value of the column i, the step is called and each value of i gets added to self.count. at last self.count will represent the sum of all values of the column and is returned through the finalize method. Now as sum is an aggregate function, same way population standard deviation is also an aggregate function. We should be able to make a user defined function to find the population standard deviation of a column or multiple columns of a sqlite3 database . Hopefully you agree ? Now to find this if i am going to write the class, in the step method, i can only count the values in the column and find the sum of all values for getting mean.I am not getting the mechanism to subtract the mean from each value of the column in the same step method or by any other way in the class. Hopefully my question is more clear now. Btw I would like to write a one liner to calculate population std deviation of a list: lst = [2, 5, 7, 9, 10] mean = sum(lst)/len(lst) std_dev = (sum((ele-mean)**2 for ele in lst)/len(lst))**0.5 print(std_dev) prints 2.870540018881465 which is the right answer. Regards Manprit Singh On Sun, Jul 3, 2022 at 9:10 PM wrote: > Manprit, > > At some point, questions are not about Python but are about a specialized > package or even about SQL statements. > > I can understand your wanting to learn by experimentation and the gist of > your programming ideas seems to be to use Python and a set of functions to > drive an SQL database when, as has been pointed out, it can be done > directly > in the database. > > But I think you have made your question clearer and you seem focused on > ways > to minimize memory use such as you might find on a device like Raspberry > Pi. > > So if you asked without the SQL part, people might get into the question. > > If I understand it, you wish to make your own class, StdDev, to somehow > manage getting the data incrementally and calculating a standard deviation, > rather than reading it all at once. Is that the question? > > Your code, of course, makes absolutely no sense as written as all it does > is > create a few items and stores them in a new table, just so it can get them > again! So of course you already fill your memory with the list you made. > The > code you want to share with us does not need any of these steps. It needs > to > start with a database out there that you want to read from. There is > nothing > wrong with your code just that it gets in the way of seeing what you want > to > do. > > You then leave some of us (meaning me) having to do research to look up > what > the heck create_aggregate() does and when I found out, I stopped wanting to > continue. > > Someone else may want to help you but I am heading our for a long drive and > already answered you in a way that I find reasonable. > > Look up the definition for a variance and then standard deviation. Decide > which version to use and note some divide by N and some by N-1 depending on > whether you have all of the population or a sample. Overall the scheme is > to > take each number minus the mean and square it and sum that and finally > divide by N-1 for the variance and then take the square root of that for > the > standard deviation. It can be done trivially using functions already > available but that does use memory all at once. > > If you want to fetch one pair of numbers at a time, the algorithm is fairly > simple. > > Start with two accumulator variables, one for each of the numbers you want > to calculate the standard deviation for. > > In a loop, read one row at a time, or some small number of rows like 100. > Add the right numbers to the right accumulator, handling any NA values if > needed, and keeping track of how many valid items in each you processed. > > When done, calculate the mean of each. > > Restart a new query and again in a loop get your numbers one at a time (or > a > hundred) and this time use a new accumulator in which you keep adding the > current number minus the mean calculated and squared. > > When all the data is calculated, you have the sums. Divide by N-1 then take > the square root. > > So you want to know how to use something that only gets one row of data at > a > time. That is here not really a Python issue as you are able to call some > function in your module that presumably gets the next row of an active > query. If you can call that directly, fine. If you want to hide it in an > iterator, also fine. > > Given your current class, your question for this part of the exercise seems > to be how to rewrite the class. But my perspective may not match yours as > you seem to want to hand the object to a function that calls on it somehow > repeatedly to get the task done. You show no code as to how you might do it > in a way I am thinking of. > > The dunder init section creates an empty list. You no longer want the list. > So what do you want? My guess is this could be the place you open a > connection to the database or a specific query. > > Your step() function presumably no longer wants to append a value to the no > longer existent list. What makes sense for you here? Is the calculation > happening inside the class? If so, the step not only gets a row of data but > is perhaps working on summing to eventually calculate a mean. As noted > above, what I am talking about requires two passes through the data so two > sets of steps like this. Maybe you want a step1() that is summing the > original data and another function called step2() that is in some ways > similar and sums the second part once it has a mean. > > And what does your finalize() method do? Right now it returns the > calculation on the list that you no longer want to use. In the outline I am > sketching, you might want to use it just to return the computed values and > maybe close the database connection. > > Maybe I am confused but you seem to ask to calculate the standard deviation > for TWO columns of data but your code seems to also work on one set of > numbers. You need to get that straight. > > Have fun. > > I am dropping out of this one. > > > > > -----Original Message----- > From: Tutor On Behalf Of > Manprit Singh > Sent: Sunday, July 3, 2022 3:55 AM > To: tutor at python.org > Subject: Re: [Tutor] toy program to find standard deviation of 2 columns of > a sqlite3 database > > Dear Sir, > > I have chosen this standard deviation as an exercise because there are two > steps: first you have to find the mean, then subtract the mean from each > value of the column . > > Writing an aggregate function for this using python's sqlite3 seems a > little > difficult as there is only single step function inside the class, used to > make that . Kindly put some light . > > This is just an exercise to understand how this create_aggregate() works . > Kindly help > Regards > Manprit Singh > > > > On Sun, Jul 3, 2022 at 9:53 AM Manprit Singh > wrote: > > > Yes it is obviously a homework kind of thing ....I do create problems > > for myself , try to solve and try to find better ways . > > > > I was trying to learn the use of create- aggregate() . > > > > Thank you for the hint that is given by you . Let me try . > > > > My purpose is not to find the std dev. It is actually to learn how to > > use the functions > > > > > > On Sun, 3 Jul, 2022, 09:28 , wrote: > > > >> Maybe a dumb question but why the need to do a calculation of a > >> standard deviation so indirectly in SQL and in the database but > >> starting from Python? > >> > >> R has a built-in function that calculates a standard deviation. You > >> can easily save it where you want after. > >> > >> As for memory use in general, there are several ways to calculate a > >> standard deviation but there is a tradeoff. You could read in an > >> entry at a time and add it to a continuing sum while keeping track of > >> the number of entries. > >> You > >> then calculate the mean. Then you can read it al in AGAIN and > >> calculate the difference between each number and the mean, and do the > >> rest of the calculation by squaring that and so on as you sum that > >> and finally play with a division and a square root. > >> > >> But that may not be needed except with large amounts of data. > >> > >> What am I missing? Is this an artificial HW situation? > >> > >> > >> -----Original Message----- > >> From: Tutor On > >> Behalf Of Manprit Singh > >> Sent: Saturday, July 2, 2022 10:50 PM > >> To: tutor at python.org > >> Subject: [Tutor] toy program to find standard deviation of 2 columns > >> of a > >> sqlite3 database > >> > >> Dear sir , > >> > >> I have tried writing a program in which I am calculating the > >> population standard deviation of two columns X1 & X2 of a table of > >> sqlite3 in - memory database . > >> import sqlite3 > >> import statistics > >> > >> class StdDev: > >> def __init__(self): > >> self.lst = [] > >> > >> def step(self, value): > >> self.lst.append(value) > >> > >> def finalize(self): > >> return statistics.pstdev(self.lst) > >> > >> > >> con = sqlite3.connect(":memory:") > >> cur = con.cursor() > >> cur.execute("create table table1(X1 int, X2 int)") ls = [(2, 4), > >> (3, 5), > >> (4, 7), > >> (5, 8)] > >> cur.executemany("insert into table1 values(?, ?)", ls) > >> con.commit() > >> con.create_aggregate("stddev", 1, StdDev) cur.execute("select > >> stddev(X1), > >> stddev(X2) from table1") > >> print(cur.fetchone()) > >> cur.close() > >> con.close() > >> > >> prints the output as : > >> > >> (1.118033988749895, 1.5811388300841898) > >> > >> which is correct . > >> > >> My question is, as you can see i have used list inside the class > >> StdDev, which > >> > >> I think is an inefficient way to do this kind of problem because > >> there may be > >> > >> a large number of values in a column and it can take a huge amount of > >> memory. > >> > >> Can this problem be solved with the use of iterators ? What would be > >> the best > >> > >> approach to do it ? > >> > >> Regards > >> > >> Manprit Singh > >> _______________________________________________ > >> Tutor maillist - Tutor at python.org > >> To unsubscribe or change subscription options: > >> https://mail.python.org/mailman/listinfo/tutor > >> > >> > _______________________________________________ > Tutor maillist - Tutor at python.org > To unsubscribe or change subscription options: > https://mail.python.org/mailman/listinfo/tutor > > From wlfraed at ix.netcom.com Sun Jul 3 13:05:56 2022 From: wlfraed at ix.netcom.com (Dennis Lee Bieber) Date: Sun, 03 Jul 2022 13:05:56 -0400 Subject: [Tutor] toy program to find standard deviation of 2 columns of a sqlite3 database References: Message-ID: On Sun, 3 Jul 2022 08:19:43 +0530, Manprit Singh declaimed the following: {Seems gmane wasn't updating yesterday} > >My question is, as you can see i have used list inside the class StdDev, which > >I think is an inefficient way to do this kind of problem because there may be > >a large number of values in a column and it can take a huge amount of memory. > >Can this problem be solved with the use of iterators ? What would be the best > >approach to do it ? > First off... What do you consider a "huge amount of memory"? >>> lrglst = list(range(100000000)) >>> sys.getsizeof(lrglst) 800000056 >>> That is a list of 100 MILLION integers. In Python, it consumes 800 Mbytes (plus some overhead). Unless you are running your application on a Raspberry-Pi you shouldn't have any concern for memory (and even then, if you have a spinning disk on USB with a swap file it should only slow you down, not crash -- R-PI 4B can be had with up to 8GB of RAM, so might not need swap at all). Who is running a computer these days with less than 8GB (my decade old system has 12GB, and rarely activates swap). Granted, each invocation (you have two in the example) will add another such list... Still, that would only come to 1.6GB of RAM. Which, ideally, would be freed up just as soon as the SQL statement finished processing and returned the results. SQLite3 is already doing iteration -- it invokes the .step() method for each record in the data set. The only way to avoid collecting that (large?) list is to change the algorithm for standard deviation itself -- and not use the one provided by the statistics module. The "naive" algorithm has been mentioned (in association with calculators -- which work with single data point entry at a time (well, the better ones handled X and Y on each step). https://en.wikipedia.org/wiki/Algorithms_for_calculating_variance#Na%C3%AFve_algorithm UNTESTED class sd_Pop: def __init__(self): self.cnt = 0 self.sum = 0.0 self.sum_squares = 0.0 def step(self, x): self.cnt +=1 self.sum += x self.sum_squares += (x * x) def finalize(self): return sqrt(self.sum_squares - ((self.sum * self.sum) / self.cnt)) / self.cnt) There... No accumulation of a long list, just one integer and two floating point values. Note that the Wikipedia link under "Computing shifted data" is a hypothetical improvement over the pure "naive" algorithm, and (if you made the sample code a class so all those "global" references change to self.###) could also fit into the SQLite3 aggregate. -- Wulfraed Dennis Lee Bieber AF6VN wlfraed at ix.netcom.com http://wlfraed.microdiversity.freeddns.org/ From manpritsinghece at gmail.com Sun Jul 3 14:12:26 2022 From: manpritsinghece at gmail.com (Manprit Singh) Date: Sun, 3 Jul 2022 23:42:26 +0530 Subject: [Tutor] toy program to find standard deviation of 2 columns of a sqlite3 database In-Reply-To: References: Message-ID: Dear Sir, Many Many thanks to Dennis Lee Bieber. (Not for the code he wrote ) But for the last lines after the code in which he mentions various ways to calculate Std. Dev (Naive method, two pass method & Welford's Algo) - That clearly shows I need to study more . I found out various ways to calculate Standard deviation. My Question is finally answered. Will be back with the implementation once gone through all. Regards Manprit Singh On Sun, Jul 3, 2022 at 10:37 PM Dennis Lee Bieber wrote: > On Sun, 3 Jul 2022 08:19:43 +0530, Manprit Singh > declaimed the following: > > {Seems gmane wasn't updating yesterday} > > > > > >My question is, as you can see i have used list inside the class StdDev, > which > > > >I think is an inefficient way to do this kind of problem because there > may be > > > >a large number of values in a column and it can take a huge amount of > memory. > > > >Can this problem be solved with the use of iterators ? What would be the > best > > > >approach to do it ? > > > > First off... What do you consider a "huge amount of memory"? > > >>> lrglst = list(range(100000000)) > >>> sys.getsizeof(lrglst) > 800000056 > >>> > > That is a list of 100 MILLION integers. In Python, it consumes 800 > Mbytes (plus some overhead). > > Unless you are running your application on a Raspberry-Pi you > shouldn't > have any concern for memory (and even then, if you have a spinning disk on > USB with a swap file it should only slow you down, not crash -- R-PI 4B can > be had with up to 8GB of RAM, so might not need swap at all). Who is > running a computer these days with less than 8GB (my decade old system has > 12GB, and rarely activates swap). > > Granted, each invocation (you have two in the example) will add > another > such list... Still, that would only come to 1.6GB of RAM. Which, ideally, > would be freed up just as soon as the SQL statement finished processing and > returned the results. > > SQLite3 is already doing iteration -- it invokes the .step() > method > for each record in the data set. The only way to avoid collecting that > (large?) list is to change the algorithm for standard deviation itself -- > and not use the one provided by the statistics module. > > The "naive" algorithm has been mentioned (in association with > calculators -- which work with single data point entry at a time (well, the > better ones handled X and Y on each step). > > > https://en.wikipedia.org/wiki/Algorithms_for_calculating_variance#Na%C3%AFve_algorithm > > UNTESTED > > class sd_Pop: > def __init__(self): > self.cnt = 0 > self.sum = 0.0 > self.sum_squares = 0.0 > > def step(self, x): > self.cnt +=1 > self.sum += x > self.sum_squares += (x * x) > > def finalize(self): > return sqrt(self.sum_squares - > ((self.sum * self.sum) / self.cnt)) / self.cnt) > > There... No accumulation of a long list, just one integer and two > floating point values. > > Note that the Wikipedia link under "Computing shifted data" is a > hypothetical improvement over the pure "naive" algorithm, and (if you made > the sample code a class so all those "global" references change to > self.###) could also fit into the SQLite3 aggregate. > > > -- > Wulfraed Dennis Lee Bieber AF6VN > wlfraed at ix.netcom.com > http://wlfraed.microdiversity.freeddns.org/ > > _______________________________________________ > Tutor maillist - Tutor at python.org > To unsubscribe or change subscription options: > https://mail.python.org/mailman/listinfo/tutor > From wlfraed at ix.netcom.com Sun Jul 3 16:41:59 2022 From: wlfraed at ix.netcom.com (Dennis Lee Bieber) Date: Sun, 03 Jul 2022 16:41:59 -0400 Subject: [Tutor] toy program to find standard deviation of 2 columns of a sqlite3 database References: <015201d88e91$13d44470$3b7ccd50$@gmail.com> <007001d88ef3$3e032420$ba096c60$@gmail.com> Message-ID: On Sun, 3 Jul 2022 22:29:41 +0530, Manprit Singh declaimed the following: >Now as sum is an aggregate function, same way population standard >deviation is also an aggregate function. We should be able to make a >user defined function > It is also superfluous: SQLite3 already has count(), sum() and even avg() built-in (though it lacks many of the bigger statistical computations -- variance, std. dev, covariance, correlation, linear regression -- that many of the bigger client/server RDBMs support). > >for getting mean.I am not getting the mechanism to subtract the mean >from each value of the column in the same step method or by any other >way > Note that the definition for creating aggregates includes something for number of arguments. Figure out how to specify multiple arguments and you might be able to have SQLite3 provide "current item" and "mean" (avg) to the step() method. I'm not going to take the time to experiment (for the most part, I'd consider it simpler to just grab the entire dataset from the database, and run the number crunching in Python, rather than the overhead of having SQLite3 invoke a Python "callback" method for each item, just to be able to have the SQLite3 return a single computed value. >Btw I would like to write a one liner to calculate population std >deviation of a list: >lst = [2, 5, 7, 9, 10] >mean = sum(lst)/len(lst) >std_dev = (sum((ele-mean)**2 for ele in lst)/len(lst))**0.5 >print(std_dev) Literally, except for the imports, that is just... print(statistics.pstdev(lst)) >>> import math as m >>> import statistics as s >>> lst = [2, 5, 7, 9, 10] >>> print(s.pstdev(lst)) 2.870540018881465 >>> Going up a level in complexity (IE -- not using the imported pstdev()) >>> print("Population Std. Dev.: %s" % m.sqrt( s.mean( (ele - s.mean(lst)) ** 2 for ele in lst))) Population Std. Dev.: 2.870540018881465 >>> This has the problem that it invokes mean(lst) for each element, so may be slower for large data sets (that problem will also exist if you manage a multi-argument step() for SQLite3). Anytime you have sum(equation-with-elements-of-data) / len(data) you can replace it with just mean(equation...) -- Wulfraed Dennis Lee Bieber AF6VN wlfraed at ix.netcom.com http://wlfraed.microdiversity.freeddns.org/ From anirudh.tamsekar at gmail.com Sun Jul 3 17:05:57 2022 From: anirudh.tamsekar at gmail.com (Anirudh Tamsekar) Date: Sun, 3 Jul 2022 14:05:57 -0700 Subject: [Tutor] Consecutive_zeros Message-ID: Hello All, Any help on this function below is highly appreciated. Goal: analyze a binary string consisting of only zeros and ones. Your code should find the biggest number of consecutive zeros in the string. For example, given the string: Its failing on below test case print(consecutive_zeros("0")) It should return 1. Returns 0 I get the max(length) as 1, if I print it separately def consecutive_zeros(string): zeros = [] length = [] result = 0 for i in string: if i == "0": zeros.append(i) else: length.append(len(zeros)) zeros.clear() result = max(length) return result -Thanks, Anirudh Tamsekar From mats at wichmann.us Sun Jul 3 18:10:46 2022 From: mats at wichmann.us (Mats Wichmann) Date: Sun, 3 Jul 2022 16:10:46 -0600 Subject: [Tutor] Consecutive_zeros In-Reply-To: References: Message-ID: On 7/3/22 15:05, Anirudh Tamsekar wrote: > Hello All, > > Any help on this function below is highly appreciated. > Goal: analyze a binary string consisting of only zeros and ones. Your code > should find the biggest number of consecutive zeros in the string. > > For example, given the string: > Its failing on below test case > > print(consecutive_zeros("0")) > It should return 1. Returns 0 > > I get the max(length) as 1, if I print it separately > > > def consecutive_zeros(string): > zeros = [] > length = [] > result = 0 > for i in string: > if i == "0": > zeros.append(i) else: > length.append(len(zeros)) > zeros.clear() > result = max(length) > return result you can certainly shorten that function, and even use a few tricks, but that function looks like it would work for anything except the boundary case you've given it. Of course, boundary cases are one of the challenges for programming - and writing good unit tests: "it works, but what if you do something unexpected?" Your problem is you only process the previous information if you see a character that isn't zero, and since there's only a single zero in the string it never triggers, so the saved count never gets collected. Btw, there's no particular reason to use an array that you append zero characters to and then take the length of, just use a counter. From hilarycarris at yahoo.com Sun Jul 3 15:46:10 2022 From: hilarycarris at yahoo.com (Hilary Carris) Date: Sun, 3 Jul 2022 14:46:10 -0500 Subject: [Tutor] (no subject) References: Message-ID: I am unable to get my python correctly loaded onto my Mac. I have to have a 3.9 version for a class and can not get it to function. Can you help me? From wlfraed at ix.netcom.com Sun Jul 3 20:37:25 2022 From: wlfraed at ix.netcom.com (Dennis Lee Bieber) Date: Sun, 03 Jul 2022 20:37:25 -0400 Subject: [Tutor] Consecutive_zeros References: Message-ID: <17c4chdfiou5fip8d34dhvvu706hqr51kf@4ax.com> On Sun, 3 Jul 2022 14:05:57 -0700, Anirudh Tamsekar declaimed the following: >Hello All, > >Any help on this function below is highly appreciated. >Goal: analyze a binary string consisting of only zeros and ones. Your code >should find the biggest number of consecutive zeros in the string. > This sounds very much like the core of run-length encoding (https://en.wikipedia.org/wiki/Run-length_encoding), with a filter stage to determine the longest run of 0s... >def consecutive_zeros(string): ... which makes the name somewhat misleading. I'd generalize to something like longest_run(source, item) where item is just one of the potential values within source data. That would permit calls in the nature of: longest_run(somedatasource, "0") longest_run(somedatasource, "A") longest_run(somedatasource, 3.141592654) #source is floats > result = max(length) You are only updating "result" when you encounter a non-"0" element. If there is no non-"0" following any amount of "0" you will not update. If not forbidden by the assignment/class, recommend you look at the itertools module -- in particular groupby(). >>> import itertools as it >>> source = "000100111010000122" >>> for k, g in it.groupby(source): ... print("key %s: length %s:" % (k, len(list(g)))) ... key 0: length 3: key 1: length 1: key 0: length 2: key 1: length 3: key 0: length 1: key 1: length 1: key 0: length 4: key 1: length 1: key 2: length 2: >>> .takewhile() and .filterfalse() might also be of use. -- Wulfraed Dennis Lee Bieber AF6VN wlfraed at ix.netcom.com http://wlfraed.microdiversity.freeddns.org/ From manpritsinghece at gmail.com Sun Jul 3 21:44:11 2022 From: manpritsinghece at gmail.com (Manprit Singh) Date: Mon, 4 Jul 2022 07:14:11 +0530 Subject: [Tutor] toy program to find standard deviation of 2 columns of a sqlite3 database In-Reply-To: References: <015201d88e91$13d44470$3b7ccd50$@gmail.com> <007001d88ef3$3e032420$ba096c60$@gmail.com> Message-ID: Dear Sir, In Pandas, handling an sql query is so simple as given below: import sqlite3 import pandas as pd con = sqlite3.connect(":memory:") cur = con.cursor() cur.execute("create table test(i, j)") ls = [(2, 4), (3, 5), (4, 2), (7, 9)] cur.executemany("insert into test(i, j) values (?, ?)", ls) pd.read_sql("select i, j from test", con).std(ddof=0) will give the desired result: i 1.870829 j 2.549510 dtype: float64 On Mon, Jul 4, 2022 at 2:13 AM Dennis Lee Bieber wrote: > On Sun, 3 Jul 2022 22:29:41 +0530, Manprit Singh > declaimed the following: > > >Now as sum is an aggregate function, same way population standard > >deviation is also an aggregate function. We should be able to make a > >user defined function > > > > It is also superfluous: SQLite3 already has count(), sum() and even > avg() built-in (though it lacks many of the bigger statistical computations > -- variance, std. dev, covariance, correlation, linear regression -- that > many of the bigger client/server RDBMs support). > > > > >for getting mean.I am not getting the mechanism to subtract the mean > >from each value of the column in the same step method or by any other > >way > > > Note that the definition for creating aggregates includes > something for > number of arguments. Figure out how to specify multiple arguments and you > might be able to have SQLite3 provide "current item" and "mean" (avg) to > the step() method. I'm not going to take the time to experiment (for the > most part, I'd consider it simpler to just grab the entire dataset from the > database, and run the number crunching in Python, rather than the overhead > of having SQLite3 invoke a Python "callback" method for each item, just to > be able to have the SQLite3 return a single computed value. > > > >Btw I would like to write a one liner to calculate population std > >deviation of a list: > >lst = [2, 5, 7, 9, 10] > >mean = sum(lst)/len(lst) > >std_dev = (sum((ele-mean)**2 for ele in lst)/len(lst))**0.5 > >print(std_dev) > > Literally, except for the imports, that is just... > > print(statistics.pstdev(lst)) > > >>> import math as m > >>> import statistics as s > >>> lst = [2, 5, 7, 9, 10] > >>> print(s.pstdev(lst)) > 2.870540018881465 > >>> > > Going up a level in complexity (IE -- not using the imported > pstdev()) > > >>> print("Population Std. Dev.: %s" % m.sqrt( s.mean( (ele - s.mean(lst)) > ** 2 for ele in lst))) > Population Std. Dev.: 2.870540018881465 > >>> > > This has the problem that it invokes mean(lst) for each element, > so > may be slower for large data sets (that problem will also exist if you > manage a multi-argument step() for SQLite3). > > Anytime you have > > sum(equation-with-elements-of-data) / len(data) > > you can replace it with just > > mean(equation...) > > > -- > Wulfraed Dennis Lee Bieber AF6VN > wlfraed at ix.netcom.com > http://wlfraed.microdiversity.freeddns.org/ > > _______________________________________________ > Tutor maillist - Tutor at python.org > To unsubscribe or change subscription options: > https://mail.python.org/mailman/listinfo/tutor > From martin at linux-ip.net Sun Jul 3 22:15:58 2022 From: martin at linux-ip.net (Martin A. Brown) Date: Sun, 3 Jul 2022 19:15:58 -0700 Subject: [Tutor] Consecutive_zeros In-Reply-To: References: Message-ID: <693a92f-afa-eeb4-a56-164a2b78fe27@wonderfrog.net> Hello there, > For example, given the string: > Its failing on below test case > > print(consecutive_zeros("0")) > It should return 1. Returns 0 > I get the max(length) as 1, if I print it separately There are many ways to this sort of thing, and I think Mats has already responded that your function (below) looks good aside from your boundary case. Of course, boundary cases are one of the things that software quality assurance and test tooling are there to help you discover. If your code never hits the boundary condition, then you'll always sleep soundly. But, as an excellent systems programmer I worked with used to say: "I like to think that my software inhabits a hostile universe." It's a defensive programming mindset that I have also tried to adopt. > def consecutive_zeros(string): > zeros = [] > length = [] > result = 0 > for i in string: > if i == "0": > zeros.append(i) else: > length.append(len(zeros)) > zeros.clear() > result = max(length) > return result > Goal: analyze a binary string consisting of only zeros and ones. Your code > should find the biggest number of consecutive zeros in the string. I took a look at the problem statement and I thought immediately of the itertools.groupby function [0]. While this is not generally one of the Python modules that you'd encounter at your first exposure to Python, I thought I'd mention it to illustrate that many languages provide advanced tooling that allow you to solve these sorts of computer science problems. There's no substitute for knowing how to write or apply this yourself. There's also value in knowing what other options are available to you in the standard library. (Like learning when to use a rasp, chisel, smoothing plane or sandpaper when working with wood: All work. Some work better in certain situations.) The itertools module is fantastic for dealing with streams and infinite sequences. This function (and others in this module) useful for stuff like your specific question, which appears to be a string that probably fits neatly into memory. def consecutive_zeros(s): zcount = 0 for char, seq in itertools.groupby(s): if char != '0': continue t = len(list(seq)) zcount = max((t, zcount)) return zcount, s I am mentioning this particular function not because I have any specific deep experience with the itertools module, but to share another way to think about approaching any specific problem. I have never regretted reading in the standard libraries of any language nor the documentation pages of any system I've ever worked on. In this case, your question sounded to me closer to a "pure" computer science question and in the vein of functional programming and itertools jumped directly into my memory. One exercise I attempt frequently is to move a specific question, e.g. "count the number of zeroes in this string sequence" to a more general case of "give me the size of the longest repeated subsequence of items". This often requires keeping extra data (e.g. the collections.defaultdict) but means you sometimes have other answers at the ready. In this case, you could also easily answer the question of 'what is the longest subsequence of "1"s' as well. And, in keeping with the notion of testing* (as Mats has suggested) and worrying about boundaries, I include a toy program (below) that demonstrates the above function as well as an alternate, generalized example, as well as testing the cases in which the result is known ahead of time. Best of luck, -Martin * Note, what I did is not quite proper testing, but is a simple illustration of how one could do it. Using assert is a convenient way to work with test tooling like py.test. [0] https://docs.python.org/3/library/itertools.html#itertools.groupby #! /usr/bin/python # # -- response to Consecutive_zeros question import os import sys import random import itertools import collections def consecutive_items(s): counts = collections.defaultdict(int) for item, seq in itertools.groupby(s): counts[item] = max(sum(1 for _ in seq), counts[item]) return counts, s def consecutive_char_of_interest(s, item): counts, _ = consecutive_items(s) return counts[item], s def consecutive_zeros(s): return(consecutive_char_of_interest(s, '0')) # def consecutive_zeros(s): # zcount = 0 # for char, seq in itertools.groupby(s): # if char != '0': # continue # t = len(list(seq)) # zcount = max((t, zcount)) # return zcount, s def report(count, sample): sample = (sample[:50] + '...') if len(sample) > 75 else sample print('{:>9d} {}'.format(count, sample)) def fail_if_wrong(sample, correct): count, _ = consecutive_zeros(sample) assert count == correct report(count, sample) return count, sample def cli(fin, fout, argv): fail_if_wrong('a', 0) # - feed it "garbage" fail_if_wrong('Q', 0) # - of several kinds fail_if_wrong('^', 0) # - punctuation! What kind of planet?! fail_if_wrong('1', 0) # - check boundary fail_if_wrong('0', 1) # - check boundary fail_if_wrong('01', 1) # - oh, let's be predictable fail_if_wrong('001', 2) fail_if_wrong('0001', 3) fail_if_wrong('00001', 4) fail_if_wrong('000001', 5) fail_if_wrong('0000001', 6) fail_if_wrong('00000001', 7) fail_if_wrong('000000001', 8) n = 37 fail_if_wrong('0' * n, n) # - favorite number? n = 1024 * 1024 # - some big number fail_if_wrong('0' * n, n) n = random.randint(9, 80) fail_if_wrong('0' * n, n) # - let the computer pick report(*consecutive_zeros(''.join(random.choices('01', k=60)))) report(*consecutive_zeros(''.join(random.choices('01', k=60)))) report(*consecutive_zeros(''.join(random.choices('01', k=1024)))) report(*consecutive_zeros(''.join(random.choices('01', k=1024*1024)))) return os.EX_OK if __name__ == '__main__': sys.exit(cli(sys.stdin, sys.stdout, sys.argv[1:])) # -- end of file -- Martin A. Brown http://linux-ip.net/ From manpritsinghece at gmail.com Mon Jul 4 01:49:54 2022 From: manpritsinghece at gmail.com (Manprit Singh) Date: Mon, 4 Jul 2022 11:19:54 +0530 Subject: [Tutor] toy program to find standard deviation of 2 columns of a sqlite3 database In-Reply-To: <009401d88f65$b102a8c0$1307fa40$@gmail.com> References: <015201d88e91$13d44470$3b7ccd50$@gmail.com> <007001d88ef3$3e032420$ba096c60$@gmail.com> <009401d88f65$b102a8c0$1307fa40$@gmail.com> Message-ID: Dear Sir, My target is still to do this task in pure python. without using an iterable. Just shown you using pandas, how easy it is in the last mail . There are several easy ways also. I will come up with a solution given by Dennis Lee bieber. I am exploring other options also, The one using numpy was also explored by me as given below: import sqlite3 import numpy as np conn = sqlite3.connect(":memory:") cur = conn.cursor() cur.execute("create table table1(X1, X2)") lst = [(2, 5), (4, 3), (5, 2), (7, 1)] cur.executemany("insert into table1 values(?, ?)", lst) cur.execute("select * from table1") print(np.std(cur.fetchall(), axis=0)) array([1.80277564, 1.47901995]) which is the right answer For column names : col_names= [cur.description[i][0] for i in (0, 1)] print(col_names) ['X1', 'X2'] My target is still to do this task in pure python. with out using an iterable From __peter__ at web.de Mon Jul 4 03:51:01 2022 From: __peter__ at web.de (Peter Otten) Date: Mon, 4 Jul 2022 09:51:01 +0200 Subject: [Tutor] Consecutive_zeros In-Reply-To: References: Message-ID: <9a041b69-9a71-098b-ee8f-2da4bcb51d75@web.de> On 03/07/2022 23:05, Anirudh Tamsekar wrote: > Hello All, > > Any help on this function below is highly appreciated. > Goal: analyze a binary string consisting of only zeros and ones. Your code > should find the biggest number of consecutive zeros in the string. > > For example, given the string: > Its failing on below test case In case you haven't already fixed your function here's a hint that is a bit more practical than what already has been said. > > print(consecutive_zeros("0")) > It should return 1. Returns 0 > > I get the max(length) as 1, if I print it separately > > > def consecutive_zeros(string): > zeros = [] > length = [] > result = 0 > for i in string: > if i == "0": > zeros.append(i) else: > length.append(len(zeros)) > zeros.clear() > result = max(length) At this point in the execution of your function what does zeros look like for the succeeding cases, and what does it look like for the failing ones? Add a print(...) call if you aren't sure and run consecutive_zeros() for examples with trailing ones, trailing runs of zeros that have or don't have the maximum length for that string. How can you bring result up-to-date? > return result PS: Because homework problems are often simpler than what comes up in the "real world" some programmers tend to come up with solutions that are less robust or general. In that spirit I can't help but suggest >>> max(map(len, "011110000110001000001111100".split("1"))) 5 which may also be written as >>> max(len(s) for s in "011110000110001000001111100".split("1")) 5 Can you figure out how this works? What will happen if there is a character other than "1" or "0"in the string? From PythonList at DancesWithMice.info Mon Jul 4 04:21:24 2022 From: PythonList at DancesWithMice.info (dn) Date: Mon, 4 Jul 2022 20:21:24 +1200 Subject: [Tutor] (no subject) In-Reply-To: References: Message-ID: <63eb83e1-c28a-5d61-2882-2b8929e0d823@DancesWithMice.info> On 04/07/2022 07.46, Hilary Carris via Tutor wrote: > I am unable to get my python correctly loaded onto my Mac. I have to have a 3.9 version for a class and can not get it to function. > Can you help me? Please start with the documentation. If the necessary answer is not featured, please mention where in the docs things diverge and provide more information. 5. Using Python on a Mac? https://docs.python.org/3/using/mac.html -- Regards, =dn From learn2program at gmail.com Mon Jul 4 04:45:29 2022 From: learn2program at gmail.com (Alan Gauld) Date: Mon, 4 Jul 2022 09:45:29 +0100 Subject: [Tutor] (no subject) In-Reply-To: References: Message-ID: On 03/07/2022 20:46, Hilary Carris via Tutor wrote: > I am unable to get my python correctly loaded onto my Mac. I have to have a 3.9 version for a class and can not get it to function. > Can you help me? What have you done so far? Where did you download from? What exactly is going wrong? ??? Is the download failing? ??? Is the install process failing? ??? Does it start with an error - what error? ?Please be as specific as possible about what you have done, and what is going wrong. Simply saying it doesn't work is not of much help.? -- Alan G Author of the Learn to Program web site http://www.alan-g.me.uk/ http://www.amazon.com/author/alan_gauld Follow my photo-blog on Flickr at: http://www.flickr.com/photos/alangauldphotos From avi.e.gross at gmail.com Sun Jul 3 23:09:30 2022 From: avi.e.gross at gmail.com (avi.e.gross at gmail.com) Date: Sun, 3 Jul 2022 23:09:30 -0400 Subject: [Tutor] toy program to find standard deviation of 2 columns of a sqlite3 database In-Reply-To: References: Message-ID: <003c01d88f53$7a7e3a60$6f7aaf20$@gmail.com> Once explained, the request makes some sense and I withdraw my earlier suggestions as not responding to what you want. You do NOT want to use Python almost at all except as a way to test manipulating some database. Fine, do that! As has been pointed out, many versions of SQL come pre-built with functions you can cll from within your SQL directly and also remotely that do things like calculate means and perhaps even standard deviations using queries like: mysql> SELECT STDDEV_SAMP (salary) FROM employee; mysql> SELECT STD(salary) FROM employee; mysql> SELECT STDDEV(salary) FROM employee; mysql> SELECT STDDEV_POP(salary) FROM employee; Depending on which one you want. So if the above are SUPPORTED then your issue of not using much memory in Python is quite irrelevant. If you want to learn to use a particular function that probably sends the above command, or something similar, fine. Figure it out but my impression is the function you are using may not be using a local Python function. I know this group is for learning but I seem to not appreciate being asked to think about a problem in ways that keep turning out to be very different than what is wanted as it is usually a waste of time for all involved. People with more focused and understandable needs may be a better use of any time I devote here. I have negligible interest personally in continuing to work on manipulating a database remotely at this time. When things get frustrating volunteers are mobile. -----Original Message----- From: Tutor On Behalf Of Manprit Singh Sent: Sunday, July 3, 2022 9:01 AM To: tutor at python.org Subject: Re: [Tutor] toy program to find standard deviation of 2 columns of a sqlite3 database Sir, I am just going through all the functionalities available in sqlite3 module , just to see if I can use sqlite3 as a good data analysis tool or not . Upto this point I have figured out that and sqlite data base file can be an excellent replacement for data stored in files . You can preserve data in a structured form, email to someone who need it etc etc . But for good data analysis ....I found pandas is superior . I use pandas for data analysis and visualization . Btw ....this is true . You should use right tool for your task . Regards Manprit Singh On Sun, 3 Jul, 2022, 18:02 Alan Gauld via Tutor, wrote: > On 03/07/2022 03:49, Manprit Singh wrote: > > > con.create_aggregate("stddev", 1, StdDev) cur.execute("select > > stddev(X1), stddev(X2) from table1") > > I just wanted to say thanks for posting this. I have never used, nor > seen anyone else use, the ability to create a user defined aggregate > function in SQLite - usually I just extract the data into python and > use python to do the aggregation. But your question made me read up on > how that all worked so it has taught me something new. > (It also makes me appreciate how the Pyhon API is much easier to use > than the raw C API to SQLite!) > > > My question is, as you can see i have used list inside the class > > StdDev, > which > > I think is an inefficient way to do this kind of problem because > > there > may be > > a large number of values in a column and it can take a huge amount > > of > memory. > > Can this problem be solved with the use of iterators ? What would be > > the > best > > approach to do it ? > > If I'm working with so much data that this would be a problem I'd use > the database itself to store the intermediate data. That would be much > slower but much less memory dependant. But as others have said, with > aggregate functions you don't usually need to store data from all rows > you just store a few inermediate results which you combine at the end. > > If you are trying to use an in-memory function - like the stddev > function here - then you need to fit all the data in memory anyway so > the function will simply not work if you can't store the data in RAM. > In that case you need to find(or write) another function that doesn't > use memory for storage or uses less storage. > > It is also worth pointing out that most industrial strength SQL > databases come with a far richer set of aggregate functions than > SQLite. So if you do have to work with large volumes of data you > should probably switch to someting like Oracle, DB2, SQLServer(*), etc > and just use the functions built into the server. If they don't have > such a function they also have amuch simpler way of defining stored > procedures. As ever, choose the appropriate tool for the job. > > (*)These are just the ones I know, I assume MySql, Postgres etc have > similarly broad libraries. > > -- > Alan G > Author of the Learn to Program web site http://www.alan-g.me.uk/ > http://www.amazon.com/author/alan_gauld > Follow my photo-blog on Flickr at: > http://www.flickr.com/photos/alangauldphotos > > > _______________________________________________ > Tutor maillist - Tutor at python.org > To unsubscribe or change subscription options: > https://mail.python.org/mailman/listinfo/tutor > _______________________________________________ Tutor maillist - Tutor at python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor From avi.e.gross at gmail.com Mon Jul 4 00:24:51 2022 From: avi.e.gross at gmail.com (avi.e.gross at gmail.com) Date: Mon, 4 Jul 2022 00:24:51 -0400 Subject: [Tutor] Consecutive_zeros In-Reply-To: References: Message-ID: <005c01d88f5e$01937b50$04ba71f0$@gmail.com> I read through all the suggestions sent in to date and some seem a bit beyond the purpose of the exercise. Why not go all the way and use a regular expression to return all matches of "0+" and then choose the length of the longest one! But seriously, the problem is actually fairly simple as long as you start by stating the maximum found so far at the start is 0 long. If no longer one is found, then the answer will be 0. And no need to play with lists. Keep a counter. Start at 0. When you see a 1, if the current count is greater that the current maximum, reset the maximum. Either way reset the count to zero. When you see a zero, increment the count. And critically, when you reach the end, check the current count and if needed increment the maximum. This is an exercise in a fairly small state machine with just a few states as in a marking automaton or Turing machine. Is there a guarantee for the purposes of this assignment that there is nothing else in the string and that it terminates? If there may be other values or it terminates in a NULL of some kind, you may have to adjust the algorithm in one of many ways but I suspect the assignment is straightforward. JUST FOR FUN --- DO NOT USE THIS: import re def consecutive_zeros(string): if (matches := re.findall("0+", string)) == [] : return 0 else: return (max([len(it) for it in matches])) print(consecutive_zeros("1010010001000010000001")) #6 print(consecutive_zeros("00000000000000000000000000")) #26 print(consecutive_zeros("1111")) #0 print(consecutive_zeros("begin1x00x110001END")) #3 print(consecutive_zeros("Boy did you pick the wrong string!")) #0 prints out: 6 26 0 3 0 REPEAT: This is not a valid way for a normal assignment and would not have been shown if I had not just finished a huge tome on using regular expressions for everything imaginable and with very different implementations in all kinds of programs and environments. But it is a tad creative if also wasteful and does handle some edge cases. And note this may not work in older versions of python as it uses the walrus operator. That could easily be avoided with slightly longer code or different code but it does present a different viewpoint on what a stretch of zeroes means. -----Original Message----- From: Tutor On Behalf Of Anirudh Tamsekar Sent: Sunday, July 3, 2022 5:06 PM To: tutor at python.org Subject: [Tutor] Consecutive_zeros Hello All, Any help on this function below is highly appreciated. Goal: analyze a binary string consisting of only zeros and ones. Your code should find the biggest number of consecutive zeros in the string. For example, given the string: Its failing on below test case print(consecutive_zeros("0")) It should return 1. Returns 0 I get the max(length) as 1, if I print it separately def consecutive_zeros(string): zeros = [] length = [] result = 0 for i in string: if i == "0": zeros.append(i) else: length.append(len(zeros)) zeros.clear() result = max(length) return result -Thanks, Anirudh Tamsekar _______________________________________________ Tutor maillist - Tutor at python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor From avi.e.gross at gmail.com Mon Jul 4 01:20:45 2022 From: avi.e.gross at gmail.com (avi.e.gross at gmail.com) Date: Mon, 4 Jul 2022 01:20:45 -0400 Subject: [Tutor] toy program to find standard deviation of 2 columns of a sqlite3 database References: <015201d88e91$13d44470$3b7ccd50$@gmail.com> <007001d88ef3$3e032420$ba096c60$@gmail.com> Message-ID: <009601d88f65$d0403f90$70c0beb0$@gmail.com> Manprit, Did you just violate your condition to not keep the entire list of results in memory with your use of pandas? Yes, what you want is straightforward when you are not trying to do it some other way. Your earlier comment though suggested you were more interested in just using a database as some kind of portable format. Pandas can equally trivially open a CSV file which strikes me as even more portable. Other formats with more flexibility that you can bring in fairly portably can be in JSON format or XML and so on. No reason not to use a database but a great strength of a database is that it can be easy to do much more complex queries such as only getting the data selectively such as the numbers for group A or those with dates in some range. You can of course do the same thing with data you load into pandas or numpy or any other format, but would need to do it within python. As mentioned earlier, if the size of memory is a concern, then asking the database to calculate things like a standard deviation simply pushes the memory usage elsewhere. If it is inside your own computer, that does not sound like serious savings unless SQLLITE uses a method that does not keep everything in memory. -----Original Message----- From: Tutor On Behalf Of Manprit Singh Sent: Sunday, July 3, 2022 9:44 PM To: tutor at python.org Subject: Re: [Tutor] toy program to find standard deviation of 2 columns of a sqlite3 database Dear Sir, In Pandas, handling an sql query is so simple as given below: import sqlite3 import pandas as pd con = sqlite3.connect(":memory:") cur = con.cursor() cur.execute("create table test(i, j)") ls = [(2, 4), (3, 5), (4, 2), (7, 9)] cur.executemany("insert into test(i, j) values (?, ?)", ls) pd.read_sql("select i, j from test", con).std(ddof=0) will give the desired result: i 1.870829 j 2.549510 dtype: float64 On Mon, Jul 4, 2022 at 2:13 AM Dennis Lee Bieber wrote: > On Sun, 3 Jul 2022 22:29:41 +0530, Manprit Singh > declaimed the following: > > >Now as sum is an aggregate function, same way population standard > >deviation is also an aggregate function. We should be able to make a > >user defined function > > > > It is also superfluous: SQLite3 already has count(), sum() and > even > avg() built-in (though it lacks many of the bigger statistical > computations > -- variance, std. dev, covariance, correlation, linear regression -- > that many of the bigger client/server RDBMs support). > > > > >for getting mean.I am not getting the mechanism to subtract the mean > >from each value of the column in the same step method or by any other > >way > > > Note that the definition for creating aggregates includes > something for number of arguments. Figure out how to specify multiple > arguments and you might be able to have SQLite3 provide "current item" > and "mean" (avg) to the step() method. I'm not going to take the time > to experiment (for the most part, I'd consider it simpler to just grab > the entire dataset from the database, and run the number crunching in > Python, rather than the overhead of having SQLite3 invoke a Python > "callback" method for each item, just to be able to have the SQLite3 > return a single computed value. > > > >Btw I would like to write a one liner to calculate population std > >deviation of a list: > >lst = [2, 5, 7, 9, 10] > >mean = sum(lst)/len(lst) > >std_dev = (sum((ele-mean)**2 for ele in lst)/len(lst))**0.5 > >print(std_dev) > > Literally, except for the imports, that is just... > > print(statistics.pstdev(lst)) > > >>> import math as m > >>> import statistics as s > >>> lst = [2, 5, 7, 9, 10] > >>> print(s.pstdev(lst)) > 2.870540018881465 > >>> > > Going up a level in complexity (IE -- not using the imported > pstdev()) > > >>> print("Population Std. Dev.: %s" % m.sqrt( s.mean( (ele - > >>> s.mean(lst)) > ** 2 for ele in lst))) > Population Std. Dev.: 2.870540018881465 > >>> > > This has the problem that it invokes mean(lst) for each > element, so may be slower for large data sets (that problem will also > exist if you manage a multi-argument step() for SQLite3). > > Anytime you have > > sum(equation-with-elements-of-data) / len(data) > > you can replace it with just > > mean(equation...) > > > -- > Wulfraed Dennis Lee Bieber AF6VN > wlfraed at ix.netcom.com > http://wlfraed.microdiversity.freeddns.org/ > > _______________________________________________ > Tutor maillist - Tutor at python.org > To unsubscribe or change subscription options: > https://mail.python.org/mailman/listinfo/tutor > _______________________________________________ Tutor maillist - Tutor at python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor From lawrencefdunn at gmail.com Sun Jul 3 20:36:50 2022 From: lawrencefdunn at gmail.com (Lawrence Dunn) Date: Sun, 3 Jul 2022 19:36:50 -0500 Subject: [Tutor] (no subject) In-Reply-To: References: Message-ID: Just wondering, is it the M1 Scilicon ? On Sun, Jul 3, 2022 at 7:09 PM Hilary Carris via Tutor wrote: > I am unable to get my python correctly loaded onto my Mac. I have to have > a 3.9 version for a class and can not get it to function. > Can you help me? > _______________________________________________ > Tutor maillist - Tutor at python.org > To unsubscribe or change subscription options: > https://mail.python.org/mailman/listinfo/tutor > From alan.gauld at yahoo.co.uk Mon Jul 4 06:46:02 2022 From: alan.gauld at yahoo.co.uk (Alan Gauld) Date: Mon, 4 Jul 2022 11:46:02 +0100 Subject: [Tutor] toy program to find standard deviation of 2 columns of a sqlite3 database In-Reply-To: References: Message-ID: On 03/07/2022 14:01, Manprit Singh wrote: > Sir, > I am just going through all the functionalities available in sqlite3 module > , just to see if I can use sqlite3 as a good data analysis tool or not . SQLite is a good storage and retrieval system. It's not aimed at data analysis, thats where tools like Pandas and R come into play. SQLite will do a better job in pulling out specific subsets of data and of organising your data with relationships etc. But it makes no attempt to be a fully featured application environment (unlike the bigger client/server databases like Oracle or DB2) > Upto this point I have figured out that and sqlite data base file can be an > excellent replacement for data stored in files . > > You can preserve data in a structured form, email to someone who need it > etc etc . Yes, that is its strong point. Everything is stored in a single file that can be easily shared by email or by storing it on a cloud server. > But for good data analysis ....I found pandas is superior . I use pandas > for data analysis and visualization . And that's good because that is what Pandas (and SciPy in general) is designed for. > > Btw ....this is true . You should use right tool for your task . Absolutely. One of the key skills of a software engineer is recognising which tools are best suited to which part of the task and how to glue them together. There is no universally best tool. -- Alan G Author of the Learn to Program web site http://www.alan-g.me.uk/ http://www.amazon.com/author/alan_gauld Follow my photo-blog on Flickr at: http://www.flickr.com/photos/alangauldphotos From manpritsinghece at gmail.com Mon Jul 4 13:14:29 2022 From: manpritsinghece at gmail.com (Manprit Singh) Date: Mon, 4 Jul 2022 22:44:29 +0530 Subject: [Tutor] toy program to find standard deviation of 2 columns of a sqlite3 database In-Reply-To: References: Message-ID: Dear Sir, Finally I came up with a solution which seems more good to me, rather than using the previous approach. In this solution I have used shortcut method for calculating the standard deviation. import sqlite3 class StdDev: def __init__(self): self.cnt = 0 self.sumx = 0 self.sumsqrx = 0 def step(self, x): self.cnt += 1 self.sumx += x self.sumsqrx += x**2 def finalize(self): return ((self.sumsqrx - self.sumx**2/self.cnt)/self.cnt)**0.5 conn = sqlite3.connect(":memory:") cur = conn.cursor() cur.execute("create table table1(X1 int, X2 int)") ls = [(2, 5), (3, 7), (4, 2), (5, 1), (8, 6)] cur.executemany("insert into table1 values(?, ?)", ls) conn.commit() conn.create_aggregate("stdev", 1, StdDev) std_dev, = cur.execute("select stdev(X1), stdev(X2) from table1") print(std_dev) cur.close() conn.close() gives output (2.0591260281974, 2.315167380558045) That's all. This is what I was looking for .So what will be the best solution to this problem ? This one or the previous one posted by me ? The whole credit goes to Dennis lee bieber & avi.e.gross at gmail.com Regards Manprit Singh On Mon, Jul 4, 2022 at 4:17 PM Alan Gauld via Tutor wrote: > On 03/07/2022 14:01, Manprit Singh wrote: > > Sir, > > I am just going through all the functionalities available in sqlite3 > module > > , just to see if I can use sqlite3 as a good data analysis tool or not . > > SQLite is a good storage and retrieval system. It's not aimed at data > analysis, thats where tools like Pandas and R come into play. > > SQLite will do a better job in pulling out specific subsets of > data and of organising your data with relationships etc. But it > makes no attempt to be a fully featured application environment > (unlike the bigger client/server databases like Oracle or DB2) > > > Upto this point I have figured out that and sqlite data base file can be > an > > excellent replacement for data stored in files . > > > > You can preserve data in a structured form, email to someone who need it > > etc etc . > > Yes, that is its strong point. Everything is stored in a single file > that can be easily shared by email or by storing it on a cloud server. > > > But for good data analysis ....I found pandas is superior . I use pandas > > for data analysis and visualization . > > And that's good because that is what Pandas (and SciPy in general) > is designed for. > > > > Btw ....this is true . You should use right tool for your task . > > Absolutely. One of the key skills of a software engineer is > recognising which tools are best suited to which part of the > task and how to glue them together. > There is no universally best tool. > > -- > Alan G > Author of the Learn to Program web site > http://www.alan-g.me.uk/ > http://www.amazon.com/author/alan_gauld > Follow my photo-blog on Flickr at: > http://www.flickr.com/photos/alangauldphotos > > > _______________________________________________ > Tutor maillist - Tutor at python.org > To unsubscribe or change subscription options: > https://mail.python.org/mailman/listinfo/tutor > From avi.e.gross at gmail.com Mon Jul 4 12:49:20 2022 From: avi.e.gross at gmail.com (avi.e.gross at gmail.com) Date: Mon, 4 Jul 2022 12:49:20 -0400 Subject: [Tutor] Consecutive_zeros In-Reply-To: <9a041b69-9a71-098b-ee8f-2da4bcb51d75@web.de> References: <9a041b69-9a71-098b-ee8f-2da4bcb51d75@web.de> Message-ID: <504401d88fc6$02af9e70$080edb50$@gmail.com> Thanks for reminding me Peter. This may be a bit off topic for the original question if it involved learning how to apply simple algorithms in Python but is a good exercise as well for finding ways to use various built-in tools perhaps in a more abstract way. What Peter did was a nice use of the built-in split() function which does indeed allow the removal of an arbitrary number of ones but, as he notes, his solution, as written, depends on there not being anything except zeroes and ones in the string. So I went back to my previous somewhat joking suggestion and thought of a way of shortening it as the "re" module also has a regular-expression version called re.split() that has the nice side effect of including an empty string when I ask it to split on all runs of non-zero. re.split("[^0]", "1010010001000010000001") ['', '0', '00', '000', '0000', '000000', ''] That '' at the end of the resulting list takes care of the edge condition for a string with no zeroes at all: re.split("[^0]", "No zeroes") ['', '', '', '', '', '', '', '', '', ''] Yes, lots of empty string but when you hand something like that to get lengths, you get at least one zero which handles always getting a number, unlike my earlier offering which needed to check if it matched anything. So here is a tad shorter and perhaps more direct version of the requested function which is in effect a one-liner: def consecutive_zeros(string): return(max([len(zs) for zs in re.split("[^0]", string)])) The full code with testing would be: import re def consecutive_zeros(string): return(max([len(zs) for zs in re.split("[^0]", string)])) def testink(string, expected): print(str(consecutive_zeros(string)) + ": " + string + "\n" + str(expected) + ": expected\n") print("testing the function and showing what was expected:\n") testink("1010010001000010000001", 6) testink("00000000000000000000000000", 26) testink("1111", 0) testink("begin1x00x110001END", 3) testink("Boy did you pick the wrong string!", 0) The output is: testing the function and showing what was expected: 6: 1010010001000010000001 6: expected 26: 00000000000000000000000000 26: expected 0: 1111 0: expected 3: begin1x00x110001END 3: expected 0: Boy did you pick the wrong string! 0: expected -----Original Message----- From: Tutor On Behalf Of Peter Otten Sent: Monday, July 4, 2022 3:51 AM To: tutor at python.org Subject: Re: [Tutor] Consecutive_zeros On 03/07/2022 23:05, Anirudh Tamsekar wrote: > Hello All, > > Any help on this function below is highly appreciated. > Goal: analyze a binary string consisting of only zeros and ones. Your > code should find the biggest number of consecutive zeros in the string. > > For example, given the string: > Its failing on below test case In case you haven't already fixed your function here's a hint that is a bit more practical than what already has been said. > > print(consecutive_zeros("0")) > It should return 1. Returns 0 > > I get the max(length) as 1, if I print it separately > > > def consecutive_zeros(string): > zeros = [] > length = [] > result = 0 > for i in string: > if i == "0": > zeros.append(i) else: > length.append(len(zeros)) > zeros.clear() > result = max(length) At this point in the execution of your function what does zeros look like for the succeeding cases, and what does it look like for the failing ones? Add a print(...) call if you aren't sure and run consecutive_zeros() for examples with trailing ones, trailing runs of zeros that have or don't have the maximum length for that string. How can you bring result up-to-date? > return result PS: Because homework problems are often simpler than what comes up in the "real world" some programmers tend to come up with solutions that are less robust or general. In that spirit I can't help but suggest >>> max(map(len, "011110000110001000001111100".split("1"))) 5 which may also be written as >>> max(len(s) for s in "011110000110001000001111100".split("1")) 5 Can you figure out how this works? What will happen if there is a character other than "1" or "0"in the string? _______________________________________________ Tutor maillist - Tutor at python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor From sjeik_appie at hotmail.com Mon Jul 4 18:31:07 2022 From: sjeik_appie at hotmail.com (Albert-Jan Roskam) Date: Tue, 05 Jul 2022 00:31:07 +0200 Subject: [Tutor] __debug__ and PYTHONOPTIMIZE Message-ID: Hi,? I am using PYTHONOPTIMIZE=1 (equivalent to start-up option "-o"). As I understood, assert statements are "compiled away" then. But how about __debug__. Does it merely evaluate to False when using "-o"? Or is it also really gone? More generally; how do I decompile a .pyc or .pyo file to verify this? With the "dis" module? This is a relevant code snippet: for record in billionrecords: ? ? if __debug__:? #evaluating this still takes some time! ? ? ? ? logger.debug(record) Thanks! Albert-Jan From __peter__ at web.de Tue Jul 5 03:20:41 2022 From: __peter__ at web.de (Peter Otten) Date: Tue, 5 Jul 2022 09:20:41 +0200 Subject: [Tutor] toy program to find standard deviation of 2 columns of a sqlite3 database In-Reply-To: References: Message-ID: On 04/07/2022 19:14, Manprit Singh wrote: > Dear Sir, > > Finally I came up with a solution which seems more good to me, rather than > using the previous approach. In this solution I have used shortcut method > for calculating the standard deviation. > > import sqlite3 > > class StdDev: > > def __init__(self): > self.cnt = 0 > self.sumx = 0 > self.sumsqrx = 0 > > def step(self, x): > self.cnt += 1 > self.sumx += x > self.sumsqrx += x**2 > > def finalize(self): > return ((self.sumsqrx - self.sumx**2/self.cnt)/self.cnt)**0.5 > > conn = sqlite3.connect(":memory:") > cur = conn.cursor() > cur.execute("create table table1(X1 int, X2 int)") > ls = [(2, 5), > (3, 7), > (4, 2), > (5, 1), > (8, 6)] > cur.executemany("insert into table1 values(?, ?)", ls) > conn.commit() > > conn.create_aggregate("stdev", 1, StdDev) > std_dev, = cur.execute("select stdev(X1), stdev(X2) from table1") > print(std_dev) > cur.close() > conn.close() > > > gives output > > (2.0591260281974, 2.315167380558045) > > That's all. This is what I was looking for .So what will be the best > solution to this problem ? This one or the previous one posted by me ? As always -- it depends. I believe the numerical error for the above algorithm tends to be much higher than for the one used in the statistics module. I'd have to google for the details, though, and I am lazy enough to leave that up to you. > The whole credit goes to Dennis lee bieber & avi.e.gross at gmail.com I think I mentioned it first ;) From __peter__ at web.de Tue Jul 5 03:55:11 2022 From: __peter__ at web.de (Peter Otten) Date: Tue, 5 Jul 2022 09:55:11 +0200 Subject: [Tutor] Consecutive_zeros In-Reply-To: <504401d88fc6$02af9e70$080edb50$@gmail.com> References: <9a041b69-9a71-098b-ee8f-2da4bcb51d75@web.de> <504401d88fc6$02af9e70$080edb50$@gmail.com> Message-ID: <6adf2afe-475d-2fc7-6051-867ceddd2015@web.de> On 04/07/2022 18:49, avi.e.gross at gmail.com wrote: > So I went back to my previous somewhat joking suggestion and thought of a > way of shortening it as the "re" module also has a regular-expression > version called re.split() that has the nice side effect of including an > empty string when I ask it to split on all runs of non-zero. > > re.split("[^0]", "1010010001000010000001") > ['', '0', '00', '000', '0000', '000000', ''] > > That '' at the end of the resulting list takes care of the edge condition > for a string with no zeroes at all: > > re.split("[^0]", "No zeroes") > ['', '', '', '', '', '', '', '', '', ''] > > Yes, lots of empty string but when you hand something like that to get > lengths, you get at least one zero which handles always getting a number, > unlike my earlier offering which needed to check if it matched anything. > > So here is a tad shorter and perhaps more direct version of the requested > function which is in effect a one-liner: > > def consecutive_zeros(string): > return(max([len(zs) for zs in re.split("[^0]", string)])) Python's developers like to tinker as much as its users -- but they disguise it as adding syntactic sugar or useful features ;) One of these features is a default value for max() which allows you to stick with re.findall() while following the cult of the one-liner: >>> max(map(len, re.findall("0+", "1010010001000010000001")), default=-1) 6 >>> max(map(len, re.findall("0+", "ham spam")), default=-1) -1 From __peter__ at web.de Tue Jul 5 04:07:16 2022 From: __peter__ at web.de (Peter Otten) Date: Tue, 5 Jul 2022 10:07:16 +0200 Subject: [Tutor] __debug__ and PYTHONOPTIMIZE In-Reply-To: References: Message-ID: <505052f8-6697-c275-8124-f4f7809a4291@web.de> On 05/07/2022 00:31, Albert-Jan Roskam wrote: > Hi, > I am using PYTHONOPTIMIZE=1 (equivalent to start-up option "-o"). As I > understood, assert statements are "compiled away" then. But how about > __debug__. Does it merely evaluate to False when using "-o"? Or is it also > really gone? Why so complicated? Just try and see: PS C:\> py -c "print(__debug__)" True PS C:\> py -Oc "print(__debug__)" False More generally; how do I decompile a .pyc or .pyo file to > verify this? With the "dis" module? This is a relevant code snippet: > for record in billionrecords: > ? ? if __debug__:? #evaluating this still takes some time! > ? ? ? ? logger.debug(record) > Thanks! Again: stop worrying and start experimenting ;) PS C:\tmp> type tmp.py def f(): if __debug__: print(42) assert False PS C:\tmp> py -O Python 3.9.6 (tags/v3.9.6:db3ff76, Jun 28 2021, 15:04:37) [MSC v.1929 32 bit (Intel)] on win32 Type "help", "copyright", "credits" or "license" for more information. >>> import tmp, dis >>> dis.dis(tmp.f) 3 0 LOAD_CONST 0 (None) 2 RETURN_VALUE >>> ^Z PS C:\tmp> py Python 3.9.6 (tags/v3.9.6:db3ff76, Jun 28 2021, 15:04:37) [MSC v.1929 32 bit (Intel)] on win32 Type "help", "copyright", "credits" or "license" for more information. >>> import tmp, dis >>> dis.dis(tmp.f) 2 0 LOAD_GLOBAL 0 (print) 2 LOAD_CONST 1 (42) 4 CALL_FUNCTION 1 6 POP_TOP 3 8 LOAD_CONST 2 (False) 10 POP_JUMP_IF_TRUE 16 12 LOAD_ASSERTION_ERROR 14 RAISE_VARARGS 1 >> 16 LOAD_CONST 0 (None) 18 RETURN_VALUE From sjeik_appie at hotmail.com Tue Jul 5 08:59:18 2022 From: sjeik_appie at hotmail.com (Albert-Jan Roskam) Date: Tue, 05 Jul 2022 14:59:18 +0200 Subject: [Tutor] __debug__ and PYTHONOPTIMIZE In-Reply-To: <505052f8-6697-c275-8124-f4f7809a4291@web.de> Message-ID: On Jul 5, 2022 10:07, Peter Otten <__peter__ at web.de> wrote: On 05/07/2022 00:31, Albert-Jan Roskam wrote: >???? Hi, >???? I am using PYTHONOPTIMIZE=1 (equivalent to start-up option "-o"). As I >???? understood, assert statements are "compiled away" then. But how about >???? __debug__. Does it merely evaluate to False when using "-o"? Or is it also >???? really gone? Why so complicated? Just try and see: PS C:\> py -c "print(__debug__)" True PS C:\> py -Oc "print(__debug__)" False More generally; how do I decompile a .pyc or .pyo file to >???? verify this? With the "dis" module? This is a relevant code snippet: >???? for record in billionrecords: >???? ? ? if __debug__:? #evaluating this still takes some time! >???? ? ? ? ? logger.debug(record) >???? Thanks! Again: stop worrying and start experimenting ;) PS C:\tmp> type tmp.py def f(): ???? if __debug__: print(42) ???? assert False PS C:\tmp> py -O Python 3.9.6 (tags/v3.9.6:db3ff76, Jun 28 2021, 15:04:37) [MSC v.1929 32 bit (Intel)] on win32 Type "help", "copyright", "credits" or "license" for more information. >>> import tmp, dis >>> dis.dis(tmp.f) ?? 3?????????? 0 LOAD_CONST?????????????? 0 (None) ?????????????? 2 RETURN_VALUE >>> ^Z PS C:\tmp> py Python 3.9.6 (tags/v3.9.6:db3ff76, Jun 28 2021, 15:04:37) [MSC v.1929 32 bit (Intel)] on win32 Type "help", "copyright", "credits" or "license" for more information. >>> import tmp, dis >>> dis.dis(tmp.f) ?? 2?????????? 0 LOAD_GLOBAL????????????? 0 (print) ?????????????? 2 LOAD_CONST?????????????? 1 (42) ?????????????? 4 CALL_FUNCTION??????????? 1 ?????????????? 6 POP_TOP ?? 3?????????? 8 LOAD_CONST?????????????? 2 (False) ????????????? 10 POP_JUMP_IF_TRUE??????? 16 ????????????? 12 LOAD_ASSERTION_ERROR ????????????? 14 RAISE_VARARGS??????????? 1 ???????? >>?? 16 LOAD_CONST?????????????? 0 (None) ????????????? 18 RETURN_VALUE _______________________________________________ ====== ====== Thanks Peter, that makes sense. It was half past midnight when I sent that mail from my phone. But I agree I should have tried a bit first. ? Best wishes, Albert-Jan From manpritsinghece at gmail.com Tue Jul 5 10:32:18 2022 From: manpritsinghece at gmail.com (Manprit Singh) Date: Tue, 5 Jul 2022 20:02:18 +0530 Subject: [Tutor] toy program to find standard deviation of 2 columns of a sqlite3 database In-Reply-To: References: Message-ID: Dear Sir, Actually it started with- how to write an aggregate function in python for sqlite3 . That i have got a fair idea now,So a task can be done in many ways, each way has its own merits and demerits. I have got a fair idea of this too with this example . Series of mails have taught me a lot of things . I am thankful to wonderful people like Dennis Lee Beiber, Peter otten and avi.e.gross at gmail.com . Regards Manprit Singh On Tue, Jul 5, 2022 at 1:10 PM Peter Otten <__peter__ at web.de> wrote: > On 04/07/2022 19:14, Manprit Singh wrote: > > Dear Sir, > > > > Finally I came up with a solution which seems more good to me, rather > than > > using the previous approach. In this solution I have used shortcut method > > for calculating the standard deviation. > > > > import sqlite3 > > > > class StdDev: > > > > def __init__(self): > > self.cnt = 0 > > self.sumx = 0 > > self.sumsqrx = 0 > > > > def step(self, x): > > self.cnt += 1 > > self.sumx += x > > self.sumsqrx += x**2 > > > > def finalize(self): > > return ((self.sumsqrx - self.sumx**2/self.cnt)/self.cnt)**0.5 > > > > conn = sqlite3.connect(":memory:") > > cur = conn.cursor() > > cur.execute("create table table1(X1 int, X2 int)") > > ls = [(2, 5), > > (3, 7), > > (4, 2), > > (5, 1), > > (8, 6)] > > cur.executemany("insert into table1 values(?, ?)", ls) > > conn.commit() > > > > conn.create_aggregate("stdev", 1, StdDev) > > std_dev, = cur.execute("select stdev(X1), stdev(X2) from table1") > > print(std_dev) > > cur.close() > > conn.close() > > > > > > gives output > > > > (2.0591260281974, 2.315167380558045) > > > > That's all. This is what I was looking for .So what will be the best > > solution to this problem ? This one or the previous one posted by me ? > > As always -- it depends. I believe the numerical error for the above > algorithm tends to be much higher than for the one used in the > statistics module. I'd have to google for the details, though, and I am > lazy enough to leave that up to you. > > > The whole credit goes to Dennis lee bieber & avi.e.gross at gmail.com > > I think I mentioned it first ;) > _______________________________________________ > Tutor maillist - Tutor at python.org > To unsubscribe or change subscription options: > https://mail.python.org/mailman/listinfo/tutor > From avi.e.gross at gmail.com Tue Jul 5 12:43:09 2022 From: avi.e.gross at gmail.com (avi.e.gross at gmail.com) Date: Tue, 5 Jul 2022 12:43:09 -0400 Subject: [Tutor] Consecutive_zeros In-Reply-To: <6adf2afe-475d-2fc7-6051-867ceddd2015@web.de> References: <9a041b69-9a71-098b-ee8f-2da4bcb51d75@web.de> <504401d88fc6$02af9e70$080edb50$@gmail.com> <6adf2afe-475d-2fc7-6051-867ceddd2015@web.de> Message-ID: <006601d8908e$50026680$f0073380$@gmail.com> Peter, It may be tinkering but is a useful feature. Had I bothered to find the manual page for max() and seen a way to specify a default of zero, indeed I would have been quite happy. There are many scenarios where you could argue a feature is syntactic sugar but when something is done commonly and can result in a bad result, it can be quite nice to have a way of avoiding it with lots of code. For example, I have seen lots of functions that can be told to remove NA values in R, or not to drop a dimension when consolidating info so the result has a know size even if not an optimal size. In this case, I would think it to be a BUG in the implementation of max() for it to return an error like this: max([]) Traceback (most recent call last): File "", line 1, in max([]) ValueError: max() arg is an empty sequence Basically it was given nothing and mathematically the maximum and minimum and many other such functions are meaningless when given nothing. So obviously a good approach is to test what you are working on before trying such a call, or wrapping it in something like a try(...) and catch the error and deal with it then. Those are valid approaches BUT we have many scenarios like the one being solved that have a concept that zero is a floor for counting numbers and is appropriate also as a description of the length of a pattern that does not exist. In this case, an empty list still means no instances of the pattern where found so the longest pattern is 0 units long. So it is perfectly reasonable to be able to add a default so the function does not blow up the program: max([], default=0) 0 I hasten to add that in many situations this approach is not valid and the code should blow up unless you take steps to avoid it as further work can be meaningless. I am curious how you feel about various other areas such as defaults when fetching an item from a dictionary. Yes, you can write elaborate code that prevents a mishap. Realistically, some people would create a slew of short functions like safe_max() that probably would use one of the techniques internally and return something more gently, so what is wrong with making a small change in max() itself that does this upon request? I know max() was not necessarily designed with regular expression matches (and their length) in mind. But many real world problems share the same concept albeit not always with a default of 0. -----Original Message----- From: Tutor On Behalf Of Peter Otten Sent: Tuesday, July 5, 2022 3:55 AM To: tutor at python.org Subject: Re: [Tutor] Consecutive_zeros On 04/07/2022 18:49, avi.e.gross at gmail.com wrote: > So I went back to my previous somewhat joking suggestion and thought > of a way of shortening it as the "re" module also has a > regular-expression version called re.split() that has the nice side > effect of including an empty string when I ask it to split on all runs of non-zero. > > re.split("[^0]", "1010010001000010000001") ['', '0', '00', '000', > '0000', '000000', ''] > > That '' at the end of the resulting list takes care of the edge > condition for a string with no zeroes at all: > > re.split("[^0]", "No zeroes") > ['', '', '', '', '', '', '', '', '', ''] > > Yes, lots of empty string but when you hand something like that to get > lengths, you get at least one zero which handles always getting a > number, unlike my earlier offering which needed to check if it matched anything. > > So here is a tad shorter and perhaps more direct version of the > requested function which is in effect a one-liner: > > def consecutive_zeros(string): > return(max([len(zs) for zs in re.split("[^0]", string)])) Python's developers like to tinker as much as its users -- but they disguise it as adding syntactic sugar or useful features ;) One of these features is a default value for max() which allows you to stick with re.findall() while following the cult of the one-liner: >>> max(map(len, re.findall("0+", "1010010001000010000001")), default=-1) 6 >>> max(map(len, re.findall("0+", "ham spam")), default=-1) -1 _______________________________________________ Tutor maillist - Tutor at python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor From avi.e.gross at gmail.com Tue Jul 5 22:31:11 2022 From: avi.e.gross at gmail.com (avi.e.gross at gmail.com) Date: Tue, 5 Jul 2022 22:31:11 -0400 Subject: [Tutor] this group and one liners Message-ID: <008a01d890e0$756336a0$6029a3e0$@gmail.com> Peter mentioned the cult of the one-liner and I admit I am sometimes one of the worshippers. But for teaching a relatively beginner computer class, which in theory includes the purpose of this group to provide hints and some help to students, or help them when they get stuck even on a more advanced project, the goal often is to do things in a way that makes an algorithm clear, often using lots of shorter lines of code. Few one-liners of any complexity are trivial to understand and often consist of highly nested code. As an example, lots of problems could look like f(g(h(arg, arg, arg, i(j(arg), arg)))) and I may not have matched my parenthese well. Throw in some pythonic constructs like [z^2 for z in ?] and it may take a while for a newcomer even to parse it, let alone debug it. And worse, much such code almost has to be read from inside to outside. What is wrong with code like: Var1 = p(?) Var2 = q(Var1, ?) ? I mean sure, there are lots of extra variables sitting around, maybe some if statement with an else and maybe some visible loops. But they show how the problem is being approached and perhaps allow some strategic debugging or testing with print statements and the like. An interesting variation I appreciate in languages with some form of pipeline is to write the algorithm forwards rather than nested. Again, this is easier for some to comprehend. I mean using pseudcode, something like: Do_this(args) piped to Do_that(above, args) piped to Do_some_more(above, args) placed in Result I do this all the time in R with code like: data.frame |> select(which columns to keep) |> filter(rows with some condition) |> mutate(make some new columns with calculations from existing columns) |> group_by(one or more columns) |> summarize(apply various calculations by group generating various additional columns in a report) |> arrange(sort in various ways) |> ggplot(make a graph) + ? + ? -> variable The above can be a set of changes one at a time that build on each other and saves the result in a variable, (or just prints it immediately) and you can build this in stages and run just up to some point and test the results and of course can intervene in many other ways. Functionally it is almost the same as using temporary variable and it can implement an algorithm in bite-sized pieces. The ggplot line is tad complex as it was created ages ago and has it?s own pipeline of sorts as ?adding? another command lets you refine your graph and add layers to it by effectively passing a growing data structure around to be changed. The point I am making is whether when some of us do one-liners, are we really helping or just enjoying solving our puzzles? Of course, if someone asks for a clever or obfuscated method, we can happily oblige. ? From learn2program at gmail.com Wed Jul 6 04:21:44 2022 From: learn2program at gmail.com (Alan Gauld) Date: Wed, 6 Jul 2022 09:21:44 +0100 Subject: [Tutor] this group and one liners In-Reply-To: <008a01d890e0$756336a0$6029a3e0$@gmail.com> References: <008a01d890e0$756336a0$6029a3e0$@gmail.com> Message-ID: <7143f0d4-0fdf-88eb-22d9-391065b28044@yahoo.co.uk> On 06/07/2022 03:31, avi.e.gross at gmail.com wrote: > But for teaching a relatively beginner computer class, ... > the goal often is to do things in a way that makes an algorithm clear, Never mind a beginner. The goal should *always* be to make the algorithm clear! As someone who spent many years leading a maintenance team we spent many hours disentangling "clever" one-liners. They almost never provided any benefit and just slowed down comprehension and made debugging nearly impossible. The only thing one-liners do is make the code shorter. But the compiler doesn't care and there are no benefits for short code except a tiny bit of saved storage! (I accept that some ancient interpreters did work slightly faster when interpreting a one liner but I doubt that any such beast is still in production use today!) ? > I mean sure, there are lots of extra variables sitting around, maybe some if statement with an else and maybe some visible loops. But they show how the problem is being approached and perhaps allow some strategic debugging or testing with print statements and the like. They also allow interspersed comments to explain what's being done And make it easier to insert print statements etc. > An interesting variation I appreciate in languages with some form of pipeline is to write the algorithm forwards rather than nested. Again, this is easier for some to comprehend. I mean using pseudcode, something like: Smalltalk programmers do this all the time(in a slightly different way) and it makes code easy to read and debug. And in the Smalltalk case easier for the optimiser to speed up. > The point I am making is whether when some of us do one-liners, are we really helping or just enjoying solving our puzzles? In my experience one-liners are nearly always the result of: a) an ego-trip by the programmer (usually by an "expert") b) a misguided assumption that short code is somehow better (usually by a beginner) Very, very, occasionally they are an engineering choice becase they do in fact give a minor speed improvement inside a critical loop. But that's about 0.1% of the one-liners I've seen! -- Alan G Author of the Learn to Program web site http://www.alan-g.me.uk/ http://www.amazon.com/author/alan_gauld Follow my photo-blog on Flickr at: http://www.flickr.com/photos/alangauldphotos From avi.e.gross at gmail.com Wed Jul 6 11:50:26 2022 From: avi.e.gross at gmail.com (avi.e.gross at gmail.com) Date: Wed, 6 Jul 2022 11:50:26 -0400 Subject: [Tutor] this group and one liners In-Reply-To: <7143f0d4-0fdf-88eb-22d9-391065b28044@yahoo.co.uk> References: <008a01d890e0$756336a0$6029a3e0$@gmail.com> <7143f0d4-0fdf-88eb-22d9-391065b28044@yahoo.co.uk> Message-ID: <007601d89150$1cf50320$56df0960$@gmail.com> Alan, Well said. I differentiate between code, often what we call one-liners, that is being done to impress or to solve a puzzle as in how compact you can make your code, as compared to fairly valid constructs that can allow programming to happen at a higher level of abstraction. As has often been discussed, quite a bit is done in languages like python using constructs like: [ x+2 for x in iterator ] In general, all that is can be written with one or more loops. Someone who started off programming in other languages may hit a wall reading that code embedded in a one-liner. I have written code with multiple nested constructs like that where I had trouble parsing it just a few days later. Functional programming techniques are another example of replacing a loop construct over multiple lines by an often short construct like do_this(function, iterable), again sometimes nested. You can even find versions that accept a list of functions to apply to a list of arguments either pairwise or more exhaustively. The same arguments apply to the way some people do object-oriented techniques when not everything needs to be an object. So, yes, you can make a fairly complicated object that encapsulates a gigantic amount of code, then use that in a one-liner where everything is done magically. I will say this though. Code that is brief and fits say on one screen, is far easier for many people to read and understand. But that can often be arranged by moving parts of the problem into other functions you create that each may also be easy to read and understand. A one-liner that simply calls one or two such functions with decent design and naming, may not qualify as using the smallest amount of code, but can be easy to read IN PARTS and still feel pleasing. Specifically with some of the code I shared recently, I pointed out that if max() failed for empty lists as an argument, you could simply create a safe_max() function of your own that encapsulates one of several variations and returns a 0 when you pass it something without a maximum. BUT, I think it must be carefully pointed out to students that there is a huge difference between using functions guaranteed to exist in any Python program unless shadowed, and those that can be imported from some standard module, and those that a person creates for themselves that may have to be brought in some way each time you use them and that others may have no access to so sharing code using them without including those functions is not a good thing. - Avi -----Original Message----- From: Tutor On Behalf Of Alan Gauld Sent: Wednesday, July 6, 2022 4:22 AM To: tutor at python.org Subject: Re: [Tutor] this group and one liners On 06/07/2022 03:31, avi.e.gross at gmail.com wrote: > But for teaching a relatively beginner computer class, ... > the goal often is to do things in a way that makes an algorithm clear, Never mind a beginner. The goal should *always* be to make the algorithm clear! As someone who spent many years leading a maintenance team we spent many hours disentangling "clever" one-liners. They almost never provided any benefit and just slowed down comprehension and made debugging nearly impossible. The only thing one-liners do is make the code shorter. But the compiler doesn't care and there are no benefits for short code except a tiny bit of saved storage! (I accept that some ancient interpreters did work slightly faster when interpreting a one liner but I doubt that any such beast is still in production use today!) ? > I mean sure, there are lots of extra variables sitting around, maybe some if statement with an else and maybe some visible loops. But they show how the problem is being approached and perhaps allow some strategic debugging or testing with print statements and the like. They also allow interspersed comments to explain what's being done And make it easier to insert print statements etc. > An interesting variation I appreciate in languages with some form of pipeline is to write the algorithm forwards rather than nested. Again, this is easier for some to comprehend. I mean using pseudcode, something like: Smalltalk programmers do this all the time(in a slightly different way) and it makes code easy to read and debug. And in the Smalltalk case easier for the optimiser to speed up. > The point I am making is whether when some of us do one-liners, are we really helping or just enjoying solving our puzzles? In my experience one-liners are nearly always the result of: a) an ego-trip by the programmer (usually by an "expert") b) a misguided assumption that short code is somehow better (usually by a beginner) Very, very, occasionally they are an engineering choice becase they do in fact give a minor speed improvement inside a critical loop. But that's about 0.1% of the one-liners I've seen! -- Alan G Author of the Learn to Program web site http://www.alan-g.me.uk/ http://www.amazon.com/author/alan_gauld Follow my photo-blog on Flickr at: http://www.flickr.com/photos/alangauldphotos _______________________________________________ Tutor maillist - Tutor at python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor From avi.e.gross at gmail.com Wed Jul 6 13:02:39 2022 From: avi.e.gross at gmail.com (avi.e.gross at gmail.com) Date: Wed, 6 Jul 2022 13:02:39 -0400 Subject: [Tutor] this group and one liners In-Reply-To: <7143f0d4-0fdf-88eb-22d9-391065b28044@yahoo.co.uk> References: <008a01d890e0$756336a0$6029a3e0$@gmail.com> <7143f0d4-0fdf-88eb-22d9-391065b28044@yahoo.co.uk> Message-ID: <008b01d8915a$337e77c0$9a7b6740$@gmail.com> The excuse for many one-liners is often that they are in some ways better as in use less memory or are faster in some sense. Although that can be true, I think it is reasonable to say that often the exact opposite is true. In order to make a compact piece of code, you often twist the problem around in a way that makes it possible to leave out some cases or not need to handle errors and so on. I did this a few days ago until I was reminded max(..., default=0) handled my need. Some of my attempts around it generated lots of empty strings which were all then evaluated as zero length and then max() processed a longer list with lots of zeroes. I am not saying the above slowed things down seriously, but that other methods that could solve the problem, but not o one line, were ignored. The pipelining methods I mentioned vary. Some languages have a variation using object-oriented techniques where you can chain calls by using methods or contents of objects as in: a.b().c(arg).d and so on. Some languages encourage writing that in a more obvious pipeline over multiple lines, where you may even be able to intersperse comments. But is that more efficient than doing it line by line? Consider this code: A = something B = f(A) C = g(B) rm(B) ... In many pipelines, there are lots of intermediate variables that are referenced once and meant to be deleted. In languages with garbage collection, that may happen eventually. But often the implementation of a pipeline may be such that only after all of it is completed, are the parts deleted on purpose or no longer held onto so garbage collection can see it as eligible. I will end by mentioning a language like PERL where people that things to extremes and beyond. The language encourages you to be so compact that normal people reading it often get lost as they fail to see how it does anything. Without giant details, the language maintains all sorts of variables with names like $_ that hold things like the last variable you set. Many of the verbs that accept arguments will default to using these automatic hidden ones as the target or source of something. So you can read a line in with something like my $mystring = ; print "$mystring \n"; chomp($mystring); $mystring =~ s/[hH]el+o/goodbye/g; print $mystring; The above (with possible errors on my part) is supposedly going to read in a line from Standard Input, remove any trailing newline and reassign it to the same variable without explicitly saying so, make a substitution from that string into another and replace every instance of "hello" with "goodbye" and print the result. But since everything is hidden in $_, you can write briefer and somewhat mysterious code like: $mystring = ; chomp($mystring); $mysub =~ s/hello/goodbye/g; print $mysub; The above prompts me for a line of text and I can enter something like "Hello my friends and hello Heloise." (without the quotes) and it spits out a result that matches "hello" with "h" either upper or lower case, and any numer of copies of the letter ell in a row: $ perl -w prog.pl Hello my friends and hello Heloise. goodbye my friends and goodbye goodbyeise. Well, yes, I deliberately used a regular expression that produced a bit more than is reasonable. BUT the same much shorter code below does the dame thing: $_ = ; chomp; s/[hH]el+o/goodbye/g; print; The lines like "chomp;" are interpreted as if I had written "$_ = chomp($_)" and the substitution as if I had written it a bit like the first version and the same for the print. And can it be a one liner? The semicolons are there to make something like this work fine: $_ = ; chomp; s/[hH]el+o/goodbye/g; print; But WHY be so terse and cryptic? As Alan points out, your code may be used and maintained by others. And the brevity above also comes with a cost of sorts. For every statement, the PERL interpreter must modify a slew of variables just in case you want them, many with cryptic names like $< and $0 but the other overhead is how many other function(alities) have to check the command line arguments and when missing, supply the hidden ones. For beginning students, I think it wise to start with more standard methods and they can use more advanced techniques, or in my example perhaps more primitive techniques, after they have a firm understanding. And my apologies if I bring in examples from other programming languages. Python has plenty of interesting usages that may seem equally mysterious to some. - s/a.e.gross/Avi/ -----Original Message----- From: Tutor On Behalf Of Alan Gauld Sent: Wednesday, July 6, 2022 4:22 AM To: tutor at python.org Subject: Re: [Tutor] this group and one liners On 06/07/2022 03:31, avi.e.gross at gmail.com wrote: > But for teaching a relatively beginner computer class, ... > the goal often is to do things in a way that makes an algorithm clear, Never mind a beginner. The goal should *always* be to make the algorithm clear! As someone who spent many years leading a maintenance team we spent many hours disentangling "clever" one-liners. They almost never provided any benefit and just slowed down comprehension and made debugging nearly impossible. The only thing one-liners do is make the code shorter. But the compiler doesn't care and there are no benefits for short code except a tiny bit of saved storage! (I accept that some ancient interpreters did work slightly faster when interpreting a one liner but I doubt that any such beast is still in production use today!) ? > I mean sure, there are lots of extra variables sitting around, maybe some if statement with an else and maybe some visible loops. But they show how the problem is being approached and perhaps allow some strategic debugging or testing with print statements and the like. They also allow interspersed comments to explain what's being done And make it easier to insert print statements etc. > An interesting variation I appreciate in languages with some form of pipeline is to write the algorithm forwards rather than nested. Again, this is easier for some to comprehend. I mean using pseudcode, something like: Smalltalk programmers do this all the time(in a slightly different way) and it makes code easy to read and debug. And in the Smalltalk case easier for the optimiser to speed up. > The point I am making is whether when some of us do one-liners, are we really helping or just enjoying solving our puzzles? In my experience one-liners are nearly always the result of: a) an ego-trip by the programmer (usually by an "expert") b) a misguided assumption that short code is somehow better (usually by a beginner) Very, very, occasionally they are an engineering choice becase they do in fact give a minor speed improvement inside a critical loop. But that's about 0.1% of the one-liners I've seen! -- Alan G Author of the Learn to Program web site http://www.alan-g.me.uk/ http://www.amazon.com/author/alan_gauld Follow my photo-blog on Flickr at: http://www.flickr.com/photos/alangauldphotos _______________________________________________ Tutor maillist - Tutor at python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor From sjeik_appie at hotmail.com Thu Jul 7 10:19:44 2022 From: sjeik_appie at hotmail.com (Albert-Jan Roskam) Date: Thu, 07 Jul 2022 16:19:44 +0200 Subject: [Tutor] this group and one liners In-Reply-To: <007601d89150$1cf50320$56df0960$@gmail.com> Message-ID: On Jul 6, 2022 17:50, avi.e.gross at gmail.com? As has often been discussed, quite a bit is done in languages like python using constructs like: [ x+2 for x in iterator ] ====== My rule of thumb is: if black converts a list/set/dict comprehension into a multiline expression, I'll probably refactor it. I try to avoid nested loops in them. Nested comprehensions may be ok. Nested ternary expressions are ugly, even more so inside comprehensions! Specifically with some of the code I shared recently, I pointed out that if max() failed for empty lists as an argument, you could simply create a safe_max() function of your own that encapsulates one of several variations and returns a 0 when you pass it something without a maximum. ==== Or this? max(items or [0]) From mats at wichmann.us Thu Jul 7 11:38:20 2022 From: mats at wichmann.us (Mats Wichmann) Date: Thu, 7 Jul 2022 09:38:20 -0600 Subject: [Tutor] this group and one liners In-Reply-To: <008b01d8915a$337e77c0$9a7b6740$@gmail.com> References: <008a01d890e0$756336a0$6029a3e0$@gmail.com> <7143f0d4-0fdf-88eb-22d9-391065b28044@yahoo.co.uk> <008b01d8915a$337e77c0$9a7b6740$@gmail.com> Message-ID: <9badc5d6-b94c-0a96-2149-d1a1c1e989d9@wichmann.us> On 7/6/22 11:02, avi.e.gross at gmail.com wrote: Boy, we "old-timers" do get into these lengthy discussions... having read several comments here, of which this is only one: > The excuse for many one-liners is often that they are in some ways better as in use less memory or are faster in some sense. > > Although that can be true, I think it is reasonable to say that often the exact opposite is true. > > In order to make a compact piece of code, you often twist the problem around in a way that makes it possible to leave out some cases or not need to handle errors and so on. I did this a few days ago until I was reminded max(..., default=0) handled my need. Some of my attempts around it generated lots of empty strings which were all then evaluated as zero length and then max() processed a longer list with lots of zeroes. After a period when they felt awkward because lots of my programming background was in languages that didn't have these, I now use simple one-liners extensively - comprehensions and ternary expressions. If they start nesting, probably not. Someone mentioned a decent metric - if a code formatter like Black starts breaking your comprehension into four lines, it probably got too complex. My take on these is you can write a more compact function this way - you're more likely to have the meat of what's going on right there together in a few lines, rather than building mountain ranges of indentation - and this can actually *improve* readability, not obscure it. I agree "clever" one-liners may be a support burden, but anyone with a reasonable amount of Python competency (which I'd expect of anyone in a position to maintain my code at a later date) should have no trouble recognizing the intent of simple ones. Sometimes thinking about how to write a concise one-liner exposes a failure to have thought through what you're doing completely - so unlike what is mentioned above - twisting a problem around unnaturally (no argument that happens too), you might actually realize that there's a simpler way to structure a step. Just one more opinion. From avi.e.gross at gmail.com Thu Jul 7 18:01:13 2022 From: avi.e.gross at gmail.com (avi.e.gross at gmail.com) Date: Thu, 7 Jul 2022 18:01:13 -0400 Subject: [Tutor] this group and one liners In-Reply-To: References: <007601d89150$1cf50320$56df0960$@gmail.com> Message-ID: <006a01d8924d$133dbd60$39b93820$@gmail.com> Albert, It would be a great idea to use: Max(item or 0) But if item == [] then max([] or 0) breaks down with an error and the [] does not evaluate to false. Many things in python are truthy but on my setup, an empty list does not seem to be evaluated and return false before max() gets to it. Am I doing anything wrong? As noted, max has an argument allowed of default=0 perhaps precisely as there isn?t such an easy way around it. Avi From: Albert-Jan Roskam Sent: Thursday, July 7, 2022 10:20 AM To: avi.e.gross at gmail.com Cc: tutor at python.org Subject: Re: [Tutor] this group and one liners On Jul 6, 2022 17:50, avi.e.gross at gmail.com As has often been discussed, quite a bit is done in languages like python using constructs like: [ x+2 for x in iterator ] ====== My rule of thumb is: if black converts a list/set/dict comprehension into a multiline expression, I'll probably refactor it. I try to avoid nested loops in them. Nested comprehensions may be ok. Nested ternary expressions are ugly, even more so inside comprehensions! Specifically with some of the code I shared recently, I pointed out that if max() failed for empty lists as an argument, you could simply create a safe_max() function of your own that encapsulates one of several variations and returns a 0 when you pass it something without a maximum. ==== Or this? max(items or [0]) From wlfraed at ix.netcom.com Thu Jul 7 18:50:29 2022 From: wlfraed at ix.netcom.com (Dennis Lee Bieber) Date: Thu, 07 Jul 2022 18:50:29 -0400 Subject: [Tutor] this group and one liners References: <007601d89150$1cf50320$56df0960$@gmail.com> <006a01d8924d$133dbd60$39b93820$@gmail.com> Message-ID: On Thu, 7 Jul 2022 18:01:13 -0400, declaimed the following: > >max([] or 0) > > > >breaks down with an error and the [] does not evaluate to false. > That's because the 0 is not an iterable, not that [] didn't evaluate... >>> [] or 0 0 >>> max([] or 0) Traceback (most recent call last): File "", line 1, in TypeError: 'int' object is not iterable >>> max(0) Traceback (most recent call last): File "", line 1, in TypeError: 'int' object is not iterable >>> max() requires something it can iterate over. The expression or returns , not true/false, and does NOT override max() normal operation of trying to compare objects in an iterable.. >>> >>> [] or "anything" 'anything' >>> max([] or "anything") 'y' >>> The "default" value option applies only if max() would otherwise raise an exception for an empty sequence. >>> max([], default=0) 0 >>> max([]) Traceback (most recent call last): File "", line 1, in ValueError: max() arg is an empty sequence >>> -- Wulfraed Dennis Lee Bieber AF6VN wlfraed at ix.netcom.com http://wlfraed.microdiversity.freeddns.org/ From alan.gauld at yahoo.co.uk Thu Jul 7 18:59:00 2022 From: alan.gauld at yahoo.co.uk (Alan Gauld) Date: Thu, 7 Jul 2022 23:59:00 +0100 Subject: [Tutor] this group and one liners In-Reply-To: <9badc5d6-b94c-0a96-2149-d1a1c1e989d9@wichmann.us> References: <008a01d890e0$756336a0$6029a3e0$@gmail.com> <7143f0d4-0fdf-88eb-22d9-391065b28044@yahoo.co.uk> <008b01d8915a$337e77c0$9a7b6740$@gmail.com> <9badc5d6-b94c-0a96-2149-d1a1c1e989d9@wichmann.us> Message-ID: On 07/07/2022 16:38, Mats Wichmann wrote: > background was in languages that didn't have these, I now use simple > one-liners extensively - comprehensions and ternary expressions. Let me clarify. When I'm talking about one-liners I mean the practice of putting multiple language expressions into a single line. I don't include language features like comprehensions, generators or ternary operators. These are all valid language features. But when a comprehension is used for its sidfe effects, or ternary operators are used with multiple comprehensions and so on, that's when it becomes a problem. Maintenance is the most expensive part of any piece of long lived softrware, making maintenance cheap is the responsibility of any programmer. Complex one-liners are the antithesis of cheap. But use of regular idiomatic expressions are fine. It's also worth noting that comprehensions can be written on multiple lines too, although that still loses debug potential... var = [expr for item in seq if condition] So if you do have to use more complex expressions or conditions you can at least make them more readable. The same applies to regex. They can be cryptic or readable. If they must be complex (and sometimes they must) they can be built up in stages with each group clearly commented. > My take on these is you can write a more compact function this way - > you're more likely to have the meat of what's going on right there > together in a few lines, rather than building mountain ranges of > indentation True to an extent, although recent studies suggest that functions can be up to 100 lines long before they become hard to maintain (they used to say 25!) But if we are getting to 4 or more levels of indentation its usually a sign that some refactoring needs to be done. > anyone in a position to maintain my code at a later date) should have no > trouble recognizing the intent of simple ones. That's true, although in many organisations the maintence team is the first assignment for new recruits. So they may have limited experience. But that usually affects their ability to deal with lots of code rather than the code within a single function. (And it's to learn that skill that they get assigned to maintenance first! - Along with learning the house style, if such exists) Of course, much software maintenance is now off-shored rather than kept in-house and the issue there is that the cheapest programmers are used and these also tend to be the least experienced or those with "limited career prospects" - aka old or mediocre. > Sometimes thinking about how to write a concise one-liner exposes a > failure to have thought through what you're doing completely That's true too and many pieces of good quality code start of as one-liners. But in the interests of maintenance should be deconstructed and/or refactored once the solution is understood. Good engineering is all about cost reduction, so reducing maintenance cost is the primary objective of good software engineering because maintenance is by far the biggest cost of most software projects. -- Alan G Author of the Learn to Program web site http://www.alan-g.me.uk/ http://www.amazon.com/author/alan_gauld Follow my photo-blog on Flickr at: http://www.flickr.com/photos/alangauldphotos From PythonList at DancesWithMice.info Thu Jul 7 19:14:29 2022 From: PythonList at DancesWithMice.info (dn) Date: Fri, 8 Jul 2022 11:14:29 +1200 Subject: [Tutor] this group and one liners In-Reply-To: <9badc5d6-b94c-0a96-2149-d1a1c1e989d9@wichmann.us> References: <008a01d890e0$756336a0$6029a3e0$@gmail.com> <7143f0d4-0fdf-88eb-22d9-391065b28044@yahoo.co.uk> <008b01d8915a$337e77c0$9a7b6740$@gmail.com> <9badc5d6-b94c-0a96-2149-d1a1c1e989d9@wichmann.us> Message-ID: <44cd1e9c-487e-4bc5-f452-8c05d92b342c@DancesWithMice.info> On 08/07/2022 03.38, Mats Wichmann wrote: > On 7/6/22 11:02, avi.e.gross at gmail.com wrote: > Boy, we "old-timers" do get into these lengthy discussions... having > read several comments here, of which this is only one: Who is being called "old"? [this conversation is operating at an 'advanced' level. If you are more of a Beginner and would like an explanation of any of what I've written here, please don't hesitate to request expansion or clarification!] >> The excuse for many one-liners is often that they are in some ways better as in use less memory or are faster in some sense. Rather than "excuse", is it a (misplace?) sense of pride or ego? That said, sometimes there are efficiency-gains. The problem though, is that such may only apply on the particular system. So, when the code is moved to my PC, an alternate approach is actually 'better' (for whichever is/are the pertinent criteria). >> Although that can be true, I think it is reasonable to say that often the exact opposite is true. >> >> In order to make a compact piece of code, you often twist the problem around in a way that makes it possible to leave out some cases or not need to handle errors and so on. I did this a few days ago until I was reminded max(..., default=0) handled my need. Some of my attempts around it generated lots of empty strings which were all then evaluated as zero length and then max() processed a longer list with lots of zeroes. Is this a good point to talk 'testability'? Normally, I would mention "TDD" - my preferred dev-approach. However, at this time I'm working with a bunch of 'Beginners' and showing the use of Python's Interactive Mode, aka the REPL - so, experiment-first rather than test-first - or, more testing of code-constructs than the data processing. The problem with dense units of code is that they are, by-definition, a 'black box'. We can test 'around' them, but not 'through' them/within them. Accordingly, the TDD approach would be to develop step-by-step, assuring each step in the process, as it is built (as it should be built). This produces a line-by-line result. From there, some tools will suggest that a for-loop be turned into a comprehension (for example). Would it be part of one's skill in refactoring to decide if such is actually a good idea - and because 'testing' has been integral from the start, that may provide a discouragement from too much integration. Trainees, using the REPL, particularly those with caution as an attribute (cf arrogance (?) ), are also more inclined to develop multi-stage processes (such as the R example given earlier), stage-by-stage. They will similarly test-as-you-go, even if after writing each line of code (cf 'pure' TDD). Once a single worked-example has been developed, the code is likely to be copied into an editor/IDE, and now that the coder's confidence is raised, the code (one hopes) will be subjected to a wider range of test-cases. In the former case, there is some caution before consolidation - tests may need to be removed/rendered impossible. In the latter, it is less likely such thinking will apply, because the "confidence" and code-first thinking may lead to over-confidence, without regard for 'testability'. Maybe? > After a period when they felt awkward because lots of my programming > background was in languages that didn't have these, I now use simple > one-liners extensively - comprehensions and ternary expressions. If > they start nesting, probably not. Someone mentioned a decent metric - > if a code formatter like Black starts breaking your comprehension into > four lines, it probably got too complex. Agreed. There is nothing inherently 'wrong' with the likes of comprehensions, generators, ternary operators, etc. So, why not use them? In short, they are idioms within the Python language, and would not have been made available if they hadn't been deemed useful and appropriate. (see PEP process) Any argument that they are difficult to understand is going to be correct - at least, on the face of it. This applies to natural-languages as well. For example, the American "yeah, right" is an exclamation consisting of two positive words - apparently a statement of agreement ("yes"), and an affirmation of correctness ("right", ie correct). Yet it is actually used as an expression of disagreement through derision, eg someone says "dn is the best-looking person in the room" but another person disputes with feeling by sarcastically intoning "yeah, right!". Idioms are learned (and often can't be easily or literally translated), and part of the necessity for ("deliberate") practice in the use of Python. Sure, someone who is only a few chapters into an intro-book will not readily recognise a list-comprehension. However, that is not a good reason why those of us who have finished the whole book should not use these facilities! [ x+2 for x in iterator ] - is a reasonable Python idiom max(items or [0]) - appears to be a reasonable use of a ternary operator, except that max( items, default=0 ) - is more 'pythonic' (a feature built-into the language for this express purpose/situation) and thus idiomatic Python How about the popular 'gotcha' of using a mutable-collection as a function-parameter, eg def function_name( mutable_collection=None ): if mutable_collection is None: mutable_collection = [] or is this commonly-used idiom more pythonic? def function_name( mutable_collection=None ): mutable_collection = mutable_collection if mutable_collection else [] Can concise-forms be over-used? Yes! Rather than relying upon some external tool, I find that the IDE (or my own) formatting of the various clauses provides immediate feedback that things are becoming too complex (for my future-self to handle). A longer ternary operator, or one that is contained within a more complex construct could be spread over two lines (which would highlight the two 'choices'). A list comprehension could split its expression, its for-clause, and its conditional-clause over three lines. Spreading a single such construct over more than three lines starts to look 'complex' (to my "old" eyes). The indentation requires more thought than I'd care to devote (I need my brain-power for problem-solving rather than 'art work'!). Thus, these lead to the idea that 'simple is better than complex...complicated' - regardless of one's interpretation of "beautiful" [Zen of Python]. > My take on these is you can write a more compact function this way - > you're more likely to have the meat of what's going on right there > together in a few lines, rather than building mountain ranges of > indentation - and this can actually *improve* readability, not obscure > it. I agree "clever" one-liners may be a support burden, but anyone > with a reasonable amount of Python competency (which I'd expect of > anyone in a position to maintain my code at a later date) should have no > trouble recognizing the intent of simple ones. Could we also posit that there is more than one definition of "complex"? As well as "testability" being lost as groups of steps in a multi-stage process are combined and/or compressed, there is a longer-term impact. When (not "if") the code needs to be changed, how easy would you rate that task? I guess the answer can't escape 'testability' in that some code which comes with (good) tests already in-place, can be changed with a greater sense of confidence - the idea that the code can be altered and those alterations will not cause 'breakage' (because the tests still 'pass') - regression testing. However, the main point here, is that someone charged with changing a piece of code will take a certain amount of time to read and understand it (before (s)he starts to make changes). Even assuming that-someone is a Python-Master, comprehending existing code requires an understanding of the domain and the algorithm (etc). Accordingly, the more 'dense' the code, the harder it is likely to become to make such a task. If the algorithm is a common platform within the domain, the 'level' at which things will be deemed 'complex' will change. For example, a recent thread on the list dealt with calculating a standard deviation. Statisticians will readily recognise such a calculation (more likely there is a library routine/function to call, but...). Accordingly, the domain doesn't require such code to be 'simplified' - even though 'mere mortals' might scratch their heads trying to decipher its purpose, the steps within it, and its place in the overall routine. > Sometimes thinking about how to write a concise one-liner exposes a > failure to have thought through what you're doing completely - so unlike > what is mentioned above - twisting a problem around unnaturally (no > argument that happens too), you might actually realize that there's a > simpler way to structure a step. Unnatural twisting sounds like something to avoid! Twist again, like we did last summer... Play Twister again, like we... I prefer to break complex calculations into their working parts (not just as a characteristic of TDD, as above). It can also be useful to separate working-parts into their own function. In both cases, we need 'names', either for intermediate-results (see also the use of _ as a placeholder-identifier) or to be able to call the function. Taking care to choose a 'good' name improves readability and decreases apparent complexity. A 'problem' I see trainees often evidencing (aka 'the rush to code', 'I'm not working if I'm not coding', etc), is that a name must be chosen when the identifier is first defined ("LHS"). However, its use may not be properly appreciated until that value is subsequently used ("RHS"). It is at this later step that the importance of the name becomes most obvious (to the writer). If that is so, the assessment goes-double for a subsequent reader attempting to divine the code's meaning and workings! In the past, realising that the first choice of name might not be the best may have lead us to say 'oh well', sigh, and quietly (try to) carry-on, because of the (considerable) effort of changing a name (without introducing regression errors). These days, such "technical debt" is quite avoidable. Capable IDEs enable one to quickly and easily refactor a choice of name and to (find-replace) update the code to utilise a better name, everywhere it is mentioned, with minimal manual effort! Thus, the effort of ensuring the future-reader/maintainer has competent implicit and fundamental documentation loses yet another 'excuse' and reaches the level of professional expectation. -- Regards, =dn From PythonList at DancesWithMice.info Thu Jul 7 19:47:31 2022 From: PythonList at DancesWithMice.info (dn) Date: Fri, 8 Jul 2022 11:47:31 +1200 Subject: [Tutor] this group and one liners In-Reply-To: References: <008a01d890e0$756336a0$6029a3e0$@gmail.com> <7143f0d4-0fdf-88eb-22d9-391065b28044@yahoo.co.uk> <008b01d8915a$337e77c0$9a7b6740$@gmail.com> <9badc5d6-b94c-0a96-2149-d1a1c1e989d9@wichmann.us> Message-ID: On 08/07/2022 10.59, Alan Gauld via Tutor wrote: > On 07/07/2022 16:38, Mats Wichmann wrote: > >> background was in languages that didn't have these, I now use simple >> one-liners extensively - comprehensions and ternary expressions. > > Let me clarify. When I'm talking about one-liners I mean the practice > of putting multiple language expressions into a single line. I don't > include language features like comprehensions, generators or ternary > operators. These are all valid language features. +1 > But when a comprehension is used for its sidfe effects, or > ternary operators are used with multiple comprehensions and > so on, that's when it becomes a problem. Maintenance is the > most expensive part of any piece of long lived softrware, > making maintenance cheap is the responsibility of any programmer. > Complex one-liners are the antithesis of cheap. But use of > regular idiomatic expressions are fine. +1 > It's also worth noting that comprehensions can be written > on multiple lines too, although that still loses debug > potential... > > var = [expr > for item in seq > if condition] > > So if you do have to use more complex expressions or conditions > you can at least make them more readable. Many texts introduce comprehensions with comments like "shorter" and "more efficient". In this example, the comprehension will run at C-speed, whereas the long-form for-loop will run at interpreter-speed. Thus, one definition of "efficient". Yes, it would *appear* shorter if written on a single line. However, is not, when formatted for reading. Here's the long-form: seq = list() for item in seq: if condition: var = expr NB I have a muscle-memory that inserts a blank-line before (and after) a for-loop (for readability). However, in this example, the 'declaration' of the seq[ence] would become physically/vertically-separated from the loop which has the sole purpose of initialising same. Negative readability! Also, my IDE will prefer to format a multi-line comprehension such as this, by placing the last square-bracket on the following line (and probably inserting a line-break after the opening bracket). Accordingly, there is no 'shorter' because the 'length' of each alternative becomes exactly the same, or the comprehension is 'longer' (counting as 'lines of code')! > The same applies to regex. They can be cryptic or readable. If > they must be complex (and sometimes they must) they can be built > up in stages with each group clearly commented. +1 I've seen people tying themselves in knots to explain a complex RegEx. Even to the point of trying to fit each 'operation' on its own line, followed by a # explanation. >> My take on these is you can write a more compact function this way - >> you're more likely to have the meat of what's going on right there >> together in a few lines, rather than building mountain ranges of >> indentation > > True to an extent, although recent studies suggest that functions can be > up to 100 lines long before they become hard to maintain (they used to > say 25!) But if we are getting to 4 or more levels of indentation its > usually a sign that some refactoring needs to be done. Didn't we used to say ~60 lines - the number of lines on a page of green-striped, continuous, line-flo[w], stationery? Those were the days! >> anyone in a position to maintain my code at a later date) should have no >> trouble recognizing the intent of simple ones. > > That's true, although in many organisations the maintence team is the > first assignment for new recruits. So they may have limited experience. > But that usually affects their ability to deal with lots of code rather > than the code within a single function. (And it's to learn that skill > that they get assigned to maintenance first! - Along with learning the > house style, if such exists) Of course, much software maintenance is now > off-shored rather than kept in-house and the issue there is that the > cheapest programmers are used and these also tend to be the least > experienced or those with "limited career prospects" - aka old or mediocre. OK, so now we're not just observing the grey in my beard? Doesn't the root meaning of mediocrity come from observations of how standards in professional journalism are steadily and markedly declining? Jokes(?) aside, the observation is all-too correct though. Thus, the added-virtue of providing tests alongside any code - or code not being 'complete' unless there is also a related test suite. The problem is that many maintenance-fixes are performed under time-pressure. Worst case: the company is at a standstill until you find this bug... The contention then, is that these 'learners-of-their-trade' should be given *more* time. Time to look-up the docs and the reference books, to see how various constructs work, eg list comprehensions. Time to learn! Although when off-shoring, one is (imagines that you are) paying for competence. So, the above scenario should not exist... Oops! >> Sometimes thinking about how to write a concise one-liner exposes a >> failure to have thought through what you're doing completely > > That's true too and many pieces of good quality code start of as > one-liners. But in the interests of maintenance should be deconstructed > and/or refactored once the solution is understood. Good engineering > is all about cost reduction, so reducing maintenance cost is the > primary objective of good software engineering because maintenance > is by far the biggest cost of most software projects. +1 Sadly an observation that is seldom experienced by students and hobbyists, and only becomes apparent - indeed relevant - when the complexity of one's projects increases. Such (only) 'in your own head' behavior is philosophically-discouraged in the practice of TDD. (just sayin'...) -- Regards, =dn From alan.gauld at yahoo.co.uk Thu Jul 7 20:33:16 2022 From: alan.gauld at yahoo.co.uk (Alan Gauld) Date: Fri, 8 Jul 2022 01:33:16 +0100 Subject: [Tutor] this group and one liners In-Reply-To: <006a01d8924d$133dbd60$39b93820$@gmail.com> References: <007601d89150$1cf50320$56df0960$@gmail.com> <006a01d8924d$133dbd60$39b93820$@gmail.com> Message-ID: On 07/07/2022 23:01, avi.e.gross at gmail.com wrote: > max([] or 0) > > breaks down with an error and the [] does not evaluate to false. max([] or [0]) a sequence is all thats needed. -- Alan G Author of the Learn to Program web site http://www.alan-g.me.uk/ http://www.amazon.com/author/alan_gauld Follow my photo-blog on Flickr at: http://www.flickr.com/photos/alangauldphotos From avi.e.gross at gmail.com Thu Jul 7 19:28:04 2022 From: avi.e.gross at gmail.com (avi.e.gross at gmail.com) Date: Thu, 7 Jul 2022 19:28:04 -0400 Subject: [Tutor] this group and one liners In-Reply-To: <9badc5d6-b94c-0a96-2149-d1a1c1e989d9@wichmann.us> References: <008a01d890e0$756336a0$6029a3e0$@gmail.com> <7143f0d4-0fdf-88eb-22d9-391065b28044@yahoo.co.uk> <008b01d8915a$337e77c0$9a7b6740$@gmail.com> <9badc5d6-b94c-0a96-2149-d1a1c1e989d9@wichmann.us> Message-ID: <02e601d89259$35982e20$a0c88a60$@gmail.com> What sometimes amazes me, Max, is how some functions people write get adjusted to support more and more options. It can be as simple as setting a default value for missing options, or the ability to support more kinds of data and converting it to the type needed, or an argument to specify how many significant digits you want in the output and lots more you can imagine. Now if you built a one-liner or even a very condensed multi-line version, fitting in additional options becomes a challenge. You may introduce errors by shoehorning in something without enough consideration and testing. Building your code more loosely and more step by step, can make it far easier to modify cleanly and maybe easier to document as you have some room for comments and so on. What you are talking about may well be a different idea at times, that a function that is too long and complex might better be re-written as a collection of smaller functions so each part of the task is done at a level of abstraction and comprehension that makes sense. That can be taken too far, as well, especially if the names and ideas are not at the level people think at. And, of course, it can lead to errors if someone just copies a function without the parts it depends on. So a dumb question is whether you approve of defining functions within a function just to be used ONCE within the function but that makes it easier to read when you finally get to the meat and are able to say something somewhat meaningful like: If (is_X((data) and is_Y(data) and not is_empty(data)): pass If the functions are not that useful for anything else, this keeps them all together, albeit there may be some overhead if the main function is called repeatedly, or will there be if it is properly interpreted and pseudo-compiled? I think it is high time I dropped this diversion and waited for a person asking for actual help. ? -----Original Message----- From: Tutor On Behalf Of Mats Wichmann Sent: Thursday, July 7, 2022 11:38 AM To: tutor at python.org Subject: Re: [Tutor] this group and one liners On 7/6/22 11:02, avi.e.gross at gmail.com wrote: Boy, we "old-timers" do get into these lengthy discussions... having read several comments here, of which this is only one: > The excuse for many one-liners is often that they are in some ways better as in use less memory or are faster in some sense. > > Although that can be true, I think it is reasonable to say that often the exact opposite is true. > > In order to make a compact piece of code, you often twist the problem around in a way that makes it possible to leave out some cases or not need to handle errors and so on. I did this a few days ago until I was reminded max(..., default=0) handled my need. Some of my attempts around it generated lots of empty strings which were all then evaluated as zero length and then max() processed a longer list with lots of zeroes. After a period when they felt awkward because lots of my programming background was in languages that didn't have these, I now use simple one-liners extensively - comprehensions and ternary expressions. If they start nesting, probably not. Someone mentioned a decent metric - if a code formatter like Black starts breaking your comprehension into four lines, it probably got too complex. My take on these is you can write a more compact function this way - you're more likely to have the meat of what's going on right there together in a few lines, rather than building mountain ranges of indentation - and this can actually *improve* readability, not obscure it. I agree "clever" one-liners may be a support burden, but anyone with a reasonable amount of Python competency (which I'd expect of anyone in a position to maintain my code at a later date) should have no trouble recognizing the intent of simple ones. Sometimes thinking about how to write a concise one-liner exposes a failure to have thought through what you're doing completely - so unlike what is mentioned above - twisting a problem around unnaturally (no argument that happens too), you might actually realize that there's a simpler way to structure a step. Just one more opinion. _______________________________________________ Tutor maillist - Tutor at python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor From avi.e.gross at gmail.com Thu Jul 7 19:36:45 2022 From: avi.e.gross at gmail.com (avi.e.gross at gmail.com) Date: Thu, 7 Jul 2022 19:36:45 -0400 Subject: [Tutor] this group and one liners In-Reply-To: References: <007601d89150$1cf50320$56df0960$@gmail.com> <006a01d8924d$133dbd60$39b93820$@gmail.com> Message-ID: <02f301d8925a$6c0ddf30$44299d90$@gmail.com> Thanks, Dennis. Indeed I should be using a list that contains a single 0 (albeit many works too), max([] or [0]) Amusingly, I played around and max() takes either a comma separated list as in max(1,2) or it takes any other iterable as in [1,2] and I think what is happening is it sees anything with a comma as a tuple which is iterable. I mean this works: max(1,2 or 2,3) returning 3 oddly enough and so does this: max([] or 0,0) returning a 0. -----Original Message----- From: Tutor On Behalf Of Dennis Lee Bieber Sent: Thursday, July 7, 2022 6:50 PM To: tutor at python.org Subject: Re: [Tutor] this group and one liners On Thu, 7 Jul 2022 18:01:13 -0400, declaimed the following: > >max([] or 0) > > > >breaks down with an error and the [] does not evaluate to false. > That's because the 0 is not an iterable, not that [] didn't evaluate... >>> [] or 0 0 >>> max([] or 0) Traceback (most recent call last): File "", line 1, in TypeError: 'int' object is not iterable >>> max(0) Traceback (most recent call last): File "", line 1, in TypeError: 'int' object is not iterable >>> max() requires something it can iterate over. The expression or returns , not true/false, and does NOT override max() normal operation of trying to compare objects in an iterable.. >>> >>> [] or "anything" 'anything' >>> max([] or "anything") 'y' >>> The "default" value option applies only if max() would otherwise raise an exception for an empty sequence. >>> max([], default=0) 0 >>> max([]) Traceback (most recent call last): File "", line 1, in ValueError: max() arg is an empty sequence >>> -- Wulfraed Dennis Lee Bieber AF6VN wlfraed at ix.netcom.com http://wlfraed.microdiversity.freeddns.org/ _______________________________________________ Tutor maillist - Tutor at python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor From avi.e.gross at gmail.com Thu Jul 7 21:52:45 2022 From: avi.e.gross at gmail.com (avi.e.gross at gmail.com) Date: Thu, 7 Jul 2022 21:52:45 -0400 Subject: [Tutor] this group and one liners In-Reply-To: References: <007601d89150$1cf50320$56df0960$@gmail.com> <006a01d8924d$133dbd60$39b93820$@gmail.com> Message-ID: <005101d8926d$6ba5fb00$42f1f100$@gmail.com> OK, Alan, in the interest of tying several annoying threads together, I make a one-line very brief generator to return a darn zero so I can set a default of zero on an empty list to max! def zeroed(): yield(0) max([] or zeroed()) 0 max([6, 66, 666] or zeroed()) 666 max([] or zeroed()) 0 Kidding aside, although any iterable will do such as [0] or (0,) or {0} it does sound like it would be useful to have a silly function like the above that makes anything such as a scalar into an iterable just to make programs that demand an iterable happy. There probably is something out there with some strange name like list(numb) but this way is harder for anyone maintaining it to figure out WHY ... def gener8r(numb): yield(numb) max([] or gener8r(3.1415926535)) 3.1415926535 But oddly although brackets work, an explicit call to list() generates an error! Ditto for {number} working and set(number) failing. Is this an anomaly with a meaning? max([] or [3.1415926535]) 3.1415926535 max([] or list(3.1415926535)) Traceback (most recent call last): File "", line 1, in max([] or list(3.1415926535)) TypeError: 'float' object is not iterable max([] or list(3.1415926535, 0)) Traceback (most recent call last): File "", line 1, in max([] or list(3.1415926535, 0)) TypeError: list expected at most 1 argument, got 2 max([] or set(3.1415926535)) Traceback (most recent call last): File "", line 1, in max([] or set(3.1415926535)) TypeError: 'float' object is not iterable max([] or {3.1415926535}) 3.1415926535 -----Original Message----- From: Tutor On Behalf Of Alan Gauld via Tutor Sent: Thursday, July 7, 2022 8:33 PM To: tutor at python.org Subject: Re: [Tutor] this group and one liners On 07/07/2022 23:01, avi.e.gross at gmail.com wrote: > max([] or 0) > > breaks down with an error and the [] does not evaluate to false. max([] or [0]) a sequence is all thats needed. -- Alan G Author of the Learn to Program web site http://www.alan-g.me.uk/ http://www.amazon.com/author/alan_gauld Follow my photo-blog on Flickr at: http://www.flickr.com/photos/alangauldphotos _______________________________________________ Tutor maillist - Tutor at python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor From alan.gauld at yahoo.co.uk Fri Jul 8 06:53:43 2022 From: alan.gauld at yahoo.co.uk (Alan Gauld) Date: Fri, 8 Jul 2022 11:53:43 +0100 Subject: [Tutor] this group and one liners In-Reply-To: <005101d8926d$6ba5fb00$42f1f100$@gmail.com> References: <007601d89150$1cf50320$56df0960$@gmail.com> <006a01d8924d$133dbd60$39b93820$@gmail.com> <005101d8926d$6ba5fb00$42f1f100$@gmail.com> Message-ID: On 08/07/2022 02:52, avi.e.gross at gmail.com wrote: > But oddly although brackets work, an explicit call to list() generates an > error! Ditto for {number} working and set(number) failing. Is this an > anomaly with a meaning? list() and set() require iterables. They won't work with single values: >>> set(4) Traceback (most recent call last): File "", line 1, in TypeError: 'int' object is not iterable >>> set((1,)) {1} >>> list(3) Traceback (most recent call last): File "", line 1, in TypeError: 'int' object is not iterable >>> list([3]) [3] >>> list((3,)) [3] >>> The error you are getting is not from max() its from list()/set() What is slightly annoying is that, unlike max(), you cannot just pass a sequence of values: >>> set(1,2) Traceback (most recent call last): File "", line 1, in TypeError: set expected at most 1 argument, got 2 -- Alan G Author of the Learn to Program web site http://www.alan-g.me.uk/ http://www.amazon.com/author/alan_gauld Follow my photo-blog on Flickr at: http://www.flickr.com/photos/alangauldphotos From wlfraed at ix.netcom.com Fri Jul 8 10:03:14 2022 From: wlfraed at ix.netcom.com (Dennis Lee Bieber) Date: Fri, 08 Jul 2022 10:03:14 -0400 Subject: [Tutor] this group and one liners References: <007601d89150$1cf50320$56df0960$@gmail.com> <006a01d8924d$133dbd60$39b93820$@gmail.com> <02f301d8925a$6c0ddf30$44299d90$@gmail.com> Message-ID: On Thu, 7 Jul 2022 19:36:45 -0400, declaimed the following: > >max(1,2 or 2,3) > >returning 3 oddly enough and so does this: > Well, that evaluates, if I recall the operator precedence, as 1, (2 or 2), 3 => 1, 2, 3 >max([] or 0,0) > >returning a 0. > ([] or 0), 0 => 0, 0 -- Wulfraed Dennis Lee Bieber AF6VN wlfraed at ix.netcom.com http://wlfraed.microdiversity.freeddns.org/ From avi.e.gross at gmail.com Fri Jul 8 12:54:02 2022 From: avi.e.gross at gmail.com (avi.e.gross at gmail.com) Date: Fri, 8 Jul 2022 12:54:02 -0400 Subject: [Tutor] this group and one liners In-Reply-To: References: <007601d89150$1cf50320$56df0960$@gmail.com> <006a01d8924d$133dbd60$39b93820$@gmail.com> <005101d8926d$6ba5fb00$42f1f100$@gmail.com> Message-ID: <00c901d892eb$53fd03d0$fbf70b70$@gmail.com> Alan, That shakes me up a bit as I do not recall ever needing to use iterables in the context of set() or list() back when I was learning the language. Then again, I mainly would use other notations or convert to that data type from another. But this can be a teaching moment. Part of the assumption is based o my experience with quite a few other languages and each has its own rules and even peculiarities. In some languages, you use list(0 to create a list and something like as.list() to coerce another object not a list. Further confusing issues is that sets and lists use a special notation of [] or {} to often handle sets. So, it seems that var = [5] And var = list(5) are far from the same thing. So it seems one way to make a list is to use list() with no arguments to make an empty list, or just use [] and then use list methods on the object to append or insert or extend. But there is no trivial way to do that inside the argument list of max() as described. Similarly for set() which either makes an empty set (the only way to do that as {} makes an empty dictionary) or coerces some iterable argument into a set. I am not convinced this is fantastic design. The concept of an iterable is very powerful but this brings us back to the question we began with. Why not have a version of max() that accepts numbers like 1,2,3 as individual arguments? In my view, just like a nonexistent stretch of zeros in a string is considered to be of length zero for our purposes, not unknown, I can imagine many scenarios where a single value should be useful in the context of many functions as if it was an iterable that returned one value just once. Heck, I can see scenarios where a null value that returned nothing would be an iterable. Python allows you to build your own iterables in several ways that do exactly that! So, yes, you can get around these rules when needed but outside of one-liners, that may rarely be an issue if you know the rules. Every language has plusses and minuses from the perspective of a user and you either deal with it or try to use another. But if people seem to want to do things a certain way, the language often is added to in ways that force things to be doable, albeit with more work and since [3.14] is enough to make the change, perhaps not necessary. Still for a flexible dynamic language, this way of doing things looks like a throwback to me. Many programming paradigms are like that. They are great when used as designed and a source of lots of frustration when not. -----Original Message----- From: Tutor On Behalf Of Alan Gauld via Tutor Sent: Friday, July 8, 2022 6:54 AM To: tutor at python.org Subject: Re: [Tutor] this group and one liners On 08/07/2022 02:52, avi.e.gross at gmail.com wrote: > But oddly although brackets work, an explicit call to list() generates > an error! Ditto for {number} working and set(number) failing. Is this > an anomaly with a meaning? list() and set() require iterables. They won't work with single values: >>> set(4) Traceback (most recent call last): File "", line 1, in TypeError: 'int' object is not iterable >>> set((1,)) {1} >>> list(3) Traceback (most recent call last): File "", line 1, in TypeError: 'int' object is not iterable >>> list([3]) [3] >>> list((3,)) [3] >>> The error you are getting is not from max() its from list()/set() What is slightly annoying is that, unlike max(), you cannot just pass a sequence of values: >>> set(1,2) Traceback (most recent call last): File "", line 1, in TypeError: set expected at most 1 argument, got 2 -- Alan G Author of the Learn to Program web site http://www.alan-g.me.uk/ http://www.amazon.com/author/alan_gauld Follow my photo-blog on Flickr at: http://www.flickr.com/photos/alangauldphotos _______________________________________________ Tutor maillist - Tutor at python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor From avi.e.gross at gmail.com Fri Jul 8 17:52:59 2022 From: avi.e.gross at gmail.com (avi.e.gross at gmail.com) Date: Fri, 8 Jul 2022 17:52:59 -0400 Subject: [Tutor] this group and one liners In-Reply-To: References: <007601d89150$1cf50320$56df0960$@gmail.com> <006a01d8924d$133dbd60$39b93820$@gmail.com> <005101d8926d$6ba5fb00$42f1f100$@gmail.com> Message-ID: <001501d89315$174b8090$45e281b0$@gmail.com> Alan, I get annoyed at myself when I find I do not apparently understand something or it makes no sense to me. So I sometimes try to do something about it. My first thought was to see if I could make a wrapper for max() that allowed the use of a non-iterable. My first attempt was to have this function that accepts any number of arguments and places them in an iterable concept before calling maxim and I invite criticism or improvements or other approaches. We know max(1,2,3) fails as it demands an iterable like [1,2,3] so I tried this: def maxim(*args): return(max(args)) maxim(1,2,3) 3 maxim([1,2,3]) [1, 2, 3] Well, clearly this needs work to be more general and forgiving! But it sort of works on an empty list with a sort of default: maxim([] or 0) 0 To make this more general would take quite a bit of work. For example, if you want to make sure max(iterator) works, you may need to check first if you have a simple object and if that object is a list or set or perhaps quite a few other things, may need to convert it so it remains intact rather than a list containing a list which max() returns as the maximum. As always, I have to suspect that someone has already done this, and more, and created some function that is a Swiss Army Knife and works on (almost) anything. -----Original Message----- From: Tutor On Behalf Of Alan Gauld via Tutor Sent: Friday, July 8, 2022 6:54 AM To: tutor at python.org Subject: Re: [Tutor] this group and one liners On 08/07/2022 02:52, avi.e.gross at gmail.com wrote: > But oddly although brackets work, an explicit call to list() generates > an error! Ditto for {number} working and set(number) failing. Is this > an anomaly with a meaning? list() and set() require iterables. They won't work with single values: >>> set(4) Traceback (most recent call last): File "", line 1, in TypeError: 'int' object is not iterable >>> set((1,)) {1} >>> list(3) Traceback (most recent call last): File "", line 1, in TypeError: 'int' object is not iterable >>> list([3]) [3] >>> list((3,)) [3] >>> The error you are getting is not from max() its from list()/set() What is slightly annoying is that, unlike max(), you cannot just pass a sequence of values: >>> set(1,2) Traceback (most recent call last): File "", line 1, in TypeError: set expected at most 1 argument, got 2 -- Alan G Author of the Learn to Program web site http://www.alan-g.me.uk/ http://www.amazon.com/author/alan_gauld Follow my photo-blog on Flickr at: http://www.flickr.com/photos/alangauldphotos _______________________________________________ Tutor maillist - Tutor at python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor From alan.gauld at yahoo.co.uk Fri Jul 8 18:08:30 2022 From: alan.gauld at yahoo.co.uk (Alan Gauld) Date: Fri, 8 Jul 2022 23:08:30 +0100 Subject: [Tutor] this group and one liners In-Reply-To: <001501d89315$174b8090$45e281b0$@gmail.com> References: <007601d89150$1cf50320$56df0960$@gmail.com> <006a01d8924d$133dbd60$39b93820$@gmail.com> <005101d8926d$6ba5fb00$42f1f100$@gmail.com> <001501d89315$174b8090$45e281b0$@gmail.com> Message-ID: On 08/07/2022 22:52, avi.e.gross at gmail.com wrote: > We know max(1,2,3) fails as it demands an iterable like [1,2,3] Nope. max(1,2,3) works just fine for me. Its only a single value that fails: >>> max(1,2,3) 3 >>> max(3) Traceback (most recent call last): File "", line 1, in TypeError: 'int' object is not iterable >>> -- Alan G Author of the Learn to Program web site http://www.alan-g.me.uk/ http://www.amazon.com/author/alan_gauld Follow my photo-blog on Flickr at: http://www.flickr.com/photos/alangauldphotos From wlfraed at ix.netcom.com Sat Jul 9 01:23:11 2022 From: wlfraed at ix.netcom.com (Dennis Lee Bieber) Date: Sat, 09 Jul 2022 01:23:11 -0400 Subject: [Tutor] this group and one liners References: <007601d89150$1cf50320$56df0960$@gmail.com> <006a01d8924d$133dbd60$39b93820$@gmail.com> <005101d8926d$6ba5fb00$42f1f100$@gmail.com> <00c901d892eb$53fd03d0$fbf70b70$@gmail.com> Message-ID: On Fri, 8 Jul 2022 12:54:02 -0400, declaimed the following: >I am not convinced this is fantastic design. The concept of an iterable is >very powerful but this brings us back to the question we began with. Why not >have a version of max() that accepts numbers like 1,2,3 as individual >arguments? > >>> max(1, 2, 3) 3 >>> In the absence of keyword arguments, the non-keyword arguments are gathered up as a tuple -- which is an iterable. >>> tpl = (1, 2, 3) >>> max(tpl) 3 >>> max(*tpl) 3 >>> First passes single arg tuple. Second /unpacks/ the tuple, passing three args, which get gathered back into a tuple by max(). -- Wulfraed Dennis Lee Bieber AF6VN wlfraed at ix.netcom.com http://wlfraed.microdiversity.freeddns.org/ From avi.e.gross at gmail.com Fri Jul 8 19:49:36 2022 From: avi.e.gross at gmail.com (avi.e.gross at gmail.com) Date: Fri, 8 Jul 2022 19:49:36 -0400 Subject: [Tutor] this group and one liners In-Reply-To: References: <007601d89150$1cf50320$56df0960$@gmail.com> <006a01d8924d$133dbd60$39b93820$@gmail.com> <005101d8926d$6ba5fb00$42f1f100$@gmail.com> <001501d89315$174b8090$45e281b0$@gmail.com> Message-ID: <004201d89325$61afe4e0$250faea0$@gmail.com> You are right, Alan, so max mainly does what I expect. Since generally no arguments makes no sense for having a maximum and one argument that is really just one also makes little sense as it is by default the maximum (and minimum, median, mode, ...) then it stands to reason to treat a single argument as a kind of collection. However, since max[1]) works fine, and max(1) fails, it seems a tad inconsistent. I note in my experiments that max("a") works but only because like many things in python, it sees a character string as an iterator of sorts and max("max") returns 'x' of course. Unfortunately, when that is supplied as a list: max(["max"]) --> 'max' Time to move on from this topic except to say that debugging some python constructs may need to be part of what I do if I am not careful. -----Original Message----- From: Tutor On Behalf Of Alan Gauld via Tutor Sent: Friday, July 8, 2022 6:09 PM To: tutor at python.org Subject: Re: [Tutor] this group and one liners On 08/07/2022 22:52, avi.e.gross at gmail.com wrote: > We know max(1,2,3) fails as it demands an iterable like [1,2,3] Nope. max(1,2,3) works just fine for me. Its only a single value that fails: >>> max(1,2,3) 3 >>> max(3) Traceback (most recent call last): File "", line 1, in TypeError: 'int' object is not iterable >>> -- Alan G Author of the Learn to Program web site http://www.alan-g.me.uk/ http://www.amazon.com/author/alan_gauld Follow my photo-blog on Flickr at: http://www.flickr.com/photos/alangauldphotos _______________________________________________ Tutor maillist - Tutor at python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor From nathan-tech at hotmail.com Mon Jul 11 06:58:47 2022 From: nathan-tech at hotmail.com (nathan tech) Date: Mon, 11 Jul 2022 10:58:47 +0000 Subject: [Tutor] Question about python decorators Message-ID: As I understand it, decorators are usuallyed created as: @object.event Def function_to_be_executed(): Do_stuff My question is, is there a way to create this after the function is created? So: Def function(): Print("this is interesting stuff") @myobject.event=function Thanks Nathan From __peter__ at web.de Mon Jul 11 12:57:17 2022 From: __peter__ at web.de (Peter Otten) Date: Mon, 11 Jul 2022 18:57:17 +0200 Subject: [Tutor] Question about python decorators In-Reply-To: References: Message-ID: On 11/07/2022 12:58, nathan tech wrote: > As I understand it, decorators are usuallyed created as: > @object.event > Def function_to_be_executed(): > Do_stuff > > My question is, is there a way to create this after the function is created? > So: > Def function(): > Print("this is interesting stuff") > > @myobject.event=function > > Thanks > Nathan @deco def fun(): ... is basically a syntactic sugar for def fun(): ... fun = deco(fun) In your case you would write function = object.event(function) As this is just an ordinary function call followed by an ordinary assignment you can of course use different names for the decorated and undecorated version of your function and thus keep both easily accessible. From mats at wichmann.us Mon Jul 11 13:23:13 2022 From: mats at wichmann.us (Mats Wichmann) Date: Mon, 11 Jul 2022 11:23:13 -0600 Subject: [Tutor] Question about python decorators In-Reply-To: References: Message-ID: <1570f91e-edb0-70d2-8b3a-350eac3a4514@wichmann.us> On 7/11/22 04:58, nathan tech wrote: > As I understand it, decorators are usuallyed created as: > @object.event > Def function_to_be_executed(): > Do_stuff > > My question is, is there a way to create this after the function is created? > So: > Def function(): > Print("this is interesting stuff") > > @myobject.event=function You don't have to use the @ form at all, but your attempt to assign won't work. def function(): print("This is interesting stuff") function = object.event(function) Now the new version of function is the wrapped version; the original version (from your def statement) is held by a reference in the instance of the wrapper. If that's what you are asking... From nathan-tech at hotmail.com Mon Jul 11 17:29:33 2022 From: nathan-tech at hotmail.com (Nathan Smith) Date: Mon, 11 Jul 2022 22:29:33 +0100 Subject: [Tutor] Question about python decorators In-Reply-To: <1570f91e-edb0-70d2-8b3a-350eac3a4514@wichmann.us> References: <1570f91e-edb0-70d2-8b3a-350eac3a4514@wichmann.us> Message-ID: Hiya, I figured this is how they work, so it's nice to have that confirmed! For some reason though I have a library that is not behaving this way, to example: @object.attr.method def function(): ?do stuff works, but object.attr.method=func does not I'm likely missing something, but I'd be curious to here experts opinions. The library I am working with is: https://github.com/j-hc/Reddit_ChatBot_Python On 11/07/2022 18:23, Mats Wichmann wrote: > On 7/11/22 04:58, nathan tech wrote: >> As I understand it, decorators are usuallyed created as: >> @object.event >> Def function_to_be_executed(): >> Do_stuff >> >> My question is, is there a way to create this after the function is created? >> So: >> Def function(): >> Print("this is interesting stuff") >> >> @myobject.event=function > You don't have to use the @ form at all, but your attempt to assign > won't work. > > def function(): > print("This is interesting stuff") > > function = object.event(function) > > Now the new version of function is the wrapped version; the original > version (from your def statement) is held by a reference in the instance > of the wrapper. > > If that's what you are asking... > > _______________________________________________ > Tutor maillist - Tutor at python.org > To unsubscribe or change subscription options: > https://nam12.safelinks.protection.outlook.com/?url=https%3A%2F%2Fmail.python.org%2Fmailman%2Flistinfo%2Ftutor&data=05%7C01%7C%7C8707660103e44e7ce3c908da6362eac3%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C637931573732838946%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=xfszgTLAm%2BXPKzIfJBwk6NqRWA%2Fmh4QpfUNlHwX97Po%3D&reserved=0 -- Best Wishes, Nathan Smith, BSC My Website: https://nathantech.net From cs at cskk.id.au Mon Jul 11 19:09:43 2022 From: cs at cskk.id.au (Cameron Simpson) Date: Tue, 12 Jul 2022 09:09:43 +1000 Subject: [Tutor] Question about python decorators In-Reply-To: References: Message-ID: On 11Jul2022 22:29, nathan tech wrote: >For some reason though I have a library that is not behaving this way, >to example: > >@object.attr.method >def function(): >?do stuff > >works, but > >object.attr.method=func > >does not That should be: func = object.attr.method(func) Remember, a decorator takes a function and returns a function (often a new function which calls the old function). Cheers, Cameron Simpson From nathan-tech at hotmail.com Tue Jul 12 01:44:01 2022 From: nathan-tech at hotmail.com (Nathan Smith) Date: Tue, 12 Jul 2022 06:44:01 +0100 Subject: [Tutor] Question about python decorators In-Reply-To: References: Message-ID: Aha! I am with you. Thanks a lot! :) On 12/07/2022 00:09, Cameron Simpson wrote: > On 11Jul2022 22:29, nathan tech wrote: >> For some reason though I have a library that is not behaving this way, >> to example: >> >> @object.attr.method >> def function(): >> ?do stuff >> >> works, but >> >> object.attr.method=func >> >> does not > That should be: > > func = object.attr.method(func) > > Remember, a decorator takes a function and returns a function (often a > new function which calls the old function). > > Cheers, > Cameron Simpson > _______________________________________________ > Tutor maillist - Tutor at python.org > To unsubscribe or change subscription options: > https://nam12.safelinks.protection.outlook.com/?url=https%3A%2F%2Fmail.python.org%2Fmailman%2Flistinfo%2Ftutor&data=05%7C01%7C%7C1ed565f9951245e8f3fc08da6393ba6a%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C637931783391597551%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=1Y9J0uqLL66FgSohUUNtkOtIbE6wNOunUVKlHl9pZSg%3D&reserved=0 -- Best Wishes, Nathan Smith, BSC My Website: https://nathantech.net From alexanderrhodis at gmail.com Sun Jul 10 02:51:19 2022 From: alexanderrhodis at gmail.com (alexander-rodis) Date: Sun, 10 Jul 2022 09:51:19 +0300 Subject: [Tutor] Implicit passing of argument select functions being called Message-ID: <931b5db0-047f-90c3-ee6c-35e6a34c01f2@gmail.com> I'm working on a project, where accessibility is very important as it's addressed to non - specialists, who may even have no knowledge of coding of Python. In a specific section, I've come up with this API to make data transformation pipeline: make_pipeline( ??? load_data(fpath,...), ??? transform1(arg1,arg2,....), ??? ...., ??? transform2(arg1,arg2,....), ??? transform3(arg1,arg2,....), ??? transformN(arg1,arg2,....), ) transformN are all callables (writing this in a functional style), decorated with a partial call in the background. make_pipeline just uses a forloop to call the functions successively like so e = arg(e). Which transforms from the many available in the library I'm writting, are used will vary between uses. load_data returns a pandas DataFrame. Some transforms may need to operate on sub-arrays. I've written functions (also decorated with a partial call) to select the sub array and return it to its original shape.? The problem is, I currently have to pass the data filtering function explicitly as an argument to each function that needs it, but this seems VERY error prone, even if I cautiously document it. I want the the filter function to be specified in one place and made automatically? available to all transforms that need it. Something roughly equivalent to: make_pipeline( ??? load_data(fpath,...), ??? transform1(arg1,arg2,....), ??? ...., ??? transform2(arg1,arg2,....), ??? transform3(arg1,arg2,....), ??? transformN(arg1,arg2,....), ??? , data_filter = simple_filter(start=0,) ) I thought all aliases local to caller make_pipelines becomes automatically available to the called functions. This seems to work but only on some small toy examples, not in this case. global is usually considered bad practice, so I'm trying to avoid it and I'm not using any classes so not OOP. I've also tried using inspect.signature to check if each callable accepts a certain argument and pass it if that's the case, however this raises an "incorrect signature error" which I could find documented anywhere. I've also considered passing it to all functions with a try/except and ignore thrown errors, but it seems this could also be error prone, namely it could catch other errors too. So my question is, is there an elegant and Pythonic way to specify data_filter in only one place and implicitly pass it to all functions that need it without global or classes? Thanks From avi.e.gross at gmail.com Sat Jul 9 18:08:56 2022 From: avi.e.gross at gmail.com (avi.e.gross at gmail.com) Date: Sat, 9 Jul 2022 18:08:56 -0400 Subject: [Tutor] maxed out In-Reply-To: References: <007601d89150$1cf50320$56df0960$@gmail.com> <006a01d8924d$133dbd60$39b93820$@gmail.com> <005101d8926d$6ba5fb00$42f1f100$@gmail.com> <00c901d892eb$53fd03d0$fbf70b70$@gmail.com> Message-ID: <002c01d893e0$7bfd12d0$73f73870$@gmail.com> Now that Alan and Dennis have pointed things out, I want to summarize some of the perspective I have. The topic is why the max() (and probably min() and perhaps other) python functions made the design choices they did and that they are NOT the only possible choices. It confused me as I saw parts of the elephant without seeing the entire outline. So lesson one is READ THE MANUAL PAGE before making often unwarranted assumptions. https://docs.python.org/3/library/functions.html#max The problem originally presented was that using max() to measure the maximum length of multiple instances of 0's interspersed with 1's resulted in what I considered anomalies. Specifically, it stopped with an error on an empty list which my algorithm would always emit if there were no zero's. My understanding is that in normal use, there are several views. The practical view is that you should not be calling max() unless you have multiple things to compare of the same kind. Thus a common use it should deal with is: max(5, 3, 2.4, 8.6, -2.3) This being python, a related use is to hand it a SINGLE argument that is a collection of it's own with the qualification that the items be compatible with being compared to each other and are ordered. So tuples, lists and sets and separately the keys and values of a dictionary come to mind. max(1,2,3) 3 max((1,2,3)) 3 max([1,2,2]) 2 max({1,3,4,3,5,2}) 5 max({1:1.1, 2:1.2, 3:1.3}.keys()) 3 max({1:1.1, 2:1.2, 3:1.3}.values()) 1.3 There may well be tons of other things it can handle albeit I suspect constructs from numpy or pandas may better be evaluated using their own versions of functions like max that are built-in. And it seems max now should handle iterables like this function: def lowering(starting=100): using = int(starting) while (using > 0): yield(using) using //= 2 max(lowering(100)) 100 min(lowering(100)) 1 import statistics statistics.mean(lowering(666)) 132.7 len(list(lowering(42))) 6 But it is inconsistent as len will not see [lowering(42)] as more than one argument as it sees an iterable function but does not iterate it. Back to the point, from a practical point of view, you have to conditions. - Everything being measured should be of the same type or perhaps coercible to the same type easily. - You either have 2 or more things being asked about directly as in max(1,2,3) OR you have a single argument that is an iterator. So the decision they made almost makes sense to me except that it does not! In a mathematical sense, a maximum also makes perfect sense for a single value. The maximum for no arguments is clearly undefined. Following that logic, the max() function would need to test if the single object it gets is SIMPLE or more complex (and I do not mean imaginary numbers. I mean ANYTHING that evaluates to a single value, be it an integer, floating point, Boolean or character string and maybe more, should return it as the maximum. If it can be evaluated as a container that ends up holding NO values, then it should fail unless a default is provided. If it returns a single value, again, that is the result. If it returns multiple values ALL OF THE SAME KIND, compare those. Not sure if that is at all easy to implement, but does that overall design concept make a tad more sense? The problem Dennis hinted out is darn TUPLES in Python. They sort of need a comma after a value if it is alone as in a = 1, a (1,) And the darn tuple(function) stupidly has no easy way to make a tuple with a single argument even if followed by an empty comma. Who designed that? I mean it works for a string because it is actually a compound object as in tuple("a") but not for tuple(1) so you need something like tuple([1]) ... My GUESS from what Dennis wrote is the creators of max(0 may have said that a singleton argument should be expanded only by coercing it to a tuple and that makes some singleton arguments FAIL! I looked at another design element in that this version of max also supports other kinds of ordering for selected larger collections such as lists containing objects of the same type: max( [ 1, 1, 1], [1, 2, 1]) [1, 2, 1] max( [ 1, 1, 1], [1, 2, 1], [3, 0]) [3, 0] It won't take my iterator unless I expand it in a list like this: max(list(lowering(42)), list(lowering(666))) [666, 333, 166, 83, 41, 20, 10, 5, 2, 1] So, overall, I understand why max does what it wants BUT I am not happy with the fact that programs often have no control over the size of a list they generate as in my case where no 0's means an empty list. So the default=0 argument there helps if used, albeit using the single argument to max() method fails when used as max([] or 0) and requires something like max([] or [0]) since the darn thing insists a single argument must be a sort of container or iterable. You learn something new every day if you look, albeit I sometimes wish I hadn't! -----Original Message----- From: Tutor On Behalf Of Dennis Lee Bieber Sent: Saturday, July 9, 2022 1:23 AM To: tutor at python.org Subject: Re: [Tutor] this group and one liners On Fri, 8 Jul 2022 12:54:02 -0400, declaimed the following: >I am not convinced this is fantastic design. The concept of an iterable >is very powerful but this brings us back to the question we began with. >Why not have a version of max() that accepts numbers like 1,2,3 as >individual arguments? > >>> max(1, 2, 3) 3 >>> In the absence of keyword arguments, the non-keyword arguments are gathered up as a tuple -- which is an iterable. >>> tpl = (1, 2, 3) >>> max(tpl) 3 >>> max(*tpl) 3 >>> First passes single arg tuple. Second /unpacks/ the tuple, passing three args, which get gathered back into a tuple by max(). -- Wulfraed Dennis Lee Bieber AF6VN wlfraed at ix.netcom.com http://wlfraed.microdiversity.freeddns.org/ _______________________________________________ Tutor maillist - Tutor at python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor From avi.e.gross at gmail.com Mon Jul 11 13:26:30 2022 From: avi.e.gross at gmail.com (avi.e.gross at gmail.com) Date: Mon, 11 Jul 2022 13:26:30 -0400 Subject: [Tutor] Python one-liners Message-ID: <005e01d8954b$5c5f39a0$151dace0$@gmail.com> Based on the recent discussion, I note with amusement that I just ordered this book but note there is nothing wrong with solving simple problems in a few lines of code, or even complex problems using code that calls functionality already written elsewhere with just a few lines of your code. Anyone read this yet? Should anyone? Title: Python one-liners : write concise, eloquent Python like a professional Author: Mayer, Christian (Computer Scientist), ISBN: 9781718500501 Personal Author: Mayer, Christian (Computer Scientist), author. Language: English Custom PUBDATE: 2020 Place of publication: San Francisco : No Starch Press, [2020] Publication Date: 2020 Summary: "Shows how to perform useful tasks with one line of Python code. Begins with a brief overview of Python, then moves on to specific problems that deal with essential topics such as regular expressions and lambda functions, providing a concise one-liner Python solution for each"-- Abstract: "Shows how to perform useful tasks with one line of Python code. Begins with a brief overview of Python, then moves on to specific problems that deal with essential topics such as regular expressions and lambda functions, providing a concise one-liner Python solution for each"-- Provided by publisher. Subject Term: Python (Computer program language) Subject: Python (Computer program language) From avi.e.gross at gmail.com Mon Jul 11 13:51:01 2022 From: avi.e.gross at gmail.com (avi.e.gross at gmail.com) Date: Mon, 11 Jul 2022 13:51:01 -0400 Subject: [Tutor] Question about python decorators In-Reply-To: References: Message-ID: <00ab01d8954e$c92bfac0$5b83f040$@gmail.com> Nathan, It depends on what you want to do. Decorators are sort of syntactic sugar that you can sort of do on your own if you wish. I am answering in general but feel free to look at some online resources such as this: https://realpython.com/primer-on-python-decorators/ The bottom line is that a function is just an object like anything else in Python and you can wrap a function inside of another function so the new function is called instead of the old one. This function, with the same name, sort of decorates the original by having the ability to do something using the arguments provided before calling the function, such as logging the call, or making changes to the arguments or just doing validation and then calling the function. After the inner (sort-of) function returns, it can do additional things and return what it wishes to the original caller. I have no idea what your @object.event decorator is and what it does for you, but presumably you can find one of several ways to apply it to a completed function, including slightly indirect methods albeit there may be minor details that may not be quite right such as docstrings. Say you already have a function called original() and you want to wrap it in @interior as a decorator. Could you imagine something like this: Copy "original" to "subordinate" so you have a new function name. Then do this: @interior def original(args): return(subordinate(args)) Again, some details are needed such as matching the args one way or another but the above is by your definition a new function definition, albeit with some overhead! LOL! Yes, I know what you are going to say. This is re-decoration! But kidding aside, as I said, it may be even simpler and you may be able to do something as simple as: func = myobject.event(func) Experiment with it. Or read a bit and then experiment. -----Original Message----- From: Tutor On Behalf Of nathan tech Sent: Monday, July 11, 2022 6:59 AM To: Tutor at python.org Subject: [Tutor] Question about python decorators As I understand it, decorators are usuallyed created as: @object.event Def function_to_be_executed(): Do_stuff My question is, is there a way to create this after the function is created? So: Def function(): Print("this is interesting stuff") @myobject.event=function Thanks Nathan _______________________________________________ Tutor maillist - Tutor at python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor From avi.e.gross at gmail.com Mon Jul 11 17:42:39 2022 From: avi.e.gross at gmail.com (avi.e.gross at gmail.com) Date: Mon, 11 Jul 2022 17:42:39 -0400 Subject: [Tutor] Question about python decorators In-Reply-To: References: <1570f91e-edb0-70d2-8b3a-350eac3a4514@wichmann.us> Message-ID: <01cb01d8956f$24ab3c60$6e01b520$@gmail.com> Nathan. I don't think anyonbe suggested you type what you say you did: object.attr.method=func You are making a second pointer to your function which leaves everything unchanged. The suggestion I think several offered was to call a function called object.attr.method and give it an object which is the function you want decorated, meaning "func" above, and CAPTURE the result as either a new meaning for the variable NAME called "func" or perhaps use a new handle. The way it works is that not only does object.attr.method accept a function name as an argument but also creates brand new function that encapsulates it and RETURNS it. You want the brand new function that is an enhancement over the old one. The old one is kept alive in the process, albeit maybe not reachable directly. You may be confused by the way decorating is done as syntactic sugar where it is indeed hard to see or understand what decorating means, let alone multiple levels as all you SEE is: @decorator Function definition. So until you try the format several have offered, why expect your way to work ...? Again, you want to try: result = object.attr.method(func) or to re-use the name func: func = object.attr.method(func) -----Original Message----- From: Tutor On Behalf Of Nathan Smith Sent: Monday, July 11, 2022 5:30 PM To: tutor at python.org Subject: Re: [Tutor] Question about python decorators Hiya, I figured this is how they work, so it's nice to have that confirmed! For some reason though I have a library that is not behaving this way, to example: @object.attr.method def function(): do stuff works, but object.attr.method=func does not I'm likely missing something, but I'd be curious to here experts opinions. The library I am working with is: https://github.com/j-hc/Reddit_ChatBot_Python On 11/07/2022 18:23, Mats Wichmann wrote: > On 7/11/22 04:58, nathan tech wrote: >> As I understand it, decorators are usuallyed created as: >> @object.event >> Def function_to_be_executed(): >> Do_stuff >> >> My question is, is there a way to create this after the function is created? >> So: >> Def function(): >> Print("this is interesting stuff") >> >> @myobject.event=function > You don't have to use the @ form at all, but your attempt to assign > won't work. > > def function(): > print("This is interesting stuff") > > function = object.event(function) > > Now the new version of function is the wrapped version; the original > version (from your def statement) is held by a reference in the instance > of the wrapper. > > If that's what you are asking... > > _______________________________________________ > Tutor maillist - Tutor at python.org > To unsubscribe or change subscription options: > https://nam12.safelinks.protection.outlook.com/?url=https%3A%2F%2Fmail.python.org%2Fmailman%2Flistinfo%2Ftutor&data=05%7C01%7C%7C8707660103e44e7ce3c908da6362eac3%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C637931573732838946%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=xfszgTLAm%2BXPKzIfJBwk6NqRWA%2Fmh4QpfUNlHwX97Po%3D&reserved=0 -- Best Wishes, Nathan Smith, BSC My Website: https://nathantech.net _______________________________________________ Tutor maillist - Tutor at python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor From roel at roelschroeven.net Mon Jul 11 07:08:42 2022 From: roel at roelschroeven.net (Roel Schroeven) Date: Mon, 11 Jul 2022 13:08:42 +0200 Subject: [Tutor] Question about python decorators In-Reply-To: References: Message-ID: Op 11/07/2022 om 12:58 schreef nathan tech: > As I understand it, decorators are usuallyed created as: > @object.event > Def function_to_be_executed(): > Do_stuff > > My question is, is there a way to create this after the function is created? > So: > Def function(): > Print("this is interesting stuff") > > @myobject.event=function You can write it as ??? function = myobject.event Actually the syntax using @object.event in front of the function is syntactic sugar for that notation. Example: ??? import time ??? import functools ??? def slow(): ??????? time.sleep(1) ??? slow = functools.cache(slow) Now slow() will only be slow on the first call, because subsequent calls will be served from the cache. -- "Peace cannot be kept by force. It can only be achieved through understanding." -- Albert Einstein From roel at roelschroeven.net Mon Jul 11 07:11:45 2022 From: roel at roelschroeven.net (Roel Schroeven) Date: Mon, 11 Jul 2022 13:11:45 +0200 Subject: [Tutor] Question about python decorators In-Reply-To: References: Message-ID: <21d2e0e6-a572-0ab6-811b-b6d229b51330@roelschroeven.net> Op 11/07/2022 om 12:58 schreef nathan tech: > As I understand it, decorators are usuallyed created as: > @object.event > Def function_to_be_executed(): > Do_stuff > > My question is, is there a way to create this after the function is created? > So: > Def function(): > Print("this is interesting stuff") > > @myobject.event=function > Oops, sorry, there is an error in my other mail! It should be: ??? function = myobject.event(function) instead of just ??? function = myobject.event -- "Peace cannot be kept by force. It can only be achieved through understanding." -- Albert Einstein From __peter__ at web.de Thu Jul 14 03:30:33 2022 From: __peter__ at web.de (Peter Otten) Date: Thu, 14 Jul 2022 09:30:33 +0200 Subject: [Tutor] Implicit passing of argument select functions being called In-Reply-To: <931b5db0-047f-90c3-ee6c-35e6a34c01f2@gmail.com> References: <931b5db0-047f-90c3-ee6c-35e6a34c01f2@gmail.com> Message-ID: On 10/07/2022 08:51, alexander-rodis wrote: > I'm working on a project, where accessibility is very important as it's > addressed to non - specialists, who may even have no knowledge of coding > of Python. > > In a specific section, I've come up with this API to make data > transformation pipeline: > > make_pipeline( > > ??? load_data(fpath,...), > > ??? transform1(arg1,arg2,....), > > ??? ...., > > ??? transform2(arg1,arg2,....), Frankly, I have no idea what your actual scenario might be. If you are still interested in a comment it would help if you provide a more concrete scenario with two or three actual transformations working together on a few rows of toy data, with a filter and one or two globals that you are hoping to avoid. Actual code is far easier to reshuffle and improve than an abstraction with overgeneralized function names and signatures where you cannot tell the necessary elements from the artifacts of the generalization. From avi.e.gross at gmail.com Wed Jul 13 12:32:29 2022 From: avi.e.gross at gmail.com (avi.e.gross at gmail.com) Date: Wed, 13 Jul 2022 12:32:29 -0400 Subject: [Tutor] Implicit passing of argument select functions being called In-Reply-To: <931b5db0-047f-90c3-ee6c-35e6a34c01f2@gmail.com> References: <931b5db0-047f-90c3-ee6c-35e6a34c01f2@gmail.com> Message-ID: <012d01d896d6$259725e0$70c571a0$@gmail.com> It would be helpful to understand a little more, Alexander. Just FYI, although I know you are trying to make something simple for people, I hope you are aware of the sklearn pipeline as it may help you figure out how to make your own. Pandas has its own way too. And of course we have various ways to create a pipeline in TensorFlow and modules build on top of it like Keras and at least 4 others including some that look a bit like what you are showing. Your application may not be compatible and you want to avoid complexity but how others do things can give you ideas. Your question though seems to be focused not on how to make a pipeline, but how you can evaluate one function after another and first load initial data and then pass that data as an argument to each successive function, replacing the data for each input with the output of the previous, and finally return the last output. So, are you using existing functions or supplying your own? If you want a generic data_filter to be available within all these functions, there seem to be quite a few ways you might do that if you control everything. For example, you could pass it as an argument to each function directly. Or have it available in some name space. So could you clarify what exactly is not working now and perhaps give a concrete example? -----Original Message----- From: Tutor On Behalf Of alexander-rodis Sent: Sunday, July 10, 2022 2:51 AM To: tutor at python.org Subject: [Tutor] Implicit passing of argument select functions being called I'm working on a project, where accessibility is very important as it's addressed to non - specialists, who may even have no knowledge of coding of Python. In a specific section, I've come up with this API to make data transformation pipeline: make_pipeline( load_data(fpath,...), transform1(arg1,arg2,....), ...., transform2(arg1,arg2,....), transform3(arg1,arg2,....), transformN(arg1,arg2,....), ) transformN are all callables (writing this in a functional style), decorated with a partial call in the background. make_pipeline just uses a forloop to call the functions successively like so e = arg(e). Which transforms from the many available in the library I'm writting, are used will vary between uses. load_data returns a pandas DataFrame. Some transforms may need to operate on sub-arrays. I've written functions (also decorated with a partial call) to select the sub array and return it to its original shape. The problem is, I currently have to pass the data filtering function explicitly as an argument to each function that needs it, but this seems VERY error prone, even if I cautiously document it. I want the the filter function to be specified in one place and made automatically available to all transforms that need it. Something roughly equivalent to: make_pipeline( load_data(fpath,...), transform1(arg1,arg2,....), ...., transform2(arg1,arg2,....), transform3(arg1,arg2,....), transformN(arg1,arg2,....), , data_filter = simple_filter(start=0,) ) I thought all aliases local to caller make_pipelines becomes automatically available to the called functions. This seems to work but only on some small toy examples, not in this case. global is usually considered bad practice, so I'm trying to avoid it and I'm not using any classes so not OOP. I've also tried using inspect.signature to check if each callable accepts a certain argument and pass it if that's the case, however this raises an "incorrect signature error" which I could find documented anywhere. I've also considered passing it to all functions with a try/except and ignore thrown errors, but it seems this could also be error prone, namely it could catch other errors too. So my question is, is there an elegant and Pythonic way to specify data_filter in only one place and implicitly pass it to all functions that need it without global or classes? Thanks _______________________________________________ Tutor maillist - Tutor at python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor From alexanderrhodis at gmail.com Thu Jul 14 03:56:31 2022 From: alexanderrhodis at gmail.com (=?UTF-8?B?zpHOu86tzr7Osc69zrTPgc6/z4IgzqHPjM60zrfPgg==?=) Date: Thu, 14 Jul 2022 10:56:31 +0300 Subject: [Tutor] Implicit passing of argument select functions being called In-Reply-To: References: <931b5db0-047f-90c3-ee6c-35e6a34c01f2@gmail.com> Message-ID: I've finally figured ou a solution, I'll leave it here in case it helps someone else, using inspect.signature The actuall functions would look like this: ` dfilter = simple_filter(start_col = 6,) make_pipeline( load_dataset(fpath="some/path.xlsx", header=[0,1], apply_internal_standard(target_col = "EtD5"), export_to_sql(fname = "some/name"), , data _filter = dfilter ) ` Though I doubt the actuall code cleared things up, if make_pipeline is a forloop and lets say apply_internal_standard needs access to dfilter (along with others not shown here) a pretty good solution is ` def make_pipeline( *args, **kwargs): D = args[0] for arg in args[1:]: k = inspect.signature(arg).parameters.keys() if "data_filter" in k: D = arg(D, kwargs["data_filter"] = data_filter) D = arg(D) return D ` On Thu, Jul 14, 2022, 10:36 Peter Otten <__peter__ at web.de> wrote: > On 10/07/2022 08:51, alexander-rodis wrote: > > > I'm working on a project, where accessibility is very important as it's > > addressed to non - specialists, who may even have no knowledge of coding > > of Python. > > > > In a specific section, I've come up with this API to make data > > transformation pipeline: > > > > make_pipeline( > > > > load_data(fpath,...), > > > > transform1(arg1,arg2,....), > > > > ...., > > > > transform2(arg1,arg2,....), > > Frankly, I have no idea what your actual scenario might be. > > If you are still interested in a comment it would help if you provide a > more concrete scenario with two or three actual transformations working > together on a few rows of toy data, with a filter and one or two globals > that you are hoping to avoid. > > Actual code is far easier to reshuffle and improve than an abstraction > with overgeneralized function names and signatures where you cannot tell > the necessary elements from the artifacts of the generalization. > From manpritsinghece at gmail.com Sat Jul 16 05:26:22 2022 From: manpritsinghece at gmail.com (Manprit Singh) Date: Sat, 16 Jul 2022 14:56:22 +0530 Subject: [Tutor] Ways of removing consequtive duplicates from a list Message-ID: Dear Sir , I was just doing an experiment of removing consecutive duplicates from a list . Did it in the following ways and it all worked . Just need to know which one should be preferred ? which one is more good ? lst = [2, 2, 3, 3, 3, 2, 2, 5, 5, 6, 3, 3, 3, 3] # Ways of removing consequtive duplicates [ele for i, ele in enumerate(lst) if i==0 or ele != lst[i-1]] [2, 3, 2, 5, 6, 3] val = object() [(val := ele) for ele in lst if ele != val] [2, 3, 2, 5, 6, 3] import itertools [val for val, grp in itertools.groupby(lst)] [2, 3, 2, 5, 6, 3] Is there anything else more efficient ? Regards Manprit Singh From avi.e.gross at gmail.com Sat Jul 16 11:45:54 2022 From: avi.e.gross at gmail.com (avi.e.gross at gmail.com) Date: Sat, 16 Jul 2022 11:45:54 -0400 Subject: [Tutor] Ways of removing consequtive duplicates from a list In-Reply-To: References: Message-ID: <005001d8992b$227b77b0$67726710$@gmail.com> Manprit, Your message is not formatted properly in my email and you just asked any women present to not reply to you, nor anyone who has not been knighted by a Queen. I personally do not expect such politeness but clearly some do. What do you mean by most efficient? Seriously. For a list this size, almost any method runs fast enough. Efficiency considerations may still apply but mainly consist of startup costs that can differ quite a bit and even change when some of the underlying functionality is changed such as to fix bugs, or deal with special cases or added options. lst = [2, 2, 3, 3, 3, 2, 2, 5, 5, 6, 3, 3, 3, 3] So are you going to do the above ONCE or repeatedly in your program? There are modules and methods available to do testing by say running your choice a million times that might provide you with numbers. Asking people here, probably will get you mostly opinions or guesses. And it is not clear why you need to know what is more efficient unless the assignment asks you to think analytically and the thinking is supposed to be by you. Here are your choices that I hopefully formatted in a way that lets them be seen. But first, this is how your original looked: [ele for i, ele in enumerate(lst) if i==0 or ele != lst[i-1]] [2, 3, 2, 5, 6, 3] val = object() [(val := ele) for ele in lst if ele != val] [2, 3, 2, 5, 6, 3] import itertools [val for val, grp in itertools.groupby(lst)] [2, 3, 2, 5, 6, 3] The first one looks like a list comprehension albeit it is not easy to see where it ends. I stopped when I hit an "or" but the brackets were not finished: [ele for i, ele in enumerate(lst) if i==0 And even with a bracket, it makes no sense! So I read on: #-----Choice ONE: [ele for i, ele in enumerate(lst) if i==0 or ele != lst[i-1]] OK, that worked and returned: [2, 3, 2, 5, 6, 3] But your rendition shows the answer "[2, 3, 2, 5, 6, 3" which thus is not code so I remove that and move on: val = object() [(val := ele) for ele in lst if ele != val] This seems to be intended as two lines: #-----Choice TWO: val = object() [(val := ele) for ele in lst if ele != val] And yes it works and produces the same output I can ignore. By now I know to make multiple lines as needed: #-----Choice THREE: import itertools [val for val, grp in itertools.groupby(lst)] So how would you analyze the above three choices, once unscrambled? I am not going to tell you what I think. What do they have in common? What I note is they are all the SAME in one way. All use a list comprehension. If one would have used loops for example, that might be a factor as they tend to be less efficient in python. But they are all the same. So what else may be different? Choice THREE imports a module. There is a cost involved especially if you import the entire module, not just the part you want so the import method adds a somewhat constant cost. But if the module is already used elsewhere in your program, it is sort of a free cost to use it here and if you use this method on large lists or many times, the cost per unit drops. How much this affects efficiency is something you might need to test and even then may vary. Do you know what "enumerate()" does in choice ONE? It can really matter in deciding what is efficient. If I have a list a million or billion units long, will enumerate make another list of numbers from 1 to N that long in memory, or will it make an iterator that is called repeatedly to make the next pair for you? Choices ONE and TWO both have visible IF clauses but the second one has an OR with two parts to test. In general, the more tests or other things done in a loop, compared to a loop of the same number of iterations, the more expensive it can be. But a careful study of the code if i==0 or ele != lst[i-1] suggests the first condition is only true the very first time but is evaluated all N times so the second condition is evaluated N-1 times. Basically, both are done with no real savings. Choice TWO has a single test in the if, albeit it is for an arbitrary object which can be more or less expensive depending on the object. The first condition in choice ONE was a fairly trivial integer comparison and the second, again, could be for any object. So these algorithms should work on things other than integers. Consider this list containing tuples and sets: obj_list = [ (1,2), (1,2,3), (1,2,3), {"a", 1}, {"b", 2}, {"b", 2} ] Should this work? [ele for i, ele in enumerate(obj_list) if i==0 or ele != obj_list[i-1]] [(1, 2), (1, 2, 3), {1, 'a'}, {'b', 2}] I think it worked but the COMPARISONS between objects had to be more complex and thus less efficient than for your initial example. So the number and type of comparisons can be a factor in your analysis depending on how you want to use each algorithm. For completeness, I also tried the other two algorithms using this alternate test list: [(val := ele) for ele in obj_list if ele != val] [(1, 2), (1, 2, 3), {1, 'a'}, {'b', 2}] And [val for val, grp in itertools.groupby(obj_list)] [(1, 2), (1, 2, 3), {1, 'a'}, {'b', 2}] Which brings us to the latter. What exactly does the groupby() function do? If it is an iterator, and it happens to be, it may use less memory but for small examples, the iterator overhead may be more that just using a short list, albeit lists are iterators of a sort too. You can look at these examples analytically and find more similarities and differences but at some point you need benchmarks to really know. The itertools module is often highly optimized, meaning in many cases being done mostly NOT in interpreted python but in C or C++ or whatever. If you wrote a python version of the same idea, it might be less efficient. And in this case, it may be overkill. I mean do you know what is returned by groupby? A hint is that it returns TWO things and you are only using one. The second is nonsense for your example as you are using the default function that generates keys based on a sort of equality so all the members of the group are the same. But the full power of group_by is if you supply a function that guides choices such as wanting all items that are the same if written in all UPPER case. So my guess is the itertools module chosen could be more than you need. But if it is efficient and the defaults click right in and do the job, who knows? My guess is there is more cost than the others for simple things but perhaps not for more complex things. I think it does a form of hashing rather than comparisons like the others. I hope my thoughts are helpful even if they do not provide a single unambiguous answer. They all seem like reasonable solutions and probably NONE of them would be expected if this was homework for a class just getting started. That class would expect a solution for a single type of object such as small integers and a fairly trivial implementation in a loop that may be an unrolled variant of perhaps close to choice TWO. Efficiency might be a secondary concern, if at all. And for really long lists, weirdly, I might suggest a variant that starts by adding a unique item in front of the list and then removing it from the results at the end. -----Original Message----- From: Tutor On Behalf Of Manprit Singh Sent: Saturday, July 16, 2022 5:26 AM To: tutor at python.org Subject: [Tutor] Ways of removing consequtive duplicates from a list Dear Sir , I was just doing an experiment of removing consecutive duplicates from a list . Did it in the following ways and it all worked . Just need to know which one should be preferred ? which one is more good ? lst = [2, 2, 3, 3, 3, 2, 2, 5, 5, 6, 3, 3, 3, 3] # Ways of removing consequtive duplicates [ele for i, ele in enumerate(lst) if i==0 or ele != lst[i-1]] [2, 3, 2, 5, 6, 3] val = object() [(val := ele) for ele in lst if ele != val] [2, 3, 2, 5, 6, 3] import itertools [val for val, grp in itertools.groupby(lst)] [2, 3, 2, 5, 6, 3] Is there anything else more efficient ? Regards Manprit Singh _______________________________________________ Tutor maillist - Tutor at python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor From alexkleider at gmail.com Sat Jul 16 18:01:24 2022 From: alexkleider at gmail.com (Alex Kleider) Date: Sat, 16 Jul 2022 15:01:24 -0700 Subject: [Tutor] Ways of removing consequtive duplicates from a list In-Reply-To: <005001d8992b$227b77b0$67726710$@gmail.com> References: <005001d8992b$227b77b0$67726710$@gmail.com> Message-ID: On Sat, Jul 16, 2022 at 12:54 PM wrote: > > Manprit, > > Your message is not formatted properly in my email and you just asked any > women present to not reply to you, nor anyone who has not been knighted by a > Queen. I personally do not expect such politeness but clearly some do. > I confess it took me longer than it should have to figure out to what you were referring in the second half of the above but eventually the light came on and the smile blossomed! My next thought was that it wouldn't necessarily have had to have been a Queen although anyone knighted (by a King) prior to the beginning of our current Queen's reign is unlikely to be even alive let alone interested in this sort of thing. Thanks for the morning smile! a PS My (at least for me easier to comprehend) solution: def rm_duplicates(iterable): last = '' for item in iterable: if item != last: yield item last = item lst = [2, 2, 3, 3, 3, 2, 2, 5, 5, 6, 3, 3, 3, 3] if __name__ == '__main__': res = [res for res in rm_duplicates(lst)] print(res) assert res == [2, 3, 2, 5, 6, 3] -- alex at kleider.ca (sent from my current gizmo) From wlfraed at ix.netcom.com Sat Jul 16 20:17:15 2022 From: wlfraed at ix.netcom.com (Dennis Lee Bieber) Date: Sat, 16 Jul 2022 20:17:15 -0400 Subject: [Tutor] Ways of removing consequtive duplicates from a list References: <005001d8992b$227b77b0$67726710$@gmail.com> Message-ID: <41l6dhhjki03i05uue62o0a35vnd6n0q6d@4ax.com> On Sat, 16 Jul 2022 11:45:54 -0400, declaimed the following: >Your message is not formatted properly in my email and you just asked any Just a comment: Might be your client -- it did come in as correctly "broken" lines in Gmane's news-server gateway to the mailing list. OTOH: I had problems with a genealogy mailing list (not available as a "news group" on any server) with some posts. They are formatted properly when reading, but become one snarled string when quoted in a reply. But only from one or two posters -- so it is a combination of posting client vs reading client... -- Wulfraed Dennis Lee Bieber AF6VN wlfraed at ix.netcom.com http://wlfraed.microdiversity.freeddns.org/ From bouncingcats at gmail.com Sat Jul 16 20:37:25 2022 From: bouncingcats at gmail.com (David) Date: Sun, 17 Jul 2022 10:37:25 +1000 Subject: [Tutor] Ways of removing consequtive duplicates from a list In-Reply-To: <41l6dhhjki03i05uue62o0a35vnd6n0q6d@4ax.com> References: <005001d8992b$227b77b0$67726710$@gmail.com> <41l6dhhjki03i05uue62o0a35vnd6n0q6d@4ax.com> Message-ID: On Sun, 17 Jul 2022 at 10:18, Dennis Lee Bieber wrote: > On Sat, 16 Jul 2022 11:45:54 -0400, declaimed the following: > > >Your message is not formatted properly in my email and you just asked any > > Just a comment: Might be your client -- it did come in as correctly > "broken" lines in Gmane's news-server gateway to the mailing list. In case it is helpful for avi.e.gross in future, the original message from Manprit Singh is formatted correctly the mailing list archive when I view it in my web browser at https://mail.python.org/pipermail/tutor/2022-July/119936.html From __peter__ at web.de Sun Jul 17 04:26:37 2022 From: __peter__ at web.de (Peter Otten) Date: Sun, 17 Jul 2022 10:26:37 +0200 Subject: [Tutor] Ways of removing consequtive duplicates from a list In-Reply-To: References: <005001d8992b$227b77b0$67726710$@gmail.com> Message-ID: On 17/07/2022 00:01, Alex Kleider wrote: > PS My (at least for me easier to comprehend) solution: > > def rm_duplicates(iterable): > last = '' > for item in iterable: > if item != last: > yield item > last = item The problem with this is the choice of the initial value for 'last': >>> list(rm_duplicates(["", "", 42, "a", "a", ""])) [42, 'a', ''] # oops, we lost the initial empty string Manprit avoided that in his similar solution by using a special value that will compare false except in pathological cases: > val = object() > [(val := ele) for ele in lst if ele != val] Another fix is to yield the first item unconditionally: def rm_duplicates(iterable): it = iter(iterable) try: last = next(it) except StopIteration: return yield last for item in it: if item != last: yield item last = item If you think that this doesn't look very elegant you may join me in the https://peps.python.org/pep-0479/ haters' club ;) From avi.e.gross at gmail.com Sun Jul 17 00:52:20 2022 From: avi.e.gross at gmail.com (avi.e.gross at gmail.com) Date: Sun, 17 Jul 2022 00:52:20 -0400 Subject: [Tutor] Ways of removing consequtive duplicates from a list In-Reply-To: <41l6dhhjki03i05uue62o0a35vnd6n0q6d@4ax.com> References: <005001d8992b$227b77b0$67726710$@gmail.com> <41l6dhhjki03i05uue62o0a35vnd6n0q6d@4ax.com> Message-ID: <003701d89998$ffcfd940$ff6f8bc0$@gmail.com> Dennis, I did not blame the sender as much as say what trouble I had putting together in proper order what THIS mailer showed me. I am using Microsoft Outlook getting mail with IMAP4 from gmail and also forwarding a copy to AOL mail which was also driving me nuts by making my text look like that when I sent. Oddly, that one received the lines nicely!!! Standards annoy some people but some standards really would be helpful if various email vendors agreed tio implement thigs as much the same as possible. Realistically, many allow all kinds of customization. For some things it matters less but for code and ESPECIALLY code in python where indentation is part of the language, it is frustrating. I may be getting touchy without the feely, but I am having trouble listening to the way some people with cultural differences, or far left/right attitudes, try to address me/us in forums like this. Alex may have been amused by my retort, and there is NOTHING wrong with saying "Dear Sirs" when done in many contexts, just like someone a while ago was writing to something like "Esteemed Professors" but it simply rubs me wrong here. Back to topic, if I may, sometimes things set our moods. I am here partially to be helpful and partially for my own amusement and education as looking at some of the puzzles presented presents opportunities to think and investigate. But before I could get to the reasonable question here, I was perturbed at the overly formulaic politeness and wrongness of the greeting from my perhaps touchy perspective for the reasons mentioned including the way it seeming assumes no Ladies are present and we are somehow Gentlemen, but also by the mess I saw on one wrapped line that was a pain to take apart. Then I wondered why the question was being asked. Yes, weirdly, it is a question you and I have discussed before when wondering which way of doing something worked better, was more efficient, or showed a more brilliant way to use the wrong method to do something nobody designed it for! But as this is supposed to be a TUTORIAL or HELP website, even if Alan rightfully may disagree and it is his forum, I am conscious of not wanting to make this into a major discussion group where the people we want to help just scratch their heads. I am not sure who read my longish message, but I hope the main point is that sometimes you should just TEST it. This is not long and complex code. However, there cannot be any one test everyone will agree on and it often depends on factors other than CPU cycles. A robust implementation that can handle multiple needs may well be slower and yet more cost effective in some sense. I have mentioned I do lots of my playing around with other languages too. Many have a minor variant of the issue here as in finding unique items in some collection such as a vector or data.frame. The problem here did not say whether the data being used can be in random order or already has all instances of the same value in order or is even in sorted order. Has anyone guessed if that is the case? Because if something is already sorted as described, such as [0,0,1,1,1,1,2,4,4,4,4,4,4,5,5] then there are even more trivial solutions by using something like numpy.unique() using just a few lines and I wonder how efficient this is: >>> import numpy as np >>> np.unique([0,0,1,1,1,1,2,4,4,4,4,4,4,5,5] ) array([0, 1, 2, 4, 5]) Admittedly this is a special case. But only the one asking the question can tell us if that is true. It also works with character data and probably much more: >>> np.unique(["a", "a", "b", "b", "b", "c"]) array(['a', 'b', 'c'], dtype='|S1') But this was not offered as one of his three choices, so never mind! -----Original Message----- From: Tutor On Behalf Of Dennis Lee Bieber Sent: Saturday, July 16, 2022 8:17 PM To: tutor at python.org Subject: Re: [Tutor] Ways of removing consequtive duplicates from a list On Sat, 16 Jul 2022 11:45:54 -0400, declaimed the following: >Your message is not formatted properly in my email and you just asked >any Just a comment: Might be your client -- it did come in as correctly "broken" lines in Gmane's news-server gateway to the mailing list. OTOH: I had problems with a genealogy mailing list (not available as a "news group" on any server) with some posts. They are formatted properly when reading, but become one snarled string when quoted in a reply. But only from one or two posters -- so it is a combination of posting client vs reading client... -- Wulfraed Dennis Lee Bieber AF6VN wlfraed at ix.netcom.com http://wlfraed.microdiversity.freeddns.org/ _______________________________________________ Tutor maillist - Tutor at python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor From avi.e.gross at gmail.com Sun Jul 17 12:59:15 2022 From: avi.e.gross at gmail.com (avi.e.gross at gmail.com) Date: Sun, 17 Jul 2022 12:59:15 -0400 Subject: [Tutor] Ways of removing consequtive duplicates from a list In-Reply-To: References: <005001d8992b$227b77b0$67726710$@gmail.com> Message-ID: <00b001d899fe$8c3cb000$a4b61000$@gmail.com> You could make the case, Peter, that you can use anything as a start that will not likely match in your domain. You are correct if an empty string may be in the data. Now an object returned by object is pretty esoteric and ought to be rare and indeed each new object seems to be individual. val=object() [(val := ele) for ele in [1,1,2,object(),3,3,3] if ele != val] -->> [1, 2, , 3] So the only way to trip this up is to use the same object or another reference to it where it is silently ignored. [(val := ele) for ele in [1,1,2,val,3,3,3] if ele != val] -->> [1, 2, 3] valiant = val [(val := ele) for ele in [1,1,2,valiant,3,3,3] if ele != val] -->> [1, 2, 3] But just about any out-of-band will presumably do. I mean if you are comparing just numbers, all you need do is slip in something else like "The quick brown fox jumped over the lazy dog" or float("inf") or even val = (math.inf, -math.inf) and so on. I would have thought also of using the special value of None and it works fine unless the string has a None! So what I see here is a choice between a heuristic solution that can fail to work quite right on a perhaps obscure edge case, or a fully deterministic algorithm that knows which is the first and treats it special. The question asked was about efficiency, so let me ask a dumb question. Is there a difference in efficiency of comparing to different things over and over again in the loop? I would think so. Comparing to None could turn out to be trivial. Math.inf as implemented in python seems to just be a big floating number as is float("inf") and I have no idea what an object() looks like but assume it is the parent class of all other objects ad thus has no content but some odd methods attached. Clearly the simplest comparison might be variable depending on what the data you are working on is. So, yes, an unconditional way of dealing with the first item often is needed. It is very common in many algorithms for the first and perhaps last item to have no neighbor on one side. -----Original Message----- From: Tutor On Behalf Of Peter Otten Sent: Sunday, July 17, 2022 4:27 AM To: tutor at python.org Subject: Re: [Tutor] Ways of removing consequtive duplicates from a list On 17/07/2022 00:01, Alex Kleider wrote: > PS My (at least for me easier to comprehend) solution: > > def rm_duplicates(iterable): > last = '' > for item in iterable: > if item != last: > yield item > last = item The problem with this is the choice of the initial value for 'last': >>> list(rm_duplicates(["", "", 42, "a", "a", ""])) [42, 'a', ''] # oops, we lost the initial empty string Manprit avoided that in his similar solution by using a special value that will compare false except in pathological cases: > val = object() > [(val := ele) for ele in lst if ele != val] Another fix is to yield the first item unconditionally: def rm_duplicates(iterable): it = iter(iterable) try: last = next(it) except StopIteration: return yield last for item in it: if item != last: yield item last = item If you think that this doesn't look very elegant you may join me in the https://peps.python.org/pep-0479/ haters' club ;) _______________________________________________ Tutor maillist - Tutor at python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor From avi.e.gross at gmail.com Sun Jul 17 14:02:23 2022 From: avi.e.gross at gmail.com (avi.e.gross at gmail.com) Date: Sun, 17 Jul 2022 14:02:23 -0400 Subject: [Tutor] Ways of removing consequtive duplicates from a list In-Reply-To: References: <005001d8992b$227b77b0$67726710$@gmail.com> Message-ID: <015d01d89a07$5e34b230$1a9e1690$@gmail.com> I was thinking of how expensive it is to push a copy of the first item in front of a list to avoid special casing in this case. I mean convert [first, ...] to [first, first, ...] That neatly can deal with some algorithms such as say calculating a moving average if you want the first N items to simply be the same as the start item or a pre-calculated mean of he cumulative sum to that point, rather than empty or an error. But I realized that with the python emphasis on iterables in python, there probably is no easy way to push items into a sort of queue. Maybe you can somewhat do it with a decorator that intercepts your first calls by supplying the reserved content and only afterwards calls the iterator. But as there are so many reasonable solutions, not an avenue needed to explore. -----Original Message----- From: Tutor On Behalf Of Peter Otten Sent: Sunday, July 17, 2022 4:27 AM To: tutor at python.org Subject: Re: [Tutor] Ways of removing consequtive duplicates from a list On 17/07/2022 00:01, Alex Kleider wrote: > PS My (at least for me easier to comprehend) solution: > > def rm_duplicates(iterable): > last = '' > for item in iterable: > if item != last: > yield item > last = item The problem with this is the choice of the initial value for 'last': >>> list(rm_duplicates(["", "", 42, "a", "a", ""])) [42, 'a', ''] # oops, we lost the initial empty string Manprit avoided that in his similar solution by using a special value that will compare false except in pathological cases: > val = object() > [(val := ele) for ele in lst if ele != val] Another fix is to yield the first item unconditionally: def rm_duplicates(iterable): it = iter(iterable) try: last = next(it) except StopIteration: return yield last for item in it: if item != last: yield item last = item If you think that this doesn't look very elegant you may join me in the https://peps.python.org/pep-0479/ haters' club ;) _______________________________________________ Tutor maillist - Tutor at python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor From wlfraed at ix.netcom.com Sun Jul 17 21:21:30 2022 From: wlfraed at ix.netcom.com (Dennis Lee Bieber) Date: Sun, 17 Jul 2022 21:21:30 -0400 Subject: [Tutor] Ways of removing consequtive duplicates from a list References: <005001d8992b$227b77b0$67726710$@gmail.com> <41l6dhhjki03i05uue62o0a35vnd6n0q6d@4ax.com> <003701d89998$ffcfd940$ff6f8bc0$@gmail.com> Message-ID: On Sun, 17 Jul 2022 00:52:20 -0400, declaimed the following: >I did not blame the sender as much as say what trouble I had putting >together in proper order what THIS mailer showed me. I am using Microsoft >Outlook getting mail with IMAP4 from gmail and also forwarding a copy to AOL >mail which was also driving me nuts by making my text look like that when I >sent. Oddly, that one received the lines nicely!!! > Personally, I don't trust anything gmail and/or GoogleGroups produces ... And, hopefully without offending, I find M$ Outlook to be the next offender. My experience (when I was employed, and the companies used Outlook as the only email client authorized) was that it went out of its way to make it almost impossible to respond in accordance with RFC1855 """ - If you are sending a reply to a message or a posting be sure you summarize the original at the top of the message, or include just enough text of the original to give a context. This will make sure readers understand when they start to read your response. Since NetNews, especially, is proliferated by distributing the postings from one host to another, it is possible to see a response to a message before seeing the original. Giving context helps everyone. But do not include the entire original! """ Outlook, in my experience, attempts to replicate corporate mail practices of ages past. Primarily by treating "quoted content" as if it were a photocopy being attached as a courtesy copy/reminder (I've seen messages that had something like 6 or more levels of indentation and font-size reductions as it just took the content of a post, applied an indent (not a standard > quote marker) shift and/or font reduction). Attempting to do a trim to relevant content with interspersed reply content was nearly impossible as one had to figure out how to under the style at the trim point -- otherwise one's inserted text ended up looking just like the quoted text and not as new content. I'll admit that I've seen configuration options to allow for something closer to RFC1855 format... But they were so buried most people never see them. Instead we get heavily styled HTML for matters which only need simple text. > >I may be getting touchy without the feely, but I am having trouble listening >to the way some people with cultural differences, or far left/right >attitudes, try to address me/us in forums like this. Alex may have been >amused by my retort, and there is NOTHING wrong with saying "Dear Sirs" when >done in many contexts, just like someone a while ago was writing to >something like "Esteemed Professors" but it simply rubs me wrong here. > The one that most affects me are those that start out with: "I have doubt..." where "doubt" is being used in place of "question". -- Wulfraed Dennis Lee Bieber AF6VN wlfraed at ix.netcom.com http://wlfraed.microdiversity.freeddns.org/ From PythonList at DancesWithMice.info Sun Jul 17 23:34:16 2022 From: PythonList at DancesWithMice.info (dn) Date: Mon, 18 Jul 2022 15:34:16 +1200 Subject: [Tutor] Ways of removing consequtive duplicates from a list In-Reply-To: References: <005001d8992b$227b77b0$67726710$@gmail.com> Message-ID: On 17/07/2022 20.26, Peter Otten wrote: > On 17/07/2022 00:01, Alex Kleider wrote: > >> PS My (at least for me easier to comprehend) solution: >> >> def rm_duplicates(iterable): >> ???? last = '' >> ???? for item in iterable: >> ???????? if item != last: >> ???????????? yield item >> ???????????? last = item > > The problem with this is the choice of the initial value for 'last': Remember "unpacking", eg >> def rm_duplicates(iterable): >> current, *the_rest = iterable >> for item in the_rest: Then there is the special case, which (assuming it is possible) can be caught as an exception - which will likely need to 'ripple up' through the function-calls because the final collection of 'duplicates' will be empty/un-process-able. (see later comment about "unconditionally") Playing in the REPL: >>> iterable = [1,2,3] >>> first, *rest = iterable >>> first, rest (1, [2, 3]) # iterable is a list >>> iterable = [1,2] >>> first, *rest = iterable >>> first, rest (1, [2]) # iterable is (technically) a list >>> iterable = [1] >>> first, *rest = iterable >>> first, rest (1, []) # iterable is an empty list >>> iterable = [] >>> first, *rest = l Traceback (most recent call last): File "", line 1, in ValueError: not enough values to unpack (expected at least 1, got 0) # nothing to see here: no duplicates - and no 'originals' either! >>>> list(rm_duplicates(["", "", 42, "a", "a", ""])) > [42, 'a', '']?? # oops, we lost the initial empty string > > Manprit avoided that in his similar solution by using a special value > that will compare false except in pathological cases: > >> val = object() >> [(val := ele) for ele in lst if ele != val] > > Another fix is to yield the first item unconditionally: > > def rm_duplicates(iterable): > ??? it = iter(iterable) > ??? try: > ??????? last = next(it) > ??? except StopIteration: > ??????? return > ??? yield last > ??? for item in it: > ??????? if item != last: > ??????????? yield item > ??????????? last = item > > If you think that this doesn't look very elegant you may join me in the > https://peps.python.org/pep-0479/ haters' club ;) This does indeed qualify as 'ugly'. However, it doesn't need to be expressed in such an ugly fashion! -- Regards, =dn From PythonList at DancesWithMice.info Mon Jul 18 01:34:08 2022 From: PythonList at DancesWithMice.info (dn) Date: Mon, 18 Jul 2022 17:34:08 +1200 Subject: [Tutor] Ways of removing consequtive duplicates from a list In-Reply-To: <003701d89998$ffcfd940$ff6f8bc0$@gmail.com> References: <005001d8992b$227b77b0$67726710$@gmail.com> <41l6dhhjki03i05uue62o0a35vnd6n0q6d@4ax.com> <003701d89998$ffcfd940$ff6f8bc0$@gmail.com> Message-ID: <5b42eea3-f886-bf85-1ec3-884b1fd2694c@DancesWithMice.info> >> ... you just asked any >> women present to not reply to you, nor anyone who has not been knighted by a >> Queen. I personally do not expect such politeness but clearly some do. >> > > I confess it took me longer than it should have to figure out to what > you were referring in the second half of the above but eventually the > light came on and the smile blossomed! > My next thought was that it wouldn't necessarily have had to have been > a Queen although anyone knighted (by a King) prior to the beginning of > our current Queen's reign is unlikely to be even alive let alone > interested in this sort of thing. > Thanks for the morning smile! I've not been knighted, but am frequently called "Sir". Maybe when you too have a grey beard? (hair atop same head, optional) It is not something that provokes a positive response, and if in a grumpy-mood may elicit a reply describing it as to be avoided "unless we are both in uniform". Thus, the same words hold different dictionary-meanings and different implications for different people, in different contexts, and between cultures! I'm not going to invoke any Python community Code of Conduct terms or insist upon 'politically-correctness', but am vaguely-surprised that someone has not... The gender observation is appropriate, but then how many of the OP's discussions feature responses from other than males? (not that such observation could be claimed as indisputable) Writing 'here', have often used constructions such as "him/her", with some question about how many readers might apply each form... The dislocation in response to the OP is cultural. (In this) I have advantage over most 'here', having lived and worked in India. (also having been brought-up, way back in the last century, in an old-English environment, where we were expected to address our elders using such titles. Come to think of it, also in the US of those days) In India, and many others parts of Asia, the respectful address of teachers, guides, and elders generally, is required-behavior. In the Antipodes, titles of almost any form are rarely used, and most will exchange first-names at an introduction. Whereas in Germany (for example), the exact-opposite applies, and one must remember to use the Herr-s, Doktor-s, du-forms etc. How to cope when one party is at the opposite 'end' of the scale from another? I'm reminded of 'Postel's law': "Be liberal in what you accept, and conservative in what you send". Is whether someone actually knows what they're talking-about more relevant (and more telling) than their qualifications, rank, title, whatever - or does that only apply in the tech-world where we seem to think we can be a technocracy? Living in an 'immigrant society' today, and having gone through such a process (many times, in many places) I'm intrigued by how quickly - or how slowly, some will adapt to the local culture possibly quite alien to them. Maybe worst of all are the ones who observe, but then react by assuming (or claiming) superiority - less an attempt to 'fit in', but perhaps an intent to be 'more equal'... > I may be getting touchy without the feely, but I am having trouble listening > to the way some people with cultural differences, or far left/right > attitudes, try to address me/us in forums like this. Alex may have been > amused by my retort, and there is NOTHING wrong with saying "Dear Sirs" when Disagree: when *I* read the message, I am me. I am in the singular. When *you* write, you (singular) are writing to many of us (plural). Who is the more relevant party to the communication? Accordingly, "Dear Sir" not "Sirs" - unless you are seeking a collective or corporate reply, eg from a firm of solicitors. (cf the individual replies (plural, one might hope) you expect from multiple individuals - who happen to be personal-members of the (collective) mailing-list). > done in many contexts, just like someone a while ago was writing to > something like "Esteemed Professors" but it simply rubs me wrong here. Like it appears do you, I quickly lose respect for 'esteemed professors/professionals' who expect this, even revel in it. However, if one is a student or otherwise 'junior', it is a career-limiting/grade-reducing move not to accede! That said, two can play at that game: someone wanting to improve his/her grade (or seeking some other favor) will attempt ingratiation through more effusive recognition and compliment ("gilding the lily"). whither 'respect'? I recall a colleague, on an International Development team assigned to a small Pacific country, who may have been junior or at most 'equal' to myself in 'rank'. Just as in India, he would introduce himself formally as "Dr Chandrashekar" plus full position and assignment. In a more relaxed situation, his informal introduction was "Dr Chandra". It was amusing to watch the reactions both 'westerners' and locals had to this. Seeing how it didn't 'fit' with our host-culture, we took sardonic delight in referring to him as "Chandra". (yes, naughty little boys!) One day my (local, and non-tech, and female) assistant, visibly shaking, requested a private meeting with another member of the team and myself. Breaking-down into tears she apologised for interrupting the urgent-fix discussion we'd been having with senior IT staff the day before, even as we knew we were scheduled elsewhere. Her 'apology' was that Chandra was (twice) insistent for our presence and demanded that meeting be interrupted, even terminated - and that she had to obey, she said, "because he is Doctor". (we tried really hard not to laugh) For our part, knowing the guy, we knew that she should not be the recipient of any 'blow-back'. After plentiful reassurance that she was not 'in trouble' with either of us, and a talk (similar to 'here') about the [ab]use of 'titles', she not only understood, but paid us both a great compliment saying something like: I call you (first-name) because we all work together, but I call him "Doctor" because he expects me to do things *for* him! Being called by my given-name, unadorned, always proved a 'buzz' thereafter! > Back to topic, if I may, sometimes things set our moods. I am here partially > to be helpful and partially for my own amusement and education as looking at > some of the puzzles presented presents opportunities to think and > investigate. > > But before I could get to the reasonable question here, I was perturbed at > the overly formulaic politeness and wrongness of the greeting from my > perhaps touchy perspective for the reasons mentioned including the way it > seeming assumes no Ladies are present and we are somehow Gentlemen, but also > by the mess I saw on one wrapped line that was a pain to take apart. Then I > wondered why the question was being asked. Yes, weirdly, it is a question > you and I have discussed before when wondering which way of doing something > worked better, was more efficient, or showed a more brilliant way to use the > wrong method to do something nobody designed it for! Yep, rubs me the wrong way too! (old grumpy-guts is likely to say "no!", on principle - and long before they've even finished their wind-up!) BTW such is not just an Asian 'thing' either - I recall seeing, and quickly avoiding, the latest version of a perennial discussion about protocol. Specifically, the sequence of email-addresses one should use in the To: and Cc: fields of email-messages (and whether or not Bcc: is "respectful"). Even today, in the US and UK, some people and/or organisations demand that the more 'important' names should precede those of mere-minions. "We the people" meets "some, more equal than others"! Yes, and the OP does irritate by not answering questions from 'helpers'. He does publish (for income/profit). I don't know if he has ever used/repeated any of the topics discussed 'here' - nor if in doing-so he attributes and credits appropriately (by European, UK, US... standards). > I am not sure who read my longish message, but I hope the main point is that > sometimes you should just TEST it. This is not long and complex code. > However, there cannot be any one test everyone will agree on and it often > depends on factors other than CPU cycles. A robust implementation that can > handle multiple needs may well be slower and yet more cost effective in some > sense. Another source of irritation: define terms-used, eg what is the metric for "better" or "best"? Frankly, the succession of 'academic questions' with dubious application in the real world (CRC-checks notwithstanding) have all the flavor of someone writing an old-fashioned text-book - emphasis on facts, with professional application relegated to lesser (if any) import, and perhaps more than a little "I'm so much smarter than you". NB the Indian and many Asian education systems use techniques which are regarded as 'old', yet at the same time they are apparently effective! -- Regards, =dn From __peter__ at web.de Mon Jul 18 02:49:21 2022 From: __peter__ at web.de (Peter Otten) Date: Mon, 18 Jul 2022 08:49:21 +0200 Subject: [Tutor] Implicit passing of argument select functions being called In-Reply-To: References: <931b5db0-047f-90c3-ee6c-35e6a34c01f2@gmail.com> Message-ID: <915d5595-34bd-fd58-0819-4fb13a09a50b@web.de> On 14/07/2022 09:56, ?????????? ????? wrote: > I've finally figured ou a solution, I'll leave it here in case it helps > someone else, using inspect.signature > > > The actuall functions would look like this: > ` > dfilter = simple_filter(start_col = 6,) > make_pipeline( > load_dataset(fpath="some/path.xlsx", > header=[0,1], > apply_internal_standard(target_col = "EtD5"), > export_to_sql(fname = "some/name"), > , > data _filter = dfilter > ) Whatever works ;) I think I would instead ensure that all transformations have the same signature. functools.partial() could be helpful to implement this. Simple example: >>> def add(items, value): return [item + value for item in items] >>> def set_value(items, value, predicate): return [value if predicate(item) else item for item in items] >>> def transform(items, *transformations): for trafo in transformations: items = trafo(items) return items >>> from functools import partial >>> transform( [-3, 7, 5], # add 5 to each item partial(add, value=5), # set items > 10 to 0 partial(set_value, value=0, predicate=lambda x: x > 10) ) [2, 0, 10] From __peter__ at web.de Mon Jul 18 03:15:14 2022 From: __peter__ at web.de (Peter Otten) Date: Mon, 18 Jul 2022 09:15:14 +0200 Subject: [Tutor] Ways of removing consequtive duplicates from a list In-Reply-To: <00b001d899fe$8c3cb000$a4b61000$@gmail.com> References: <005001d8992b$227b77b0$67726710$@gmail.com> <00b001d899fe$8c3cb000$a4b61000$@gmail.com> Message-ID: <28dee502-7274-b2cb-26f5-89010761f42e@web.de> On 17/07/2022 18:59, avi.e.gross at gmail.com wrote: > You could make the case, Peter, that you can use anything as a start that > will not likely match in your domain. You are correct if an empty string may > be in the data. > > Now an object returned by object is pretty esoteric and ought to be rare and > indeed each new object seems to be individual. > > val=object() > > [(val := ele) for ele in [1,1,2,object(),3,3,3] if ele != val] > -->> [1, 2, , 3] > > So the only way to trip this up is to use the same object or another > reference to it where it is silently ignored. When you want a general solution for removal of consecutive duplicates you can put the line val = object() into the deduplication function which makes it *very* unlikely that val will also be passed as an argument to that function. To quote myself: > Manprit avoided that in his similar solution by using a special value > that will compare false except in pathological cases: > >> val = object() >> [(val := ele) for ele in lst if ele != val] What did I mean with "pathological"? One problematic case would be an object that compares equal to everything, class A: def __eq__(self, other): return True def __ne__(self, other): return False but that is likely to break the algorithm anyway. Another problematic case: objects that only implement comparison for other objects of the same type. For these deduplication will work if you avoid the out-of-band value: >>> class A: def __init__(self, name): self.name = name def __eq__(self, other): return self.name == other.name def __ne__(self, other): return self.name != other.name def __repr__(self): return f"A(name={self.name})" >>> prev = object() >>> >>> [(prev:=item) for item in map(A, "abc") if item != prev] Traceback (most recent call last): File "", line 1, in [(prev:=item) for item in map(A, "abc") if item != prev] File "", line 1, in [(prev:=item) for item in map(A, "abc") if item != prev] File "", line 5, in __ne__ def __ne__(self, other): return self.name != other.name AttributeError: 'object' object has no attribute 'name' >>> def rm_duplicates(iterable): it = iter(iterable) try: last = next(it) except StopIteration: return yield last for item in it: if item != last: yield item last = item >>> list(rm_duplicates(map(A, "aabccc"))) [A(name=a), A(name=b), A(name=c)] >>> From avi.e.gross at gmail.com Mon Jul 18 00:22:32 2022 From: avi.e.gross at gmail.com (avi.e.gross at gmail.com) Date: Mon, 18 Jul 2022 00:22:32 -0400 Subject: [Tutor] Ways of removing consequtive duplicates from a list In-Reply-To: References: <005001d8992b$227b77b0$67726710$@gmail.com> Message-ID: <012301d89a5e$00863b70$0192b250$@gmail.com> Dennis, Unpacking is an interesting approach. Your list example seems to return a shorter list which remains iterable. But what does it mean to unpack other iterables like a function that yields? Does the unpacking call it as often as needed to satisfy the first variables you want filled and then pass a usable version of the iterable to the last argument? Since the question asked was about what approach is in some way better, unpacking can be a sort of hidden cost or it can be done very efficiently. -----Original Message----- From: Tutor On Behalf Of dn Sent: Sunday, July 17, 2022 11:34 PM To: tutor at python.org Subject: Re: [Tutor] Ways of removing consequtive duplicates from a list On 17/07/2022 20.26, Peter Otten wrote: > On 17/07/2022 00:01, Alex Kleider wrote: > >> PS My (at least for me easier to comprehend) solution: >> >> def rm_duplicates(iterable): >> last = '' >> for item in iterable: >> if item != last: >> yield item >> last = item > > The problem with this is the choice of the initial value for 'last': Remember "unpacking", eg >> def rm_duplicates(iterable): >> current, *the_rest = iterable >> for item in the_rest: Then there is the special case, which (assuming it is possible) can be caught as an exception - which will likely need to 'ripple up' through the function-calls because the final collection of 'duplicates' will be empty/un-process-able. (see later comment about "unconditionally") Playing in the REPL: >>> iterable = [1,2,3] >>> first, *rest = iterable >>> first, rest (1, [2, 3]) # iterable is a list >>> iterable = [1,2] >>> first, *rest = iterable >>> first, rest (1, [2]) # iterable is (technically) a list >>> iterable = [1] >>> first, *rest = iterable >>> first, rest (1, []) # iterable is an empty list >>> iterable = [] >>> first, *rest = l Traceback (most recent call last): File "", line 1, in ValueError: not enough values to unpack (expected at least 1, got 0) # nothing to see here: no duplicates - and no 'originals' either! >>>> list(rm_duplicates(["", "", 42, "a", "a", ""])) > [42, 'a', ''] # oops, we lost the initial empty string > > Manprit avoided that in his similar solution by using a special value > that will compare false except in pathological cases: > >> val = object() >> [(val := ele) for ele in lst if ele != val] > > Another fix is to yield the first item unconditionally: > > def rm_duplicates(iterable): > it = iter(iterable) > try: > last = next(it) > except StopIteration: > return > yield last > for item in it: > if item != last: > yield item > last = item > > If you think that this doesn't look very elegant you may join me in > the https://peps.python.org/pep-0479/ haters' club ;) This does indeed qualify as 'ugly'. However, it doesn't need to be expressed in such an ugly fashion! -- Regards, =dn _______________________________________________ Tutor maillist - Tutor at python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor From __peter__ at web.de Mon Jul 18 04:25:19 2022 From: __peter__ at web.de (Peter Otten) Date: Mon, 18 Jul 2022 10:25:19 +0200 Subject: [Tutor] Ways of removing consequtive duplicates from a list In-Reply-To: References: <005001d8992b$227b77b0$67726710$@gmail.com> Message-ID: On 18/07/2022 05:34, dn wrote: > On 17/07/2022 20.26, Peter Otten wrote: >> On 17/07/2022 00:01, Alex Kleider wrote: >> >>> PS My (at least for me easier to comprehend) solution: >>> >>> def rm_duplicates(iterable): >>> ???? last = '' >>> ???? for item in iterable: >>> ???????? if item != last: >>> ???????????? yield item >>> ???????????? last = item >> >> The problem with this is the choice of the initial value for 'last': > > Remember "unpacking", eg Try first, *rest = itertools.count() ;) If you want the full generality of the iterator-based approach unpacking is right out. >> Another fix is to yield the first item unconditionally: >> >> def rm_duplicates(iterable): >> ??? it = iter(iterable) >> ??? try: >> ??????? last = next(it) >> ??? except StopIteration: >> ??????? return >> ??? yield last >> ??? for item in it: >> ??????? if item != last: >> ??????????? yield item >> ??????????? last = item >> >> If you think that this doesn't look very elegant you may join me in the >> https://peps.python.org/pep-0479/ haters' club ;) > This does indeed qualify as 'ugly'. However, it doesn't need to be > expressed in such an ugly fashion! On second thought, I'm not exception-shy, and compared to the other options last = next(it, out_of_band) if last is out_of_band: return or for last in it: break else: return I prefer the first version. I'd probably go with [key for key, _value in groupby(items)] though. However, if you look at the Python equivalent to groupby() https://docs.python.org/3/library/itertools.html#itertools.groupby you'll find that this just means that the "ugly" parts have been written by someone else. I generally like that strategy -- if you have an "ugly" piece of code, stick it into a function with a clean interface, add some tests to ensure it works as advertised, and move on. From marcus.luetolf at bluewin.ch Mon Jul 18 05:59:51 2022 From: marcus.luetolf at bluewin.ch (marcus.luetolf at bluewin.ch) Date: Mon, 18 Jul 2022 11:59:51 +0200 Subject: [Tutor] problem solving with lists: final (amateur) solution In-Reply-To: References: <000f01d888c3$c32e6eb0$498b4c10$@bluewin.ch> Message-ID: <000b01d89a8d$1f6fda80$5e4f8f80$@bluewin.ch> Hello Experts, hello dn, after having studied your valuable critiques I revised may code as below. A few remarks: The terms "pythonish" and "dummy_i" I got from a Rice University's online lecture on python style. If there is concern about a reader to have to "switch gears" between reading a for loop and a list comprehension then the concern should be even greater to read a nested list comprehension coding the 4 flights for day 1. I've read that a list comprehension should not exceed one line of code, otherwise a for loop should be used in favor of readability. I'am quite shure that my revised code does not meet all critique points especially there are still separate code parts (snippets?) for day_1 and day_2 to day_5 but not separate functions for at present I'am unable to "fold them in". I also updated my code on github accordingly and adjusted docstrings and comments: https://github.com/luemar/player_startlist/blob/main/start_list.py def start_list(all_players, num_in_flight, num_days_played): print('...............day_1................') history = {'a':[], 'b':[],'c':[],'d':[],'e':[],'f':[],'g':[],'h':[],\ 'i':[],'j':[],'k':[],'l':[],'m':[],'n':[],'o':[],'p':[]} for lead_player_index in range(0, len(all_players), num_in_flight): players = all_players[lead_player_index: lead_player_index + num_in_flight] [history[pl_day_1].extend(players) for pl_day_1 in players] print(all_players[lead_player_index] + '_flight_day_1:',players) for i in range(num_days_played - 1): flights = {} c_all_players = all_players[:] print('...............day_' + str(i)+'................') flights['a_flight_day_'+str(i+2)]= [] flights['b_flight_day_'+str(i+2)]= [] flights['c_flight_day_'+str(i+2)]= [] flights['d_flight_day_'+str(i+2)]= [] lead = list('abcd') flight_list = [flights['a_flight_day_'+str(i+2)], flights['b_flight_day_'+str(i+2)],\ flights['c_flight_day_'+str(i+2)], flights['d_flight_day_'+str(i+2)]] for j in range(len(flight_list)): def flight(cond, day): for player in all_players: if player not in cond: day.extend(player) cond.extend(history[player]) history[lead[j]].extend(player) day.extend(lead[j]) day.sort() [history[pl_day_2_5].extend(day) for pl_day_2_5 in day[1:]] return lead[j]+'_flight_day_'+str(i+2)+ ': ' + str(flight_list[j]) conditions = [history[lead[j]], history[lead[j]] + flights['a_flight_day_'+str(i+2)],\ history[lead[j]] + flights['a_flight_day_'+str(i+2)] + \ flights['b_flight_day_'+str(i+2)], \ history[lead[j]] + flights['a_flight_day_'+str(i+2)] + \ flights['b_flight_day_'+str(i+2)]+ flights['c_flight_day_'+str(i+2)]] print(flight(list(set(conditions[j])), flight_list[j])) num_in_flight = 4 if num_in_flight != 4: raise ValueError('out of seize of flight limit') num_days_played = 5 if num_days_played >5 or num_days_played <2: raise ValueError('out of playing days limit') all_players = list('abcdefghijklmnop') start_list(all_players,num_in_flight , num_days_played) Many thanks and regards, Marcus. --------------------------------------------------------------------------------------------------------------------------------------------------------------------------- -----Urspr?ngliche Nachricht----- Von: dn Gesendet: Sonntag, 26. Juni 2022 02:35 An: marcus.luetolf at bluewin.ch; tutor at python.org Betreff: Re: AW: [Tutor] problem solving with lists: final (amateur) solution On 26/06/2022 06.45, marcus.luetolf at bluewin.ch wrote: > Hello Experts, hello dn, > it's a while since I - in terms of Mark Lawrence - bothered you with > my problem. > Thanks to your comments, especially to dn's structured guidance I've > come up with the code below, based on repeatability. > I am shure there is room for improvement concerning pythonish style > but for the moment the code serves my purposes. > A commented version can be found on > https://github.com/luemar/player_startlist. > > def startlist(all_players, num_in_flight): > c_all_players = all_players[:] > history = {'a':[], 'b':[],'c':[],'d':[],'e':[],'f':[],'g':[],'h':[],\ > 'i':[],'j':[],'k':[],'l':[],'m':[],'n':[],'o':[],'p':[]} > print('...............day_1................') > def day_1_flights(): > key_hist = list(history.keys()) > c_key_hist = key_hist[:] > for dummy_i in c_key_hist: > print('flights_day_1: ', c_key_hist[:num_in_flight]) > for key in c_key_hist[:num_in_flight]: > [history[key].append(player)for player in > c_all_players[0:num_in_flight]] > del c_key_hist[:num_in_flight] > del c_all_players[0:num_in_flight] > day_1_flights() > > def day_2_to_5_flights(): > flights = {} > for i in range(2,6): > print('...............day_' + str(i)+'................') > flights['a_flight_day_'+str(i)]= [] > flights['b_flight_day_'+str(i)]= [] > flights['c_flight_day_'+str(i)]= [] > flights['d_flight_day_'+str(i)]= [] > lead = list('abcd') > flight_list = [flights['a_flight_day_'+str(i)], > flights['b_flight_day_'+str(i)],\ > flights['c_flight_day_'+str(i)], > flights['d_flight_day_'+str(i)]] > > for j in range(len(flight_list)): > def flight(cond, day): > for player in all_players: > if player not in cond: > day.extend(player) > cond.extend(history[player]) > history[lead[j]].extend(player) > day.extend(lead[j]) > day.sort() > [history[pl].extend(day) for pl in day[1:]] > return lead[j]+'_flight_day_'+str(i)+ ': ' + > str(flight_list[j]) > > conditions = [history[lead[j]], history[lead[j]] + > flights['a_flight_day_'+str(i)],\ > history[lead[j]] + > flights['a_flight_day_'+str(i)] + \ > flights['b_flight_day_'+str(i)], \ > history[lead[j]] + > flights['a_flight_day_'+str(i)] + \ > flights['b_flight_day_'+str(i)]+ > flights['c_flight_day_'+str(i)]] > print(flight(list(set(conditions[j])), flight_list[j])) > day_2_to_5_flights() > startlist(list('abcdefghijklmnop'), 4) > > Many thanks, Marcus. ... > The word "hardcoded" immediately stopped me in my tracks! > > The whole point of using the computer is to find 'repetition' and have > the machine/software save us from such boredom (or nit-picking detail > in which we might make an error/become bored). ... > The other 'side' of both of these code-constructs is the data-construct. > Code-loops require data-collections! The hard-coded "a" and "day_1" > made me shudder. > (not a pretty sight - the code, nor me shuddering!) ... > Sadly, the 'hard-coded' parts may 'help' sort-out week-one, but (IMHO) > have made things impossibly-difficult to proceed into week-two (etc). ... It works. Well done! What could be better than that? [Brutal] critique: - ?pythonish? in German becomes ?pythonic? in English (but I'm sure we all understood) - position the two inner-functions outside and before startlist() - whereas the ?4?, ie number of players per flight (num_in_flight), is defined as a parameter in the call to startlist(), the five ?times or days? is a 'magic constant' (worse, it appears in day_2_to_5_flights() as part of ?range(2,6)? which 'disguises' it due to Python's way of working) - the comments also include reference to those parameters as if they are constants (which they are - if you only plan to use the algorithm for this 16-4-5 configuration of the SGP). Thus, if the function were called with different parameters, the comments would be not only wrong but have the potential to mislead the reader - in the same vein (as the two points above), the all_players (variable) argument is followed by the generation of history as a list of constants (?constant? cf ?variable?) - on top of which: day_1_flights() generates key_hist from history even though it already exists as all_players - the Python facility for a 'dummy value' (that will never be used, or perhaps only 'plugged-in' to 'make things happen') is _ (the under-score/under-line character), ie for _ in c_key_hist: - an alternative to using a meaningless 'placeholder' with no computational-purpose, such as _ or dummy_i, is to choose an identifier which aids readability, eg for each_flight in c_key_hist - well done for noting that a list-comprehension could be used to generate history/ies. Two thoughts: 1 could the two for-loops be combined into a single nested list-comprehension? 2 does the reader's mind have to 'change gears' between reading the outer for-loop as a 'traditional-loop' structure, and then the inner-loop as a list-comprehension? ie would it be better to use the same type of code-construct for both? - both the code- and data-structures of day_1_flights() seem rather tortured (and tortuous), and some are unused and therefore unnecessary. Might it be possible to simplify, if the control-code commences with: for first_player_index in range( 0, len( all_players ), num_in_flight ): print( first_player_index, all_players[ first_player_index: first_player_index+num_in_flight ] ) NB the print() is to make the methodology 'visible'. - the docstring for day_1_flights() is only partly-correct. Isn't the function also creating and building the history set? - that being the case, should the initial set-definition be moved inside the function? - functions should not depend upon global values. How does the history 'pass' from one function to another - which is allied to the question: how do the functions know about values such as _all_players and num_in_flight? To make the functions self-contained and ?independent?, these values should be passed-in as parameters/arguments and/or return-ed - much of the above also applies to day_2_to_5_flights() - chief concern with day_2_to_5_flights() is: what happens to d_flight_day_N if there are fewer/more than four players per flight, or what if there are fewer/more than 5 flights? - the observation that the same players would always be the 'lead' of a flight, is solid. Thus, could the lead-list be generated from a provided-parameter, rather than stated as a constant? Could that construct (also) have been used in the earlier function? - we know (by definition) that flight() is an unnecessary set of conditions to apply during day_1, but could it be used nonetheless? If so, could day_1_flights() be 'folded into' day_2_to_5_flights() instead of having separate functions? (yes, I recall talking about the essential differences in an earlier post - and perhaps I'm biased because this was how I structured the draft-solution) [More than] enough for now? -- Regards, =dn From wlfraed at ix.netcom.com Mon Jul 18 11:17:43 2022 From: wlfraed at ix.netcom.com (Dennis Lee Bieber) Date: Mon, 18 Jul 2022 11:17:43 -0400 Subject: [Tutor] Ways of removing consequtive duplicates from a list References: <005001d8992b$227b77b0$67726710$@gmail.com> <41l6dhhjki03i05uue62o0a35vnd6n0q6d@4ax.com> <003701d89998$ffcfd940$ff6f8bc0$@gmail.com> <5b42eea3-f886-bf85-1ec3-884b1fd2694c@DancesWithMice.info> Message-ID: On Mon, 18 Jul 2022 17:34:08 +1200, dn declaimed the following: >Yes, and the OP does irritate by not answering questions from 'helpers'. >He does publish (for income/profit). I don't know if he has ever >used/repeated any of the topics discussed 'here' - nor if in doing-so he >attributes and credits appropriately (by European, UK, US... standards). > You managed to pique my interest -- so I hit Google. I don't have a LinkedIn account, so I can't get beyond the intro page, but... https://in.linkedin.com/in/manprit-singh-87961a1ba has recent "activity" posts that appear to be derived from the SQLite3 Aggregate function thread (LinkedIn blocks my cut&pasting the heading text). Presuming this IS the same person, I begin to have /my/ doubts: a "technical trainer" who seems to be using the tutor list for his own training? Constantly posting toy examples with the question "which is better/more efficient/etc." yet never (apparently) bothering to learn techniques for profiling/timing these examples (nor making examples with real-world data quantities for which profiling would show differences). -- Wulfraed Dennis Lee Bieber AF6VN wlfraed at ix.netcom.com http://wlfraed.microdiversity.freeddns.org/ From avi.e.gross at gmail.com Mon Jul 18 12:20:11 2022 From: avi.e.gross at gmail.com (avi.e.gross at gmail.com) Date: Mon, 18 Jul 2022 12:20:11 -0400 Subject: [Tutor] Ways of removing consequtive duplicates from a list In-Reply-To: References: <005001d8992b$227b77b0$67726710$@gmail.com> <41l6dhhjki03i05uue62o0a35vnd6n0q6d@4ax.com> <003701d89998$ffcfd940$ff6f8bc0$@gmail.com> <5b42eea3-f886-bf85-1ec3-884b1fd2694c@DancesWithMice.info> Message-ID: <006801d89ac2$41311be0$c39353a0$@gmail.com> Dennis, You may be changing the topic from pythons to lions as an extremely common Sikh name is variations on Singh and it is also used by some Hindus and others. I, too, have had no incentive to join LinkedIn but can see DOZENS of people with a name that might as well be John Smith. How can you zoom in on the right one? They are literally everywhere including working at LinkedIn, Microsoft and so on and plenty of them are in a search with SQL as part of it. Your link fails for me. However, I had not considered some possibilities of how someone might use a group like this. I mean it can be to collect mistakes people make or where people get stuck or how the people posting replies think, make assumptions, suggest techniques and so on. But so what? Some questions may well be reasonable even if for a purpose. And yes, some people abuse things and can make it worse for others. I know that some people/questions do after a while motivate me to ignore them. The fact is that within a few days of people discussing something here, it is considered decent for the one who started it to chime in and answer our questions or comment on what we said or simply say the problem is solved and we can move on. I will say this, I have had what I wrote ending up published when it still contained a spelling error and I was not thrilled as my informal posts are not meant to be used this way without my permission. If someone here told us up-front what they wanted to do with our work, I might e way more careful or opt out. -----Original Message----- From: Tutor On Behalf Of Dennis Lee Bieber Sent: Monday, July 18, 2022 11:18 AM To: tutor at python.org Subject: Re: [Tutor] Ways of removing consequtive duplicates from a list On Mon, 18 Jul 2022 17:34:08 +1200, dn declaimed the following: >Yes, and the OP does irritate by not answering questions from 'helpers'. >He does publish (for income/profit). I don't know if he has ever >used/repeated any of the topics discussed 'here' - nor if in doing-so >he attributes and credits appropriately (by European, UK, US... standards). > You managed to pique my interest -- so I hit Google. I don't have a LinkedIn account, so I can't get beyond the intro page, but... https://in.linkedin.com/in/manprit-singh-87961a1ba has recent "activity" posts that appear to be derived from the SQLite3 Aggregate function thread (LinkedIn blocks my cut&pasting the heading text). Presuming this IS the same person, I begin to have /my/ doubts: a "technical trainer" who seems to be using the tutor list for his own training? Constantly posting toy examples with the question "which is better/more efficient/etc." yet never (apparently) bothering to learn techniques for profiling/timing these examples (nor making examples with real-world data quantities for which profiling would show differences). -- Wulfraed Dennis Lee Bieber AF6VN wlfraed at ix.netcom.com http://wlfraed.microdiversity.freeddns.org/ _______________________________________________ Tutor maillist - Tutor at python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor From avi.e.gross at gmail.com Mon Jul 18 12:45:19 2022 From: avi.e.gross at gmail.com (avi.e.gross at gmail.com) Date: Mon, 18 Jul 2022 12:45:19 -0400 Subject: [Tutor] Ways of removing consequtive duplicates from a list In-Reply-To: <5b42eea3-f886-bf85-1ec3-884b1fd2694c@DancesWithMice.info> References: <005001d8992b$227b77b0$67726710$@gmail.com> <41l6dhhjki03i05uue62o0a35vnd6n0q6d@4ax.com> <003701d89998$ffcfd940$ff6f8bc0$@gmail.com> <5b42eea3-f886-bf85-1ec3-884b1fd2694c@DancesWithMice.info> Message-ID: <008001d89ac5$c49bcf90$4dd36eb0$@gmail.com> I will try a short answer to the topic of why some of us (meaning in this case ME) react to what we may see not so much as cultural differences but to feeling manipulated. I have had people in my life who say things like "You are so smart you can probably do this for me in just a few minutes" and lots of variations on such a theme. As it happens, that is occasionally true when I already happen to know how to do it. But sometimes I have to do lots of research and experimentation or ask others with specific expertise. Being smart or over-educated in one thing is not the same as being particularly good at other things. How would you feel if asked to write a program (for no pay) and after spending lots of time and showing the results, the other guy says that this is pretty much how they already did it and they just wanted to see if it was right or the best way so they asked you? What a waste of time for no real result! I have found there are people in this world who use techniques ranging from flattery to guilt to get you to do things for them. One of these finally really annoyed me by asking if I knew of a good lawyer to do real estate for a friend of his about 25 miles from where I live. He lives perhaps a hundred miles away and neither of us is particularly knowledgeable about the area and I don't even know lawyers in my area. We both can use the darn internet. So why ask me? The answer is because they are users. Note many people ask serious questions here and we ask the same question. Did you do any kind of search before asking? Did you write any code and see where it fails? So next time we get a question like this one, how about we reply with a request that they provide their own thoughts FIRST and also spell out what the meaning of words like "best" is and only once they convince us they have tried and really need help, do we jump in. I am not necessarily talking about everyone with a question, but definitely about repeaters., -----Original Message----- From: Tutor On Behalf Of dn Sent: Monday, July 18, 2022 1:34 AM To: tutor at python.org Subject: Re: [Tutor] Ways of removing consequtive duplicates from a list >> ... you just asked any >> women present to not reply to you, nor anyone who has not been >> knighted by a Queen. I personally do not expect such politeness but clearly some do. >> > > I confess it took me longer than it should have to figure out to what > you were referring in the second half of the above but eventually the > light came on and the smile blossomed! > My next thought was that it wouldn't necessarily have had to have been > a Queen although anyone knighted (by a King) prior to the beginning of > our current Queen's reign is unlikely to be even alive let alone > interested in this sort of thing. > Thanks for the morning smile! I've not been knighted, but am frequently called "Sir". Maybe when you too have a grey beard? (hair atop same head, optional) It is not something that provokes a positive response, and if in a grumpy-mood may elicit a reply describing it as to be avoided "unless we are both in uniform". Thus, the same words hold different dictionary-meanings and different implications for different people, in different contexts, and between cultures! I'm not going to invoke any Python community Code of Conduct terms or insist upon 'politically-correctness', but am vaguely-surprised that someone has not... The gender observation is appropriate, but then how many of the OP's discussions feature responses from other than males? (not that such observation could be claimed as indisputable) Writing 'here', have often used constructions such as "him/her", with some question about how many readers might apply each form... The dislocation in response to the OP is cultural. (In this) I have advantage over most 'here', having lived and worked in India. (also having been brought-up, way back in the last century, in an old-English environment, where we were expected to address our elders using such titles. Come to think of it, also in the US of those days) In India, and many others parts of Asia, the respectful address of teachers, guides, and elders generally, is required-behavior. In the Antipodes, titles of almost any form are rarely used, and most will exchange first-names at an introduction. Whereas in Germany (for example), the exact-opposite applies, and one must remember to use the Herr-s, Doktor-s, du-forms etc. How to cope when one party is at the opposite 'end' of the scale from another? I'm reminded of 'Postel's law': "Be liberal in what you accept, and conservative in what you send". Is whether someone actually knows what they're talking-about more relevant (and more telling) than their qualifications, rank, title, whatever - or does that only apply in the tech-world where we seem to think we can be a technocracy? Living in an 'immigrant society' today, and having gone through such a process (many times, in many places) I'm intrigued by how quickly - or how slowly, some will adapt to the local culture possibly quite alien to them. Maybe worst of all are the ones who observe, but then react by assuming (or claiming) superiority - less an attempt to 'fit in', but perhaps an intent to be 'more equal'... > I may be getting touchy without the feely, but I am having trouble > listening to the way some people with cultural differences, or far > left/right attitudes, try to address me/us in forums like this. Alex > may have been amused by my retort, and there is NOTHING wrong with > saying "Dear Sirs" when Disagree: when *I* read the message, I am me. I am in the singular. When *you* write, you (singular) are writing to many of us (plural). Who is the more relevant party to the communication? Accordingly, "Dear Sir" not "Sirs" - unless you are seeking a collective or corporate reply, eg from a firm of solicitors. (cf the individual replies (plural, one might hope) you expect from multiple individuals - who happen to be personal-members of the (collective) mailing-list). > done in many contexts, just like someone a while ago was writing to > something like "Esteemed Professors" but it simply rubs me wrong here. Like it appears do you, I quickly lose respect for 'esteemed professors/professionals' who expect this, even revel in it. However, if one is a student or otherwise 'junior', it is a career-limiting/grade-reducing move not to accede! That said, two can play at that game: someone wanting to improve his/her grade (or seeking some other favor) will attempt ingratiation through more effusive recognition and compliment ("gilding the lily"). whither 'respect'? I recall a colleague, on an International Development team assigned to a small Pacific country, who may have been junior or at most 'equal' to myself in 'rank'. Just as in India, he would introduce himself formally as "Dr Chandrashekar" plus full position and assignment. In a more relaxed situation, his informal introduction was "Dr Chandra". It was amusing to watch the reactions both 'westerners' and locals had to this. Seeing how it didn't 'fit' with our host-culture, we took sardonic delight in referring to him as "Chandra". (yes, naughty little boys!) One day my (local, and non-tech, and female) assistant, visibly shaking, requested a private meeting with another member of the team and myself. Breaking-down into tears she apologised for interrupting the urgent-fix discussion we'd been having with senior IT staff the day before, even as we knew we were scheduled elsewhere. Her 'apology' was that Chandra was (twice) insistent for our presence and demanded that meeting be interrupted, even terminated - and that she had to obey, she said, "because he is Doctor". (we tried really hard not to laugh) For our part, knowing the guy, we knew that she should not be the recipient of any 'blow-back'. After plentiful reassurance that she was not 'in trouble' with either of us, and a talk (similar to 'here') about the [ab]use of 'titles', she not only understood, but paid us both a great compliment saying something like: I call you (first-name) because we all work together, but I call him "Doctor" because he expects me to do things *for* him! Being called by my given-name, unadorned, always proved a 'buzz' thereafter! > Back to topic, if I may, sometimes things set our moods. I am here > partially to be helpful and partially for my own amusement and > education as looking at some of the puzzles presented presents > opportunities to think and investigate. > > But before I could get to the reasonable question here, I was > perturbed at the overly formulaic politeness and wrongness of the > greeting from my perhaps touchy perspective for the reasons mentioned > including the way it seeming assumes no Ladies are present and we are > somehow Gentlemen, but also by the mess I saw on one wrapped line that > was a pain to take apart. Then I wondered why the question was being > asked. Yes, weirdly, it is a question you and I have discussed before > when wondering which way of doing something worked better, was more > efficient, or showed a more brilliant way to use the wrong method to do something nobody designed it for! Yep, rubs me the wrong way too! (old grumpy-guts is likely to say "no!", on principle - and long before they've even finished their wind-up!) BTW such is not just an Asian 'thing' either - I recall seeing, and quickly avoiding, the latest version of a perennial discussion about protocol. Specifically, the sequence of email-addresses one should use in the To: and Cc: fields of email-messages (and whether or not Bcc: is "respectful"). Even today, in the US and UK, some people and/or organisations demand that the more 'important' names should precede those of mere-minions. "We the people" meets "some, more equal than others"! Yes, and the OP does irritate by not answering questions from 'helpers'. He does publish (for income/profit). I don't know if he has ever used/repeated any of the topics discussed 'here' - nor if in doing-so he attributes and credits appropriately (by European, UK, US... standards). > I am not sure who read my longish message, but I hope the main point > is that sometimes you should just TEST it. This is not long and complex code. > However, there cannot be any one test everyone will agree on and it > often depends on factors other than CPU cycles. A robust > implementation that can handle multiple needs may well be slower and > yet more cost effective in some sense. Another source of irritation: define terms-used, eg what is the metric for "better" or "best"? Frankly, the succession of 'academic questions' with dubious application in the real world (CRC-checks notwithstanding) have all the flavor of someone writing an old-fashioned text-book - emphasis on facts, with professional application relegated to lesser (if any) import, and perhaps more than a little "I'm so much smarter than you". NB the Indian and many Asian education systems use techniques which are regarded as 'old', yet at the same time they are apparently effective! -- Regards, =dn _______________________________________________ Tutor maillist - Tutor at python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor From avi.e.gross at gmail.com Mon Jul 18 13:42:54 2022 From: avi.e.gross at gmail.com (avi.e.gross at gmail.com) Date: Mon, 18 Jul 2022 13:42:54 -0400 Subject: [Tutor] Ways of removing consequtive duplicates from a list In-Reply-To: <28dee502-7274-b2cb-26f5-89010761f42e@web.de> References: <005001d8992b$227b77b0$67726710$@gmail.com> <00b001d899fe$8c3cb000$a4b61000$@gmail.com> <28dee502-7274-b2cb-26f5-89010761f42e@web.de> Message-ID: <008d01d89acd$cf82d220$6e887660$@gmail.com> Peter, I studied Pathology in school but we used human bodies rather than the pythons you are abusing. The discussion we are having is almost as esoteric and has to do with theories of computation and what algorithms are in some sense provable and which are probabilistic to the point of them failing being very rare and which are more heuristic and tend to work and perhaps get a solution that is close enough to optimal and which ones never terminate and so on. My point was that I played with your idea and was convinced it should work as long as you only create the object once and never copy it or in any way include it in the list or iterable. That seems very doable. But your new comment opens up another door. Turn your class A around: class A: def __eq__(self, other): return True def __ne__(self, other): return False make it: class all_alone: def __eq__(self, other): return False def __ne__(self, other): return True If you made an object of that class, it won't even report being equal to itself! That is very slightly better but not really an important distinction. But what happens when A is compared to all_alone may depend on which is first. Worth a try? You can always flip the order of the comparison as needed. I do note back in my UNIX days, we often needed a guaranteed unique ID as in a temporary filename and often used a process ID deemed to be unique. But processed come and go and eventually that process ID is re-used and odd things can happen if it find files already there or ... -----Original Message----- From: Tutor On Behalf Of Peter Otten Sent: Monday, July 18, 2022 3:15 AM To: tutor at python.org Subject: Re: [Tutor] Ways of removing consequtive duplicates from a list On 17/07/2022 18:59, avi.e.gross at gmail.com wrote: > You could make the case, Peter, that you can use anything as a start > that will not likely match in your domain. You are correct if an empty > string may be in the data. > > Now an object returned by object is pretty esoteric and ought to be > rare and indeed each new object seems to be individual. > > val=object() > > [(val := ele) for ele in [1,1,2,object(),3,3,3] if ele != val] > -->> [1, 2, , 3] > > So the only way to trip this up is to use the same object or another > reference to it where it is silently ignored. When you want a general solution for removal of consecutive duplicates you can put the line val = object() into the deduplication function which makes it *very* unlikely that val will also be passed as an argument to that function. To quote myself: > Manprit avoided that in his similar solution by using a special value > that will compare false except in pathological cases: > >> val = object() >> [(val := ele) for ele in lst if ele != val] What did I mean with "pathological"? One problematic case would be an object that compares equal to everything, class A: def __eq__(self, other): return True def __ne__(self, other): return False but that is likely to break the algorithm anyway. Another problematic case: objects that only implement comparison for other objects of the same type. For these deduplication will work if you avoid the out-of-band value: >>> class A: def __init__(self, name): self.name = name def __eq__(self, other): return self.name == other.name def __ne__(self, other): return self.name != other.name def __repr__(self): return f"A(name={self.name})" >>> prev = object() >>> >>> [(prev:=item) for item in map(A, "abc") if item != prev] Traceback (most recent call last): File "", line 1, in [(prev:=item) for item in map(A, "abc") if item != prev] File "", line 1, in [(prev:=item) for item in map(A, "abc") if item != prev] File "", line 5, in __ne__ def __ne__(self, other): return self.name != other.name AttributeError: 'object' object has no attribute 'name' >>> def rm_duplicates(iterable): it = iter(iterable) try: last = next(it) except StopIteration: return yield last for item in it: if item != last: yield item last = item >>> list(rm_duplicates(map(A, "aabccc"))) [A(name=a), A(name=b), A(name=c)] >>> _______________________________________________ Tutor maillist - Tutor at python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor From siddharthsatish93 at gmail.com Wed Jul 20 19:23:02 2022 From: siddharthsatish93 at gmail.com (Siddharth Satishchandran) Date: Wed, 20 Jul 2022 18:23:02 -0500 Subject: [Tutor] Need help installing a program on my computer using python (Bruker2nifti) Message-ID: Hi I am a new user to Python. I am interested in installing a program on Python (Bruker2nifti). I am having trouble writing out the appropriate code to install the program. I know pip install bruker is needed but it is not working on my console. I would like to know the correct code I need to install the program. I can provide the GitHub if needed. -Sidd From alan.gauld at yahoo.co.uk Wed Jul 20 20:25:29 2022 From: alan.gauld at yahoo.co.uk (Alan Gauld) Date: Thu, 21 Jul 2022 01:25:29 +0100 Subject: [Tutor] Need help installing a program on my computer using python (Bruker2nifti) In-Reply-To: References: Message-ID: On 21/07/2022 00:23, Siddharth Satishchandran wrote: > I know pip install bruker is needed but it is not working on my console. Please be secific. "Not working" ios not helpful. What exactly did you type? what exactly was the result? cut n paste from your console into the message, do not paraphrase the errors or output. Also tell us the OS you are using and the python version. The actual command needed, according to the PyPi page is pip install bruker2nifti And you can use the copy-to-clipboard link to paste it into your terminal. This needs to be run from your OS prompt (after installing Python, if necessary). That should download and install all necessary files. Most folks usually prefer to use the following however, because it gives more reliable results and better debugging data... python3 -m pip install bruker2nifti The package maintainer seems to have an issues page, you should try contacting him/her directly if pip does not succeed. https://github.com/SebastianoF/bruker2nifti/issues -- Alan G Author of the Learn to Program web site http://www.alan-g.me.uk/ http://www.amazon.com/author/alan_gauld Follow my photo-blog on Flickr at: http://www.flickr.com/photos/alangauldphotos From ahires99 at gmail.com Thu Jul 21 02:58:58 2022 From: ahires99 at gmail.com (Sagar Ahire) Date: Thu, 21 Jul 2022 12:28:58 +0530 Subject: [Tutor] PermissionError: [WinError 32] Message-ID: Hello sir, I am getting below error while I install ?pip install sasl? in cmd Error: PermissionError: [WinError 32] The process cannot access the file because it is being used by another process: 'C:\\Users\\SAGAR~1.AHI\\AppData\\Local\\Temp\\tmpxlfme666' I tried multiple things to resolve it but no luck, below is the list I tried Uninstall python and re-install Uninstall python and re-install with other directory folder Delete the temp folder 'C:\\Users\\SAGAR~1.AHI\\AppData\\Local\\Temp\\? Net Stop http and net start http in command prompt (getting error to stop http) Any help to resolve this will be greatly appreciated. Thank you Sagar Ahire From Sagar.Ahire at asurion.com Thu Jul 21 02:54:25 2022 From: Sagar.Ahire at asurion.com (Ahire, Sagar) Date: Thu, 21 Jul 2022 06:54:25 +0000 Subject: [Tutor] WinError 32 Message-ID: Hello sir, I am getting below error while I install "pip install sasl" in cmd Error: PermissionError: [WinError 32] The process cannot access the file because it is being used by another process: 'C:\\Users\\SAGAR~1.AHI\\AppData\\Local\\Temp\\tmpxlfme666' I tried multiple things to resolve it but no luck, below is the list I tried 1. Uninstall python and re-install 2. Uninstall python and re-install with other directory folder 3. Delete the temp folder 'C:\\Users\\SAGAR~1.AHI\\AppData\\Local\\Temp\\" 4. Net Stop http and net start http in command prompt (getting error to stop http) Any help to resolve this will be greatly appreciated. Thank you Sagar Ahire ________________________________ This message (including any attachments) contains confidential and/or privileged information. It is intended for a specific individual and purpose and is protected by law. If you are not the intended recipient, please notify the sender immediately and delete this message. Any disclosure, copying, or distribution of this message, or the taking of any action based on it, is strictly prohibited. Asurion_Internal_Use_Only From connectsachit at gmail.com Fri Jul 22 10:36:24 2022 From: connectsachit at gmail.com (Sachit Murarka) Date: Fri, 22 Jul 2022 20:06:24 +0530 Subject: [Tutor] Error while connecting to SQL Server using Python Message-ID: Hello Users, Facing below error while using pyodbc to connect to SQL Server. conn = pyodbc.connect( pyodbc.Error: ('01000', "[01000] [unixODBC][Driver Manager]Can't open lib 'ODBC Driver 17 for SQL Server' : file not found (0) (SQLDriverConnect)") Can anyone pls suggest what could be done? Kind Regards, Sachit Murarka From wlfraed at ix.netcom.com Fri Jul 22 13:27:14 2022 From: wlfraed at ix.netcom.com (Dennis Lee Bieber) Date: Fri, 22 Jul 2022 13:27:14 -0400 Subject: [Tutor] Error while connecting to SQL Server using Python References: Message-ID: On Fri, 22 Jul 2022 20:06:24 +0530, Sachit Murarka declaimed the following: >Facing below error while using pyodbc to connect to SQL Server. > >conn = pyodbc.connect( pyodbc.Error: ('01000', "[01000] [unixODBC][Driver >Manager]Can't open lib 'ODBC Driver 17 for SQL Server' : file not found (0) >(SQLDriverConnect)") > >Can anyone pls suggest what could be done? > What OS? (the "unixODBC" suggests it may be some variant of Linux, but...). What is the connect string (obscure any passwords, maybe hostname, but leave the rest). Your "error" doesn't look like a normal Python traceback, and/or newlines have been stripped. Have you installed the required ODBC module? Or some other DB-API... For example (from a stale Debian "apt search"): python3-pymssql/oldstable 2.1.4+dfsg-1 amd64 Python database access for MS SQL server and Sybase - Python 3 https://docs.microsoft.com/en-us/sql/connect/odbc/download-odbc-driver-for-sql-server?view=sql-server-ver16 https://docs.microsoft.com/en-us/sql/connect/odbc/linux-mac/installing-the-microsoft-odbc-driver-for-sql-server?view=sql-server-ver16 (actual current version is 18, not 17) -- Wulfraed Dennis Lee Bieber AF6VN wlfraed at ix.netcom.com http://wlfraed.microdiversity.freeddns.org/ From mats at wichmann.us Fri Jul 22 13:32:30 2022 From: mats at wichmann.us (Mats Wichmann) Date: Fri, 22 Jul 2022 11:32:30 -0600 Subject: [Tutor] Error while connecting to SQL Server using Python In-Reply-To: References: Message-ID: <89b1fad7-f97f-753b-6960-380f13d742c2@wichmann.us> On 7/22/22 11:27, Dennis Lee Bieber wrote: > On Fri, 22 Jul 2022 20:06:24 +0530, Sachit Murarka > declaimed the following: > >> Facing below error while using pyodbc to connect to SQL Server. >> >> conn = pyodbc.connect( pyodbc.Error: ('01000', "[01000] [unixODBC][Driver >> Manager]Can't open lib 'ODBC Driver 17 for SQL Server' : file not found (0) >> (SQLDriverConnect)") >> >> Can anyone pls suggest what could be done? >> > > What OS? (the "unixODBC" suggests it may be some variant of Linux, > but...). If it *is* Linux, check that the driver you're trying to use is listed in /etc/odbcinst.ini. There are a bunch of defaults there, but I think the Microsoft ODBC drivers, of which there about two zillion different versions, are not among them. There should be documentation on that topic, if that is indeed the issue. From mats at wichmann.us Fri Jul 22 14:12:45 2022 From: mats at wichmann.us (Mats Wichmann) Date: Fri, 22 Jul 2022 12:12:45 -0600 Subject: [Tutor] Error while connecting to SQL Server using Python In-Reply-To: References: <89b1fad7-f97f-753b-6960-380f13d742c2@wichmann.us> Message-ID: <40461bf5-5f96-efb7-1946-17e928cd2a0d@wichmann.us> On 7/22/22 12:08, Sachit Murarka wrote: > Hi Mats/Dennis, > > Following is the output. > > ?$cat /etc/odbcinst.ini > [ODBC Driver 17 for SQL Server] > Description=Microsoft ODBC Driver 17 for SQL Server > Driver=/opt/microsoft/msodbcsql17/lib64/libmsodbcsql-17.10.so.1.1 > UsageCount=1 > > We are using Ubuntu and we have installed odbc driver as well. >> If it *is* Linux, check that the driver you're trying to use is listed >> in /etc/odbcinst.ini.? There are a bunch of defaults there, but I think >> the Microsoft ODBC drivers, of which there about two zillion different >> versions, are not among them. Well, there goes my one idea, already taken care of (assuming all is correct with that - it did say "file not found")... hope somebody else has some ideas! From bouncingcats at gmail.com Fri Jul 22 19:03:40 2022 From: bouncingcats at gmail.com (David) Date: Sat, 23 Jul 2022 09:03:40 +1000 Subject: [Tutor] Error while connecting to SQL Server using Python In-Reply-To: <40461bf5-5f96-efb7-1946-17e928cd2a0d@wichmann.us> References: <89b1fad7-f97f-753b-6960-380f13d742c2@wichmann.us> <40461bf5-5f96-efb7-1946-17e928cd2a0d@wichmann.us> Message-ID: On Sat, 23 Jul 2022 at 04:14, Mats Wichmann wrote: > On 7/22/22 12:08, Sachit Murarka wrote: > > $cat /etc/odbcinst.ini > > [ODBC Driver 17 for SQL Server] > > Description=Microsoft ODBC Driver 17 for SQL Server > > Driver=/opt/microsoft/msodbcsql17/lib64/libmsodbcsql-17.10.so.1.1 > > UsageCount=1 Another easy thing to check could be to find out what user the Python process is running as, and confirm that this user can read the file at /opt/microsoft/msodbcsql17/lib64/libmsodbcsql-17.10.so.1.1 and all its parent directories. From connectsachit at gmail.com Fri Jul 22 14:08:34 2022 From: connectsachit at gmail.com (Sachit Murarka) Date: Fri, 22 Jul 2022 23:38:34 +0530 Subject: [Tutor] Error while connecting to SQL Server using Python In-Reply-To: <89b1fad7-f97f-753b-6960-380f13d742c2@wichmann.us> References: <89b1fad7-f97f-753b-6960-380f13d742c2@wichmann.us> Message-ID: Hi Mats/Dennis, Following is the output. $cat /etc/odbcinst.ini [ODBC Driver 17 for SQL Server] Description=Microsoft ODBC Driver 17 for SQL Server Driver=/opt/microsoft/msodbcsql17/lib64/libmsodbcsql-17.10.so.1.1 UsageCount=1 We are using Ubuntu and we have installed odbc driver as well. Have referred following documentation:- https://docs.microsoft.com/en-us/sql/connect/odbc/linux-mac/installing-the-microsoft-odbc-driver-for-sql-server?view=sql-server-ver16#ubuntu17 Kind Regards, Sachit Murarka On Fri, Jul 22, 2022 at 11:04 PM Mats Wichmann wrote: > On 7/22/22 11:27, Dennis Lee Bieber wrote: > > On Fri, 22 Jul 2022 20:06:24 +0530, Sachit Murarka > > declaimed the following: > > > >> Facing below error while using pyodbc to connect to SQL Server. > >> > >> conn = pyodbc.connect( pyodbc.Error: ('01000', "[01000] > [unixODBC][Driver > >> Manager]Can't open lib 'ODBC Driver 17 for SQL Server' : file not found > (0) > >> (SQLDriverConnect)") > >> > >> Can anyone pls suggest what could be done? > >> > > > > What OS? (the "unixODBC" suggests it may be some variant of Linux, > > but...). > > If it *is* Linux, check that the driver you're trying to use is listed > in /etc/odbcinst.ini. There are a bunch of defaults there, but I think > the Microsoft ODBC drivers, of which there about two zillion different > versions, are not among them. > > There should be documentation on that topic, if that is indeed the issue. > > > _______________________________________________ > Tutor maillist - Tutor at python.org > To unsubscribe or change subscription options: > https://mail.python.org/mailman/listinfo/tutor > From trent.shipley at gmail.com Fri Jul 22 11:27:57 2022 From: trent.shipley at gmail.com (trent shipley) Date: Fri, 22 Jul 2022 08:27:57 -0700 Subject: [Tutor] Volunteer teacher Message-ID: I've volunteered to do some informal Python teaching. What are some useful online resources and tutorials? What are some good, introductory books--whether ebooks or dead tree. I'm thinking of something very reader friendly, like the "teach yourself in 24 hours", or "for dummies series", but with good exercises. Has anyone used https://exercism.org/tracks/python? I've had good luck with the four JavaScript exercises I did, but I did one Scala exercise and the grader was broken (I confirmed it with a live mentor.) Trent From connectsachit at gmail.com Fri Jul 22 23:15:10 2022 From: connectsachit at gmail.com (Sachit Murarka) Date: Sat, 23 Jul 2022 08:45:10 +0530 Subject: [Tutor] Error while connecting to SQL Server using Python In-Reply-To: References: <89b1fad7-f97f-753b-6960-380f13d742c2@wichmann.us> <40461bf5-5f96-efb7-1946-17e928cd2a0d@wichmann.us> Message-ID: Hey David, Thanks for response, The below file has access to the user which is being executed to run python. Kind Regards, Sachit Murarka On Sat, Jul 23, 2022 at 4:35 AM David wrote: > On Sat, 23 Jul 2022 at 04:14, Mats Wichmann wrote: > > On 7/22/22 12:08, Sachit Murarka wrote: > > > > $cat /etc/odbcinst.ini > > > [ODBC Driver 17 for SQL Server] > > > Description=Microsoft ODBC Driver 17 for SQL Server > > > Driver=/opt/microsoft/msodbcsql17/lib64/libmsodbcsql-17.10.so.1.1 > > > UsageCount=1 > > Another easy thing to check could be to find out what user the Python > process is running as, and confirm that this user can read the file at > /opt/microsoft/msodbcsql17/lib64/libmsodbcsql-17.10.so.1.1 > and all its parent directories. > _______________________________________________ > Tutor maillist - Tutor at python.org > To unsubscribe or change subscription options: > https://mail.python.org/mailman/listinfo/tutor > From leamhall at gmail.com Sat Jul 23 05:53:22 2022 From: leamhall at gmail.com (Leam Hall) Date: Sat, 23 Jul 2022 04:53:22 -0500 Subject: [Tutor] Volunteer teacher In-Reply-To: References: Message-ID: Trent, Two variables are the ages of the students, and their existing coding skills. Are they teens who need a first language or grizzled Assembler veterans in need of recovery? For the former, I'd recommend the combination of "Practical Programming" (https://www.amazon.com/Practical-Programming-Introduction-Computer-Science/dp/1680502689) and the Coursera courses by the authors (https://www.coursera.org/learn/learn-to-program and https://www.coursera.org/learn/program-code). That gives an easy introduction into doing stuff with computers, and hits both visual and book learners. For the latter, Python Object Oriented Programming (https://www.amazon.com/Python-Object-Oriented-Programming-maintainable-object-oriented/dp/1801077266). There's a new book out, Python Distilled (https://www.amazon.com/Python-Essential-Reference-Developers-Library/dp/0134173279) by David Beazley. I liked David's work on the "Python Cookbook", but can't speak about "Python Distilled" from experience. The Exercism stuff is usually good, but I'd suggest you go through it first. There are quirks in the problem explanations, and some bugs. Leam On 7/22/22 10:27, trent shipley wrote: > I've volunteered to do some informal Python teaching. > > What are some useful online resources and tutorials? > > What are some good, introductory books--whether ebooks or dead tree. I'm > thinking of something very reader friendly, like the "teach yourself in 24 > hours", or "for dummies series", but with good exercises. > > Has anyone used https://exercism.org/tracks/python? I've had good luck > with the four JavaScript exercises I did, but I did one Scala exercise and > the grader was broken (I confirmed it with a live mentor.) > > > Trent > _______________________________________________ > Tutor maillist - Tutor at python.org > To unsubscribe or change subscription options: > https://mail.python.org/mailman/listinfo/tutor -- Automation Engineer (reuel.net/resume) Scribe: The Domici War (domiciwar.net) General Ne'er-do-well (github.com/LeamHall) From wlfraed at ix.netcom.com Sat Jul 23 15:14:47 2022 From: wlfraed at ix.netcom.com (Dennis Lee Bieber) Date: Sat, 23 Jul 2022 15:14:47 -0400 Subject: [Tutor] Volunteer teacher References: Message-ID: On Sat, 23 Jul 2022 04:53:22 -0500, Leam Hall declaimed the following: > >For the latter, Python Object Oriented Programming (https://www.amazon.com/Python-Object-Oriented-Programming-maintainable-object-oriented/dp/1801077266). > A rather unfortunate name... Acronym POOP... "Object Oriented Programming in Python" avoids such Of course -- my view is that, if one is going to focus on OOP, one should precede it with an introduction to a language-neutral OOAD textbook. -- Wulfraed Dennis Lee Bieber AF6VN wlfraed at ix.netcom.com http://wlfraed.microdiversity.freeddns.org/ From leamhall at gmail.com Sat Jul 23 15:24:31 2022 From: leamhall at gmail.com (Leam Hall) Date: Sat, 23 Jul 2022 14:24:31 -0500 Subject: [Tutor] Volunteer teacher In-Reply-To: References: Message-ID: <2118ecff-c37f-c4cf-5f1e-1b706c500c06@gmail.com> On 7/23/22 14:14, Dennis Lee Bieber wrote: > On Sat, 23 Jul 2022 04:53:22 -0500, Leam Hall > declaimed the following: >> For the latter, Python Object Oriented Programming (https://www.amazon.com/Python-Object-Oriented-Programming-maintainable-object-oriented/dp/1801077266). >> > > > > A rather unfortunate name... Acronym POOP... "Object Oriented > Programming in Python" avoids such > > Of course -- my view is that, if one is going to focus on OOP, one > should precede it with an introduction to a language-neutral OOAD textbook. Worse, the book is published by Packt; so it's "Packt POOP". :) I disagree on the "OOAD first" opinion, though. Programming is about exploration, and we learn more by exploring with fewer third party constraints. Those OOAD tomes are someone else's opinion on how we should do things, and until we have a handle on what we're actually able to do then there's no frame of reference for the OODA to stick to. I'm a prime example of "needs to read less and code more". Incredibly bad habit, see a good book and buy it before really understanding the last half-dozen or so books I already have on that topic. Well, with Python I'm over a dozen, but other languages not so much. -- Automation Engineer (reuel.net/resume) Scribe: The Domici War (domiciwar.net) General Ne'er-do-well (github.com/LeamHall) From mats at wichmann.us Sat Jul 23 15:27:09 2022 From: mats at wichmann.us (Mats Wichmann) Date: Sat, 23 Jul 2022 13:27:09 -0600 Subject: [Tutor] Volunteer teacher In-Reply-To: References: Message-ID: <75553f79-af0e-6fe1-7f6b-9c2e7f509c72@wichmann.us> On 7/23/22 13:14, Dennis Lee Bieber wrote: > On Sat, 23 Jul 2022 04:53:22 -0500, Leam Hall > declaimed the following: > >> >> For the latter, Python Object Oriented Programming (https://www.amazon.com/Python-Object-Oriented-Programming-maintainable-object-oriented/dp/1801077266). >> > > > > A rather unfortunate name... Acronym POOP... "Object Oriented > Programming in Python" avoids such > > Of course -- my view is that, if one is going to focus on OOP, one > should precede it with an introduction to a language-neutral OOAD textbook. Maybe... I haven't looked at one for so long, but I'd worry that they'd nod too much to existing implementations like Java which enforce a rather idiotic "everything must be a class even if it isn't, like your main() routine". From learn2program at gmail.com Sat Jul 23 19:25:06 2022 From: learn2program at gmail.com (Alan Gauld) Date: Sun, 24 Jul 2022 00:25:06 +0100 Subject: [Tutor] Volunteer teacher In-Reply-To: <2118ecff-c37f-c4cf-5f1e-1b706c500c06@gmail.com> References: <2118ecff-c37f-c4cf-5f1e-1b706c500c06@gmail.com> Message-ID: <0ef4cf34-21fe-9af1-48d6-c6bc0c699d7a@yahoo.co.uk> On 23/07/2022 20:24, Leam Hall wrote: > >> Of course -- my view is that, if one is going to focus on OOP, one >> should precede it with an introduction to a language-neutral OOAD textbook. > I disagree on the "OOAD first" opinion, though. Programming is about exploration, Its one view. But not a universal one and certainly not what the founding fathers thought. And defiitely not what the originators of "software enginering" thought. To them programming was akin to bricklaying. The final part of the process after you had analyzed the system and designed the solution. Then you got your materials and followed the design. And, agile theories not withstanding, it's still how many large organisations view things. > Those OOAD tomes are someone else's opinion on how we should do things, That's true, albeit based on a lot of data driven science rather than the gut-feel and "personal experience" theory that drives much of modern software development. But especially OOP is a style of programming that needs understanding of the principles before programming constructs like classes etc make sense. OOP came about before classes as we know them. Classes were borrowed from Simula as a convenient mechanism for building OOP systems. > until we have a handle on what we're actually able to do then there's no | > frame of reference for the OODA to stick to. I'd turn that around and say without the OOAD frame of reference you can't make sense of OOP constructs. Sadly many students today are not taught OOP but only taught how to build classes, as if classes were OOP. Then they call themselves OOP programmers but in reality build procedural programs using quasi abstract-data- types implemented as classes. And many never do understand the difference between programming with objects and building genuinely object-oriented programs. -- Alan G Author of the Learn to Program web site http://www.alan-g.me.uk/ http://www.amazon.com/author/alan_gauld Follow my photo-blog on Flickr at: http://www.flickr.com/photos/alangauldphotos From leamhall at gmail.com Sat Jul 23 20:53:07 2022 From: leamhall at gmail.com (Leam Hall) Date: Sat, 23 Jul 2022 19:53:07 -0500 Subject: [Tutor] Pair 'o dimes wuz: Volunteer teacher In-Reply-To: <0ef4cf34-21fe-9af1-48d6-c6bc0c699d7a@yahoo.co.uk> References: <2118ecff-c37f-c4cf-5f1e-1b706c500c06@gmail.com> <0ef4cf34-21fe-9af1-48d6-c6bc0c699d7a@yahoo.co.uk> Message-ID: <5482fe87-c7a4-db6d-d490-c89b83c2869e@gmail.com> On 7/23/22 18:25, Alan Gauld wrote: > > On 23/07/2022 20:24, Leam Hall wrote: >> >>> Of course -- my view is that, if one is going to focus on OOP, one >>> should precede it with an introduction to a language-neutral OOAD textbook. >> I disagree on the "OOAD first" opinion, though. Programming is about exploration, > > Its one view. But not a universal one and certainly not what the > founding fathers thought. > > And defiitely not what the originators of "software enginering" thought. > To them > programming was akin to bricklaying. The final part of the process after > you had > analyzed the system and designed the solution. Then you got your > materials and > followed the design. And, agile theories not withstanding, it's still > how many large > organisations view things. > >> Those OOAD tomes are someone else's opinion on how we should do things, > > That's true, albeit based on a lot of data driven science rather than > the gut-feel > and "personal experience" theory that drives much of modern software > development. > > But especially OOP is a style of programming that needs understanding of > the > principles before programming constructs like classes etc make sense. OOP > came about before classes as we know them. Classes were borrowed from > Simula as a convenient mechanism for building OOP systems. > >> until we have a handle on what we're actually able to do then there's no | >> frame of reference for the OODA to stick to. > > I'd turn that around and say without the OOAD frame of reference you > can't make sense of OOP constructs. Sadly many students today are not > taught OOP but only taught how to build classes, as if classes were OOP. > > Then they call themselves OOP programmers but in reality build procedural > programs using quasi abstract-data- types implemented as classes. And many > never do understand the difference between programming with objects and > building genuinely object-oriented programs. > About the only truly universal things are hydrogen and paperwork, most everything else is contextual. I'd be surprised that the founding fathers couldn't code in anything before they came up with OOP. It seems odd to design a building before you know how bricks or electrical systems work. The building architects and civil engineers I know really do have a handle on the nuts and bolts of things, and then they spend years as underlings before they ever get to be lead designer. Design isn't code, it won't run on the computer. It is a nice skill to have, and large organizations often spend a considerable time on design. And they spend a lot of resources on failed projects and no-longer-useful designs. Does anyone not have at least one experience where the designers cooked up something that wouldn't work? I feel one of Python's strengths is that it can do OOP, as well as other styles of programming. That lets people create actual working "stuff", and then evaluate how to improve the system as new environmental data and requirements come in. What people call themselves, and what paradigms they use is irrelevant; working code wins. -- Automation Engineer (reuel.net/resume) Scribe: The Domici War (domiciwar.net) General Ne'er-do-well (github.com/LeamHall) From PythonList at DancesWithMice.info Sat Jul 23 20:54:32 2022 From: PythonList at DancesWithMice.info (dn) Date: Sun, 24 Jul 2022 12:54:32 +1200 Subject: [Tutor] Volunteer teacher In-Reply-To: <75553f79-af0e-6fe1-7f6b-9c2e7f509c72@wichmann.us> References: <75553f79-af0e-6fe1-7f6b-9c2e7f509c72@wichmann.us> Message-ID: <29035bba-b1b5-dc58-7379-f009eeae12a5@DancesWithMice.info> On 24/07/2022 07.27, Mats Wichmann wrote: > On 7/23/22 13:14, Dennis Lee Bieber wrote: >> On Sat, 23 Jul 2022 04:53:22 -0500, Leam Hall >> declaimed the following: >> >>> >>> For the latter, Python Object Oriented Programming (https://www.amazon.com/Python-Object-Oriented-Programming-maintainable-object-oriented/dp/1801077266). >>> >> >> >> >> A rather unfortunate name... Acronym POOP... "Object Oriented >> Programming in Python" avoids such >> >> Of course -- my view is that, if one is going to focus on OOP, one >> should precede it with an introduction to a language-neutral OOAD textbook. > > > Maybe... I haven't looked at one for so long, but I'd worry that they'd > nod too much to existing implementations like Java which enforce a > rather idiotic "everything must be a class even if it isn't, like your > main() routine". +1 (here, and +1 with @Alan's post) and +1 to the 'paralysis through analysis' syndrome @Leam mentions. The best way to 'learn' is to 'do'! Reading the same facts in similar books/sources is unlikely to improve 'learning' as much as you might hope! As a trainer (my $job is not in Python) my colleagues and I struggle with trying to find a balance between 'doing stuff' with code, and learning the more academic theory 'behind' ComSc and/or software-development. It's a lot easier if one only covers 'Python', and not the theory behind OOP (or example). However, professionals will (usually) benefit from both. Worse, there are a number of topics, eg Source Code Management, Test-Driven Development, O-O, even 'modules' and (the infamous) 'comments'; which are/will always be problematic topics at the 'Beginner level' - because if one is only writing short 'toy' program[me]s, there is no *apparent* need or purpose - which makes it very much an exercise in theory! We are running something of a (pedagogical) experiment at our Python Users' Group. The PUG consists of many 'groups' of folks, including hobbyists and professionals, ranging from 101-students and 'Beginners', through to 'Python Masters' - so it is quite a challenge to find meeting-topics which will offer 'something for everyone'. Currently, we have Olaf running a Presentation-based series on "Software Craftsmanship". Accordingly, he talks of systems at a 'high level'. For example 'inversion of control' is illustrated with 'Uncle Bob's' very theoretical "The Clean Architecture" diagram (which I call 'the circles diagram': https://image.slidesharecdn.com/cleanarchitectureonandroid-160528160204/95/clean-architecture-on-android-11-638.jpg?cb=1464451395 In complementary (and complimentary) fashion, I am working 'bottom up' on a series of 'Coding Evenings' called "Crafting Software" using the ultimate practical 'code-along-at-home' approach. This has started with using the REPL (PyCharm's Python Console - our meetings feature a 'door prize', sponsored by JetBrains - thanks guys!). We commenced implementing a very (very) simple Business Rule/spec, and immediately launched into grabbing some input data. Coders who 'dive straight in' make me wince, so you can only imagine the required manful self-discipline... Newcomers were learning how to use input() - that's how 'low' we started! The object of this series is to build a routine and then gradually expand and improve same. Along the way we will quickly discover the hassles of changing from a single (constant) price (for 1KG/2lb of apples), to (say) an in-code price-list, to having product detail stored in a database. Hopefully there will be realisation(s): 'we should have asked that question before warming-up the keyboard', as well as 'if we design with change in-mind, our later-lives will be a lot easier'... 'SOLID by stealth'! So, the series' aim is to show that a bit of thought (essentially the implementation of U.Bob's diagram showing 'inversion', independence, cohesion, and coupling) up-front is a worthwhile investment - as well as demonstrating 'how to do it' in Python, and a bunch of paradigms and principles, etc, along-the-way. Relevance to the OP: is to get-going, and realise that any 'later' "refactoring" is not a crime/sin - indeed may be a valuable part of the learning-experience. Relevance to you/more details about the two series: - Olaf's series runs bi-monthly (s/b mid-August*), but Coding Evenings monthly * KiwiPyCon will be held (in-person and also on-line) 19-21 August (after three postponements! https://kiwipycon.nz) - the PUG gathers for two meetings pcm - meeting-details are published through https://www.meetup.com/nzpug-auckland/ - although the labels say "New Zealand" and "Auckland", we are no-longer 'local', running in the virtual world - accordingly, relative time-zones are the deciding-factor, so we often refer to ourselves as 'UTC+12' (or the PUG at the end of the universe?) - all welcome - learners and contributors alike! PS are looking at introducing git (or...) as part of the "Crafting Software" series, and UML to ease Olaf's 'load'. Would you please volunteer a 'lightning talk' and demo on one/the other/both subject(s)? -- Regards, =dn From PythonList at DancesWithMice.info Sat Jul 23 21:01:49 2022 From: PythonList at DancesWithMice.info (dn) Date: Sun, 24 Jul 2022 13:01:49 +1200 Subject: [Tutor] Pair 'o dimes wuz: Volunteer teacher In-Reply-To: <5482fe87-c7a4-db6d-d490-c89b83c2869e@gmail.com> References: <2118ecff-c37f-c4cf-5f1e-1b706c500c06@gmail.com> <0ef4cf34-21fe-9af1-48d6-c6bc0c699d7a@yahoo.co.uk> <5482fe87-c7a4-db6d-d490-c89b83c2869e@gmail.com> Message-ID: <4da5d32b-7301-3cbf-d0a5-588cd48fd3e0@DancesWithMice.info> On 24/07/2022 12.53, Leam Hall wrote: ... > I feel one of Python's strengths is that it can do OOP, as well as other > styles of programming. That lets people create actual working "stuff", > and then evaluate how to improve the system as new environmental data > and requirements come in. What people call themselves, and what > paradigms they use is irrelevant; working code wins. Agree: at one level. However, consider the statement saying "code is read more often than it is written". Thereafter, the meaning of "working"... Yes, it is likely that if we both tackled the same requirement, our delivered-code would be different. We may have used different paradigms or structures to 'get there'. This is not particularly at-issue if, as you say, they are 'working code'. However, software is subject to change. At which time, the ability with which we could read each-other's code, or code from our 6-months-ago self; becomes a major contributor to the success of the new project! Thus, my code *works* on the computer, but does it *work* for you - will you be able to take it and *win*? -- Regards, =dn From leamhall at gmail.com Sun Jul 24 07:23:43 2022 From: leamhall at gmail.com (Leam Hall) Date: Sun, 24 Jul 2022 06:23:43 -0500 Subject: [Tutor] Pair 'o dimes wuz: Volunteer teacher In-Reply-To: <4da5d32b-7301-3cbf-d0a5-588cd48fd3e0@DancesWithMice.info> References: <2118ecff-c37f-c4cf-5f1e-1b706c500c06@gmail.com> <0ef4cf34-21fe-9af1-48d6-c6bc0c699d7a@yahoo.co.uk> <5482fe87-c7a4-db6d-d490-c89b83c2869e@gmail.com> <4da5d32b-7301-3cbf-d0a5-588cd48fd3e0@DancesWithMice.info> Message-ID: <9bff8b0e-bc7d-d26a-738a-1d402748ac1d@gmail.com> On 7/23/22 20:01, dn wrote: > On 24/07/2022 12.53, Leam Hall wrote: > ... > >> I feel one of Python's strengths is that it can do OOP, as well as other >> styles of programming. That lets people create actual working "stuff", >> and then evaluate how to improve the system as new environmental data >> and requirements come in. What people call themselves, and what >> paradigms they use is irrelevant; working code wins. > > Agree: at one level. > > However, consider the statement saying "code is read more often than it > is written". Thereafter, the meaning of "working"... > > Yes, it is likely that if we both tackled the same requirement, our > delivered-code would be different. We may have used different paradigms > or structures to 'get there'. This is not particularly at-issue if, as > you say, they are 'working code'. > > However, software is subject to change. > > At which time, the ability with which we could read each-other's code, > or code from our 6-months-ago self; becomes a major contributor to the > success of the new project! Thus, my code *works* on the computer, but > does it *work* for you - will you be able to take it and *win*? Totally agree, on two levels. First, Python is a lot easier to read than many languages. My first Python task, years ago, was to try and do something with Twisted. I was successful, and I didn't even know Python at the time. The language was just that clear. Secondly, you could argue that the Twisted code was particularly well written and that's an argument for good design. I would take you at your word, I don't know the quality of Twisted code. I would very much agree with you that a good design, implemented well, beats a lot of the code I have seen. It beats a lot of code I have written, too. I chuckled as I read your earlier response. Imagine the dev team trying to work through a spaghetti of undesigned codebase, and the design person saying "Now do you believe me that design is important?" I'm all for good design, and if you or Alan look at my code, swear under your breath, and then ask me if I'd consider fixing things with a good design, I'm going to listen. Unfortunately, we can't just open our skulls up, drop in the GoF or Booch's OOAD, and magically do good design. To learn how to implement good design, our brains need to play. First with the language itself, and Python is becoming the language of choice for many on-line college courses (Coursera, EdX). This play will be just like learning a human language; we'll sound awful and not make a lot of sense, but learning takes time. In both computer and human language, a lot of people can't get past the early learning failures, never realizing that failure is implicit in play, and play is mandatory for learning. Too often we burden them with rules and expectations that kill the joy of play. Once we have the basics, hopefully a mentor shows up that can take us to the next level. Either a strong base in verb declensions or an introduction to design concepts. Then we have to play with those new toys. Play will integrate the concepts into our skills, but it takes lots of time and lots of play. Having the toys to play "design this" with engages our brains and gives us a chance to deeply learn. We agree that good design is good. My opinion, even if it's mine alone, is that design is not the first thing to learn. Leam -- Automation Engineer (reuel.net/resume) Scribe: The Domici War (domiciwar.net) General Ne'er-do-well (github.com/LeamHall) From alan.gauld at yahoo.co.uk Sun Jul 24 07:47:24 2022 From: alan.gauld at yahoo.co.uk (Alan Gauld) Date: Sun, 24 Jul 2022 12:47:24 +0100 Subject: [Tutor] Pair 'o dimes wuz: Volunteer teacher In-Reply-To: <5482fe87-c7a4-db6d-d490-c89b83c2869e@gmail.com> References: <2118ecff-c37f-c4cf-5f1e-1b706c500c06@gmail.com> <0ef4cf34-21fe-9af1-48d6-c6bc0c699d7a@yahoo.co.uk> <5482fe87-c7a4-db6d-d490-c89b83c2869e@gmail.com> Message-ID: On 24/07/2022 01:53, Leam Hall wrote: >> programming was akin to bricklaying. The final part of the process after >> you had >> analyzed the system and designed the solution. Then you got your >> materials and >> followed the design. And, agile theories not withstanding, it's still >> how many large >> organisations view things. >> >>> Those OOAD tomes are someone else's opinion on how we should do things, >> >> That's true, albeit based on a lot of data driven science rather than >> the gut-feel >> and "personal experience" theory that drives much of modern software >> development. >> >> But especially OOP is a style of programming that needs understanding of >> the >> principles before programming constructs like classes etc make sense. OOP >> came about before classes as we know them. Classes were borrowed from >> Simula as a convenient mechanism for building OOP systems. >> >>> until we have a handle on what we're actually able to do then there's no | >>> frame of reference for the OODA to stick to. >> >> I'd turn that around and say without the OOAD frame of reference you >> can't make sense of OOP constructs. Sadly many students today are not >> taught OOP but only taught how to build classes, as if classes were OOP. >> >> Then they call themselves OOP programmers but in reality build procedural >> programs using quasi abstract-data- types implemented as classes. And many >> never do understand the difference between programming with objects and >> building genuinely object-oriented programs. >> > > About the only truly universal things are hydrogen and paperwork, most everything else is contextual. > > I'd be surprised that the founding fathers couldn't code in anything before they came up with OOP. Not OOP. But analysis and design. Remember the founding fathers of programming didn't have any interactive capability, they couldn't easily experiment. They had to write their code in assembler and transfer it to punch cards or tape. So they made sure to design their code carefully before they started writing it. They only had limited tools such as flow charts and pseudo code but the structure had to be clear. Code was ameans of translating a design idea into someting the machine understood and could execute. It was only after the development of teletypes and interpreters that experimental programming came about or even became possible. But it was still the exception. Even in the mid 80's a friend working for a large insurance company was limited to one compile run per day, anything beyond that needed formal sign-off from his boss. And in the early 90s I was working pn a project where we could only build our own module, we were'nt allowed to build the entire system(in case we broke it!). System builds ran overnight(and too several hours) So it is only in relatively recent times that the idea of programming as an experimental/explorative activity has become common place. And there is definirely a place for that, its a good way to learn language features. But if looking at things like OOP which are mich higher level it still helps to have a context and an idea of what we are trying to achieve. Focusing on the minutiae of language features to build classes can lead to very bad practices, misusing fratures. The classic example is the abuse of inheritance as a code reuse tool rather than as part of an OOD. > It seems odd to design a building before you know how bricks or > electrical systems work. But that is exactly what happens. You design the structure first then choose the appropriate materials. Yes you need to undersand about bricks etc, and what they are capable of but you don't need to be proficient in the craft of laying them. > The building architects and civil engineers I know really do > have a handle on the nuts and bolts of things, In my experience (in electrical engineering) very few engineers actually have much practical experience of assembling and maintaining electrical components. They have an army of technicians to do that. They know what the components are for, how they work and may have basic construction skills for prototyping. But they don't normally get involved at that level. Software engineering is one of the few fields where the designer is often the constructor too. > they spend years as underlings before they ever get to be lead designer. Of course, but they till need to know the design principles. > Design isn't code, it won't run on the computer. Neither will code without a design. You can design it in your head as you go along, but it will generally take longer and be less flexible and harder to maintain. Especially when it goes above a few hundred lines. And thats where things like OOP comer in because each object is like a small standalone program. Which makes it easier to design minimally. > spend a lot of resources on failed projects and no-longer-useful designs. This is a flat out myth! The vast majority of lage scale projects succeed, very few are cancelled or go badly wrong (they are often over budget and over time, but thats only measuring against the initial budgets and timescales). It's just that when they do fail they attract a lot of attention because the cost a lot of money! If a $10-50k 4-man project goes belly up nobody notices. But when a 4 year, 1000 man, project costing $100 million goes belly up it is very noticeable. But you can't build 4000 man-year projects using agile - it's been tried and invariably winds up moving to more traditional methods. Usually after a lot of wasted time and money. But the fact is that our modern world is run by large scale software projects successfully delivered by the fortune 500 companies. We just don't think about it and take it for granted every time we board a train or plane, turn on the electricity or water, collect our wages, etc. > Does anyone not have at least one experience where the designers > cooked up something that wouldn't work? I've seen stuff that was too ambitious or didn't run fast enough. But I've never seen anything designed that just didn't work at all. (I've seen designs rejected for that reason, but they never got built - thus saving a huge amount of money! That's the purpose of design.) but I've seen dozens of programs that were attempted without design that just didn't run, did the wrong thing, soaked up memory, etc etc. Its just easier to hide when there is limited evidence (documentation etc) > I feel one of Python's strengths is that it can do OOP, You can do OOP in any language, even assembler. OOP is a style of programming. Language features help make it easier that's all. OOP helps control complexity. Sometimes its the best approach, sometimes not. But that depends on the nature of the problem not the language used. > what paradigms they use is irrelevant; working code wins. Working code wins over non working code, for sure. But working code alone is not good enough. It also needs to be maintainable, efficient, and economical. For that you usually need a design. Spaghetti code has caused far more project failures than design faulures ever did. -- Alan G Author of the Learn to Program web site http://www.alan-g.me.uk/ http://www.amazon.com/author/alan_gauld Follow my photo-blog on Flickr at: http://www.flickr.com/photos/alangauldphotos From alan.gauld at yahoo.co.uk Sun Jul 24 07:52:14 2022 From: alan.gauld at yahoo.co.uk (Alan Gauld) Date: Sun, 24 Jul 2022 12:52:14 +0100 Subject: [Tutor] Volunteer teacher In-Reply-To: <75553f79-af0e-6fe1-7f6b-9c2e7f509c72@wichmann.us> References: <75553f79-af0e-6fe1-7f6b-9c2e7f509c72@wichmann.us> Message-ID: On 23/07/2022 20:27, Mats Wichmann wrote: >> should precede it with an introduction to a language-neutral OOAD textbook. > > Maybe... I haven't looked at one for so long, but I'd worry that they'd > nod too much to existing implementations like Java There are very few language neutral OOAD books and most are from the early days of OOP in the 80's and 90's. Sadly OOP has become synonymous with class based programming (with Java a prime example) and as a result has acquired a poor reputation with a generation of programmers who have never really understood what it was about. Partly because they never had to struggle with a world where OOP was not an option. -- Alan G Author of the Learn to Program web site http://www.alan-g.me.uk/ http://www.amazon.com/author/alan_gauld Follow my photo-blog on Flickr at: http://www.flickr.com/photos/alangauldphotos From avi.e.gross at gmail.com Sat Jul 23 21:16:14 2022 From: avi.e.gross at gmail.com (avi.e.gross at gmail.com) Date: Sat, 23 Jul 2022 21:16:14 -0400 Subject: [Tutor] Volunteer teacher In-Reply-To: <2118ecff-c37f-c4cf-5f1e-1b706c500c06@gmail.com> References: <2118ecff-c37f-c4cf-5f1e-1b706c500c06@gmail.com> Message-ID: <00c801d89efa$f87d6650$e97832f0$@gmail.com> You guys have it all wrong about naming and marketing. It is Snake Language Object Oriented Programming -- SLOOP --- --POOLS-- if using it backward. -----Original Message----- From: Tutor On Behalf Of Leam Hall Sent: Saturday, July 23, 2022 3:25 PM To: tutor at python.org Subject: Re: [Tutor] Volunteer teacher On 7/23/22 14:14, Dennis Lee Bieber wrote: > On Sat, 23 Jul 2022 04:53:22 -0500, Leam Hall > declaimed the following: >> For the latter, Python Object Oriented Programming (https://www.amazon.com/Python-Object-Oriented-Programming-maintainable-obje ct-oriented/dp/1801077266). >> > > > > A rather unfortunate name... Acronym POOP... "Object Oriented > Programming in Python" avoids such > > Of course -- my view is that, if one is going to focus on OOP, one > should precede it with an introduction to a language-neutral OOAD textbook. Worse, the book is published by Packt; so it's "Packt POOP". :) I disagree on the "OOAD first" opinion, though. Programming is about exploration, and we learn more by exploring with fewer third party constraints. Those OOAD tomes are someone else's opinion on how we should do things, and until we have a handle on what we're actually able to do then there's no frame of reference for the OODA to stick to. I'm a prime example of "needs to read less and code more". Incredibly bad habit, see a good book and buy it before really understanding the last half-dozen or so books I already have on that topic. Well, with Python I'm over a dozen, but other languages not so much. -- Automation Engineer (reuel.net/resume) Scribe: The Domici War (domiciwar.net) General Ne'er-do-well (github.com/LeamHall) _______________________________________________ Tutor maillist - Tutor at python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor From avi.e.gross at gmail.com Sat Jul 23 21:23:26 2022 From: avi.e.gross at gmail.com (avi.e.gross at gmail.com) Date: Sat, 23 Jul 2022 21:23:26 -0400 Subject: [Tutor] Volunteer teacher In-Reply-To: <75553f79-af0e-6fe1-7f6b-9c2e7f509c72@wichmann.us> References: <75553f79-af0e-6fe1-7f6b-9c2e7f509c72@wichmann.us> Message-ID: <00d101d89efb$f9a91fa0$ecfb5ee0$@gmail.com> Dumb Question. Every damn language I have done so-called object-oriented programming in DOES IT DIFFERENT. Some have not quite proper objects and use tricks to fake it to a point. Some have very precise objects you can only access parts of in certain ways, if at all, and others are so free-for all that it takes ingenuity to hide things so a user can not get around you and so on. If you had a book on generic object-oriented techniques and then saw Python or R or JAVA and others, what would their experience be? And I thing things do not always exist in a vacuum. Even when writing a program that uses OO I also use functional methods, recursion and anything else I feel like. Just learning OO may leave them stranded in Python! -----Original Message----- From: Tutor On Behalf Of Mats Wichmann Sent: Saturday, July 23, 2022 3:27 PM To: tutor at python.org Subject: Re: [Tutor] Volunteer teacher On 7/23/22 13:14, Dennis Lee Bieber wrote: > On Sat, 23 Jul 2022 04:53:22 -0500, Leam Hall > declaimed the following: > >> >> For the latter, Python Object Oriented Programming (https://www.amazon.com/Python-Object-Oriented-Programming-maintainable-obje ct-oriented/dp/1801077266). >> > > > > A rather unfortunate name... Acronym POOP... "Object Oriented > Programming in Python" avoids such > > Of course -- my view is that, if one is going to focus on OOP, one > should precede it with an introduction to a language-neutral OOAD textbook. Maybe... I haven't looked at one for so long, but I'd worry that they'd nod too much to existing implementations like Java which enforce a rather idiotic "everything must be a class even if it isn't, like your main() routine". _______________________________________________ Tutor maillist - Tutor at python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor From bouncingcats at gmail.com Sun Jul 24 09:02:09 2022 From: bouncingcats at gmail.com (David) Date: Sun, 24 Jul 2022 23:02:09 +1000 Subject: [Tutor] Volunteer teacher In-Reply-To: <00d101d89efb$f9a91fa0$ecfb5ee0$@gmail.com> References: <75553f79-af0e-6fe1-7f6b-9c2e7f509c72@wichmann.us> <00d101d89efb$f9a91fa0$ecfb5ee0$@gmail.com> Message-ID: On Sat, 23 Jul 2022 at 09:25, trent shipley wrote: > > I've volunteered to do some informal Python teaching. > > What are some useful online resources and tutorials? The topic of discussion seems to be drifting away from the question asked. This is the Tutor list. As I understand it, the point of this list is to respond to questions. Out of consideration for the OP, I am repeating the question that they asked, in the hope that it might focus replies towards addressing the OP questions. From alan.gauld at yahoo.co.uk Sun Jul 24 09:04:57 2022 From: alan.gauld at yahoo.co.uk (Alan Gauld) Date: Sun, 24 Jul 2022 14:04:57 +0100 Subject: [Tutor] Pair 'o dimes wuz: Volunteer teacher In-Reply-To: <9bff8b0e-bc7d-d26a-738a-1d402748ac1d@gmail.com> References: <2118ecff-c37f-c4cf-5f1e-1b706c500c06@gmail.com> <0ef4cf34-21fe-9af1-48d6-c6bc0c699d7a@yahoo.co.uk> <5482fe87-c7a4-db6d-d490-c89b83c2869e@gmail.com> <4da5d32b-7301-3cbf-d0a5-588cd48fd3e0@DancesWithMice.info> <9bff8b0e-bc7d-d26a-738a-1d402748ac1d@gmail.com> Message-ID: On 24/07/2022 12:23, Leam Hall wrote: > First, Python is a lot easier to read than many languages. True but that only really helps when working at the detail level. It does nothing to help with figuring out how a system works - which functions call which other functions. How data structures relate to each other etc. And thats where design comes in. One of the big failures in design is to go too deep and try to design every line of code. Design should only go to the level where it becomes easier to write code than design. When that stage is reached it is time to write code! > Imagine the dev team trying to work through a spaghetti of > undesigned codebase, and the design person saying "Now do > you believe me that design is important?" I've been there several times. We once received about 1 million lines of C code with no design. We had a team of guys stepping through the code in debuggers for 6 months documenting how it works and reverse engineering the "design". it took us 3 years to fully document the system. But once ew did we could turn around bugs in 24 hours and new starts could be productive within 2 days, rather than the 2 or 3 months it took when we first got the code! When I ran a maintenance team and we took on a new project one of the first tasks was to review the design and if necessary update it (or even rewrite it) in a useful style. A good design is critical to efficient maintenance, even if it has to be retro-fitted. > Unfortunately, we can't just open our skulls up, drop in the GoF > or Booch's OOAD, and magically do good design. Absolutely true. You need to start with the basics. Branching, loops, functions, modules, coupling v cohesion. Separation of concerns, data hiding. Clean interface design. Then you build up to higher level concepts like state machines, table driven code, data driven code, data structures and normalisation. And even if using a book like Booch (which is very good) it should be woked through in parallel with the language constructs. But just as reading booch alone would be useless, so is learning to define functions without understanding their purpose. Or building classes without understanding why. Learning is an iterative process. And this is especially true with software. But you need to understand the why and how equally. Learning how without knowing why leads to bad programming practices - global variables, mixing logic and display, tight coupling etc. > Once we have the basics, hopefully a mentor shows up Ideally we have the mentor in place before we even look at the basics. Even the basics can be baffling without guidance. I've see so many beginners completely baffled by a line like: x = x + 1 It makes no sense to a non-programmer. It is mathematical nonsense! (Its also my many languages have a distince assignment operator: x := x+1 is much easier to assimilate.... > We agree that good design is good. My opinion, even if it's mine alone, > is that design is not the first thing to learn. I dont think I'm arguing for design as a skill - certainly not things like UML or SSADM or even flow charts. But rather the rationale behind programming constructs. Why do we have loops? Why so many of them? And why are functions useful? Why not just cut 'n paste the code? For OOP it's about explaining why we want to use classes/objects. What does an OOP program look like - "a set of objects communicating by messages". How do we send a message from one object to another? What happens when an object receives a message? What method does the receiver choose to fulfil the message request? How does the receiver reply to the requestor? These ideas can then be translated/demonstrated in the preferred language. A decent OOAD book will describe those concepts better than a programming language tutorial in my experience. (Again, the first section of Booch with its famous cat cartoons is very good at that) -- Alan G Author of the Learn to Program web site http://www.alan-g.me.uk/ http://www.amazon.com/author/alan_gauld Follow my photo-blog on Flickr at: http://www.flickr.com/photos/alangauldphotos From alan.gauld at yahoo.co.uk Sun Jul 24 09:15:25 2022 From: alan.gauld at yahoo.co.uk (Alan Gauld) Date: Sun, 24 Jul 2022 14:15:25 +0100 Subject: [Tutor] Volunteer teacher In-Reply-To: <00d101d89efb$f9a91fa0$ecfb5ee0$@gmail.com> References: <75553f79-af0e-6fe1-7f6b-9c2e7f509c72@wichmann.us> <00d101d89efb$f9a91fa0$ecfb5ee0$@gmail.com> Message-ID: On 24/07/2022 02:23, avi.e.gross at gmail.com wrote: > Dumb Question. > > Every damn language I have done so-called object-oriented programming in > DOES IT DIFFERENT. Of course, because OOP is not a language feature. Languages implement tools to facilitate OOP. And each language designer will have different ideas about which features of OOP need support and how best to provide that. In some it will be by classes, in others actors, in others prototyping. Some will try to make OOP look like existing procedural code where others will create a special syntax specifically for objects. > If you had a book on generic object-oriented techniques and then saw Python > or R or JAVA and others, what would their experience be? That's what happens every time I meet a new language. I look to see how that language implements the concepts of OOP. > And I thing things do not always exist in a vacuum. Even when writing a > program that uses OO I also use functional methods, recursion and anything > else I feel like. Just learning OO may leave them stranded in Python! OOP doesn't preclude these other programming techniques. OOP is a design idiom that allows for any style of lower level coding. (What is more difficult is taking a high level functional design and introducing OOP into that - those two things don't blend well at all!) I've also never succeeded in doing OOP in Prolog. Maybe somebody has done it, but it beats me! I've also never felt quite comfortable shoe-horning objects into SQL despite the alleged support for the OOP concepts of some database systems/vendors... -- Alan G Author of the Learn to Program web site http://www.alan-g.me.uk/ http://www.amazon.com/author/alan_gauld Follow my photo-blog on Flickr at: http://www.flickr.com/photos/alangauldphotos From bouncingcats at gmail.com Sun Jul 24 09:18:39 2022 From: bouncingcats at gmail.com (David) Date: Sun, 24 Jul 2022 23:18:39 +1000 Subject: [Tutor] Volunteer teacher In-Reply-To: References: Message-ID: On Sat, 23 Jul 2022 at 09:25, trent shipley wrote: > > I've volunteered to do some informal Python teaching. > > What are some useful online resources and tutorials? Hi Trent, I don't have much knowledge of this area, but one site that I have noticed seems to be good quality is realpython.com. It looks like most of their tutorials require a fee to be paid, but some of them do not and might give you some ideas. Their basic courses are here: https://realpython.com/tutorials/basics/ An example free one that might give some ideas: https://realpython.com/python-dice-roll/ From avi.e.gross at gmail.com Sun Jul 24 12:35:58 2022 From: avi.e.gross at gmail.com (avi.e.gross at gmail.com) Date: Sun, 24 Jul 2022 12:35:58 -0400 Subject: [Tutor] Volunteer teacher In-Reply-To: References: <75553f79-af0e-6fe1-7f6b-9c2e7f509c72@wichmann.us> <00d101d89efb$f9a91fa0$ecfb5ee0$@gmail.com> Message-ID: <004801d89f7b$74b7dc70$5e279550$@gmail.com> dAVId, That happens quite often when a topic comes up and people react to it. You are correct that the original request was about resources for teaching a subset of python. It is then quite reasonable to ask what exactly the purpose was as the right materials need to be chosen based on the level of the students before they enter, and various other goals. My view is that it is harder to appreciate the advantages or uses of object-oriented styles without at least some idea of the alternatives and showing why it is better. I go back to a time it was common to work with multiple variables such as one array to hold lots of names and another for birthdays and another for salaries and how it became advantageous to collect them in a new structure called a "structure" so they could be moved around as a unit and not accidentally get messed up if you deleted from one array but not another, for example. Later "classes" were build atop such structures by adding additional layers such as limiting access to the internals while adding member functions that were specific to the needs and so on. But I will stop my digression here. I have no specific materials to offer. AVI -----Original Message----- From: Tutor On Behalf Of David Sent: Sunday, July 24, 2022 9:02 AM To: Python Tutor Subject: Re: [Tutor] Volunteer teacher On Sat, 23 Jul 2022 at 09:25, trent shipley wrote: > > I've volunteered to do some informal Python teaching. > > What are some useful online resources and tutorials? The topic of discussion seems to be drifting away from the question asked. This is the Tutor list. As I understand it, the point of this list is to respond to questions. Out of consideration for the OP, I am repeating the question that they asked, in the hope that it might focus replies towards addressing the OP questions. _______________________________________________ Tutor maillist - Tutor at python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor From avi.e.gross at gmail.com Sun Jul 24 13:00:40 2022 From: avi.e.gross at gmail.com (avi.e.gross at gmail.com) Date: Sun, 24 Jul 2022 13:00:40 -0400 Subject: [Tutor] Pair 'o dimes wuz: Volunteer teacher In-Reply-To: References: <2118ecff-c37f-c4cf-5f1e-1b706c500c06@gmail.com> <0ef4cf34-21fe-9af1-48d6-c6bc0c699d7a@yahoo.co.uk> <5482fe87-c7a4-db6d-d490-c89b83c2869e@gmail.com> Message-ID: <005301d89f7e$e8066ae0$b81340a0$@gmail.com> At the risk of not answering the question, I am responding to the thoughts others express here. Object-Oriented programming is a fairly meaningless idea as it means a little bit of everything with some people focusing on only some parts. Other ideas like functional programming have similar aspects. As Alan pointed out, one major goal is to solve a much larger problem pretty much in smaller chunks that can each be understood or made by a small team. But it does not end there as implementations have added back too much complexity in the name of generality and things like multiple inheritance can make it a huge challenge to figure out what is inherited in some programs. I would beware books and resources that have an almost religious orientation towards a particular use of things. The reality is that most good programs should not use any one style but mix and match them as makes sense. If you never expect to use it in more than one place, why create a new class to encapsulate something just so you can say it is object oriented. I have seen students make a class with a single variable within it and several instance functions that simply set or get the content and NOTHING ELSE. True, you may later want to expand on that and add functionality but why add overhead just in case when you can change it later? I have seen books that brag about how you can do pretty much anything using recursion. But why? Why would I even want to decide if a billion is more or less than 2 billion by recursively subtracting one from each of two arguments until one or the other hits zero? If your goal is to teach the principles of object-oriented approaches, in the abstract, you still end up using a form of pseudo-code. Picking an actual language can be helpful to show some instantiations. And python can be a good choice alongside many others and also can be a bad choice. It now supports pretty much everything as an object including the number 42. I did a quick search for "object oriented programming in python" and then substituted "java", "c++", "R" and others. There aplenty of such matches and the big question is which fits YOUR class and needs. From mayoadams at gmail.com Sun Jul 24 17:10:21 2022 From: mayoadams at gmail.com (Mayo Adams) Date: Sun, 24 Jul 2022 17:10:21 -0400 Subject: [Tutor] Volunteer teacher In-Reply-To: <004801d89f7b$74b7dc70$5e279550$@gmail.com> References: <75553f79-af0e-6fe1-7f6b-9c2e7f509c72@wichmann.us> <00d101d89efb$f9a91fa0$ecfb5ee0$@gmail.com> <004801d89f7b$74b7dc70$5e279550$@gmail.com> Message-ID: It happens quite often, and that is not in itself a reason to let it pass. On Sun, Jul 24, 2022 at 2:00 PM wrote: > dAVId, > > That happens quite often when a topic comes up and people react to it. > > You are correct that the original request was about resources for teaching > a > subset of python. > > It is then quite reasonable to ask what exactly the purpose was as the > right > materials need to be chosen based on the level of the students before they > enter, and various other goals. > > My view is that it is harder to appreciate the advantages or uses of > object-oriented styles without at least some idea of the alternatives and > showing why it is better. > > I go back to a time it was common to work with multiple variables such as > one array to hold lots of names and another for birthdays and another for > salaries and how it became advantageous to collect them in a new structure > called a "structure" so they could be moved around as a unit and not > accidentally get messed up if you deleted from one array but not another, > for example. > > Later "classes" were build atop such structures by adding additional layers > such as limiting access to the internals while adding member functions that > were specific to the needs and so on. > > But I will stop my digression here. I have no specific materials to offer. > > AVI > > -----Original Message----- > From: Tutor On Behalf Of > David > Sent: Sunday, July 24, 2022 9:02 AM > To: Python Tutor > Subject: Re: [Tutor] Volunteer teacher > > On Sat, 23 Jul 2022 at 09:25, trent shipley > wrote: > > > > I've volunteered to do some informal Python teaching. > > > > What are some useful online resources and tutorials? > > The topic of discussion seems to be drifting away from the question asked. > > This is the Tutor list. As I understand it, the point of this list is to > respond to questions. > > Out of consideration for the OP, I am repeating the question that they > asked, in the hope that it might focus replies towards addressing the OP > questions. > _______________________________________________ > Tutor maillist - Tutor at python.org > To unsubscribe or change subscription options: > https://mail.python.org/mailman/listinfo/tutor > > _______________________________________________ > Tutor maillist - Tutor at python.org > To unsubscribe or change subscription options: > https://mail.python.org/mailman/listinfo/tutor > -- Mayo Adams From alan.gauld at yahoo.co.uk Mon Jul 25 07:10:04 2022 From: alan.gauld at yahoo.co.uk (Alan Gauld) Date: Mon, 25 Jul 2022 12:10:04 +0100 Subject: [Tutor] Volunteer teacher In-Reply-To: References: <75553f79-af0e-6fe1-7f6b-9c2e7f509c72@wichmann.us> <00d101d89efb$f9a91fa0$ecfb5ee0$@gmail.com> <004801d89f7b$74b7dc70$5e279550$@gmail.com> Message-ID: On 24/07/2022 22:10, Mayo Adams wrote: > It happens quite often, and that is not in itself a reason to let it pass. We should always strive to answer the original question. However, a group like this is designed to foster wider understanding of programming issues, especially with regard to Python, so side discussions are not only permitted but encouraged. Although the OP is looking for a specific answer, the hundreds of other readers are often interested in the other, wider, issues raised. It is the nature of public fora and mailing lists. -- Alan G Author of the Learn to Program web site http://www.alan-g.me.uk/ http://www.amazon.com/author/alan_gauld Follow my photo-blog on Flickr at: http://www.flickr.com/photos/alangauldphotos From avi.e.gross at gmail.com Mon Jul 25 14:23:54 2022 From: avi.e.gross at gmail.com (avi.e.gross at gmail.com) Date: Mon, 25 Jul 2022 14:23:54 -0400 Subject: [Tutor] teaching for purpose Message-ID: <007501d8a053$b3033050$190990f0$@gmail.com> The recent discussion inspires me to open a new thread. It is about teaching a topic like python or kinds of programming it supports and doing it for a purpose in a way that does not overwhelm students but helps them understand the motivation and use of features. The recent topic was OOP and my thought was in several directions. One was bottom-up. Show what the world was like when you had individual variables only (meaning memory locations of sorts) and then some sort of array data structure was added so you can ask for var[5] and then if you wanted to store info about a group of say employees, you had one array for each attribute like name, salary, age. So a user could be identified as name[5] and salary[5] and age[5]. Then show what a pain it was to say fire a user or move them elsewhere without remembering to make changes all over the place. So you show the invention of structures like the C struct and how they now hold an employee together. Then you show how functions that operate on structs had to know what it was in order to be effective and general and that enhancing a struct to contain additional member functions and other features like inheritance, could lead to a concept where an object was central. Yes, there are more rungs on this ladder but this in enough for my purpose. The other way is to work more from top-down. Show them some complex network they all use and say this could not easily be build all at once, or efficiently, if it was not seen as parts and then parts within parts, with some parts re-used as needed. At the level of the programming language, the parts can be made somewhat smart and with goals if designed well. If you want to be able to process a container holding many kinds of parts, you may want each part to know how to do several things to itself so if you want all parts to be able to print some version of what they contain, you want to say object.printYourSelf() and each will do it their own way. You may want to invent protocols and insist that any object that wishes to be considered a member of this protocol, must implement the methods needed by the protocol and thus you can pass a collection of such objects to anything that wants the protocol to be used without worrying about errors if the wrong kind of object is there too. Again, I hope you get the idea. You can lead up to objects from basics or lead down as you deconstruct. Of course there are many other ways. I mention this because I personally find some features almost meaningless without the right context. Consider any GUI such as a browser where events seem to happen asynchronously depending on movements of the mouse, clicks, key-presses, and even events coming in from outside. Various parts of the program are not RUN in a deterministic manner but must pop into being, do a few things, and go away till needed again. In a sense, you set event listeners and attach bits of code such as functions or perhaps a method call for any object. In such an environment, all kinds of things get interesting and choices for implementation that do things in what may seem odd or indirect ways, suddenly may make some sense. So imagine first building an almost empty object that holds nothing and has a very few methods. Why would anyone want this? A while later you show how to make one and then more objects that inherit from the same object but BECAUSE they share a base, they have something in common, including in some languages the ability to be in a collection that insists everything be the SAME. Then you show how an object can be made easier and faster if it is only a bit different than another or by combining several others. At some point you want to make sure the students get the ideas behind it, not just some silly localized syntax useful only in that language. Once you have an idea of purpose, you may show them that others had similar purposes in mind but chose other ideas. Take the five or more ways python allows formatting text. As soon as you realize they are all variants with some similarities in purpose, you can more easily classify them and not struggle so much as to why anyone would do this. At least that is how my mind works. What I find helpful in motivating using objects when they really are NOT NEEDED is when it is explained that part of the purpose was to make a set of tools that work well together. Modules in python like sklearn stopped looking weird after I got that. I mean you can easily make functions for yourself (well, maybe not easily) that implement taking some data and return N clusters that are mostly closer within each cluster to some centroid than to other clusters. So why ask the user to set a variable to a new object of some sort, then repeatably ask the object to do things to itself like accept data or add to existing data, set various internal attributes like values telling it to be fast or accurate or which technique to use, or to transform the data by normalizing it some way, and run analyses and supply aspects of what it came up with or predict now new data would be transformed by the internal model? This does not, at first, seem necessary or at all useful. But consider how this scales up if say you want to analyze many files of data and do some comparisons. Each file can be held by it's own object and the objects can be kept in something like a list or matrix and can be used almost interchangeably with other objects that implement very different ways to analyze the data if they all take the same overall set of commands, as appropriate. All may be called, say, with new data, and asked to predict results based on previous learning steps. The point in all the above, which may easily be dismissed by the volume, is that I think part of learning is a mix of ideas in the abstract as well as some really concrete programming along the lines of prototyping. Learning Object Oriented Programming in the abstract may seem like a good idea, albeit implementations vary greatly. But knowing WHY some people developed those ideas and what they were meant to improve or handle, . I shudder sometimes when a functional programming person tries to sell me on using closures to retain values and so on, until I realize that in a sense, it overlaps object-oriented programming. Or does it? From avi.e.gross at gmail.com Mon Jul 25 15:58:26 2022 From: avi.e.gross at gmail.com (avi.e.gross at gmail.com) Date: Mon, 25 Jul 2022 15:58:26 -0400 Subject: [Tutor] Volunteer teacher In-Reply-To: References: <75553f79-af0e-6fe1-7f6b-9c2e7f509c72@wichmann.us> <00d101d89efb$f9a91fa0$ecfb5ee0$@gmail.com> Message-ID: <011301d8a060$e7b62c50$b72284f0$@gmail.com> Alan, I can't say the FIRST thing I look at in a language is OOP. LOL! I stopped counting languages long ago and many did not have anything like OOP then, albeit some may have aspects of it now, if they are still being used. What I look at depends on my needs and how it is being learned. I consider life to be the accumulation of lots of little tricks and techniques that often allow you to use them to solve many kinds of problems, often in association with others. In my UNIX days back at Bell labs, (and still in many places like LINUX distributions) there actually were sets of tools ranging from echo and cat to cut and sed and grep and awk that could be combined in pipelines to do all kinds of rapid prototyping and, if it became commonly used and important, might be rewritten in more than one-liners in C or awk or PERL or whatever. There are many levels of mental tools you can use including some that try to guess which of many tools may be best suited (in combination with others) to get a task done. Smart people are often not so much more gifted overall than those who strive to learn and master some simple tools and find ways to combine them in a semi-optimal way to get jobs done. They do not always have to re-invent the wheel! So when I start a new language, I start by getting a feel for the language so I can compare and contrast it with others and see if there are some purposes it is designed for and some it is very much NOT designed for and a royal pain. If so, I look to see if add-ons help. If you look at Python, it has some glaring gaps as they considered heterogeneous lists to be the most abstract and neglected to make something basic in other languages like an array/vector. Hence, many people just load numpy and perhaps pandas and go on from there and suddenly python is friendly for doing things with objects like dataframes that depend on columns having a uniform nature. Perhaps amusingly is how a language like R where a data.frame was a strictly-enforced list of vectors, now allows list columns as well since allowing varied content can sometimes be useful. Languages start with certain premises but some evolve. I do not know why people say python is simple. Compared to what? It may well be that the core has some simplicity but nothing that supports so many ways of programming as well as many ways of doing similar things, can be simple. Not even debatable. It may be that the deliberate use of indentation rather than braces, for grouping, makes it look simpler. I think the opposite and my own code in other languages has lots of such indentation deliberately along with braces and other such structures. Not because it is required, but because I like to see trends at a glance. Copying code from one place to another is trivial and will work without reformatting, albeit tools in editors easily can reformat it. On the other hand, copying python code can be a mess and a source of error if you copy it to a region of different indentation and things dangle. So back to first appearances, I look for themes and differences. Does the language require something special such as a semicolon to terminate an instruction, or a colon to denote the beginning of a body of text. What are the meanings of symbols I use elsewhere and are they different. Think of how many differences there are in how some languages use single and double quotes (Python uses them interchangeably and in triples) or what characters may need to be escaped when used differently. I look at issues of scope which vary widely. And I look for idioms, often highly compressed notation like x //= 10 and so on. But overall, there is a kernel in which most languages seem almost identical except for pesky details. Or are they? Yes, everything has an IF statement, often followed by an ELSE (upper case just for emphasis) but some use an ELIF and others an ELSE IF and some provide alternatives like some kind of CASE or SWITCH statement with variations or ternary operations like ?: or operations designed to apply vectorized like ifelse(condition, this, that) and so on. Lots of creative minds have come up with so many variations. You can get looked at strangely if you end up programming in very basic tiny steps using literal translations from your days writing in BASIC until you get stuck with how to translate GOSUB, LOL! In my opinion, to teach OOP using Python in a single class is an exercise in what NOT to teach. Yes, inside an object you create there may lurk recursive method calls or functional programming constructs or who knows what cute method. All you care about is the damn object sorts the contents when asked to or when it feels like it. Do hey need to know why or how the following works? a_sorted = [a.pop(a.index(min(a))) for _ in range(len(a))] Or this quicksort one liner: q = lambda l: q([x for x in l[1:] if x <= l[0]]) + [l[0]] + q([x for x in l if x > l[0]]) if l else [] Since the above can easily be done like this more understandable way: def quicksort(my_list): # recursion base case - empty list if not my_list: return [] # first element is pivot pivot = my_list[0] # break up problem smaller = [x for x in my_list[1:] if x < pivot] greater = [x for x in my_list[1:] if x >= pivot] # recursively solve problem and recombine solutions return quicksort(smaller) + [pivot] + quicksort(greater) The goal is to let them study more python on their own when they feel like it but focus in on OOP in general, unless that is not the full purpose of the course. I actually enjoy courses at times that are heterogeneous and show dozens of ways to solve a particular problem using lots of sides of a language. This forum often gets a question answered many different ways. But a focused course is best not pushed off the track. After all, a major focus on OOP is to hide how it is done and to allow existing objects to change how they do it as long as the outside view is the same. As an example, an object could re-sort every time an item is added or perhaps changed or deleted, but it could also NOT do that but mark the fact that the data is not currently sorted, and any attempt to use the data would notice that and sort it before handing back anything. In some cases, the latter approach may be more efficient. But the user rarely knows or cares what happens as long as it happens as expected from the outside of a black box. -----Original Message----- From: Tutor On Behalf Of Alan Gauld via Tutor Sent: Sunday, July 24, 2022 9:15 AM To: tutor at python.org Subject: Re: [Tutor] Volunteer teacher On 24/07/2022 02:23, avi.e.gross at gmail.com wrote: > Dumb Question. > > Every damn language I have done so-called object-oriented programming > in DOES IT DIFFERENT. Of course, because OOP is not a language feature. Languages implement tools to facilitate OOP. And each language designer will have different ideas about which features of OOP need support and how best to provide that. In some it will be by classes, in others actors, in others prototyping. Some will try to make OOP look like existing procedural code where others will create a special syntax specifically for objects. > If you had a book on generic object-oriented techniques and then saw > Python or R or JAVA and others, what would their experience be? That's what happens every time I meet a new language. I look to see how that language implements the concepts of OOP. > And I thing things do not always exist in a vacuum. Even when writing > a program that uses OO I also use functional methods, recursion and > anything else I feel like. Just learning OO may leave them stranded in Python! OOP doesn't preclude these other programming techniques. OOP is a design idiom that allows for any style of lower level coding. (What is more difficult is taking a high level functional design and introducing OOP into that - those two things don't blend well at all!) I've also never succeeded in doing OOP in Prolog. Maybe somebody has done it, but it beats me! I've also never felt quite comfortable shoe-horning objects into SQL despite the alleged support for the OOP concepts of some database systems/vendors... -- Alan G Author of the Learn to Program web site http://www.alan-g.me.uk/ http://www.amazon.com/author/alan_gauld Follow my photo-blog on Flickr at: http://www.flickr.com/photos/alangauldphotos _______________________________________________ Tutor maillist - Tutor at python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor From avi.e.gross at gmail.com Mon Jul 25 17:09:54 2022 From: avi.e.gross at gmail.com (avi.e.gross at gmail.com) Date: Mon, 25 Jul 2022 17:09:54 -0400 Subject: [Tutor] Pair 'o dimes wuz: Volunteer teacher In-Reply-To: References: <2118ecff-c37f-c4cf-5f1e-1b706c500c06@gmail.com> <0ef4cf34-21fe-9af1-48d6-c6bc0c699d7a@yahoo.co.uk> <5482fe87-c7a4-db6d-d490-c89b83c2869e@gmail.com> <4da5d32b-7301-3cbf-d0a5-588cd48fd3e0@DancesWithMice.info> <9bff8b0e-bc7d-d26a-738a-1d402748ac1d@gmail.com> Message-ID: <016401d8a06a$e3a8fcf0$aafaf6d0$@gmail.com> Alan, Good points. I find a major reason for a published design is to highlight easily what CANNOT and SHOULD NOT be done. Too often people ask for new features or other changes without knowing (or caring) if it can be done trivially, or not at all, or perhaps would require a set of new designs/requirements followed by a complete rewrite, perhaps in another language. It can be something as simple as pointing out how the code has a function that takes TWO arguments and the new feature would require adding a third. In some languages that can be as simple and in others it might mean searching all existing code and adding some harmless third argument for all cases that do not want or need it, and recompiling everything in sight and hoping you did not miss anything or break something else. Ditto for making one argument optional but with a default. Now in python, some things like this may be easier to change. But my point is asking a program to do something it was not designed to do is easier to refuse to accept when you can show how it clashes with the design. Yes, they can still ask for it, but cannot expect to get it soooooon. -----Original Message----- From: Tutor On Behalf Of Alan Gauld via Tutor Sent: Sunday, July 24, 2022 9:05 AM To: tutor at python.org Subject: Re: [Tutor] Pair 'o dimes wuz: Volunteer teacher On 24/07/2022 12:23, Leam Hall wrote: > First, Python is a lot easier to read than many languages. True but that only really helps when working at the detail level. It does nothing to help with figuring out how a system works - which functions call which other functions. How data structures relate to each other etc. And thats where design comes in. One of the big failures in design is to go too deep and try to design every line of code. Design should only go to the level where it becomes easier to write code than design. When that stage is reached it is time to write code! > Imagine the dev team trying to work through a spaghetti of undesigned > codebase, and the design person saying "Now do you believe me that > design is important?" I've been there several times. We once received about 1 million lines of C code with no design. We had a team of guys stepping through the code in debuggers for 6 months documenting how it works and reverse engineering the "design". it took us 3 years to fully document the system. But once ew did we could turn around bugs in 24 hours and new starts could be productive within 2 days, rather than the 2 or 3 months it took when we first got the code! When I ran a maintenance team and we took on a new project one of the first tasks was to review the design and if necessary update it (or even rewrite it) in a useful style. A good design is critical to efficient maintenance, even if it has to be retro-fitted. > Unfortunately, we can't just open our skulls up, drop in the GoF or > Booch's OOAD, and magically do good design. Absolutely true. You need to start with the basics. Branching, loops, functions, modules, coupling v cohesion. Separation of concerns, data hiding. Clean interface design. Then you build up to higher level concepts like state machines, table driven code, data driven code, data structures and normalisation. And even if using a book like Booch (which is very good) it should be woked through in parallel with the language constructs. But just as reading booch alone would be useless, so is learning to define functions without understanding their purpose. Or building classes without understanding why. Learning is an iterative process. And this is especially true with software. But you need to understand the why and how equally. Learning how without knowing why leads to bad programming practices - global variables, mixing logic and display, tight coupling etc. > Once we have the basics, hopefully a mentor shows up Ideally we have the mentor in place before we even look at the basics. Even the basics can be baffling without guidance. I've see so many beginners completely baffled by a line like: x = x + 1 It makes no sense to a non-programmer. It is mathematical nonsense! (Its also my many languages have a distince assignment operator: x := x+1 is much easier to assimilate.... > We agree that good design is good. My opinion, even if it's mine > alone, is that design is not the first thing to learn. I dont think I'm arguing for design as a skill - certainly not things like UML or SSADM or even flow charts. But rather the rationale behind programming constructs. Why do we have loops? Why so many of them? And why are functions useful? Why not just cut 'n paste the code? For OOP it's about explaining why we want to use classes/objects. What does an OOP program look like - "a set of objects communicating by messages". How do we send a message from one object to another? What happens when an object receives a message? What method does the receiver choose to fulfil the message request? How does the receiver reply to the requestor? These ideas can then be translated/demonstrated in the preferred language. A decent OOAD book will describe those concepts better than a programming language tutorial in my experience. (Again, the first section of Booch with its famous cat cartoons is very good at that) -- Alan G Author of the Learn to Program web site http://www.alan-g.me.uk/ http://www.amazon.com/author/alan_gauld Follow my photo-blog on Flickr at: http://www.flickr.com/photos/alangauldphotos _______________________________________________ Tutor maillist - Tutor at python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor From alan.gauld at yahoo.co.uk Mon Jul 25 19:53:58 2022 From: alan.gauld at yahoo.co.uk (Alan Gauld) Date: Tue, 26 Jul 2022 00:53:58 +0100 Subject: [Tutor] Volunteer teacher In-Reply-To: <011301d8a060$e7b62c50$b72284f0$@gmail.com> References: <75553f79-af0e-6fe1-7f6b-9c2e7f509c72@wichmann.us> <00d101d89efb$f9a91fa0$ecfb5ee0$@gmail.com> <011301d8a060$e7b62c50$b72284f0$@gmail.com> Message-ID: On 25/07/2022 20:58, avi.e.gross at gmail.com wrote: > I can't say the FIRST thing I look at in a language is OOP. LOL! I didn't say it was the first thing I looked at, just one of the things I looked at. Although as someone who learned to design code (and indeed analyse problems) using OOP I probably do look at OOP features earlier than most. > I stopped counting languages long ago and many did not have anything like > OOP then, As I said in an earlier post you can do OOP in most any language. I've used in in Assembler, COBOL and vanilla C as well as many OOPLs. > life to be the accumulation of lots of little tricks and techniques that > often allow you to use them to solve many kinds of problems, often in > association with others. Absolutely and especially at the small scale level. I hardly ever use OOP when I'm programming things with less than 1000 lines of code. But as the code count goes up so does the likelihood of my using OOP. But on the little projects I will use any number of tools from Lisp to awk to nroff and SQL. > ranging from echo and cat to cut and sed and grep and awk that could be > combined in pipelines Indeed. And even in Windows NT I wrote a suite of small programs (using awk, cut and sed from the cygwin package for NT) that were combined together via a DOS batch file to create the distribution media for a project which had to be installed in over 30 sites each with very specific network configurations. (And eventually that suite was rewritten entirely in VBA as an Excel spreadsheet macro!) > rewritten in more than one-liners in > C or awk or PERL or whatever. At work my main use of Python (actually Jython) was to write little prototype objects to demonstrate concepts to our offshore developers who then turned them into working strength Java. [Jython had the wonderful feature of allowing my to import our industrial strength Java code and create objects and call Java methods from my Python prototypes.] But I've also worked on large projects where there is no coding done by humans at all. These are mostly real-time projects such as telephone exchange control systems and a speech recognition platform where the design was done using SDL (Specification & Design Language) and the design tool generated C code (that was all but unreadable by humans). But if you didn't like C you could flip a configuration option and it would spew out ADA or Modula3 instead... it didn't matter to the developers they only worked at the SDL level. > If you look at Python, it has some glaring gaps as they considered > heterogeneous lists to be the most abstract and neglected to make something > basic in other languages like an array/vector. But that's something I've never found a need for in my 20 years of using Python. I can fill a list with homogenous objects as easily as with hereogenous ones. It's only when dealing with third party tools (often written in other languages under the hood) that the need for an array becomes useful IME. I'm not saying that nobody has a genuine use for a strictly homogenous container, just that I've never needed such a thing personally. > I do not know why people say python is simple. Python used to be simple (compared to almost every other general purpose language) but it is not so today. So many bells and whistles and abstract mechanisms have been added that Python is quite a complex language. (In the same way C++ went from a few simple add-ons to C to being a compeletly different, and vastly complex, animal!) Last time I updated my programming tutorial I seriously considered choosing a different language, but when I went looking I could find nothing simpler that offered all of the real-world usefulness of Python! But it is a much more difficult language to learn now than it was 25 years ago wen I first found it. > It may be that the deliberate use of indentation rather > than braces, for grouping, makes it look simpler. Simpler for beginners to understand. Remember Python came about largely as a response to Guido's experience teaching programming with ABC. So it took the features that students found helpful. Block delimiters have always been a major cause of bugs for beginners. > ...Copying code from one place to another is trivial This is one of the strongest arguments for them. and there are others too, however, as a language deliberately targeted at beginners (but with potential use by experts too) Python (ie Guido) chose ease of learning over ease of copying. > So back to first appearances, I look for themes and differences. Does the > language require something special such as a semicolon to terminate an > instruction, or a colon to denote the beginning of a body of text. What are > the meanings of symbols I use elsewhere and are they different. I confess I don't look at those kinds of things at all. They are just part of the inevitable noise of learning a language but I don't care whether they are there or not. I just accept it whatever it is. But then, I rarely choose a language, I usually have it thrust upon me as a necessity for some project or other. The only languages I've ever chosen to teach myself are Python, Eiffel, Prolog, Haskell and Logo. The other 20 or 30 that I know have been learned to complete some task or other. (However I do intend to teach myself Erlang, Lua and Clojure, and am currently studying Swift so the list will get longer!) > a major focus on OOP is to hide how it is done That's not OOP, it's information hiding and predates OOP by quite a way. But it's a good example of how language implementations of OOP have obscured the whole point of OOP which is purely and simply about objects communicating via messages. Information hiding is a nice thing to include and OOP implementation details like classes can provide a vehicle to enforce it. But its not part of the essence of OOP. > and to allow existing objects to change how they do > it as long as the outside view is the same. But that is polymorphism and a fundamental part of OOP. The fact that you can send a message to an object and the object itself is responsible for figuring out the method of processing that message is absolutely key to OOP. -- Alan G Author of the Learn to Program web site http://www.alan-g.me.uk/ http://www.amazon.com/author/alan_gauld Follow my photo-blog on Flickr at: http://www.flickr.com/photos/alangauldphotos From leamhall at gmail.com Mon Jul 25 20:10:06 2022 From: leamhall at gmail.com (Leam Hall) Date: Mon, 25 Jul 2022 19:10:06 -0500 Subject: [Tutor] Pair 'o dimes wuz: Volunteer teacher In-Reply-To: References: <2118ecff-c37f-c4cf-5f1e-1b706c500c06@gmail.com> <0ef4cf34-21fe-9af1-48d6-c6bc0c699d7a@yahoo.co.uk> <5482fe87-c7a4-db6d-d490-c89b83c2869e@gmail.com> Message-ID: On 7/24/22 06:47, Alan Gauld via Tutor wrote: > On 24/07/2022 01:53, Leam Hall wrote: >> spend a lot of resources on failed projects and no-longer-useful designs. > > This is a flat out myth! The vast majority of lage scale projects > succeed, very few are cancelled or go badly wrong (they are often > over budget and over time, but thats only measuring against the > initial budgets and timescales). It's just that when they do fail they > attract a lot of attention because the cost a lot of money! > If a $10-50k 4-man project goes belly up nobody notices. But when > a 4 year, 1000 man, project costing $100 million goes belly up it is > very noticeable. If the are over budget or over time, then the project and the design failed. You can still pour resources into something, and stretch the calendar, but I've worked for some of those Fortune 500 companies. "On time and within budget" is the extreme rarity. > But the fact is that our modern world is run by large scale software > projects successfully delivered by the fortune 500 companies. We just > don't think about it and take it for granted every time we board > a train or plane, turn on the electricity or water, collect our > wages, etc. Our modern world is run by large scale, complex software that is buggy and often unsupportable. When weekly or monthly patch updates may or may not work, but they are always needed, then you have another sort of failure. (Taking this out of context, but in partial agreement) > But you can't build 4000 man-year projects using > agile - it's been tried and invariably winds up moving to more > traditional methods. Usually after a lot of wasted time and money. Yes and no. I wouldn't want to build an aircraft carrier totally with Agile methodology. However, aircraft carriers are mostly known technology, and in theory design shouldn't have too many surprises. Some software is like that; you can have a three tier application (web, middleware, db) in a myriad of variations, but the tiers are pretty standard. If you can nail down how things connect then you can keep the internals fluid and responsive. I've been on those million dollar projects that bust the budget and the schedule, and I've seem some really good project management skills displayed during high visibility projects. Waterfall, or Big Design Up Front, in theory can work when every component is well known by all responsible teams. That is seldom, if ever, the case. That's a lot of what fueled Agile, I would guess. Re-reading Uncle Bob's history on it now. Does tradition, and sticking to what the founding fathers meant, have a place? Again, yes and no. Both can have value, depending on the context. Neither is intrinsically evil, but neither are they universally useful. Follow tradition and stick to the intent when it helps you build better software. Find something when it causes you to fail. Leam -- Automation Engineer (reuel.net/resume) Scribe: The Domici War (domiciwar.net) General Ne'er-do-well (github.com/LeamHall) From alan.gauld at yahoo.co.uk Mon Jul 25 20:31:57 2022 From: alan.gauld at yahoo.co.uk (Alan Gauld) Date: Tue, 26 Jul 2022 01:31:57 +0100 Subject: [Tutor] teaching for purpose In-Reply-To: <007501d8a053$b3033050$190990f0$@gmail.com> References: <007501d8a053$b3033050$190990f0$@gmail.com> Message-ID: n 25/07/2022 19:23, avi.e.gross at gmail.com wrote: > The recent discussion inspires me to open a new thread. Probably a good idea! :-) > It is about teaching a topic like python or kinds of programming it supports > and doing it for a purpose in a way that does not overwhelm students but > helps them understand the motivation and use of features. I think motivation is key here. But many of the things we have been discussing cannot be understood by true beginners until they have the elementary programming constructs under their belt (sequence, loop and selection plus input/output and basic data types), Only once you get to the level where you can introduce the creation of functions (and in some languages records or structures) do higher level issues like procedural v functional v OO become even comprehensible, let alone relevant. > One was bottom-up. Show what the world was like... > Then show what a pain it was to say fire a user or move them elsewhere > without remembering to make changes all over the place. That is the usual starting point for OOP, it is certainly where Booch goes. > The other way is to work more from top-down. Show them some complex network > they all use and say this could not easily be build all at once, or > efficiently, if it was not seen as parts and then parts within parts, The Smalltalk community (from whence the term OOP came) took such a high level view and discussed programming as a process of communicating objects. And likened it to real world physical objects like a vehicle etc. You could demontrate how a vehicle is constructed of parts each of which is an object, then deconstruct those objects further etc. Thus their students came to programming thinking about ibjects fiorst and the low level "data entities" were just low-level objects, no different to the higher level ones they already used. Seymour Papert took a similar approach with Logo which is not explicitly OOP focused but does share the idea of issuing commands to code objects. He used robots(turtles) to teach the concepts but the ideas were the same. > program are not RUN in a deterministic manner but must pop into being, The may not run in a sequential manner but they most assuredly are deterministic for any given sequence of events. And from an OOP point of view an OOP program does not normally appear to run sequentially but more like an event driven system. As messages are received from other objects the receiving objects respond. Of course there must be some initial message to start the system off but its conceptually more like a GUI than a traditional compiler or payroll system. > At some point you want to make sure the students get the ideas behind it, > not just some silly localized syntax useful only in that language. This is exactly the point I've been trying (badly) to make. It is the conceptual understanding that is critical, the syntax is a nice to have and will change with every language. > ...Take the five or more ways python allows formatting text. Hmm, yes that's one of the features of modern python I really dislike. It's simply confusing and a result of the open-source development model. Even with a benevolant dictator too many variations exist to do the same job. The community should pick one, make it fit for all purposes(that already exist) and deeprecate the others. But historic code would get broken so we have to endure (and become familiar with) them all! It's the nightmare of legacy code and the same reason that modern C++ is such a mess. Java is showing signs of going the same way. > What I find helpful in motivating using objects when they really are NOT > NEEDED is when it is explained that part of the purpose was to make a set of > tools that work well together. One point that needs to be made clear to students is that code objects are useful and don't need to be used in OOP. In fact most objects are not used in OOP. Coding with objects is not the same as coding in OOP. Objects are a useful extension to records/structures and can be used in procedural code just as well as in OOP code. It is the same with functions. You can use functions in non functional code, and indeed the vast majority of functions are non functional in nature. But you can't write functional code without functions. And you can't write OOP code without objects(in the general sense, not in the "instance of a class" sense). > So why ask the user to set a variable to a new object of some sort, then > repeatably ask the object to do things to itself like accept data or add to > existing data, set various internal attributes like values telling it to be > fast or accurate or which technique to use, or to transform the data by > normalizing it some way, and run analyses and supply aspects of what it came > up with or predict now new data would be transformed by the internal model? > This does not, at first, seem necessary or at all useful. And certainly not OOP! > But consider how this scales up if say you want to analyze many files of > data and do some comparisons. Each file can be held by it's own object and > the objects can be kept in something like a list or matrix and can be used > almost interchangeably with other objects that implement very different ways > to analyze the data if they all take the same overall set of commands, as > appropriate. All may be called, say, with new data, and asked to predict > results based on previous learning steps. Again, none of which is OOP. > The point in all the above, which may easily be dismissed by the volume, is > that I think part of learning is a mix of ideas in the abstract as well as > some really concrete programming along the lines of prototyping. Learning > Object Oriented Programming in the abstract may seem like a good idea, > albeit implementations vary greatly. But knowing WHY some people developed > those ideas and what they were meant to improve or handle, . Learning anything in the purely abstract rarely works. The problem with OOP is that it only makes sense in fairly big projects, certainly much bigger than most programming learners ever experience. Probably bigger than most university students experience too. Arguably only where multiple programmers are involved does it really show value - or where the requirements are very complex. > I shudder sometimes when a functional programming person tries to sell me on > using closures to retain values and so on, until I realize that in a sense, > it overlaps object-oriented programming. Or does it? Most functional adherents view OOP as the very opposite of good functional practice with the implicit retention of state. I've seen arguments to suggest that instance attributes are just special forms of closure but I'm not convinced. I tend to view them as orthogonal views of the world that rarely intersect (but can be usefully combined). -- Alan G Author of the Learn to Program web site http://www.alan-g.me.uk/ http://www.amazon.com/author/alan_gauld Follow my photo-blog on Flickr at: http://www.flickr.com/photos/alangauldphotos From alan.gauld at yahoo.co.uk Mon Jul 25 21:01:15 2022 From: alan.gauld at yahoo.co.uk (Alan Gauld) Date: Tue, 26 Jul 2022 02:01:15 +0100 Subject: [Tutor] Pair 'o dimes wuz: Volunteer teacher In-Reply-To: References: <2118ecff-c37f-c4cf-5f1e-1b706c500c06@gmail.com> <0ef4cf34-21fe-9af1-48d6-c6bc0c699d7a@yahoo.co.uk> <5482fe87-c7a4-db6d-d490-c89b83c2869e@gmail.com> Message-ID: On 26/07/2022 01:10, Leam Hall wrote: >> succeed, very few are cancelled or go badly wrong (they are often >> over budget and over time, but thats only measuring against the >> initial budgets and timescales). > If the are over budget or over time, then the project and the design failed. Not at all. If the requirements changed (and they invariably do) then extra cost and time will be required. Thats true of all large projects not just software. Large building projects like the English Channel Tunnel went over budget and took longer than expected. It is impossible to accurately estimate cost and timescales based on an initial requirement spec. That is understood and accepted by anyone working on large projects. It's true of small projects as well but the margin of error is much smaller and theefore typically absorbed by contingecy. But accountants don't like "contingencies" of $10 million! But if you dream big you must expect to spend big too. A project only fails if it doesn't deliver the benefits required in a cost effective manner. Judging a multi-year project based on time/cost estimates given before any work is started is foolishness. Most large projects do deliver eventually. > Our modern world is run by large scale, complex software | > that is buggy and often unsupportable. That's not my experience. especially in safety-critical systems like air-traffic control, power-station control systems etc. Sure there will be bugs, even formal methods won't find all of them. But our world would simply not function if the software was that bad. > When weekly or monthly patch updates may or may not work, That's bad. Most large mission critical projects I've worked on work on 6-monthly (occasionally quarterly) releases rigorously tested and that rarely go wrong. > Yes and no. I wouldn't want to build an aircraft carrier totally > with Agile methodology. That's the point. I've used Agile on large projects at the component level. Each component team runs an Agile shop. But the overall project is controlled with more traditional techniques and the overall architecture is designed up front (albeit subject to change, see above). > However, aircraft carriers are mostly known technology, Mostly, but each generation has a lot of cutting edge new stuff too! > a three tier application That's what design patterns are for. > I've been on those million dollar projects that bust the budget > and the schedule, and I've seem some really good project management > skills... Waterfall, or Big Design Up Front, in theory can work > when every component is well known by all responsible teams. Which, as you point out is the case of 80% of the code in a large system. (Also in most small systems too for that matter!) The trick is to identify the 20% that doesn't fit (but that can be a difficult trick up front!) and apply more flexible approaches such as prototyping and Agile in those areas. And then risk manage them like crazy! > Does tradition, and sticking to what the founding fathers meant, have a place? Only to provide context. Unlike much of modern software engineering the early programmers were driven by research and had the time to study the best approaches based on data not anecdote. They couldn't afford to do anything else given the tools available. Our modern tools are so powerful we can afford to waste cycles trying things and if it doesn't work roll back and try again. But the hard-won lessons of the past should not be forgotten because they mostly remain valid. But at the same time we must maximize the benefits of the modern computing power too. Otherwise we'd still all be drawing flowcharts and using teletypes! -- Alan G Author of the Learn to Program web site http://www.alan-g.me.uk/ http://www.amazon.com/author/alan_gauld Follow my photo-blog on Flickr at: http://www.flickr.com/photos/alangauldphotos From avi.e.gross at gmail.com Mon Jul 25 23:03:16 2022 From: avi.e.gross at gmail.com (avi.e.gross at gmail.com) Date: Mon, 25 Jul 2022 23:03:16 -0400 Subject: [Tutor] Volunteer teacher In-Reply-To: References: <75553f79-af0e-6fe1-7f6b-9c2e7f509c72@wichmann.us> <00d101d89efb$f9a91fa0$ecfb5ee0$@gmail.com> <011301d8a060$e7b62c50$b72284f0$@gmail.com> Message-ID: <024101d8a09c$40e32540$c2a96fc0$@gmail.com> Alan, I will selectively quote just a few parts where I want to clarify as overall we often mainly agree. As I said in an earlier post you can do OOP in most any language. I've used in in Assembler, COBOL and vanilla C as well as many OOPLs. I differentiate between using IDEAS and code built-in to the language meant to facilitate it. If you mean you can implement a design focusing on objects with serious amounts of work, sure. I mean I can define an object as multiple pieces of data and pass all of them to every function I write that handles them instead of a pointer of sorts to a class. Or if pointers to functions are allowed I can simulate many things. I just mean that languages like FORTRAN and Pascal and BASIC that I used way back when, or PROLOG which you mentioned earlier, came in my timeline BEFORE I had heard much about Object-Oriented. I was programming in plain vanilla C when Bjarne Stroustrup started talking about making an OO version he eventually put out as C++ and was the first one in my area who jumped in and started using it and got the agreement to use it on an existing project where I saw a way to use it for a sort of polymorphism. I may have told this story already but I was SHOCKED they allowed me to do that and a tad suspicious. I mean I also had to change the "makefile" to properly distinguish between my files and regular C files and properly call the right compilers with the right arguments and then link them together. Turns out I was right and the product we were building was never used. This being the AT&T Bell Labs that wasted money, it seems management pulled permission for the project but gave them 6 months to find a new project. But rather than telling us and letting use some vacation time or take classes and add to our skills or look for other work, they simply called a meeting to announce that a NEW Project was starting NOW and that management had been working it out for months and was reshuffling staff among supervisors assigned to each part and so on! Sheesh! And, yes, a project or two later, we did start using C++. My point was not that concepts cannot be used, and arguably were sometimes used. If you look at Python, it has some glaring gaps as they considered heterogeneous lists to be the most abstract and neglected to make something basic in other languages like an array/vector. But that's something I've never found a need for in my 20 years of using Python. I can fill a list with homogenous objects as easily as with he[te]r[e]ogenous ones. It's only when dealing with third party tools (often written in other languages under the hood) that the need for an array becomes useful IME. I'm not saying that nobody has a genuine use for a strictly homogenous container, just that I've never needed such a thing personally. I think it goes deeper than that, Alan. Yes, other than efficiency considerations, I fully agree that a list data structure can do a superset of what a uniform data structure can do. So can a tuple if you want immutability. In a sense you could argue that vectors or arrays or whatever name a language uses for some data structures that only can contain one kind of thing are old-school and based on the ideas and technology of an earlier time. Mind you, it is not that simple. Old arrays typically were of fixed length upon creation and usually took work to extend by creating a new one and often led to memory faults or leaks if you followed them too far. But when I look at some implementation in say R, they have no fixed length and lots of details are handled behind the scenes for you. What I was talking about may be subtle. There are times you want to guarantee things or perhaps have automatic conversions. You can build a class, of course, that internally maintains both a list and the right TYPE it accepts and access methods that enforce that anything changed or added is only allowed if it either is the right type, or perhaps can be coerced to that type. Or, perhaps, that if a new type is added that cannot mingle, then change everything else to the new type. R is built on something like that. A vector can hold one or more (or zero if you know how) contents of the same type. It can expand and contract and even flip the content of all its parts to a new one. I can start with a vector of integers and add or change a float and they all become floats. I can do the same with a character string and they all become character strings. But they are pretty much guaranteed to all be the same type (or an NA, albeit there are type specific NA variables under the hood). This become useful because R encourages use of data structures like data.frames in which it is necessary (or used to be necessary) for columns in such tables to all reflect the same kind of thing. And, yes, it now allows you to use lists which are technically a kind of vector too. I have created some rather bizarre data structures this way including taking a larger data.frame, grouping it on selected columns, folding up the remaining columns to be smaller data.frames embedded in such a list column, doing statistical analyses on it and saving the results in another list column, extracting some of the results from that column to explode out into more individual columns, and so on, including sometimes taking the smaller data.frames and re-expanding them back into the main data.frame. The structures can be quite convoluted, and in some sense resemble LISP constructs where you used functions like CADDAR(.) to traverse a complex graph, as many "classes" I R are made of named lists nested within . I will note that many programming languages that tried to force you to have containers that only held on kind of thing, often cheated by techniques like a UNION of structures so everything was in some sense the same type. Languages like JAVA and others can really play the game using tricks like inheriting from a common ancestor or just claim to implement the same interface, to allow some generic ideas that let you ride comfortably enough together. In a sense, Python just did away with much of that and allowed you to hold anything in a list or tuple, and perhaps not quite everything in hashable constructs like a dictionary. I won't quote what you said about Python being simple or complex except to say that it depends on perspective. As an experienced programmer, I do not want something so simple it is hard to do much with it without lots of extra work. I like that you can extend the core language in many ways including with so many modules out there that are tailored to make some chores easier - including oodles of specialized or general objects. But as a teaching tool, it reminds me a bit of conversations I have had with my kids where everything I said seemed to have a reference or vocabulary word they did not know. If you said Abyssinia and they asked what that was and you answered it is now mostly Ethiopia across the Red Sea then they want to know what that is or why it is called Red. If you explain it may be more about a misunderstanding of Reed, and get into the Exodus story you find words and idea you have to explain and so on until you wonder when they will learn all this stuff and be able to have a real conversation!!!! So there is always one kid in a class where you are teaching fairly basic stuff who wants to know why you are using a loop rather than a comprehension or not doing it recursively or using itertools and how many times can you keep saying that is more advanced and some will be covered later and others are beyond this intro course so ask me AFTER class or look it up yourself? a major focus on OOP is to hide how it is done That's not OOP, it's information hiding and predates OOP by quite a way. I accept that in a sense data hiding may be partially or even completely independent from OOP and the same for hiding an implementation so if it changes, it does not cause problems. The idea is the object is a sort of black box that only does what it says in the documentation and can only be accessed exactly as it says. Some languages allow you to make all kinds of things private and if they are compiled, it is not easy to make changes. Some languages use fixed slots in classes to hold data while python allows any object to pick up attributes so something like a function can create memory attached to itself sort of on the outside outside and so can any object. Someone posted some Python code recently that did things in ways often discouraged such as changing a variable within an object directly and was shown ways to not allow that in cases where it is a calculated value that should change when another value changes but remain in sync. In that sense, we could argue (and I would lose) about what in a language is OOP and what is something else or optional. My views are not based o one formal course in the abstract but maybe I should read a recent book on the topic, not just see how each language brags it supports OOP. In particular, your idea it involves message passing is true in some arenas but mostly NOT how it is done elsewhere. Calling a member function is not a general message passing method, nor is being called while asleep on a queue and being told nothing except that it is tie to wake up. Perhaps it does make sense to not just teach OOP in Python but also snippets of how it is implemented in pseudocode or in other languages that perhaps do it more purely. -----Original Message----- From: Tutor On Behalf Of Alan Gauld via Tutor Sent: Monday, July 25, 2022 7:54 PM To: tutor at python.org Subject: Re: [Tutor] Volunteer teacher On 25/07/2022 20:58, avi.e.gross at gmail.com wrote: > I can't say the FIRST thing I look at in a language is OOP. LOL! I didn't say it was the first thing I looked at, just one of the things I looked at. Although as someone who learned to design code (and indeed analyse problems) using OOP I probably do look at OOP features earlier than most. > I stopped counting languages long ago and many did not have anything > like OOP then, As I said in an earlier post you can do OOP in most any language. I've used in in Assembler, COBOL and vanilla C as well as many OOPLs. > life to be the accumulation of lots of little tricks and techniques > that often allow you to use them to solve many kinds of problems, > often in association with others. Absolutely and especially at the small scale level. I hardly ever use OOP when I'm programming things with less than 1000 lines of code. But as the code count goes up so does the likelihood of my using OOP. But on the little projects I will use any number of tools from Lisp to awk to nroff and SQL. > ranging from echo and cat to cut and sed and grep and awk that could > be combined in pipelines Indeed. And even in Windows NT I wrote a suite of small programs (using awk, cut and sed from the cygwin package for NT) that were combined together via a DOS batch file to create the distribution media for a project which had to be installed in over 30 sites each with very specific network configurations. (And eventually that suite was rewritten entirely in VBA as an Excel spreadsheet macro!) > rewritten in more than one-liners in C or awk or PERL or whatever. At work my main use of Python (actually Jython) was to write little prototype objects to demonstrate concepts to our offshore developers who then turned them into working strength Java. [Jython had the wonderful feature of allowing my to import our industrial strength Java code and create objects and call Java methods from my Python prototypes.] But I've also worked on large projects where there is no coding done by humans at all. These are mostly real-time projects such as telephone exchange control systems and a speech recognition platform where the design was done using SDL (Specification & Design Language) and the design tool generated C code (that was all but unreadable by humans). But if you didn't like C you could flip a configuration option and it would spew out ADA or Modula3 instead... it didn't matter to the developers they only worked at the SDL level. > If you look at Python, it has some glaring gaps as they considered > heterogeneous lists to be the most abstract and neglected to make > something basic in other languages like an array/vector. But that's something I've never found a need for in my 20 years of using Python. I can fill a list with homogenous objects as easily as with hereogenous ones. It's only when dealing with third party tools (often written in other languages under the hood) that the need for an array becomes useful IME. I'm not saying that nobody has a genuine use for a strictly homogenous container, just that I've never needed such a thing personally. > I do not know why people say python is simple. Python used to be simple (compared to almost every other general purpose language) but it is not so today. So many bells and whistles and abstract mechanisms have been added that Python is quite a complex language. (In the same way C++ went from a few simple add-ons to C to being a compeletly different, and vastly complex, animal!) Last time I updated my programming tutorial I seriously considered choosing a different language, but when I went looking I could find nothing simpler that offered all of the real-world usefulness of Python! But it is a much more difficult language to learn now than it was 25 years ago wen I first found it. > It may be that the deliberate use of indentation rather than braces, > for grouping, makes it look simpler. Simpler for beginners to understand. Remember Python came about largely as a response to Guido's experience teaching programming with ABC. So it took the features that students found helpful. Block delimiters have always been a major cause of bugs for beginners. > ...Copying code from one place to another is trivial This is one of the strongest arguments for them. and there are others too, however, as a language deliberately targeted at beginners (but with potential use by experts too) Python (ie Guido) chose ease of learning over ease of copying. > So back to first appearances, I look for themes and differences. Does > the language require something special such as a semicolon to > terminate an instruction, or a colon to denote the beginning of a body > of text. What are the meanings of symbols I use elsewhere and are they different. I confess I don't look at those kinds of things at all. They are just part of the inevitable noise of learning a language but I don't care whether they are there or not. I just accept it whatever it is. But then, I rarely choose a language, I usually have it thrust upon me as a necessity for some project or other. The only languages I've ever chosen to teach myself are Python, Eiffel, Prolog, Haskell and Logo. The other 20 or 30 that I know have been learned to complete some task or other. (However I do intend to teach myself Erlang, Lua and Clojure, and am currently studying Swift so the list will get longer!) > a major focus on OOP is to hide how it is done That's not OOP, it's information hiding and predates OOP by quite a way. But it's a good example of how language implementations of OOP have obscured the whole point of OOP which is purely and simply about objects communicating via messages. Information hiding is a nice thing to include and OOP implementation details like classes can provide a vehicle to enforce it. But its not part of the essence of OOP. > and to allow existing objects to change how they do it as long as the > outside view is the same. But that is polymorphism and a fundamental part of OOP. The fact that you can send a message to an object and the object itself is responsible for figuring out the method of processing that message is absolutely key to OOP. -- Alan G Author of the Learn to Program web site http://www.alan-g.me.uk/ http://www.amazon.com/author/alan_gauld Follow my photo-blog on Flickr at: http://www.flickr.com/photos/alangauldphotos _______________________________________________ Tutor maillist - Tutor at python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor From alan.gauld at yahoo.co.uk Tue Jul 26 05:21:46 2022 From: alan.gauld at yahoo.co.uk (Alan Gauld) Date: Tue, 26 Jul 2022 10:21:46 +0100 Subject: [Tutor] Volunteer teacher In-Reply-To: <024101d8a09c$40e32540$c2a96fc0$@gmail.com> References: <75553f79-af0e-6fe1-7f6b-9c2e7f509c72@wichmann.us> <00d101d89efb$f9a91fa0$ecfb5ee0$@gmail.com> <011301d8a060$e7b62c50$b72284f0$@gmail.com> <024101d8a09c$40e32540$c2a96fc0$@gmail.com> Message-ID: On 26/07/2022 04:03, avi.e.gross at gmail.com wrote: > As I said in an earlier post you can do OOP in most any language. > > I differentiate between using IDEAS and code built-in to the language meant > to facilitate it. If you mean you can implement a design focusing on objects > with serious amounts of work, sure. That's exactly what I mean. My point is that OOP is a style of programming not a set of language features (that's an OOPL). It's the same with functional programming. You can discuss how languages like ML, Lisp, Haskell or Python implement functional features but you cannot fully discuss FP purely by using any one of those languages. And you can write non FP code in any of those languages while still using the FP features. > IME. I'm not saying that nobody has a genuine use for a strictly > homogenous container, just that > I've never needed such a thing personally. > > What I was talking about may be subtle. There are times you want to > guarantee things or perhaps have automatic conversions. ... Sure, and I have used arrays when dealing with code from say C libraries that require type consistency. But that's a requirement of the tools I'm using and their interface. Efficiency is another pragmatic reason to use them but I've never needed that much efficiency in my Python projects. > I will note that many programming languages that tried to force you to have > containers that only held on kind of thing, often cheated by techniques like > a UNION... > Languages like JAVA and others inheriting from a common ancestor That's how most OOPL programs work by defining a collection of "object" and all objects inherit from "object". And generics allow you to do the same in non OOPLs like ADA. > I won't quote what you said about Python being simple or complex except to > say that it depends on perspective. As an experienced programmer, I do not > want something so simple it is hard to do much with it without lots of extra > work. Sure, and Guido had to walk the tightrope between ease of use for beginners and experts. In the early days the bias was towards beginners but nowadays it is towards experts, but we have historic legacy that clearly favours the learners. > But as a teaching tool, it reminds me a bit of conversations I have had with > my kids where everything I said seemed to have a reference or vocabulary > word they did not know. That's always a problem teaching programming. I tried very hard when writing my tutorial not to use features before decribing them but it is almost impossible. You can get away with a cursory explanation then go deeper later but you can't avoid it completely. I suspect thats true of any complex topic (music, art etc) > That's not OOP, it's information hiding and predates OOP by quite a > way. > > I accept that in a sense data hiding may be partially or even completely > independent from OOP > In that sense, we could argue (and I would lose) about what in a language is > OOP and what is something else or optional. My point is that language features are never OOP. They are tools to facilitate OOP. Data hiding is a curious case because it's a concept that is separate from OOP but often conflated with OOP in the implementation. > topic, not just see how each language brags it supports OOP. In particular, > your idea it involves message passing is true in some arenas but mostly NOT > how it is done elsewhere. OOP is all about message passing. That's why we have the terminology we do. Why is a method called that? It's because when a message is received by an object the object knows the method it should use to service that message. The fact that the method is implemented as a function with the same name as the message is purely an implementation detail. There are OOPLs that allow you to map messages to methods internally and these (correctly from an OOP standpoint) allow the same method to be used to process multiple messages. > Calling a member function is not a general message > passing method, No, but that's a language feature not OOP. Languages like Lisp Flavours used a different approach where you wrote code like: (Send (anObject, "messageName", (object list)) The receiving object then mapped the string "messageName" to an internal method function. So the important point is that programmers writing OOP code in an OOPL should *think* that they are passing messages not calling functions. That's a subtle but very important distinction. And of course, in one sense it's true because, when you have a class heirarchy (something that is also not an essential OOP feature!), and send a message to a leaf node you may not in fact be calling a function in that node, it may well be defined in a totally different class further up the heirarchy. [ And we can say the same about calling functions. That's a conceptual thing inherited from math. It practice we are doing a goto in the assembler code. But we think of it as a conceptual function invocation like we were taught at high school. The difference with OOP is that message passing is a new concept to learn (unless coming from a traditional engineering background) whereas function calls we already know about from math class.] > Perhaps it does make sense to not just teach OOP in Python but also snippets > of how it is implemented in pseudocode or in other languages that perhaps do > it more purely. Certainly Booch takes that approach in his first and third editions of his book. But even that leads to confusion between OOP and language features (OOPLs). The essence of OOP is about how you think about the program. Is it composed of objects communicating - object *oriented* - or is it composed of a functional decomposition that uses objects in the mix (the more common case). Objects are a powerful tool that can add value to a procedural solution. But OOP is a far more powerful tool, especially when dealing with larger systems. One of the best books IMHO to describe the difference in the approaches is Peter Coad's book "Object Models, Strategies and Implementations" It's not an especially well written book and makes some controversial claims, but the concepts within are clearly explained. But it is written at the UML level not code. And of course there comes a point in almost all OOPLs where you have to drop out of OOP thinking to use low level raw data types. (Smalltalk and a few others being the exceptions where absolutely everything is an object) Ultimately, pragmatism has to take over from principle and the trick is working with OOP at the program structure level and raw data at the detailed implementation level. Knowing where that line lives is still one of the hardest aspects of using any OOPL. But it is still an implementation issue not an OOP issue. -- Alan G Author of the Learn to Program web site http://www.alan-g.me.uk/ http://www.amazon.com/author/alan_gauld Follow my photo-blog on Flickr at: http://www.flickr.com/photos/alangauldphotos From o1bigtenor at gmail.com Tue Jul 26 06:32:14 2022 From: o1bigtenor at gmail.com (o1bigtenor) Date: Tue, 26 Jul 2022 05:32:14 -0500 Subject: [Tutor] Volunteer teacher In-Reply-To: References: <75553f79-af0e-6fe1-7f6b-9c2e7f509c72@wichmann.us> <00d101d89efb$f9a91fa0$ecfb5ee0$@gmail.com> <011301d8a060$e7b62c50$b72284f0$@gmail.com> <024101d8a09c$40e32540$c2a96fc0$@gmail.com> Message-ID: On Tue, Jul 26, 2022 at 4:22 AM Alan Gauld via Tutor wrote: > > On 26/07/2022 04:03, avi.e.gross at gmail.com wrote: > Fascinating discussion!!!! Thank you for not only engaging in it but also for allowing it. I leant a long time ago that I could find plenty of nuggets for learning in the digressions - - -in fact sometimes even more things of interest in such than the original topic(s). A lurker (here to learn - - -grin!) From avi.e.gross at gmail.com Tue Jul 26 13:19:23 2022 From: avi.e.gross at gmail.com (avi.e.gross at gmail.com) Date: Tue, 26 Jul 2022 13:19:23 -0400 Subject: [Tutor] Volunteer teacher In-Reply-To: References: <75553f79-af0e-6fe1-7f6b-9c2e7f509c72@wichmann.us> <00d101d89efb$f9a91fa0$ecfb5ee0$@gmail.com> <011301d8a060$e7b62c50$b72284f0$@gmail.com> <024101d8a09c$40e32540$c2a96fc0$@gmail.com> Message-ID: <006f01d8a113$d9dd28f0$8d977ad0$@gmail.com> The way I recall it, Alan, is that many language designers and users were once looking for some kind of guarantees that a program would run without crashing AND that every possible path would lead to a valid result. One such attempt was to impose very strict typing. You could not just add two numbers without first consciously converting them to the same type and so on. Programming in these languages rapidly became tedious and lots of people worked around them to get things done. Other languages were a bit too far in the other direction and did thigs like flip a character string into a numeric form if it was being used in a context where that made sense. My point, perhaps badly made, was that one reason some OOP ideas made it into languages INCLUDED attempts to do useful things when the original languages made them hard. Ideas like allowing anything that CLAIMED to implement an INTERFACE to be allowed in a list whose type was that interface, could let you create some silly interface that did next to nothing and add that interface to just about anything and get around restrictions. But that also meant their programs were not really very safe in one sense. Languages like python largely did away with some such mathematical restrictions and seem easier to new programmers because of that and other similar things. I remember getting annoyed at having to spell out every variable before using it as being of a particular type, sometimes a rather complex type like address of a pointer to an integer. Many languages now simply make educated guesses when possible so a=1 makes an integer and b=a^2 is also obviously an integer while c=a*2.0 must be a float and so on. Such a typing system may still have the ability to specify things precisely but most people stop using that except when needed and the language becomes easier, albeit can develop subtle bugs. What makes some languages really hard to understand, including some things like JAVA, is their attempt to create sort of generic functions. There is an elusive syntax that declares abstract types that are instantiated as needed and if you use the function many times using the object types allowed, it compiles multiple actual functions with one for each combo. So if a function takes 4 arguments that call all be 5 kinds, it may end up compiling as many as 625 functions internally. Python does make something like that much easier. You just create a function that takes ANYTHING and at run time, the arguments arrive with known types and are simply handled. But the simplicity can cost you as you cannot trivially restrict what can be used and may have to work inside the function to verify you are only getting the types you want to handle. You make some points about efficiency that make me wonder. There are so many obvious tradeoffs but it clearly does not always pay to make something efficient as the original goal for every single project. As noted, parts of python or added modules are often written first in python then strategic bits of compiled code are substituted. But in a world where some software (such as nonsense like making digital money by using lots of electricity) uses up lots of energy, it makes sense to try to cut back on some more grandiose software. And it is not just about efficiency. Does a program need to check if I have new mail in a tight loop or can it check once a second or even every ten minutes? I am getting to understand your viewpoint in focusing on ideas not so much implementations and agree. The method of transmitting a message can vary as long as you have objects communicating and influencing each other. Arguably sending interrupts or generating events and many other such things are all possible implementations. I think the ability to try to catch various errors and interrupts is part of why languages like python can relax some rules as an error need not stop a program. Heading out to the beach. -----Original Message----- From: Tutor On Behalf Of Alan Gauld via Tutor Sent: Tuesday, July 26, 2022 5:22 AM To: tutor at python.org Subject: Re: [Tutor] Volunteer teacher On 26/07/2022 04:03, avi.e.gross at gmail.com wrote: > As I said in an earlier post you can do OOP in most any language. > > I differentiate between using IDEAS and code built-in to the language > meant to facilitate it. If you mean you can implement a design > focusing on objects with serious amounts of work, sure. That's exactly what I mean. My point is that OOP is a style of programming not a set of language features (that's an OOPL). It's the same with functional programming. You can discuss how languages like ML, Lisp, Haskell or Python implement functional features but you cannot fully discuss FP purely by using any one of those languages. And you can write non FP code in any of those languages while still using the FP features. > IME. I'm not saying that nobody has a genuine use for a > strictly homogenous container, just that I've never needed > such a thing personally. > > What I was talking about may be subtle. There are times you want to > guarantee things or perhaps have automatic conversions. ... Sure, and I have used arrays when dealing with code from say C libraries that require type consistency. But that's a requirement of the tools I'm using and their interface. Efficiency is another pragmatic reason to use them but I've never needed that much efficiency in my Python projects. > I will note that many programming languages that tried to force you to > have containers that only held on kind of thing, often cheated by > techniques like a UNION... > Languages like JAVA and others inheriting from a common ancestor That's how most OOPL programs work by defining a collection of "object" and all objects inherit from "object". And generics allow you to do the same in non OOPLs like ADA. > I won't quote what you said about Python being simple or complex > except to say that it depends on perspective. As an experienced > programmer, I do not want something so simple it is hard to do much > with it without lots of extra work. Sure, and Guido had to walk the tightrope between ease of use for beginners and experts. In the early days the bias was towards beginners but nowadays it is towards experts, but we have historic legacy that clearly favours the learners. > But as a teaching tool, it reminds me a bit of conversations I have > had with my kids where everything I said seemed to have a reference or > vocabulary word they did not know. That's always a problem teaching programming. I tried very hard when writing my tutorial not to use features before decribing them but it is almost impossible. You can get away with a cursory explanation then go deeper later but you can't avoid it completely. I suspect thats true of any complex topic (music, art etc) > That's not OOP, it's information hiding and predates OOP by > quite a way. > > I accept that in a sense data hiding may be partially or even > completely independent from OOP > In that sense, we could argue (and I would lose) about what in a > language is OOP and what is something else or optional. My point is that language features are never OOP. They are tools to facilitate OOP. Data hiding is a curious case because it's a concept that is separate from OOP but often conflated with OOP in the implementation. > topic, not just see how each language brags it supports OOP. In > particular, your idea it involves message passing is true in some > arenas but mostly NOT how it is done elsewhere. OOP is all about message passing. That's why we have the terminology we do. Why is a method called that? It's because when a message is received by an object the object knows the method it should use to service that message. The fact that the method is implemented as a function with the same name as the message is purely an implementation detail. There are OOPLs that allow you to map messages to methods internally and these (correctly from an OOP standpoint) allow the same method to be used to process multiple messages. > Calling a member function is not a general message passing method, No, but that's a language feature not OOP. Languages like Lisp Flavours used a different approach where you wrote code like: (Send (anObject, "messageName", (object list)) The receiving object then mapped the string "messageName" to an internal method function. So the important point is that programmers writing OOP code in an OOPL should *think* that they are passing messages not calling functions. That's a subtle but very important distinction. And of course, in one sense it's true because, when you have a class heirarchy (something that is also not an essential OOP feature!), and send a message to a leaf node you may not in fact be calling a function in that node, it may well be defined in a totally different class further up the heirarchy. [ And we can say the same about calling functions. That's a conceptual thing inherited from math. It practice we are doing a goto in the assembler code. But we think of it as a conceptual function invocation like we were taught at high school. The difference with OOP is that message passing is a new concept to learn (unless coming from a traditional engineering background) whereas function calls we already know about from math class.] > Perhaps it does make sense to not just teach OOP in Python but also > snippets of how it is implemented in pseudocode or in other languages > that perhaps do it more purely. Certainly Booch takes that approach in his first and third editions of his book. But even that leads to confusion between OOP and language features (OOPLs). The essence of OOP is about how you think about the program. Is it composed of objects communicating - object *oriented* - or is it composed of a functional decomposition that uses objects in the mix (the more common case). Objects are a powerful tool that can add value to a procedural solution. But OOP is a far more powerful tool, especially when dealing with larger systems. One of the best books IMHO to describe the difference in the approaches is Peter Coad's book "Object Models, Strategies and Implementations" It's not an especially well written book and makes some controversial claims, but the concepts within are clearly explained. But it is written at the UML level not code. And of course there comes a point in almost all OOPLs where you have to drop out of OOP thinking to use low level raw data types. (Smalltalk and a few others being the exceptions where absolutely everything is an object) Ultimately, pragmatism has to take over from principle and the trick is working with OOP at the program structure level and raw data at the detailed implementation level. Knowing where that line lives is still one of the hardest aspects of using any OOPL. But it is still an implementation issue not an OOP issue. -- Alan G Author of the Learn to Program web site http://www.alan-g.me.uk/ http://www.amazon.com/author/alan_gauld Follow my photo-blog on Flickr at: http://www.flickr.com/photos/alangauldphotos _______________________________________________ Tutor maillist - Tutor at python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor From alan.gauld at yahoo.co.uk Tue Jul 26 17:19:37 2022 From: alan.gauld at yahoo.co.uk (Alan Gauld) Date: Tue, 26 Jul 2022 22:19:37 +0100 Subject: [Tutor] Volunteer teacher In-Reply-To: <006f01d8a113$d9dd28f0$8d977ad0$@gmail.com> References: <75553f79-af0e-6fe1-7f6b-9c2e7f509c72@wichmann.us> <00d101d89efb$f9a91fa0$ecfb5ee0$@gmail.com> <011301d8a060$e7b62c50$b72284f0$@gmail.com> <024101d8a09c$40e32540$c2a96fc0$@gmail.com> <006f01d8a113$d9dd28f0$8d977ad0$@gmail.com> Message-ID: On 26/07/2022 18:19, avi.e.gross at gmail.com wrote: > The way I recall it, Alan, is that many language designers and users were > once looking for some kind of guarantees that a program would run without > crashing AND that every possible path would lead to a valid result. One such > attempt was to impose very strict typing. Yes and many still follow that mantra. C++ is perhaps the most obsessive of the modern manifestations. > Programming in these languages rapidly became tedious and lots of people > worked around them to get things done. Yep, I first learned to program in Pascal(*) which was super strict. ####### Pseudo Pascal - too lazy to lok up the exact syntax! ### Type SingleDigit = INTEGER(0..9); funtion f(SingleDigit):Boolean .... var x: INTEGER var y: SingleDigit var b: Boolean begin x := 3 y := 3 b := f(x) (*Fails because x is not a SingleDigit *) b := f(y) (* Succeeds because y is... even though both are 3 *) end. The problem with these levels of strictness is that people are forced to convert types for trivial reasons like above. And every type conversion is a potential bug. In my experience of maintaining C code type conversions (especially "casting") are one of the top 5 causes of production code bugs. (*)Actually I did a single-term class in programming in BASIC in the early 70s at high school, but the technology meant we didn't go beyond sequences, loops and selection. In the mid 80s at University I did a full two years of Pascal. (While simultaneously studying Smalltalk and C++ having discovered OOP in the famous BYTE magazine article) > the other direction and did thigs like flip a character string into a > numeric form if it was being used in a context where that made sense. Javascript, Tcl, et al... > My point, perhaps badly made, was that one reason some OOP ideas made it > into languages INCLUDED attempts to do useful things when the original > languages made them hard. Ideas like allowing anything that CLAIMED to > implement an INTERFACE to be allowed in a list whose type was that > interface, could let you create some silly interface that did next to > nothing and add that interface to just about anything and get around > restrictions. But that also meant their programs were not really very safe > in one sense. Absolutely but that's a result of somebody trying to hitch their particular programming irk onto the OOP bandwagon. It has nothing whatsoever to do with OOP. There were a lot of different ideas circulating around the 80s/90s and language implementors used the OOP hype to include their pet notions. So lots of ideas all got conflated into "OOP" and the core principles got lost completely in the noise! > address of a pointer to an integer. Many languages now simply make educated > guesses when possible so a=1 makes an integer and b=a^2 is also obviously an > integer Java does a little of this and Swift is very good at it. > including some things like JAVA, is their attempt to create sort of generic > functions. But generics are another topic again... > ...There is an elusive syntax that declares abstract types that are > instantiated as needed and if you use the function many times using the > object types allowed, it compiles multiple actual functions with one for > each combo. So if a function takes 4 arguments that call all be 5 kinds, it > may end up compiling as many as 625 functions internally. True, that's also what happens in C++. But it is only an issue at assembler level - and if you care about the size of the executable which is rare these days. At the source code level the definitions are fairly compact and clear and still enables the compiler to do strict typing. > trivially restrict what can be used and may have to work inside the > function to verify you are only getting the types you want to handle. Although, if you stick to using the interfaces, then you should be able to trust the objects to "do the right thing". But there is a measure of responsibility on the programmer not to wilfully do stupid things! > I am getting to understand your viewpoint in focusing on ideas not so much > implementations and agree. The method of transmitting a message can vary as > long as you have objects communicating and influencing each other. Arguably > sending interrupts or generating events and many other such things are all > possible implementations. Absolutely and in real-time OOP systems it's common to wrap the OS interrupt handling into some kind of dispatcher object which collects the interrupt and determines the correct receiver and then sends the interrupt details as a message to that object. And from a purely theoretical systems engineering viewpoint, where an OOP system is a form of a network of sequential machines, interrupts are about the closest to a pure OOP architecture. -- Alan G Author of the Learn to Program web site http://www.alan-g.me.uk/ http://www.amazon.com/author/alan_gauld Follow my photo-blog on Flickr at: http://www.flickr.com/photos/alangauldphotos From alan.gauld at yahoo.co.uk Tue Jul 26 17:52:40 2022 From: alan.gauld at yahoo.co.uk (Alan Gauld) Date: Tue, 26 Jul 2022 22:52:40 +0100 Subject: [Tutor] Volunteer teacher In-Reply-To: References: <75553f79-af0e-6fe1-7f6b-9c2e7f509c72@wichmann.us> <00d101d89efb$f9a91fa0$ecfb5ee0$@gmail.com> <011301d8a060$e7b62c50$b72284f0$@gmail.com> <024101d8a09c$40e32540$c2a96fc0$@gmail.com> Message-ID: You know things are bad when you reply to your own emails! On 26/07/2022 10:21, Alan Gauld via Tutor wrote: > That's exactly what I mean. My point is that OOP is a style of > programming not a set of language features (that's an OOPL). ... > OOP is all about message passing. I realize I may be causing confusion by over-simplifying. In reality the term OOP has come to mean many differnt things. The reason for this is historic because what we now call OOP is an amalgam of programming developments during the late 60s-late 80s. There were basically 3 different groups all working along similar lines but with different goals and interests: Group 1 - Simula/Modula This group were focused on improving the implementation of abstract data types and modules and their use in simulating real-world systems. This resulted in the invention of classes as we know them, in the language Simula. Languages influenced by this group include C++, Object Pascal, Java and most of the others that we consider OOPLs today. Group 2 - Allan Kay and his Dynabook project at Xerox Parc. Kay was intent on developing a computing device that could be used by the masses. He focused on teaching programming to youngsters as a representative group. The lessons learnt included the fact that youngsters could relate to objects sending messages to each other and from this he developed the ideas and coined the term "Object Oriented Programming". He built Smalltalk (in 3 different versions culminating in Smalltalk 80) to implement those concepts. Along the way he picked up Simulas class concept. It is he who "defines" OOP as a message passing mechanism and a programming style rather than a set of language features. Object Pascal, Objective C, Actor and Python all include strong influences from this group. Group 3 - Lisp and the Academics At the same time lots of academics were experimenting with different programming styles to try to find a way to accelerate program development. This was driven by the "software crisis" where software development times were increasing expoentially with complexity. They picked up on the activity by groups 1 and 2 and added some spice of their own. Mostly they used Lisp and came up with several OOP systems, best known of which are Flavors and CLOS. CLOS in particular is intended to support multiple styles of object based programming including pure OOP(as defined by Kay). [Bertrand Meyer developed his Eiffel language in parallel with Group 3 but strongly influenced by Group 1 too. Eiffel along with CLOS are probably the most complete implementations of all currently existing OO concepts] [Seymour Papert was working on similar concepts to Kay but was idealogically more closely aligned with group 3 but never espoused objects per se. Instead he developed Logo which is closely related to Lisp but includes the concept of sending messages but not the encapsulation of the receiver data and functions.] When I talk about OOP I am firmly in the Group 2 category. You can do OOP in almost any language. You can use objects in almost any style of programming. But don't make the mistake that just building and using classes means you are doing OOP. Even in the purest of OOP languages(Smalltalk?) it is entirely possible to write non OOP code. The manual for Smalltalk/V includes a good example of non OOP code written in Smalltalk and how it looks when re-written in an OOP style. The point being that simply learning Smalltalk does not mean you are learning OOP! I can probably dig that example out (and maybe even translate it to Python) if anyone is interested. -- Alan G Author of the Learn to Program web site http://www.alan-g.me.uk/ http://www.amazon.com/author/alan_gauld Follow my photo-blog on Flickr at: http://www.flickr.com/photos/alangauldphotos From wlfraed at ix.netcom.com Tue Jul 26 19:12:03 2022 From: wlfraed at ix.netcom.com (Dennis Lee Bieber) Date: Tue, 26 Jul 2022 19:12:03 -0400 Subject: [Tutor] Volunteer teacher References: <75553f79-af0e-6fe1-7f6b-9c2e7f509c72@wichmann.us> <00d101d89efb$f9a91fa0$ecfb5ee0$@gmail.com> <011301d8a060$e7b62c50$b72284f0$@gmail.com> <024101d8a09c$40e32540$c2a96fc0$@gmail.com> <006f01d8a113$d9dd28f0$8d977ad0$@gmail.com> Message-ID: <3os0ehduim97utbff3ea0teqv6r7h1rj93@4ax.com> On Tue, 26 Jul 2022 13:19:23 -0400, declaimed the following: >The way I recall it, Alan, is that many language designers and users were >once looking for some kind of guarantees that a program would run without >crashing AND that every possible path would lead to a valid result. One such >attempt was to impose very strict typing. You could not just add two numbers >without first consciously converting them to the same type and so on. >Programming in these languages rapidly became tedious and lots of people >worked around them to get things done. Other languages were a bit too far in >the other direction and did thigs like flip a character string into a >numeric form if it was being used in a context where that made sense. > Why pussy-foot? You've essentially described Ada and REXX Ada: Conceptually define a data type for every discrete component (so you can't compare "apples" to "oranges" without transmuting one into the other or to some common type ("fruit"). REXX: Everything is a string unless context says otherwise. And statements beginning with unknown keywords (or explicitly quoted strings) are assumed to be commands to an external command processor (in most implementation, the "shell" -- IBM mainframe mostly supported addressing an editor as command processor, Amiga ARexx supported ANY application opening a "RexxPort" to which commands could be sent -- even another ARexx script, or (with Irmen's work) Python. -- Wulfraed Dennis Lee Bieber AF6VN wlfraed at ix.netcom.com http://wlfraed.microdiversity.freeddns.org/ From wlfraed at ix.netcom.com Tue Jul 26 19:22:49 2022 From: wlfraed at ix.netcom.com (Dennis Lee Bieber) Date: Tue, 26 Jul 2022 19:22:49 -0400 Subject: [Tutor] Volunteer teacher References: <75553f79-af0e-6fe1-7f6b-9c2e7f509c72@wichmann.us> <00d101d89efb$f9a91fa0$ecfb5ee0$@gmail.com> <011301d8a060$e7b62c50$b72284f0$@gmail.com> <024101d8a09c$40e32540$c2a96fc0$@gmail.com> <006f01d8a113$d9dd28f0$8d977ad0$@gmail.com> Message-ID: On Tue, 26 Jul 2022 22:19:37 +0100, Alan Gauld via Tutor declaimed the following: >(*)Actually I did a single-term class in programming in BASIC in the >early 70s at high school, but the technology meant we didn't go beyond >sequences, loops and selection. In the mid 80s at University I did a >full two years of Pascal. (While simultaneously studying Smalltalk >and C++ having discovered OOP in the famous BYTE magazine article) > Welcome to the club... I needed a final 3 credits for graduation, so took BASIC in my senior year of college. No effort class -- considering I'd already used BASIC in the data structures course implementing an assignment using "hashed head multiply linked lists" (and have never seen such used any except for the Amiga disk directory/file structure -- hash into the root directory block to find a pointer to the start of a linked list of file header blocks [file name, followed by pointers to data blocks and a pointer to an overflow list of data blocks] and/or directory blocks [directory name followed by hashed list of pointers to next level of chains]) This (BASIC course) was AFTER FORTRAN (-IV), Advanced FORTRAN, Assembly [sequence A], COBOL, Advanced COBOL, Database [sequence B, Database text covered Hierarchical, Network, and Relational as theoretical -- subsequent editions covered Relational, and treated Hierarchical and Network as historical], the aforesaid data structures, and a language design course. -- Wulfraed Dennis Lee Bieber AF6VN wlfraed at ix.netcom.com http://wlfraed.microdiversity.freeddns.org/ From bobxander87 at gmail.com Tue Jul 26 16:58:06 2022 From: bobxander87 at gmail.com (bobx ander) Date: Tue, 26 Jul 2022 22:58:06 +0200 Subject: [Tutor] Building dictionary from large txt file Message-ID: Hi all, I'm trying to build a dictionary from a rather large file of following format after it has being read into a list(excerpt from start of list below) -------- Atomic Number = 1 Atomic Symbol = H Mass Number = 1 Relative Atomic Mass = 1.00782503223(9) Isotopic Composition = 0.999885(70) Standard Atomic Weight = [1.00784,1.00811] Notes = m -------- My goal is to extract the content into a dictionary that displays each unique triplet as indicated below {'H1': {'Z': 1,'A': 1,'m': 1.00782503223}, 'D2': {'Z': 1,'A': 2,'m': 2.01410177812} ...} etc My code that I have attempted is as follows: filename='ex.txt' afile=open(filename,'r') #opens the file content=afile.readlines() afile.close() isotope_data={'Z':0,'A':0,'m':0}#start to create subdictionary for each case of atoms with its unique keys and values for line in content: data=line.strip().split() if len(data)<1: pass elif data[0]=="Atomic" and data[1]=="Number": atomic_number=data[3] elif data[0]=="Mass" and data[1]=="Number": mass_number=data[3] elif data[0]=="Relative" and data[1]=="Atomic" and data[2]=="Mass": relative_atomic_mass=data[4] isotope_data['Z']=atomic_number isotope_data['A']=mass_number isotope_data['A']=relative_atomic_mass isotope_data the output from the programme is only {'Z': '118', 'A': '295', 'm': '295.21624(69#)'} I seem to be owerwriting each dictionary and ends up with the above result.Somehow i think I have to put the assigment of the key,value pairs elsewhere. I have tried directly below the elif statements also,but that did not work. Any hints or ideas Regards Bob From learn2program at gmail.com Tue Jul 26 20:33:39 2022 From: learn2program at gmail.com (Alan Gauld) Date: Wed, 27 Jul 2022 01:33:39 +0100 Subject: [Tutor] Building dictionary from large txt file In-Reply-To: References: Message-ID: <0aa4daf4-6c45-ab98-ed5f-f9d381ca6299@yahoo.co.uk> On 26/07/2022 21:58, bobx ander wrote: > Hi all, > I'm trying to build a dictionary from a rather large file of following > format after it has being read into a list(excerpt from start of list below) > -------- > > Atomic Number = 1 > Atomic Symbol = H > Mass Number = 1 > Relative Atomic Mass = 1.00782503223(9) > Isotopic Composition = 0.999885(70) > Standard Atomic Weight = [1.00784,1.00811] > Notes = m > -------- > > My goal is to extract the content into a dictionary that displays each > unique triplet as indicated below > {'H1': {'Z': 1,'A': 1,'m': 1.00782503223}, > 'D2': {'Z': 1,'A': 2,'m': 2.01410177812} > ...} etc Unfortunately to those of us unfamiliar with your data that is as clear as mud. You refer to a triplet but your sample file entry has 7 fields, some of which have multiple values. Where is the triplet among all that data? Then you show us a dictionary with keys that do not correspond to any of the fields in your data sample. How do the fields correspond - the only "obvious" one is the mass which evidently corresponds with the key 'm'. But what are H1 and D2? Another file record or some derived value from the record shown above? Similarly for Z, A and m. How do they relate to the data? You need to specify your requirement more explicitly for us to be sure we are giving valid advice. > My code that I have attempted is as follows: > > filename='ex.txt' > > afile=open(filename,'r') #opens the file > content=afile.readlines() > afile.close() You probably don't need to read the file into a list if you are going to process it line by line. Just read the lines from the file and process them as you go. > isotope_data={'Z':0,'A':0,'m':0}#start to create subdictionary for > each case of atoms with its unique keys and values > for line in content: > data=line.strip().split() > > if len(data)<1: > pass > elif data[0]=="Atomic" and data[1]=="Number": > atomic_number=data[3] > > > elif data[0]=="Mass" and data[1]=="Number": > mass_number=data[3] > > > > elif data[0]=="Relative" and data[1]=="Atomic" and data[2]=="Mass": > relative_atomic_mass=data[4] > Rather than split the line then compare each field it might be easier (and more readable) to compare the full strings using the startswith() method then split the string: for line in file: ???? if line.startwith("Atomic Number"): ???????? atomic_number = line.strip().split()[3] ??? etc... > isotope_data['Z']=atomic_number > isotope_data['A']=mass_number > isotope_data['A']=relative_atomic_mass > isotope_data > > the output from the programme is only > > {'Z': '118', 'A': '295', 'm': '295.21624(69#)'} > > I seem to be owerwriting each dictionary Yes, you never detect the end of a record - you never explain how records are separated in the file either! You need something like master = []?? # empty dict. for line in file: ?????? if line.startswith("Atomic Number") ?????????? create variable.... ????? if line.startswith(....):....etc ?????? if ?? # we don't know what this is... ???????????? # save variables in a dictionary ???????????? record = { key1:variable1, key2:variable2....} ???????????? # insert dictionary to master dictionary ???????????? master[key] = record How you generate the keys is a mystery to me but presumably you know. You could write the values directly into the master dictionary if you prefer. Also note that you are currently storing strings. If you want the numeric data you will need to convert it with int() or float() as appropriate. -- Alan G Author of the Learn to Program web site http://www.alan-g.me.uk/ http://www.amazon.com/author/alan_gauld Follow my photo-blog on Flickr at: http://www.flickr.com/photos/alangauldphotos From PythonList at DancesWithMice.info Tue Jul 26 22:36:37 2022 From: PythonList at DancesWithMice.info (dn) Date: Wed, 27 Jul 2022 14:36:37 +1200 Subject: [Tutor] Building dictionary from large txt file In-Reply-To: References: Message-ID: On 27/07/2022 08.58, bobx ander wrote: > Hi all, > I'm trying to build a dictionary from a rather large file of following > format after it has being read into a list(excerpt from start of list below) > -------- > > Atomic Number = 1 > Atomic Symbol = H > Mass Number = 1 > Relative Atomic Mass = 1.00782503223(9) > Isotopic Composition = 0.999885(70) > Standard Atomic Weight = [1.00784,1.00811] > Notes = m > -------- > > My goal is to extract the content into a dictionary that displays each > unique triplet as indicated below > {'H1': {'Z': 1,'A': 1,'m': 1.00782503223}, > 'D2': {'Z': 1,'A': 2,'m': 2.01410177812} > ...} etc > My code that I have attempted is as follows: > > filename='ex.txt' > > afile=open(filename,'r') #opens the file > content=afile.readlines() > afile.close() > isotope_data={'Z':0,'A':0,'m':0}#start to create subdictionary for > each case of atoms with its unique keys and values > for line in content: > data=line.strip().split() > > if len(data)<1: > pass > elif data[0]=="Atomic" and data[1]=="Number": > atomic_number=data[3] > > > elif data[0]=="Mass" and data[1]=="Number": > mass_number=data[3] > > > > elif data[0]=="Relative" and data[1]=="Atomic" and data[2]=="Mass": > relative_atomic_mass=data[4] > > > isotope_data['Z']=atomic_number > isotope_data['A']=mass_number > isotope_data['A']=relative_atomic_mass > isotope_data +1 after @Alan: it is difficult to ascertain how the dictionary is transformed from the input file (not list!). Because things are "not work[ing]" the code is evidently 'too complex'. NB this is not an insult to your intelligence. It is however a reflection on your programming expertise/experience, and/or your Python expertise. The (recommended) answer is to break-down the total problem into smaller units which you-personally can 'see' and confidently manage. (which level of detail or size of "chunk", is different for each of us!) Is the file *guaranteed* to have all seven lines per isotope (or whatever we have to imagine it contains)? Alternately, are some 'isotopes' described with fewer than seven lines of data? In which case, each line must be read and 'understood' - plus any missing data must be handled, presumably with some default value or an indicator that such data is not available. The first option seems more likely. Note how (first line, above) the problem was expressed, perhaps 'backwards'! This is because it is easier to understand that way around - and possibly a source of the problem described. So, here is a suggested approach - with the larger-problem broken-down into smaller (and more-easily understood) units:- The first component to code is a Python-generator which opens the file (use a Context Manager if you want to be 'advanced'/an excuse to learn such), reads a line, 'cleans' the data, and "yield"s the data-value; 'rinse and repeat'. Next, (up to) seven 'basic-functions', representing each of the dictionary entries/lines in the data file. These will be very similar to each-other, but each is solely-devoted to creating one dictionary entry from the data generated by the generator. If they are called in the correct sequence, each call will correspond to the next (expected) record being read-in from the data-file. I'm assuming (in probable ignorance) that some data-items are collated/collected together as nested dictionaries. In which case, another 'level' of subroutine may be required - an 'assembly-function'. This/these will call 'however many' of the above 'basic-functions' in order to assemble a dictionary-entry which contains a dictionary as its "value" (dicts are "key"-"value" pairs - in case you haven't met this terminology before). Those 'assembly-functions' will return that more complex dictionary entry. We can now 'see' that the one-to-one relationship between a dictionary sub-structure is more important than any one-to-one relationship with the input file! Thus, given that the objective is to build "a dictionary" of "unique triplet[s]", each function should return a sub-component of that 'isotope's' entry in the dictionary - some larger sub-components and others a single value or key-value pair! Finally then, the 'top level' is a loop-forever until the generator returns an 'end of file' exception. The loop calls each basic-function or assembly-function in-turn, and either gradually or 'at the bottom of each loop' assembles the dictionary-entry for that 'isotope' and adds it to the dictionary. Try a main-loop which looks something like: # init dict while "there's data": atomic_number = get_atomic_number() atomic_symbol = get_atomic_symbol() assemble_atomic_mass = get_atomic_mass() # etc assemble_dict_entry( atomic_number, atomic_symbol, ... ) # probably only need a try...except around the first call # which will break out of the while-loop # dict is now fully assembled and ready for use... # sample 'assembly-function' def assemble_atomic_mass(): # init sub-dict mass_number = get_mass_number() relative_atomic_mass = get_relative_atomic_mass() #etc # assemble sub-dict entry with atomic mass data return sub-dict # repeat above with function for each sub-dict/sub-collection of data # which brings us to the individual data-items. These, it is implied, appear on separate lines of the data file, but in sets of seven data-lines (am ignoring lines of dashes, but if present, then eight-line sets). Accordingly: def get_atomic_number(): get_next_line() # whatever checks/processing return atomic_number # and repeat for each of the seven data-items # if necessary, add read-and-discard for line of dashes # all the input functionality has been devolved to: def get_next_line(): # a Python generator which # open the file # loop-forever # reads single line/record # (no need for more - indeed no point in reading the whole and then having to break that down!) # strip, split, etc # yield data-value # until eof and an exception will be 'returned' and ripple 'up' the hierarchy of functions to the 'top-level'. Here is another question: having assembled this dictionary, what will be done with it? Always start at that back-end - we used to mutter the mantra "input - process - output" and start 'backwards' (you've probably already noted that!) Another elegant feature is that each of the functions (starting from the lowest level) can be developed and tested individually (or tested and developed if you practice "TDD"). By testing that the generator returns the data-file's records appropriately, the complexity of writing and testing the next 'layer' of subroutine/function becomes easier - because you will know that at least half of it 'already works'! Each (working) small module can be built-upon and more-easily assembled into a working whole - and if/when something 'goes wrong', it will most likely be contained (only) within the newly-developed code! (of course, if a fault is found to be caused by 'lower level code' (draw conclusion here), then, provided the tests have been retained, the test for that lower-level can be expanded with the needed check, the tests re-run, and one's attention allowed to rise back 'up' through the layers...) "Divide and conquer"! -- Regards, =dn From wlfraed at ix.netcom.com Tue Jul 26 23:11:02 2022 From: wlfraed at ix.netcom.com (Dennis Lee Bieber) Date: Tue, 26 Jul 2022 23:11:02 -0400 Subject: [Tutor] Building dictionary from large txt file References: Message-ID: On Tue, 26 Jul 2022 22:58:06 +0200, bobx ander declaimed the following: > >Atomic Number = 1 > Atomic Symbol = H > Mass Number = 1 > Relative Atomic Mass = 1.00782503223(9) > Isotopic Composition = 0.999885(70) > Standard Atomic Weight = [1.00784,1.00811] > Notes = m >-------- > >My goal is to extract the content into a dictionary that displays each >unique triplet as indicated below >{'H1': {'Z': 1,'A': 1,'m': 1.00782503223}, > 'D2': {'Z': 1,'A': 2,'m': 2.01410177812} > ...} etc First thing I'd want to know is how each entry in your source data MAPS to each item in your desired dictionary. >My code that I have attempted is as follows: > >filename='ex.txt' > >afile=open(filename,'r') #opens the file >content=afile.readlines() >afile.close() I'd probably run a loop inside the open/close section, collecting the items for ONE entry. I presume "Atomic Number" starts each entry. Then, when the next "Atomic Number" line is reached you process the collected lines to make your dictionary entry. >isotope_data={'Z':0,'A':0,'m':0}#start to create subdictionary for >each case of atoms with its unique keys and values Usually not needed as addressing a key to add a value doesn't need predefined keys or values. The only reason to initialize is if you expect to have blocks that DON'T define all key/value pairs. >for line in content: > data=line.strip().split() > Drop the .split() at this level... IF you don't mind some loss in processing speed to allow... > if len(data)<1: if not data: #empty string pass see: >>> str1 = "Atomic Number = 1" >>> str2 = " " >>> bool(str1) True >>> bool(str2) True >>> bool(str1.strip()) True >>> bool(str2.strip()) <<<< False >>> > pass > elif data[0]=="Atomic" and data[1]=="Number": > atomic_number=data[3] > elif data.startswith("Atomic Number": atomic_number = data.split()[-1] > > elif data[0]=="Mass" and data[1]=="Number": > mass_number=data[3] > > > > elif data[0]=="Relative" and data[1]=="Atomic" and data[2]=="Mass": > relative_atomic_mass=data[4] > Ditto for all those. > >isotope_data['Z']=atomic_number >isotope_data['A']=mass_number >isotope_data['A']=relative_atomic_mass This REPLACES any previous value of the key "A". To store multiple values for a single key you need to put the values into a list... Presuming you will always have both "mass_number" and "relative_atomic_mass" isotope_date["A"] = [mass_number, relative_atomic_mass] You don't show the outer dictionary in the example (the same list concern may apply, you may need to do something like dict["key"] = [] if term_1: dict["key"].append(term_1_value) if term_2: dict["key"].append(term_2_value) etc. -- Wulfraed Dennis Lee Bieber AF6VN wlfraed at ix.netcom.com http://wlfraed.microdiversity.freeddns.org/ From avi.e.gross at gmail.com Tue Jul 26 20:47:06 2022 From: avi.e.gross at gmail.com (avi.e.gross at gmail.com) Date: Tue, 26 Jul 2022 20:47:06 -0400 Subject: [Tutor] Building dictionary from large txt file In-Reply-To: References: Message-ID: <006801d8a152$65a37130$30ea5390$@gmail.com> There is so much work to be done based on what you show and questions to answer. The short answer is you made ONE dictionary and overwrote it. You want an empty dictionary that you keep inserting this dictionary you made into. You need to recognize when a section of lines is complete. When you see a blank line now, you PASS. Your goal seems to be to read in a multi-line entry perhaps between dividers like this"--------" so your readlines may not be doing what you want as each lines has a single item and some may have none. Whatever you read in seems to be in content. Your code wrapped funny on MY screen so I did not see this line: isotope_data={'Z':0,'A':0,'m':0}#start to create subdictionary for each case of atoms with its unique keys and values for line in content: data=line.strip().split() OK, without comment, the rest of your code seems to suggest there is perhaps a blank line between a series of lines containing info. It is of some concern that two entries now start with Atomic but you only look for one. And are you aware that the split leaves these as single entities: 1.00782503223(9), 0.999885(70), [1.00784,1.00811] Those are all TEXT and you seem to want to remove things in parentheses from your output. Do you really want to generally store text or various forms of numbers? You have lots of work to make the details work such as by concatenating what in your code was NOT READ containing an "H" with the isotopic number of "1" into H1. It is best to consider reading in a small sample of ONE and making the code work to extract what is needed and combine it into the form you want and make a single entry. Then add a loop. If your data is guaranteed to always have the same N lines, other methods may work as well or better. -----Original Message----- From: Tutor On Behalf Of bobx ander Sent: Tuesday, July 26, 2022 4:58 PM To: tutor at python.org Subject: [Tutor] Building dictionary from large txt file Hi all, I'm trying to build a dictionary from a rather large file of following format after it has being read into a list(excerpt from start of list below) -------- Atomic Number = 1 Atomic Symbol = H Mass Number = 1 Relative Atomic Mass = 1.00782503223(9) Isotopic Composition = 0.999885(70) Standard Atomic Weight = [1.00784,1.00811] Notes = m -------- My goal is to extract the content into a dictionary that displays each unique triplet as indicated below {'H1': {'Z': 1,'A': 1,'m': 1.00782503223}, 'D2': {'Z': 1,'A': 2,'m': 2.01410177812} ...} etc My code that I have attempted is as follows: filename='ex.txt' afile=open(filename,'r') #opens the file content=afile.readlines() afile.close() isotope_data={'Z':0,'A':0,'m':0}#start to create subdictionary for each case of atoms with its unique keys and values for line in content: data=line.strip().split() if len(data)<1: pass elif data[0]=="Atomic" and data[1]=="Number": atomic_number=data[3] elif data[0]=="Mass" and data[1]=="Number": mass_number=data[3] elif data[0]=="Relative" and data[1]=="Atomic" and data[2]=="Mass": relative_atomic_mass=data[4] isotope_data['Z']=atomic_number isotope_data['A']=mass_number isotope_data['A']=relative_atomic_mass isotope_data the output from the programme is only {'Z': '118', 'A': '295', 'm': '295.21624(69#)'} I seem to be owerwriting each dictionary and ends up with the above result.Somehow i think I have to put the assigment of the key,value pairs elsewhere. I have tried directly below the elif statements also,but that did not work. Any hints or ideas Regards Bob _______________________________________________ Tutor maillist - Tutor at python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor From avi.e.gross at gmail.com Tue Jul 26 21:11:01 2022 From: avi.e.gross at gmail.com (avi.e.gross at gmail.com) Date: Tue, 26 Jul 2022 21:11:01 -0400 Subject: [Tutor] Building dictionary from large txt file In-Reply-To: <0aa4daf4-6c45-ab98-ed5f-f9d381ca6299@yahoo.co.uk> References: <0aa4daf4-6c45-ab98-ed5f-f9d381ca6299@yahoo.co.uk> Message-ID: <006d01d8a155$bcdbc080$36934180$@gmail.com> Sorry, Alan, I found that part quite clear. Then again, one of my degrees is a B.S. in Chemistry. No idea why I ever did that as I have never had any use for it, well, except in Med School, which I also wonder why ... Bob may not have been clear, but what he is reading in is basically a table of atomic elements starting with the Atomic Number (number of Protons so Hydrogen is 1 and Helium is 2 and so on). Many elements come in a variety of isotopes meaning the number of neutrons varies so Hydrogen can be a single proton or have one neutron too for Deuterium or 2 for tritium. The mass number is loosely how many nucleons it has as they are almost the same mass. He wants the key to be the concatenation of Atomic number (which is always one or two letters like H or He with I believe the mass number, thus he should make H1 in his one and only example (albeit you can look up things like a periodical table to see others in whatever format.) That field clearly should be text and used as a unique key. He then want the values to be another dictionary where the many second-level dictionaries contain keys of 'Z', 'A', 'm' and whatever his etc. is. The Atomic mass is obvious but not really as it depends on the mix of isotopes in a sample. Hydrogen normally is predominantly the single proton version so the atomic weight is very close to one. But if you extracted out almost pure Deuterium, it would be about double. Whatever it is, he needs to extract out a value like this "1.00782503223(9)" and either toss the leaky digit, parens and all, or include it and then convert it into a FLOAT of some kind. Isotopic Composition is sort of clear(as mud) as I mentioned there are more than two isotopes, albeit tritium does not last long before breaking down, so here it means the H-1 version is 0.999885(70) or way over 99% of the sample, with a tiny bit of H-2 or deuterium. (Sorry, hard to write anything in text mode when there are subscripts and superscripts used on both the left and right of symbols). I am not sure how this scales up when many other elements have many isotopes including stable ones, but assume it means the primary is some percent of the total. Chlorine, for example, has over two dozen known isotopes and an Atomic Weight in the neighborhood of 35 1/2 as nothing quite dominates. And since samples vary in percentage composition, the Atomic Weight is shown as some kind of range in: Standard Atomic Weight = [1.00784,1.00811] He needs to extract the two numbers using whatever techniques he wants and either record both as a tuple or list (after converting perhaps to float) or take an average or whatever the assignment requires. I have no idea if Notes matters as he stopped explain what he wants his output to be BUT he should know it may be a pain to deal with the split text as it may show up as multiple items in his list of tokens. But as I wrote earlier, his main request was to ask why his badly formatted single dictionary gets overwritten and the answer is because he does that instead of adding it to an outer dictionary first and then starting over. So the rest of your comments do apply. Just satisfying your curiosity and if I am wrong, someone feel free to correct me. -----Original Message----- From: Tutor On Behalf Of Alan Gauld Sent: Tuesday, July 26, 2022 8:34 PM To: tutor at python.org Subject: Re: [Tutor] Building dictionary from large txt file On 26/07/2022 21:58, bobx ander wrote: > Hi all, > I'm trying to build a dictionary from a rather large file of following > format after it has being read into a list(excerpt from start of list > below) > -------- > > Atomic Number = 1 > Atomic Symbol = H > Mass Number = 1 > Relative Atomic Mass = 1.00782503223(9) > Isotopic Composition = 0.999885(70) > Standard Atomic Weight = [1.00784,1.00811] > Notes = m > -------- > > My goal is to extract the content into a dictionary that displays each > unique triplet as indicated below > {'H1': {'Z': 1,'A': 1,'m': 1.00782503223}, > 'D2': {'Z': 1,'A': 2,'m': 2.01410177812} > ...} etc Unfortunately to those of us unfamiliar with your data that is as clear as mud. You refer to a triplet but your sample file entry has 7 fields, some of which have multiple values. Where is the triplet among all that data? Then you show us a dictionary with keys that do not correspond to any of the fields in your data sample. How do the fields correspond - the only "obvious" one is the mass which evidently corresponds with the key 'm'. But what are H1 and D2? Another file record or some derived value from the record shown above? Similarly for Z, A and m. How do they relate to the data? You need to specify your requirement more explicitly for us to be sure we are giving valid advice. > My code that I have attempted is as follows: > > filename='ex.txt' > > afile=open(filename,'r') #opens the file > content=afile.readlines() > afile.close() You probably don't need to read the file into a list if you are going to process it line by line. Just read the lines from the file and process them as you go. > isotope_data={'Z':0,'A':0,'m':0}#start to create subdictionary for > each case of atoms with its unique keys and values for line in > content: > data=line.strip().split() > > if len(data)<1: > pass > elif data[0]=="Atomic" and data[1]=="Number": > atomic_number=data[3] > > > elif data[0]=="Mass" and data[1]=="Number": > mass_number=data[3] > > > > elif data[0]=="Relative" and data[1]=="Atomic" and data[2]=="Mass": > relative_atomic_mass=data[4] > Rather than split the line then compare each field it might be easier (and more readable) to compare the full strings using the startswith() method then split the string: for line in file: if line.startwith("Atomic Number"): atomic_number = line.strip().split()[3] etc... > isotope_data['Z']=atomic_number > isotope_data['A']=mass_number > isotope_data['A']=relative_atomic_mass > isotope_data > > the output from the programme is only > > {'Z': '118', 'A': '295', 'm': '295.21624(69#)'} > > I seem to be owerwriting each dictionary Yes, you never detect the end of a record - you never explain how records are separated in the file either! You need something like master = [] # empty dict. for line in file: if line.startswith("Atomic Number") create variable.... if line.startswith(....):....etc if # we don't know what this is... # save variables in a dictionary record = { key1:variable1, key2:variable2....} # insert dictionary to master dictionary master[key] = record How you generate the keys is a mystery to me but presumably you know. You could write the values directly into the master dictionary if you prefer. Also note that you are currently storing strings. If you want the numeric data you will need to convert it with int() or float() as appropriate. -- Alan G Author of the Learn to Program web site http://www.alan-g.me.uk/ http://www.amazon.com/author/alan_gauld Follow my photo-blog on Flickr at: http://www.flickr.com/photos/alangauldphotos _______________________________________________ Tutor maillist - Tutor at python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor From roel at roelschroeven.net Wed Jul 27 03:39:04 2022 From: roel at roelschroeven.net (Roel Schroeven) Date: Wed, 27 Jul 2022 09:39:04 +0200 Subject: [Tutor] Building dictionary from large txt file In-Reply-To: References: Message-ID: Op 27/07/2022 om 5:11 schreef Dennis Lee Bieber: > [...] > I'd probably run a loop inside the open/close section, collecting the > items for ONE entry. I presume "Atomic Number" starts each entry. Then, > when the next "Atomic Number" line is reached you process the collected > lines to make your dictionary entry. I'd probably do something like that too. But don't forget to also process the collected lines at the end of the file! After the last entry there is no "Atomic Number" line anymore, so it's easy to inadvertently skip that last entry (speaking from experience here... ). -- "Most of us, when all is said and done, like what we like and make up reasons for it afterwards." -- Soren F. Petersen From avi.e.gross at gmail.com Fri Jul 29 12:22:48 2022 From: avi.e.gross at gmail.com (avi.e.gross at gmail.com) Date: Fri, 29 Jul 2022 12:22:48 -0400 Subject: [Tutor] POSIT and QUARTO Message-ID: <011e01d8a367$715933e0$540b9ba0$@gmail.com> This is not a question. Just a fairly short comment about changes that may impact some Python users. I have long used both R and python to do things but had to use different development environments. The company formerly called RSTUDIO has been increasingly supporting python as well as R and now has been renamed to POSIT, presumably by adding a P for Python and keeping some letter from STudIO: https://www.r-bloggers.com/2022/07/posit-why-rstudio-is-changing-its-name/?u tm_source=phpList &utm_medium=email&utm_campaign=R-bloggers-daily&utm_content=HTML I mention this as I have already been doing some python work in RSTUDIO as well as anaconda and for small bits in IDLE and even in a CYGWIN environment and my machine is a tad confused at the multiple downloads of various versions of python. Also, I have been using tools to make live documents that run code and interleave it with text and the new company also supports a sort of vastly improved and merged version of a product that will now also work with python and other languages called QUARTO that some might be interested in. https://www.r-bloggers.com/2022/07/announcing-quarto-a-new-scientific-and-te chnical-publishing-system/?utm_source=phpList &utm_medium=email&utm_campaign=R-bloggers-daily&utm_content=HTML I have no personal connection with the company except as a happy user for many years who has been interested much more broadly than their initial product and will happily use the abilities they provide that let me mix and match what I do in an assortment of languages. Time for me to revisit Julia and Javascript that are now supported. I am NOT saying there is anything wrong with python, just a new option on how to work with python in a nice well-considered GUI that many already have found very useful. In many fields of use, many programmers and projects often choose among various programming languages and environments so you often end up having to be, in a sense, multilingual and multicultural. So it can be nice to work toward an environment where many people can be comfortable and even work together while remaining somewhat unique. The above is an example I have used to write documents that incorporate functionality as in use R to read in a file, convert it and save another file while producing some statistics in the text, then in the same document, have a snippet of python open that output file and do more and show it in the same document, as an example. From sjeik_appie at hotmail.com Sun Jul 31 12:19:42 2022 From: sjeik_appie at hotmail.com (Albert-Jan Roskam) Date: Sun, 31 Jul 2022 18:19:42 +0200 Subject: [Tutor] Unittest question Message-ID: Hi, I am trying to run the same unittests against two different implementations of a module. Both use C functions in a dll, but one uses ctypes and the other that I just wrote uses CFFI. How can do this nicely? I don't understand why the code below won't run both flavours of tests. It says "Ran 1 test", I expected two! Any ideas? Thanks! Albert-Jan import unittest ? class MyTest(unittest.TestCase): ? ??? def __init__(self, *args, **kwargs): ??????? self.implementation = kwargs.pop("implementation", None) ??????? if self.implementation == "CFFI": ????????????import my_module_CFFI as xs ????????else: ??????????? import my_module as xs ??????? super().__init__(*args, **kwargs)????????????????????????????????? ? ??? def test_impl(self): ??????? print(self.implementation) ??????? self.assertEqual(self.implementation, "") ? if __name__ == "__main__": ????suite? = unittest.TestSuite() ??? suite.addTests(MyTest(implementation="CFFI")) ????suite.addTests(MyTest(implementation="ctypes"))??? ????runner = unittest.TextTestRunner(verbosity=3) ??? runner.run(suite) ? ###Output: AssertionError: None != '' -------------------- >> begin captured stdout << --------------------- None ? --------------------- >> end captured stdout << ---------------------- ? ---------------------------------------------------------------------- From sjeik_appie at hotmail.com Sun Jul 31 16:53:33 2022 From: sjeik_appie at hotmail.com (Albert-Jan Roskam) Date: Sun, 31 Jul 2022 22:53:33 +0200 Subject: [Tutor] Unittest question In-Reply-To: Message-ID: On Jul 31, 2022 18:19, Albert-Jan Roskam wrote: ?suite.addTests(MyTest(implementation="CFFI")) ?? ????suite.addTests(MyTest(implementation="ctypes"))??? ? ? Hmm, maybe addTest and not addTests.? I found this SO post with a very similar approach. Will try this tomorrow.?https://stackoverflow.com/questions/32899/how-do-you-generate-dynamic-parameterized-unit-tests-in-python And unittest has "subtests", which may also work From sjeik_appie at hotmail.com Sun Jul 31 17:02:20 2022 From: sjeik_appie at hotmail.com (Albert-Jan Roskam) Date: Sun, 31 Jul 2022 23:02:20 +0200 Subject: [Tutor] Unittest question In-Reply-To: Message-ID: Me likey :-) https://bugs.python.org/msg151444 Looks like a clean approach On Jul 31, 2022 22:53, Albert-Jan Roskam wrote: On Jul 31, 2022 18:19, Albert-Jan Roskam wrote: ?suite.addTests(MyTest(implementation="CFFI")) ?? ????suite.addTests(MyTest(implementation="ctypes"))??? ? ? Hmm, maybe addTest and not addTests.? I found this SO post with a very similar approach. Will try this tomorrow.?https://stackoverflow.com/questions/32899/how-do-you-generate-dynamic-parameterized-unit-tests-in-python And unittest has "subtests", which may also work