From __peter__ at web.de Tue Jun 1 01:50:23 2021 From: __peter__ at web.de (Peter Otten) Date: Tue, 1 Jun 2021 07:50:23 +0200 Subject: [Tutor] product of all arguments passed to function call In-Reply-To: References: Message-ID: On 31/05/2021 12:19, Alan Gauld via Tutor wrote: > def product(*x): > result = 1 if x else 0 > for n in x: > result *= n > return result The result of an empty product is usually the "multiplicative identity", i. e. 1 in the majority of cases. If there were a result for the no-arguments case I'd pick that. Another problem with your implementation is that you are making assumptions about the arguments, thus defeating Python's duck typing. I admit the example is a bit farfetched, but anyway: >>> class S(str): def __imul__(self, other): return S(f"{self}*{other.strip()}") >>> def product(*x): result = 1 if x else 0 for n in x: result *= n return result >>> product(S("a"), "b ", " c ") Traceback (most recent call last): File "", line 1, in product(S("a"), "b ", " c ") File "", line 4, in product result *= n TypeError: can't multiply sequence by non-int of type 'str' Whereas: >>> def product(x, *y): for yy in y: x *= yy return x >>> product(S("a"), "b ", " c ") 'a*b*c' There is prior art (sum() and strings), I don't like that either ;) From mats at wichmann.us Tue Jun 1 09:23:49 2021 From: mats at wichmann.us (Mats Wichmann) Date: Tue, 1 Jun 2021 07:23:49 -0600 Subject: [Tutor] flatten a python list In-Reply-To: References: Message-ID: On 5/30/21 9:58 PM, Manprit Singh wrote: > Dear sir, > > consider a problem to get a flat list for below given list : > lst = [[2, 6], [5, 8], [6, 0]] > the flatten list will be : > ans = [2, 6, 5, 8, 6, 0] > > I have seen in several texts around the internet and even in the textbooks, > the approach followed is to use nested for loop. > ans = [] > for ele in lst: > for num in ele: > ans.append(num) > > instead of using this for loop if i write : > ans = [] > for ele in lst: > ans.extend(ele) > > Although the answer is correct, I need your comments on this approach. > > Just for the sake of healthy discussion, i am putting this question > Although i can do it in a very efficient manner with list comprehension as > given below: > ans = [num for ele in lst for num in ele] How about this for grins, should handle the "arbitrary layout" question Alan raised. def flatten(lst): for item in lst: if not isinstance(item, list): yield item else: yield from flatten(item) lst = [[2, 6], [5, 8], [6, 0]] ans = list(flatten(lst)) print(ans) [2, 6, 5, 8, 6, 0] # you can leave out the list() if an iterator is ok for your use This has some flaws - but the other approaches do, too. e.g. what if some item is a tuple, or a string is passed... In a lot of cases this probably isn't what you want: print(list(flatten("a string"))) ['a', ' ', 's', 't', 'r', 'i', 'n', 'g'] in other words, as a general purpose approach you'd want to do some data validation. From theoldny at gmail.com Tue Jun 1 08:21:38 2021 From: theoldny at gmail.com (Nick Yim) Date: Tue, 1 Jun 2021 19:21:38 +0700 Subject: [Tutor] Question Regarding Quiz Material Message-ID: Hello, My question is regarding the script below. The program executes fine and gives the desired result, so I guess my question might have more to do with the math in this problem. I am wondering why the numbers - n = 3, n = 36, n 102 - are not included in the list of divisors? As well, why is the number 1 included as a divisor of 3 and 36, but not as a divisor of 102? Please advise! Best, Nick def sum_divisors(n): x = 1 #trying all the numbers from 1, because you cannot divide by 0 sum = 0 #adding to 0 while n!=0 and x References: Message-ID: On 01/06/2021 13:21, Nick Yim wrote: > I am wondering why the numbers - n = 3, n = 36, n 102 - are not included in > the list of divisors? > > As well, why is the number 1 included as a divisor of 3 and 36, but not as > a divisor of 102? I don't see those issues in the code below? Please post in plain text, otherwise the mail system strips all indentation as below... > def sum_divisors(n): > x = 1 > #trying all the numbers from 1, because you cannot divide by 0 for x in range(1,n+1): > sum = 0 > #adding to 0 > while n!=0 and x #error because the answer does not include the number itself The for loop would handle that too. > if n%x == 0: > sum += x > x +=1 And the for loop does that too. > return sum Or you could do it all with a generator expression: def sum_divisors(n): return sum(x for x in range(1,n+1) if n%x==0) -- Alan G Author of the Learn to Program web site http://www.alan-g.me.uk/ http://www.amazon.com/author/alan_gauld Follow my photo-blog on Flickr at: http://www.flickr.com/photos/alangauldphotos From marc.tompkins at gmail.com Tue Jun 1 10:46:41 2021 From: marc.tompkins at gmail.com (Marc Tompkins) Date: Tue, 1 Jun 2021 07:46:41 -0700 Subject: [Tutor] Question Regarding Quiz Material In-Reply-To: References: Message-ID: On Tue, Jun 1, 2021 at 7:21 AM Nick Yim wrote: > I am wondering why the numbers - n = 3, n = 36, n 102 - are not included in > the list of divisors? > > This line: while n!=0 and x As well, why is the number 1 included as a divisor of 3 and 36, but not as > a divisor of 102? > It _is_ included as a divisor of 102; I think your addition is off. A small modification to your script will make things a bit clearer: def sum_divisors(n): x = 1 #trying all the numbers from 1, because you cannot divide by 0 sum = 0 divisors = [] #adding to 0 while n!=0 and x > From PyTutor at DancesWithMice.info Tue Jun 1 21:35:36 2021 From: PyTutor at DancesWithMice.info (dn) Date: Wed, 2 Jun 2021 13:35:36 +1200 Subject: [Tutor] Was: flatten a python list In-Reply-To: References: Message-ID: <8b743908-3927-c9cc-f7bc-746bee924c5f@DancesWithMice.info> On 02/06/2021 01.23, Mats Wichmann wrote: > On 5/30/21 9:58 PM, Manprit Singh wrote: >> consider a problem to get a flat list for below given list : ... > > How about this for grins, should handle the "arbitrary layout" question > Alan raised. > > def flatten(lst): > ??? for item in lst: > ??????? if not isinstance(item, list): > ??????????? yield item > ??????? else: > ??????????? yield from flatten(item) > > lst = [[2, 6], [5, 8], [6, 0]] > ans = list(flatten(lst)) > print(ans) > [2, 6, 5, 8, 6, 0] > # you can leave out the list() if an iterator is ok for your use > > This has some flaws - but the other approaches do, too.? e.g. what if > some item is a tuple, or a string is passed...? In a lot of cases this > probably isn't what you want: > > print(list(flatten("a string"))) > ['a', ' ', 's', 't', 'r', 'i', 'n', 'g'] > > in other words, as a general purpose approach you'd want to do some data > validation. What happens if we broaden the original spec from "list" to strings (as above), and then continue into tuples, dictionary (keys or values), classes, etc? Is there a generic Python function which advises if a value is atomic or some form of container? (yes, could check for __contains__ method, but...) Thus (from above code): if value is atomic then copy to output else dive into container... -- Regards, =dn From cs at cskk.id.au Tue Jun 1 22:00:07 2021 From: cs at cskk.id.au (Cameron Simpson) Date: Wed, 2 Jun 2021 12:00:07 +1000 Subject: [Tutor] Was: flatten a python list In-Reply-To: <8b743908-3927-c9cc-f7bc-746bee924c5f@DancesWithMice.info> References: <8b743908-3927-c9cc-f7bc-746bee924c5f@DancesWithMice.info> Message-ID: On 02Jun2021 13:35, DL Neil wrote: >What happens if we broaden the original spec from "list" to strings (as >above), and then continue into tuples, dictionary (keys or values), >classes, etc? > >Is there a generic Python function which advises if a value is atomic or >some form of container? >(yes, could check for __contains__ method, but...) I go with: should_iterate = False if not isinstance(obj, str): # known special case :-( try: it = iter(obj) except TypeError: pass else: should_iterate = True Usually you have better knowledge about what's going on, or the function spec mandates being given a sensible iterable etc. You can do the same for sequences by trying obj[None] or something. Hopefully these tests have no significant side effects, which is why ideally you know more (i.e. your callers can be expected to give you the right kind of thing), or mandate more (i.e. your caller's are buggy if they give you the wrong thing). Not so easy with generic stuff. Like list flatten in this example. For an example where you are doing a genericish flatten but still expect sensible inputs (but, still accepting str!), here's an example of mine: https://hg.sr.ht/~cameron-simpson/css/browse/lib/python/cs/binary.py#L103 which flattens the return from a "transcribe" method common to a suite of subclasses. For ease of implementation the transcribe methods can return a variety of convenient to the implementor things, including being generators yielding things. You can see it does a getattr along the lines of your __contains__ check, but if otherwise rather opinionated: None, bytes, memoryview, str are specially handled but otherwise we assume we've got an iterable. The return from flatten() itself is an iterable - a flat iterable if byteses. Cheers, Cameron Simpson From marc.tompkins at gmail.com Tue Jun 1 23:44:24 2021 From: marc.tompkins at gmail.com (Marc Tompkins) Date: Tue, 1 Jun 2021 20:44:24 -0700 Subject: [Tutor] Question Regarding Quiz Material In-Reply-To: References: Message-ID: On Tue, Jun 1, 2021 at 7:19 PM Nick Yim wrote: > Hey! > > Thanks a lot Marc, this clarifies > Glad to help. Sorry about the discrepancy, by the way - I called the list "divvs" in my scratch code, decided that "divisors" was better for the email, but only changed the first instance. My bad, but it looks like you got past that. From theoldny at gmail.com Tue Jun 1 22:19:05 2021 From: theoldny at gmail.com (Nick Yim) Date: Wed, 2 Jun 2021 09:19:05 +0700 Subject: [Tutor] Question Regarding Quiz Material In-Reply-To: References: Message-ID: Hey! Thanks a lot Marc, this clarifies. Best, Nick Y. On Tue, Jun 1, 2021 at 9:46 PM Marc Tompkins wrote: > On Tue, Jun 1, 2021 at 7:21 AM Nick Yim wrote: > >> I am wondering why the numbers - n = 3, n = 36, n 102 - are not included >> in >> the list of divisors? >> >> This line: > while n!=0 and x specifies that x (the divisor you're checking) MUST be smaller than your > original number. If you want to include the original number as its own > divisor, change it to > while n!=0 and x<=n: > >> As well, why is the number 1 included as a divisor of 3 and 36, but not as >> a divisor of 102? >> > > It _is_ included as a divisor of 102; I think your addition is off. > > A small modification to your script will make things a bit clearer: > > def sum_divisors(n): > x = 1 > #trying all the numbers from 1, because you cannot divide by 0 > sum = 0 > divisors = [] > #adding to 0 > while n!=0 and x #conditions for n to have divisors (not be 0) and for x to be a > potential divisor > #error because the answer does not include the number itself > if n%x == 0: > sum += x > divvs.append(x) > x +=1 > return (n, divisors, sum) > > print(sum_divisors(0)) > # 0 > print(sum_divisors(3)) # Should sum of 1 > # 1 > print(sum_divisors(36)) # Should sum of 1+2+3+4+6+9+12+18 > # 55 > print(sum_divisors(102)) # Should be sum of 2+3+6+17+34+51 > # 114 > >> >> From uptown3000 at hotmail.com Wed Jun 2 03:33:37 2021 From: uptown3000 at hotmail.com (R M) Date: Wed, 2 Jun 2021 07:33:37 +0000 Subject: [Tutor] Help Message-ID: I have this code and want to change it to stop calculating percentage. Let it just display the value what do I need to do? @staticmethod def get_default_loan_processing_fee(): config, _ = SiteConfig.objects.get_or_create(conf_key=constants.SITE_CONFIG_KEY_LOAN_PROCESSING_FEES, defaults={"value": '0', 'type': SiteConfig.TYPE_INT}) return config @staticmethod def get_default_loan_processing_fee_value(): site_config = SiteConfig.get_default_loan_processing_fee() return decimal(site_config.value) @staticmethod def set_default_loan_processing_fee_value(value): config, _ = SiteConfig.objects.update_or_create(conf_key=constants.SITE_CONFIG_KEY_LOAN_PROCESSING_FEES, defaults={"value": '%s' % value}) return config Sent from Mail for Windows 10 From alan.gauld at yahoo.co.uk Wed Jun 2 05:25:48 2021 From: alan.gauld at yahoo.co.uk (Alan Gauld) Date: Wed, 2 Jun 2021 10:25:48 +0100 Subject: [Tutor] Help In-Reply-To: References: Message-ID: On 02/06/2021 08:33, R M wrote: > I have this code and want to change it to stop calculating percentage. You need to give us more information and a more precise description. In the code below nothing is displaying anything, let alone a percentage. Most of the work is being done by SiteConfog methods which we can't see. unless these are the SireConfig methods? Its not obvious. And we certainly don't know anything about the objects attribute and its get_or_create() method. Which method do you think is returning a percentage? You can usually convert a percentage to a value by multiplying by 100. Let it just display the value what do I need to do? > > @staticmethod > def get_default_loan_processing_fee(): > config, _ = SiteConfig.objects.get_or_create(conf_key=constants.SITE_CONFIG_KEY_LOAN_PROCESSING_FEES, > defaults={"value": '0', 'type': SiteConfig.TYPE_INT}) > > return config > > @staticmethod > def get_default_loan_processing_fee_value(): > site_config = SiteConfig.get_default_loan_processing_fee() > return decimal(site_config.value) > > @staticmethod > def set_default_loan_processing_fee_value(value): > config, _ = SiteConfig.objects.update_or_create(conf_key=constants.SITE_CONFIG_KEY_LOAN_PROCESSING_FEES, > defaults={"value": '%s' % value}) > return config -- Alan G Author of the Learn to Program web site http://www.alan-g.me.uk/ http://www.amazon.com/author/alan_gauld Follow my photo-blog on Flickr at: http://www.flickr.com/photos/alangauldphotos From alan.gauld at yahoo.co.uk Wed Jun 2 13:46:02 2021 From: alan.gauld at yahoo.co.uk (Alan Gauld) Date: Wed, 2 Jun 2021 18:46:02 +0100 Subject: [Tutor] Help In-Reply-To: <40dfbg1eee4fotlg90m57e5jc7h35p728e@4ax.com> References: <40dfbg1eee4fotlg90m57e5jc7h35p728e@4ax.com> Message-ID: On 02/06/2021 17:42, Dennis Lee Bieber wrote: > On Wed, 2 Jun 2021 10:25:48 +0100, Alan Gauld via Tutor > declaimed the following: > >> >> You can usually convert a percentage to a value >> by multiplying by 100. >> > > Pardon? Did you mean /dividing/? Oops! Yes indeed. :-) -- Alan G Author of the Learn to Program web site http://www.alan-g.me.uk/ http://www.amazon.com/author/alan_gauld Follow my photo-blog on Flickr at: http://www.flickr.com/photos/alangauldphotos From manpritsinghece at gmail.com Sat Jun 5 09:09:19 2021 From: manpritsinghece at gmail.com (Manprit Singh) Date: Sat, 5 Jun 2021 18:39:19 +0530 Subject: [Tutor] A program that can check if all elements of the list are mutually disjoint Message-ID: Dear sir, Consider a list : ls = ["amba", "Joy", "Preet"] The elements of the list are mutually disjoint, as no two elements of the list have anything in common. I have written the code but it seems quite unreadable. First of all need to know if it is correct or not, if it is correct , I need to know if this can be done in a better way or not. Kindly have a look at below given code : ls = ["amba", "Joy", "Preet"] for idx, ele in enumerate(ls): if idx < len(ls)-1: if not all(set(ele).isdisjoint(ch)for ch in ls[idx+1:]): print("Not mutually disjoint") break else: print("Mutually disjoint") Regards Manprit Singh From __peter__ at web.de Sat Jun 5 10:58:58 2021 From: __peter__ at web.de (Peter Otten) Date: Sat, 5 Jun 2021 16:58:58 +0200 Subject: [Tutor] A program that can check if all elements of the list are mutually disjoint In-Reply-To: References: Message-ID: On 05/06/2021 15:09, Manprit Singh wrote: > Dear sir, > Consider a list : > ls = ["amba", "Joy", "Preet"] > > The elements of the list are mutually disjoint, as no two elements of the > list have anything in common. > > First of all need > to know if it is correct or not, if it is correct , I'm sure you can find that out yourself, or get some confidence at least. First of all you have to tighten your definition; how does "having anything in common" translate into code? Then turn your algorithm into a function, preferably one with a result rather than print() calls. Then write a few tests (unittest or doctest) that call your function with various arguments that cover what you expect your function to handle. Example: def disjoint_words(words): """ >>> disjoint_words(["amba", "Joy", "Preet"]) True """ return True > I have written the code but it seems quite unreadable. I'd say that my alternative is *very* readable -- but also wrong. Write test cases to bring that to light, then fix the function. Let's assume you do this by reusing the code you posted. The function may now be uglier, but hopefully correct. The print() calls are gone. You can be confident that the code only depends on the 'words' argument and has no hidden interactions with the rest of your script. That's progress ;) Because you have tests in place you can now refactor without fear. If a test continues to fail just go back to your previous working version of the function. Hooray to version control ;) What would be possible approaches to improve your code? The low-hanging fruit: factor out the two nested for loops (I'd look into itertools for that). A bit more demanding: see what you can do about the algorithm. Is there a chance to replace O(something) with O(something else)? I need to know if this > can be done in a better way or not. Yes ;) Kindly have a look at below given code : > > ls = ["amba", "Joy", "Preet"] > for idx, ele in enumerate(ls): > if idx < len(ls)-1: > if not all(set(ele).isdisjoint(ch)for ch in ls[idx+1:]): > print("Not mutually disjoint") > break > else: > print("Mutually disjoint") > > Regards > Manprit Singh > _______________________________________________ > Tutor maillist - Tutor at python.org > To unsubscribe or change subscription options: > https://mail.python.org/mailman/listinfo/tutor > From alan.gauld at yahoo.co.uk Sat Jun 5 14:06:07 2021 From: alan.gauld at yahoo.co.uk (Alan Gauld) Date: Sat, 5 Jun 2021 19:06:07 +0100 Subject: [Tutor] A program that can check if all elements of the list are mutually disjoint In-Reply-To: References: Message-ID: On 05/06/2021 14:09, Manprit Singh wrote: > Dear sir, > Consider a list : > ls = ["amba", "Joy", "Preet"] > > The elements of the list are mutually disjoint, as no two elements of the > list have anything in common. Ok, That is a very specific definition. As such you should put it in a predicate function ( ie. a test that returns a boolean result) You can then compare two items for success/failure of your test. > I have written the code but it seems quite unreadable. First of all need > to know if it is correct or not, There are several meanings of correct in programming: 1) Does it perform the function it is supposed to perform? 2) Does it adhere to community/project coding standards 3) Is it code that could actually be used in production (error handling, edge cases etc) Perhaps better described as "finished". Assuming you mean (1). How will you determine that? You can approach it in two ways: a) prove the code is correct using formal methods b) prove the code is correct by comprehensive testing. I'd recommend the second approach unless you have done a course on formal code proving... So can you define a set of tests and test data that will cover all the possible variations(not all cases, that could be infinite!) but all types of scenario. Include bad data, and edge cases as well as pass and fail scenarios. Put your code into a function and write the tests to check the function. If all tests give the expected result you have a high likelihood of it being correct. > can be done in a better way or not. When you move into solving very specific problems 'better' becomes a relative trm. There are unlikely to be definitive answers. Best will need to take account of the purpose of the code (library, one-off test, part of a bigger suite etc) as well as non-functional constraints such as run-time, space requirements, maintainability, logging/debugging needs etc. > ls = ["amba", "Joy", "Preet"] > for idx, ele in enumerate(ls): > if idx < len(ls)-1:> if not all(set(ele).isdisjoint(ch)for ch in ls[idx+1:]): This is where a predicate helper function would clarify things a lot: for idx, word in enumerate(ls): if isDisjoint(word, ls[idx+1:] And the function looks something like(untested!) def isDisjoint(aWord,aCollection) for other in aCollection: if not set(other).isdisjoint(set(aWord)): return false else: return True That is completely untested but I think it should work and seems more readable to me. Its probably not as fast, but you didn't specify any speed constraints. You may not need all of the explicit set() conversions. Now you need to test it... I notice Peter has now responded with ideas on how to do that. -- Alan G Author of the Learn to Program web site http://www.alan-g.me.uk/ http://www.amazon.com/author/alan_gauld Follow my photo-blog on Flickr at: http://www.flickr.com/photos/alangauldphotos From roel at roelschroeven.net Sat Jun 5 13:28:03 2021 From: roel at roelschroeven.net (Roel Schroeven) Date: Sat, 5 Jun 2021 19:28:03 +0200 Subject: [Tutor] A program that can check if all elements of the list are mutually disjoint In-Reply-To: References: Message-ID: Manprit Singh schreef op 5/06/2021 om 15:09: > Dear sir, > Consider a list : > ls = ["amba", "Joy", "Preet"] > > The elements of the list are mutually disjoint, as no two elements of the > list have anything in common. > > I have written the code but it seems quite unreadable. First of all need > to know if it is correct or not, if it is correct , I need to know if this > can be done in a better way or not. Kindly have a look at below given code : > > ls = ["amba", "Joy", "Preet"] > for idx, ele in enumerate(ls): > if idx < len(ls)-1: > if not all(set(ele).isdisjoint(ch)for ch in ls[idx+1:]): > print("Not mutually disjoint") > break > else: > print("Mutually disjoint") > Looks correct to me, at first sight. As Peter Otten mentioned, the best way to find out is to construct a number of test cases and a program to run those for you. Maybe have a look at a unit testing framework. I'm a big fan of putting code like that in a function. Then you can reason about what the function does independently from the rest of the program. I also think it's a good idea to split off the code to generate all the pairs of words in a separate function. Luckily there already is a function that does just that: itertools.combination(). Lastly I think it's better to create the sets (that you eventually use with disjoint) only once, instead of every time again. That does imply more memory usage since all sets are in memory at the same time, so maybe there are situations where it's better not to do this. This is what I came up with: from itertools import combinations ls = ["amba", "Joy", "Preet"] def are_elements_disjoint(seq): sets = (set(elem) for elem in seq) pairs = combinations(sets, 2) return all(a.isdisjoint(b) for (a, b) in pairs) if are_elements_disjoint(ls): print('Mutually disjoint') else: print('Not mutually disjoint') -- "Honest criticism is hard to take, particularly from a relative, a friend, an acquaintance, or a stranger." -- Franklin P. Jones Roel Schroeven From robertvstepp at gmail.com Sat Jun 5 21:35:50 2021 From: robertvstepp at gmail.com (boB Stepp) Date: Sat, 5 Jun 2021 20:35:50 -0500 Subject: [Tutor] How is "set(ls).add('a') evaluated? [Was: Re: A program that can check if all elements of the list are mutually disjoint] In-Reply-To: References: Message-ID: As is often the case Mr. Singh's query got me to play around in the interpreter. On Sat, Jun 5, 2021 at 8:10 AM Manprit Singh wrote: > ls = ["amba", "Joy", "Preet"] > for idx, ele in enumerate(ls): > if idx < len(ls)-1: > if not all(set(ele).isdisjoint(ch)for ch in ls[idx+1:]): > print("Not mutually disjoint") > break [...] I have not played around with set's methods and operators to date, so while trying to understand this code I tried out different things in the interpreter. Along the way I tried something and it surprised me: Python 3.9.5 (tags/v3.9.5:0a7dcbd, May 3 2021, 17:27:52) [MSC v.1928 64 bit (AMD64)] on win32 Type "help", "copyright", "credits" or "license" for more information. >>> ls = ["amba", "Joy", "Preet"] >>> z = set(ls).add('a') >>> z >>> print(z) None # This surprised me. I was expecting {'amba', 'Joy', 'Preet', 'a'}. The "normal" way of using the add() method works fine: >>> z = set(ls) >>> z {'amba', 'Joy', 'Preet'} >>> z.add('a') >>> z {'amba', 'Joy', 'Preet', 'a'} But the following works which is in a similar chained format: >>> zz = set(ls).union('a') >>> zz {'amba', 'Joy', 'Preet', 'a'} # BTW, please forgive bad identifier naming! So I am *not* understanding how "set(ls).add('a')" is evaluated to result in "None". TIA! boB Stepp From bouncingcats at gmail.com Sat Jun 5 21:55:53 2021 From: bouncingcats at gmail.com (David) Date: Sun, 6 Jun 2021 11:55:53 +1000 Subject: [Tutor] How is "set(ls).add('a') evaluated? [Was: Re: A program that can check if all elements of the list are mutually disjoint] In-Reply-To: References: Message-ID: On Sun, 6 Jun 2021 at 11:37, boB Stepp wrote: > I have not played around with set's methods and operators to date, so > while trying to understand this code I tried out different things in > the interpreter. Along the way I tried something and it surprised me: > > Python 3.9.5 (tags/v3.9.5:0a7dcbd, May 3 2021, 17:27:52) [MSC v.1928 > 64 bit (AMD64)] on win32 > Type "help", "copyright", "credits" or "license" for more information. > >>> ls = ["amba", "Joy", "Preet"] > >>> z = set(ls).add('a') > >>> z > >>> print(z) > None # This surprised me. I was expecting {'amba', 'Joy', > 'Preet', 'a'}. The set() object has an add() method that modifies its object and returns None. It is common style for methods that modify their objects to return None. The intent of this is to remind users that such methods act by side effect on an existing object, not by creating a a new object. From robertvstepp at gmail.com Sat Jun 5 22:08:12 2021 From: robertvstepp at gmail.com (boB Stepp) Date: Sat, 5 Jun 2021 21:08:12 -0500 Subject: [Tutor] How is "set(ls).add('a') evaluated? [Was: Re: A program that can check if all elements of the list are mutually disjoint] In-Reply-To: References: Message-ID: On Sat, Jun 5, 2021 at 8:57 PM David wrote: > > On Sun, 6 Jun 2021 at 11:37, boB Stepp wrote: > > > I have not played around with set's methods and operators to date, so > > while trying to understand this code I tried out different things in > > the interpreter. Along the way I tried something and it surprised me: > > > > Python 3.9.5 (tags/v3.9.5:0a7dcbd, May 3 2021, 17:27:52) [MSC v.1928 > > 64 bit (AMD64)] on win32 > > Type "help", "copyright", "credits" or "license" for more information. > > >>> ls = ["amba", "Joy", "Preet"] > > >>> z = set(ls).add('a') > > >>> z > > >>> print(z) > > None # This surprised me. I was expecting {'amba', 'Joy', > > 'Preet', 'a'}. > > The set() object has an add() method that modifies > its object and returns None. I understand this. But for "set(ls).add('a')" I am thinking the following sequence of events occur: 1) "set(ls)" creates the set-type object "{'amba', 'Joy', 'Preet'}. 2) ".add('a')" method is called on this set object, resulting in the new set object "{'amba', 'Joy', 'Preet', 'a'} 3) "z" is now assigned to point to this resultant object. But instead "z" refers to "None". OTOH, the very similar sequence of events *does* appear to happen for: >>> zz = set(ls).union('a') >>> zz {'amba', 'Joy', 'Preet', 'a'} Why are these different in their results? boB Stepp From robertvstepp at gmail.com Sat Jun 5 22:19:18 2021 From: robertvstepp at gmail.com (boB Stepp) Date: Sat, 5 Jun 2021 21:19:18 -0500 Subject: [Tutor] How is "set(ls).add('a') evaluated? [Was: Re: A program that can check if all elements of the list are mutually disjoint] In-Reply-To: References: Message-ID: Ah, I finally see the light, dn! On Sat, Jun 5, 2021 at 9:08 PM boB Stepp wrote: > > On Sat, Jun 5, 2021 at 8:57 PM David wrote: > > > > On Sun, 6 Jun 2021 at 11:37, boB Stepp wrote: > > > > > I have not played around with set's methods and operators to date, so > > > while trying to understand this code I tried out different things in > > > the interpreter. Along the way I tried something and it surprised me: > > > > > > Python 3.9.5 (tags/v3.9.5:0a7dcbd, May 3 2021, 17:27:52) [MSC v.1928 > > > 64 bit (AMD64)] on win32 > > > Type "help", "copyright", "credits" or "license" for more information. > > > >>> ls = ["amba", "Joy", "Preet"] > > > >>> z = set(ls).add('a') > > > >>> z > > > >>> print(z) > > > None # This surprised me. I was expecting {'amba', 'Joy', > > > 'Preet', 'a'}. > > > > The set() object has an add() method that modifies > > its object and returns None. > > I understand this. But for "set(ls).add('a')" I am thinking the > following sequence of events occur: > > 1) "set(ls)" creates the set-type object "{'amba', 'Joy', 'Preet'}. > 2) ".add('a')" method is called on this set object, resulting in the > new set object "{'amba', 'Joy', 'Preet', 'a'} > 3) "z" is now assigned to point to this resultant object. (3) is *not* correct. "z" is assigned to what the add() method is returning, dn's point. > >>> zz = set(ls).union('a') > >>> zz > {'amba', 'Joy', 'Preet', 'a'} > > Why are these different in their results? Here according to the docs the union() method: "Return a new set with elements from the set and all others." Caught up in something elementary once again. Sigh! boB Stepp From robertvstepp at gmail.com Sat Jun 5 22:32:25 2021 From: robertvstepp at gmail.com (boB Stepp) Date: Sat, 5 Jun 2021 21:32:25 -0500 Subject: [Tutor] How is "set(ls).add('a') evaluated? [Was: Re: A program that can check if all elements of the list are mutually disjoint] In-Reply-To: References: Message-ID: I guess that I will conclude with a Python gripe from this eternal learner and forgetter... On Sat, Jun 5, 2021 at 9:19 PM boB Stepp wrote: > > Why are these different in their results? # Referring to difference in return values for set add() and union() methods. [snip] > Caught up in something elementary once again. Sigh! I find it really hard to remember which functions and methods return "None" and operate by side effect with those that return a new object. This is especially difficult for me as I so often have long breaks between using/studying Python. I wish that Python had some sort of clever syntax that made it *obvious* which of the two possibilities will occur. ~(:>)) boB Stepp From manpritsinghece at gmail.com Sun Jun 6 04:19:30 2021 From: manpritsinghece at gmail.com (Manprit Singh) Date: Sun, 6 Jun 2021 13:49:30 +0530 Subject: [Tutor] A program that can check if all elements of the list are mutually disjoint In-Reply-To: References: Message-ID: Dear sir , Thanks to Roel . Finally today I have understood the importance of standard library modules, I work with mostly numbers when programming , and feel that itertools is very very useful. The solution that i do agree is as follows and seems readable to me too: import itertools def is_mutually_disjoint(arr): comb = itertools.combinations(lst, 2) return all(set(ele1).isdisjoint(ele2) for ele1, ele2 in comb) lst = ["amar", "sit", "xyz", "kla"] ans = is_mutually_disjoint(lst) print(ans) - gives o/p as False that means the elements of list lst are not mutually disjoint. (amar and kla have "a" as common character) and lst = ["amar", "sit", "xyz", "klg"] ans = is_mutually_disjoint(lst) print(ans) - gives True as now in the list there are no two elements which have anything in common, so the elements of list are mutually disjoint On Sat, Jun 5, 2021 at 6:39 PM Manprit Singh wrote: > Dear sir, > Consider a list : > ls = ["amba", "Joy", "Preet"] > > The elements of the list are mutually disjoint, as no two elements of the > list have anything in common. > > I have written the code but it seems quite unreadable. First of all need > to know if it is correct or not, if it is correct , I need to know if this > can be done in a better way or not. Kindly have a look at below given code : > > ls = ["amba", "Joy", "Preet"] > for idx, ele in enumerate(ls): > if idx < len(ls)-1: > if not all(set(ele).isdisjoint(ch)for ch in ls[idx+1:]): > print("Not mutually disjoint") > break > else: > print("Mutually disjoint") > > Regards > Manprit Singh > From manpritsinghece at gmail.com Sun Jun 6 04:48:35 2021 From: manpritsinghece at gmail.com (Manprit Singh) Date: Sun, 6 Jun 2021 14:18:35 +0530 Subject: [Tutor] A program that can check if all elements of the list are mutually disjoint In-Reply-To: References: Message-ID: Dear sir , But now i have one more question : If you can see in the last mails : The function posted by Roel is : def are_elements_disjoint(seq): sets = (set(elem) for elem in seq) pairs = combinations(sets, 2) return all(a.isdisjoint(b) for (a, b) in pairs) If you notice he is making an iterator "sets" of all elements of the seq in the starting Now coming to the function made by me : import itertools def is_mutually_disjoint(arr): comb = itertools.combinations(lst, 2) return all(set(ele1).isdisjoint(ele2) for ele1, ele2 in comb) in my function i am making set inside all() function. So what is the right & efficient approach ? Regards Manprit Singh On Sat, Jun 5, 2021 at 6:39 PM Manprit Singh wrote: > Dear sir, > Consider a list : > ls = ["amba", "Joy", "Preet"] > > The elements of the list are mutually disjoint, as no two elements of the > list have anything in common. > > I have written the code but it seems quite unreadable. First of all need > to know if it is correct or not, if it is correct , I need to know if this > can be done in a better way or not. Kindly have a look at below given code : > > ls = ["amba", "Joy", "Preet"] > for idx, ele in enumerate(ls): > if idx < len(ls)-1: > if not all(set(ele).isdisjoint(ch)for ch in ls[idx+1:]): > print("Not mutually disjoint") > break > else: > print("Mutually disjoint") > > Regards > Manprit Singh > From __peter__ at web.de Sun Jun 6 04:51:41 2021 From: __peter__ at web.de (Peter Otten) Date: Sun, 6 Jun 2021 10:51:41 +0200 Subject: [Tutor] A program that can check if all elements of the list are mutually disjoint In-Reply-To: References: Message-ID: On 06/06/2021 10:19, Manprit Singh wrote: > Dear sir , > Thanks to Roel . Finally today I have understood the importance of standard > library modules, I work with mostly numbers when programming , and feel > that itertools is very very useful. > The solution that i do agree is as follows and seems readable to me too: > > import itertools > def is_mutually_disjoint(arr): > comb = itertools.combinations(lst, 2) > return all(set(ele1).isdisjoint(ele2) for ele1, ele2 in comb) Note that there's a bug in the above function (try running it with a list literal) giving you a chance to learn why global variables should be avoided ;) From PyTutor at DancesWithMice.info Sun Jun 6 04:53:09 2021 From: PyTutor at DancesWithMice.info (dn) Date: Sun, 6 Jun 2021 20:53:09 +1200 Subject: [Tutor] How is "set(ls).add('a') evaluated? [Was: Re: A program that can check if all elements of the list are mutually disjoint] In-Reply-To: References: Message-ID: <9c524c62-9916-997f-2435-8a45e01c82ca@DancesWithMice.info> On 06/06/2021 14.19, boB Stepp wrote: > Ah, I finally see the light, dn! Regret to advise that such wisdom came from @David, not this-David?self.David aka "dn". That said:- > On Sat, Jun 5, 2021 at 9:08 PM boB Stepp wrote: >> On Sat, Jun 5, 2021 at 8:57 PM David wrote: >>> On Sun, 6 Jun 2021 at 11:37, boB Stepp wrote: >>> >>>> I have not played around with set's methods and operators to date, so >>>> while trying to understand this code I tried out different things in >>>> the interpreter. Along the way I tried something and it surprised me: >>>> >>>> Python 3.9.5 (tags/v3.9.5:0a7dcbd, May 3 2021, 17:27:52) [MSC v.1928 >>>> 64 bit (AMD64)] on win32 >>>> Type "help", "copyright", "credits" or "license" for more information. >>>>>>> ls = ["amba", "Joy", "Preet"] >>>>>>> z = set(ls).add('a') >>>>>>> z >>>>>>> print(z) >>>> None # This surprised me. I was expecting {'amba', 'Joy', >>>> 'Preet', 'a'}. >>> >>> The set() object has an add() method that modifies >>> its object and returns None. >> >> I understand this. But for "set(ls).add('a')" I am thinking the >> following sequence of events occur: >> >> 1) "set(ls)" creates the set-type object "{'amba', 'Joy', 'Preet'}. >> 2) ".add('a')" method is called on this set object, resulting in the >> new set object "{'amba', 'Joy', 'Preet', 'a'} >> 3) "z" is now assigned to point to this resultant object. > > (3) is *not* correct. "z" is assigned to what the add() method is > returning, dn's point. > >>>>> zz = set(ls).union('a') >>>>> zz >> {'amba', 'Joy', 'Preet', 'a'} >> >> Why are these different in their results? > > Here according to the docs the union() method: > > "Return a new set with elements from the set and all others." > > Caught up in something elementary once again. Sigh! and later:- > > I find it really hard to remember which functions and methods return > "None" and operate by side effect with those that return a new object. > This is especially difficult for me as I so often have long breaks > between using/studying Python. I wish that Python had some sort of > clever syntax that made it *obvious* which of the two possibilities > will occur. > > ~(:>)) It's "pythonic"! (hey, don't shoot me, I'm only the piano-player) When we work with objects, we can access their attributes using "dotted notation". (remember though: a method (function) is also an attribute) Thus, we are able to access a value directly, without using "getters", and to change its value without "setters". Because we are changing an attribute's value, the modified-value is the result of such. Thus, a return of None. (see also "Functional Programming" and the concept of "side effects") We need to 'tune in' to the documentation to settle such confusion (thereafter clarifying any doubt by experimenting with the REPL). In this case: Python 3.9.5 (default, May 14 2021, 00:00:00) [GCC 10.3.1 20210422 (Red Hat 10.3.1-1)] on linux Type "help", "copyright", "credits" or "license" for more information. >>> help( set ) Help on class set in module builtins: class set(object) | set() -> new empty set object | set(iterable) -> new set object | | Build an unordered collection of unique elements. | | Methods defined here: | | __and__(self, value, /) | Return self&value. | | __contains__(...) | x.__contains__(y) <==> y in x. | ... | add(...) | Add an element to a set. | | This has no effect if the element is already present. ... | __len__(self, /) | Return len(self). ... | update(...) | Update a set with the union of itself and others. ... - where we can see that the set method "and" will return a new set (logical given the result will be neither self nor the other set), whereas set.add() will modify the contents of this existing set (stated more categorically in the explanation for set.update() ): s = set() s.add( 1 ) #no need for an assignment, ie LHS and RHS construct Whereas functions, (built-in or otherwise) work with any provided arguments, and return a result (in lieu of an explicit return-value, this will be None). if 1 in s: ... l = len( s ) # which is a generic function implemented as l = s.__len__() Indeed: s = set() is a factory-function which creates (and outputs) a new set - and thus also enables "s" to subsequently 'do stuff' with its attributes. Functions normally run in their own "frame" and thus have their own namespace. So, if a numerical value is named within a function, that name will not be available after the return. (watch-out for the mutable container exception!) Conversely, objects form their own namespace, which lasts for the life of the object. Thus, any change made within that namespace remains accessible as the particular attribute. Just to add to your gripe, there is (at least) one function which provides a result AND a (by-design) internal (to the set) "side-effect": s = { 1, 2 } what = s.pop() ...from help( set )... | pop(...) | Remove and return an arbitrary set element. | Raises KeyError if the set is empty. ... Note the "and"! What will be the value of "what"? What will be the value of s? Thus, as we "tune-in" to the documentation's idiom - words like "remove", "and", and "return" become especially meaningful; in-turn helping us to understand which operations will return a new object, and which will alter some attribute of our current subject. (or both...!) -- Regards, =dn From manpritsinghece at gmail.com Sun Jun 6 04:56:55 2021 From: manpritsinghece at gmail.com (Manprit Singh) Date: Sun, 6 Jun 2021 14:26:55 +0530 Subject: [Tutor] A program that can check if all elements of the list are mutually disjoint In-Reply-To: References: Message-ID: Dear sir , Sorry i have written lst in place of arr: correct function is import itertools def is_mutually_disjoint(arr): comb = itertools.combinations(arr, 2) return all(set(ele1).isdisjoint(ele2) for ele1, ele2 in comb) Kindly comment now ..... On Sun, Jun 6, 2021 at 2:21 PM Peter Otten <__peter__ at web.de> wrote: > On 06/06/2021 10:19, Manprit Singh wrote: > > Dear sir , > > Thanks to Roel . Finally today I have understood the importance of > standard > > library modules, I work with mostly numbers when programming , and feel > > that itertools is very very useful. > > The solution that i do agree is as follows and seems readable to me too: > > > > import itertools > > def is_mutually_disjoint(arr): > > comb = itertools.combinations(lst, 2) > > return all(set(ele1).isdisjoint(ele2) for ele1, ele2 in comb) > > Note that there's a bug in the above function (try running it with a > list literal) giving you a chance to learn why global variables should > be avoided ;) > > _______________________________________________ > Tutor maillist - Tutor at python.org > To unsubscribe or change subscription options: > https://mail.python.org/mailman/listinfo/tutor > From __peter__ at web.de Sun Jun 6 05:20:22 2021 From: __peter__ at web.de (Peter Otten) Date: Sun, 6 Jun 2021 11:20:22 +0200 Subject: [Tutor] A program that can check if all elements of the list are mutually disjoint In-Reply-To: References: Message-ID: On 06/06/2021 10:48, Manprit Singh wrote: > Dear sir , > > But now i have one more question : > If you can see in the last mails : > The function posted by Roel is : > def are_elements_disjoint(seq): > sets = (set(elem) for elem in seq) > pairs = combinations(sets, 2) > return all(a.isdisjoint(b) for (a, b) in pairs) > > If you notice he is making an iterator "sets" of all elements of the seq in > the starting > > Now coming to the function made by me : > > import itertools > def is_mutually_disjoint(arr): > comb = itertools.combinations(lst, 2) > return all(set(ele1).isdisjoint(ele2) for ele1, ele2 in comb) > > in my function i am making set inside all() function. So what is the right > & efficient approach ? set_a.isdisjoint(set_b) is probably a tad faster than set_a.isdisjoint(string_b), so it makes sense to convert the strings before doing the check, especially since the set is needed anyway. In the following pathological case the difference is three orders of magnitude: PS C:\> py -m timeit -s "s = set('a'); t = 'b'*10000" "s.isdisjoint(t)" 500 loops, best of 5: 719 usec per loop PS C:\> py -m timeit -s "s = set('a'); t = set('b'*10000)" "s.isdisjoint(t)" 1000000 loops, best of 5: 305 nsec per loop From learn2program at gmail.com Sun Jun 6 07:06:37 2021 From: learn2program at gmail.com (Alan Gauld) Date: Sun, 6 Jun 2021 12:06:37 +0100 Subject: [Tutor] How is "set(ls).add('a') evaluated? [Was: Re: A program that can check if all elements of the list are mutually disjoint] In-Reply-To: References: Message-ID: <90262af7-7941-4913-12fd-b9b35ee98543@yahoo.co.uk> On 06/06/2021 03:32, boB Stepp wrote: > > I find it really hard to remember which functions and methods return > "None" and operate by side effect with those that return a new object. You are not alone. Coming from smalltalk where the default return value of methods is self I find Python's default of none to be quite frustrating at times. As to when it happens the general rule is that if it is a mutable object that is being modified the change happens in-place and None is returned. If it is an immutable object(string, tuple) then the "change" is returned as a new object (since the original cannot be changed!) But the Smalltalk style of returning self for mutables and the new object for immutables leads to much more consistent coding style (especially important when dealing with heterogenous collections of objects!) and also allows chaining. But every language has its foibles and None returns is one of Python's. You just have to think about whether the object being changed is mutable or not. -- Alan G Author of the Learn to Program web site http://www.alan-g.me.uk/ http://www.amazon.com/author/alan_gauld Follow my photo-blog on Flickr at: http://www.flickr.com/photos/alangauldphotos From robertvstepp at gmail.com Sun Jun 6 13:22:12 2021 From: robertvstepp at gmail.com (boB Stepp) Date: Sun, 6 Jun 2021 12:22:12 -0500 Subject: [Tutor] How is "set(ls).add('a') evaluated? [Was: Re: A program that can check if all elements of the list are mutually disjoint] In-Reply-To: <9c524c62-9916-997f-2435-8a45e01c82ca@DancesWithMice.info> References: <9c524c62-9916-997f-2435-8a45e01c82ca@DancesWithMice.info> Message-ID: On Sun, Jun 6, 2021 at 3:53 AM dn via Tutor wrote: > > On 06/06/2021 14.19, boB Stepp wrote: > > Ah, I finally see the light, dn! > > Regret to advise that such wisdom came from @David, not > this-David?self.David aka "dn". My sincere apologies, David (*not* dn)! I cannot seem to keep the "bouncing cats" segregated from the " dancing (with) mice" in my head, not to mention the same first names! > That said:- > > > > On Sat, Jun 5, 2021 at 9:08 PM boB Stepp wrote: > >> On Sat, Jun 5, 2021 at 8:57 PM David wrote: > >>> On Sun, 6 Jun 2021 at 11:37, boB Stepp wrote: > >>> > >>>> I have not played around with set's methods and operators to date, so > >>>> while trying to understand this code I tried out different things in > >>>> the interpreter. Along the way I tried something and it surprised me: > >>>> > >>>> Python 3.9.5 (tags/v3.9.5:0a7dcbd, May 3 2021, 17:27:52) [MSC v.1928 > >>>> 64 bit (AMD64)] on win32 > >>>> Type "help", "copyright", "credits" or "license" for more information. > >>>>>>> ls = ["amba", "Joy", "Preet"] > >>>>>>> z = set(ls).add('a') > >>>>>>> z > >>>>>>> print(z) > >>>> None # This surprised me. I was expecting {'amba', 'Joy', > >>>> 'Preet', 'a'}. > >>> > >>> The set() object has an add() method that modifies > >>> its object and returns None. > >> > >> I understand this. But for "set(ls).add('a')" I am thinking the > >> following sequence of events occur: > >> > >> 1) "set(ls)" creates the set-type object "{'amba', 'Joy', 'Preet'}. > >> 2) ".add('a')" method is called on this set object, resulting in the > >> new set object "{'amba', 'Joy', 'Preet', 'a'} > >> 3) "z" is now assigned to point to this resultant object. > > > > (3) is *not* correct. "z" is assigned to what the add() method is > > returning, dn's point. > > > >>>>> zz = set(ls).union('a') > >>>>> zz > >> {'amba', 'Joy', 'Preet', 'a'} > >> > >> Why are these different in their results? > > > > Here according to the docs the union() method: > > > > "Return a new set with elements from the set and all others." > > > > Caught up in something elementary once again. Sigh! > > and later:- > > > > I find it really hard to remember which functions and methods return > > "None" and operate by side effect with those that return a new object. > > This is especially difficult for me as I so often have long breaks > > between using/studying Python. I wish that Python had some sort of > > clever syntax that made it *obvious* which of the two possibilities > > will occur. > > > > ~(:>)) > > > It's "pythonic"! > (hey, don't shoot me, I'm only the piano-player) > > > When we work with objects, we can access their attributes using "dotted > notation". (remember though: a method (function) is also an attribute) > Thus, we are able to access a value directly, without using "getters", > and to change its value without "setters". Because we are changing an > attribute's value, the modified-value is the result of such. Thus, a > return of None. > > (see also "Functional Programming" and the concept of "side effects") > > > We need to 'tune in' to the documentation to settle such confusion > (thereafter clarifying any doubt by experimenting with the REPL). In > this case: > > Python 3.9.5 (default, May 14 2021, 00:00:00) > [GCC 10.3.1 20210422 (Red Hat 10.3.1-1)] on linux > Type "help", "copyright", "credits" or "license" for more information. > >>> help( set ) > > Help on class set in module builtins: > > class set(object) > | set() -> new empty set object > | set(iterable) -> new set object > | > | Build an unordered collection of unique elements. > | > | Methods defined here: > | > | __and__(self, value, /) > | Return self&value. > | > | __contains__(...) > | x.__contains__(y) <==> y in x. > | > ... > | add(...) > | Add an element to a set. > | > | This has no effect if the element is already present. > ... > | __len__(self, /) > | Return len(self). > ... > | update(...) > | Update a set with the union of itself and others. > ... > > - where we can see that the set method "and" will return a new set > (logical given the result will be neither self nor the other set), > whereas set.add() will modify the contents of this existing set (stated > more categorically in the explanation for set.update() ): Ah, but here's the rub. The help(set) and online docs when they describe the add() method do *not* _explicitly_ state what the method returns. When reading this originally I naively thought it returned the updated set object. OTOH, the union() method explicitly states what is returned. I guess I am just dense and should have realized that if no *explicit* mention in the help and docs is made regarding a method's return value then it by default will return "None". This now is making me even more of a believer in using type annotations and making it *very explicit* for dense people like me what the return value is for each method and function. For me the docs and help() would read better if in every case for every function and method its return value is always explicitly stated. Fortunately David (bouncingcats) made me re-read the docs in light of his very helpful comment and (finally!) saw my mistake. > Just to add to your gripe, there is (at least) one function which > provides a result AND a (by-design) internal (to the set) "side-effect": > > s = { 1, 2 } > what = s.pop() > > ...from help( set )... > | pop(...) > | Remove and return an arbitrary set element. > | Raises KeyError if the set is empty. Good point! But here it is made explicit to the reader *both* the return value and the side effect. boB Stepp From robertvstepp at gmail.com Sun Jun 6 13:27:11 2021 From: robertvstepp at gmail.com (boB Stepp) Date: Sun, 6 Jun 2021 12:27:11 -0500 Subject: [Tutor] How is "set(ls).add('a') evaluated? [Was: Re: A program that can check if all elements of the list are mutually disjoint] In-Reply-To: <90262af7-7941-4913-12fd-b9b35ee98543@yahoo.co.uk> References: <90262af7-7941-4913-12fd-b9b35ee98543@yahoo.co.uk> Message-ID: On Sun, Jun 6, 2021 at 6:07 AM Alan Gauld wrote: > > > On 06/06/2021 03:32, boB Stepp wrote: > > > > I find it really hard to remember which functions and methods return > > "None" and operate by side effect with those that return a new object. > > You are not alone. Coming from smalltalk where the default return value > of methods is self I find Python's default of none to be quite > frustrating at > times. As to when it happens the general rule is that if it is a mutable > object > that is being modified the change happens in-place and None is returned. Ah, but even this apparently has at least one exception. See dn's comment about the pop() method where it does return a value (not normally None) *and* modifies the object in place. But I take your point that this is a very good rule of thumb. And you have helped me feel a little better! Thanks! > But every language has its foibles and None returns is one of Python's. Probably a driver for the creation of new, improved programming languages? boB Stepp From alan.gauld at yahoo.co.uk Sun Jun 6 15:00:39 2021 From: alan.gauld at yahoo.co.uk (Alan Gauld) Date: Sun, 6 Jun 2021 20:00:39 +0100 Subject: [Tutor] How is "set(ls).add('a') evaluated? [Was: Re: A program that can check if all elements of the list are mutually disjoint] In-Reply-To: References: <90262af7-7941-4913-12fd-b9b35ee98543@yahoo.co.uk> Message-ID: On 06/06/2021 18:27, boB Stepp wrote: > On Sun, Jun 6, 2021 at 6:07 AM Alan Gauld wrote: > Ah, but even this apparently has at least one exception. See dn's > comment about the pop() method where it does return a value (not > normally None) *and* modifies the object in place. Yes, but that's the definition of pop() as an operation. It takes the top element off the stack and returns it to you. You couldn't have a pop operation unless it returned the removed value. It wouldn't make sense. And likewise it has to modify the stack since otherwise, future pops would return the same object! -- Alan G Author of the Learn to Program web site http://www.alan-g.me.uk/ http://www.amazon.com/author/alan_gauld Follow my photo-blog on Flickr at: http://www.flickr.com/photos/alangauldphotos From alexkleider at gmail.com Sun Jun 6 15:42:10 2021 From: alexkleider at gmail.com (Alex Kleider) Date: Sun, 6 Jun 2021 12:42:10 -0700 Subject: [Tutor] How is "set(ls).add('a') evaluated? [Was: Re: A program that can check if all elements of the list are mutually disjoint] In-Reply-To: References: <90262af7-7941-4913-12fd-b9b35ee98543@yahoo.co.uk> Message-ID: This may have been mentioned in this thread but if so, I didn't notice it. Is it not true that functions/methods the names of which are verbs modify something that exists (and mostly return None.) Nouns return that to which their name refers/implies. 'pop' does both but as Alan has pointed out, that's implicit in the data structure. On Sun, Jun 6, 2021 at 10:28 AM boB Stepp wrote: > On Sun, Jun 6, 2021 at 6:07 AM Alan Gauld wrote: > > > > > > On 06/06/2021 03:32, boB Stepp wrote: > > > > > > I find it really hard to remember which functions and methods return > > > "None" and operate by side effect with those that return a new object. > > > > You are not alone. Coming from smalltalk where the default return value > > of methods is self I find Python's default of none to be quite > > frustrating at > > times. As to when it happens the general rule is that if it is a mutable > > object > > that is being modified the change happens in-place and None is returned. > > Ah, but even this apparently has at least one exception. See dn's > comment about the pop() method where it does return a value (not > normally None) *and* modifies the object in place. But I take your > point that this is a very good rule of thumb. And you have helped me > feel a little better! Thanks! > > > But every language has its foibles and None returns is one of Python's. > > Probably a driver for the creation of new, improved programming languages? > > boB Stepp > _______________________________________________ > Tutor maillist - Tutor at python.org > To unsubscribe or change subscription options: > https://mail.python.org/mailman/listinfo/tutor > From PyTutor at DancesWithMice.info Sun Jun 6 17:54:08 2021 From: PyTutor at DancesWithMice.info (dn) Date: Mon, 7 Jun 2021 09:54:08 +1200 Subject: [Tutor] How is "set(ls).add('a') evaluated? [Was: Re: A program that can check if all elements of the list are mutually disjoint] In-Reply-To: References: <9c524c62-9916-997f-2435-8a45e01c82ca@DancesWithMice.info> Message-ID: On 07/06/2021 05.22, boB Stepp wrote: > On Sun, Jun 6, 2021 at 3:53 AM dn via Tutor wrote: >> On 06/06/2021 14.19, boB Stepp wrote: >>> Ah, I finally see the light, dn! >> >> Regret to advise that such wisdom came from @David, not >> this-David?self.David aka "dn". > > My sincere apologies, David (*not* dn)! I cannot seem to keep the > "bouncing cats" segregated from the " dancing (with) mice" in my head, > not to mention the same first names! So let me get this straight: David gets the kudos *and* an apology - whereas I'm told to pop-off... >>> >>> I find it really hard to remember which functions and methods return >>> "None" and operate by side effect with those that return a new object. >>> This is especially difficult for me as I so often have long breaks >>> between using/studying Python. I wish that Python had some sort of >>> clever syntax that made it *obvious* which of the two possibilities >>> will occur. >>> >>> ~(:>)) >> >> >> It's "pythonic"! >> (hey, don't shoot me, I'm only the piano-player) There are conventions at-play when help() and similar docs are composed. At the higher level there are considerations of conveying detail whilst also minimising the space required - the opposite applies (possibly) when reading the Python Docs! Secondly, a 'skill' for reading such (as mentioned) lies in aligning one's thinking with the help()-idiom. For example, "We hold these truths to be self-evident" that each method will be preceded either by "self." (within the class definition) or 'object-name.' after instantiation. Thus, "s.pop()". Perhaps if we were to retro-fit the Python built-ins and PSL modules to use "typing", that would make things more explicit? <<< It is recommended but not required that checked functions have annotations for all arguments and the return type. For a checked function, the default annotation for arguments and for the return type is Any. An exception is the first argument of instance and class methods. If it is not annotated, then it is assumed to have the type of the containing class for instance methods, and a type object type corresponding to the containing class object for class methods. For example, in class A the first argument of an instance method has the implicit type A. In a class method, the precise type of the first argument cannot be represented using the available type notation. >>> (PEP-484 and after, https://www.python.org/dev/peps/pep-0484/) Note that this PEP (and most) concentrates on automated features for its justification, eg static checking; rather than any documentation/readability factors! Thus (from help() ): | __and__(self, Set, /) -> Set | Return self&value. ... | add( Any ) [NB no -> return-value noted] | Add an element to a set. | | This has no effect if the element is already present. ... | __len__(self, /) -> int | Return len(self). ... | pop( index:Optional[ Int ] ) -> element:Any | Remove and return an arbitrary set element. Raises KeyError if the set is empty. ... | update( Set) [NB no -> return-value noted] | Update a set with the union of itself and others. ... >> - where we can see that the set method "and" will return a new set >> (logical given the result will be neither self nor the other set), >> whereas set.add() will modify the contents of this existing set (stated >> more categorically in the explanation for set.update() ): > Ah, but here's the rub. The help(set) and online docs when they > describe the add() method do *not* _explicitly_ state what the method > returns. When reading this originally I naively thought it returned > the updated set object. OTOH, the union() method explicitly states > what is returned. I guess I am just dense and should have realized > that if no *explicit* mention in the help and docs is made regarding a > method's return value then it by default will return "None". This now > is making me even more of a believer in using type annotations and > making it *very explicit* for dense people like me what the return > value is for each method and function. For me the docs and help() > would read better if in every case for every function and method its > return value is always explicitly stated. Whilst there may be some improvement - what do you think? It looks rather clumsy (to me), and one could argue that the lack of return-value (in some cases), still requires the reader to notice and appreciate the implication of its omission! The balance here may be that help() is more an easy-access reference/reminder tool. When it comes to 'learning' (or reacting to an experiment resulting in "surprise") might the Python Docs be a better (official) source? eg https://docs.python.org/3.9/library/stdtypes.html#set-types-set-frozenset Mea culpa: when trying something new, I often find that an aspect lacking clarity (in my mind) in the docs, may be resolved by playing with help() and the REPL - and equally, vice-versa! -- Regards, =dn From mats at wichmann.us Sun Jun 6 18:48:07 2021 From: mats at wichmann.us (Mats Wichmann) Date: Sun, 6 Jun 2021 16:48:07 -0600 Subject: [Tutor] How is "set(ls).add('a') evaluated? [Was: Re: A program that can check if all elements of the list are mutually disjoint] In-Reply-To: References: <9c524c62-9916-997f-2435-8a45e01c82ca@DancesWithMice.info> Message-ID: <6f66055a-d75d-143d-e465-6306b1847437@wichmann.us> On 6/6/21 3:27 PM, Dennis Lee Bieber wrote: > Consider the difference between list.sort() and sorted(list). The first > sorts in place and returns None, the second returns a sorted copy of list. Probably flogging a horse that has already left the barn, or whatever, but... For those who don't like this behavior, maybe consider: if you ask to _modify_ an object, the result is either that the object was modified as you requested, or an exception is raised indicating why that couldn't happen. If you didn't get an exception, it succeeded, so there's not really anything else Python needs to tell you. >>> x = (1, 2) >>> x.append(3) Traceback (most recent call last): File "", line 1, in x.append(3) AttributeError: 'tuple' object has no attribute 'append' >>> y = [1, 2] >>> y.append(3) >>> # silence is golden? One could have picked something arbitrary to return here, like the new size of the object, but why? If you want that, it's easy to ask for. This is nice and consistent. I realize keeping track of "we are modifying an object in place" (no return) vs. "we are asking for a new object to be produced" (return the object) is a bit of a pain. And I do know this has led to errors - I've made more than a few of this type myself. When you write your own code, I'm becoming more and more fond of adding one specific kind of type annotation: the return type. This has two useful effects: 1. You find you're having a really hard time describing the return type, and you reach for Any, or for a really complex Union, or... this probably means you've written a function that isn't single purpose enough, and you should stop and think how all the different kinds of returns are likely to confuse users of your code. 2. You identify the cases where you return "something-useful" or None. Which can be a real pain for users. You tell them they'll get back a certain object, so they perform an operation on the object, and get what I think is currently one of the most frequent Python errors: AttributeError: 'NoneType' object has no attribute 'index' ... because there was a path where you didn't actually return one of those expected objects. I know sentinels can be useful if all the values of an object are valid values (e.g. you want to return a string, and "" is also a valid value), but maybe in some circumstances an exception might be a better thing to do for the case where there wasn't a valid value to return? In any case, whichever approach you think is right, a checker will let you know you're returning something that's not of an expected type if you: def foo(a, b) -> str: # buncha code if ok: return obj return None so you can at least be reminded to put it in the docs and/or revise the return-type annotation to show None is possible. From cs at cskk.id.au Sun Jun 6 19:05:56 2021 From: cs at cskk.id.au (Cameron Simpson) Date: Mon, 7 Jun 2021 09:05:56 +1000 Subject: [Tutor] How is "set(ls).add('a') evaluated? [Was: Re: A program that can check if all elements of the list are mutually disjoint] In-Reply-To: References: Message-ID: On 05Jun2021 21:32, boB Stepp wrote: >On Sat, Jun 5, 2021 at 9:19 PM boB Stepp >wrote: >> > Why are these different in their results? # Referring to >> > difference in return values for set add() and union() methods. "add" is a verb. It modifies the set. "union" is a noun: it returns a thing computed from the set(s). >I find it really hard to remember which functions and methods return >"None" and operate by side effect with those that return a new object. Grammar helps, provided that people have put a little effort into naming the API calls. Also, consider what is sensible for the operation. .add() builds on an existing set. .union() returns a computation of sets. Cheers, Cameron Simpson From mats at wichmann.us Sun Jun 6 19:09:32 2021 From: mats at wichmann.us (Mats Wichmann) Date: Sun, 6 Jun 2021 17:09:32 -0600 Subject: [Tutor] How is "set(ls).add('a') evaluated? [Was: Re: A program that can check if all elements of the list are mutually disjoint] In-Reply-To: References: Message-ID: <09727a22-a670-281b-5c81-0d126fd9497f@wichmann.us> On 6/6/21 5:05 PM, Cameron Simpson wrote: > "add" is a verb. It modifies the set. "union" is a noun: it returns a > thing computed from the set(s). ... > Grammar helps, provided that people have put a little effort into naming > the API calls. Yeah, that: API design is actually not that easy. Including naming :) From marc.tompkins at gmail.com Sun Jun 6 20:01:18 2021 From: marc.tompkins at gmail.com (Marc Tompkins) Date: Sun, 6 Jun 2021 17:01:18 -0700 Subject: [Tutor] How is "set(ls).add('a') evaluated? [Was: Re: A program that can check if all elements of the list are mutually disjoint] In-Reply-To: <09727a22-a670-281b-5c81-0d126fd9497f@wichmann.us> References: <09727a22-a670-281b-5c81-0d126fd9497f@wichmann.us> Message-ID: On Sun, Jun 6, 2021 at 4:11 PM Mats Wichmann wrote: > On 6/6/21 5:05 PM, Cameron Simpson wrote: > > > "add" is a verb. It modifies the set. "union" is a noun: it returns a > > thing computed from the set(s). > ... > > Grammar helps, provided that people have put a little effort into naming > > the API calls. > > Yeah, that: API design is actually not that easy. Including naming :) > This reminds me of a conversation I had, years ago, when I was puzzled over whether a particular (Los Angeles) street was a boulevard or an avenue. My associate was shocked - SHOCKED! - that I didn't know that avenues are always north/south and boulevards always east/west. So I pulled out the Thomas Guide and showed him multiple intersections in our neighborhood where two avenues, or two boulevards, crossed at right angles. It's all very well knowing the canonical rules, but if there are enough exceptions the rules become almost a hindrance to understanding. From Richard at Damon-Family.org Sun Jun 6 20:56:45 2021 From: Richard at Damon-Family.org (Richard Damon) Date: Sun, 6 Jun 2021 20:56:45 -0400 Subject: [Tutor] How is "set(ls).add('a') evaluated? [Was: Re: A program that can check if all elements of the list are mutually disjoint] In-Reply-To: References: <09727a22-a670-281b-5c81-0d126fd9497f@wichmann.us> Message-ID: <9cc6a6b7-0d36-8e8c-0fa8-1a0fbf753a87@Damon-Family.org> On 6/6/21 8:01 PM, Marc Tompkins wrote: > On Sun, Jun 6, 2021 at 4:11 PM Mats Wichmann wrote: > >> On 6/6/21 5:05 PM, Cameron Simpson wrote: >> >>> "add" is a verb. It modifies the set. "union" is a noun: it returns a >>> thing computed from the set(s). >> ... >>> Grammar helps, provided that people have put a little effort into naming >>> the API calls. >> Yeah, that: API design is actually not that easy. Including naming :) >> > This reminds me of a conversation I had, years ago, when I was puzzled over > whether a particular (Los Angeles) street was a boulevard or an avenue. My > associate was shocked - SHOCKED! - that I didn't know that avenues are > always north/south and boulevards always east/west. So I pulled out the > Thomas Guide and showed him multiple intersections in our > neighborhood where two avenues, or two boulevards, crossed at right angles. > > It's all very well knowing the canonical rules, but if there are enough > exceptions the rules become almost a hindrance to understanding. It's quite possible that two avenues which are both 'north/south' cross at 90 degrees as the north-south rule is likely a 'general' case, while at a particular point they may deviate significantly. I have one stretch of road near me that is at the same time a North Bound Highway, an East Bound Highway and a South Bound Highway. -- Richard Damon From marc.tompkins at gmail.com Mon Jun 7 01:25:22 2021 From: marc.tompkins at gmail.com (Marc Tompkins) Date: Sun, 6 Jun 2021 22:25:22 -0700 Subject: [Tutor] How is "set(ls).add('a') evaluated? [Was: Re: A program that can check if all elements of the list are mutually disjoint] In-Reply-To: <9cc6a6b7-0d36-8e8c-0fa8-1a0fbf753a87@Damon-Family.org> References: <09727a22-a670-281b-5c81-0d126fd9497f@wichmann.us> <9cc6a6b7-0d36-8e8c-0fa8-1a0fbf753a87@Damon-Family.org> Message-ID: On Sun, Jun 6, 2021 at 5:58 PM Richard Damon wrote: > It's quite possible that two avenues which are both 'north/south' cross > at 90 degrees as the north-south rule is likely a 'general' case, while > at a particular point they may deviate significantly. > > In the central San Fernando Valley (inside the city of LA), most boulevards do run east/west and almost all avenues run north/south - but several of the major north/south roads (Sepulveda, Balboa, Reseda, Topanga Canyon) break the rule and are called boulevards. A few miles east in the city of Burbank, the entire grid is rotated 45 degrees and there are multiple avenue/avenue and boulevard/boulevard intersections running NW/SE and NE/SW. The rule is useful as a guideline, but ONLY if the designers of the system followed the rule themselves - otherwise it just helps you get lost. (Hence my parallel to API naming schemes.) > I have one stretch of road near me that is at the same time a North > Bound Highway, an East Bound Highway and a South Bound Highway. > US 101 (or, in LA-speak, "the 101") is officially a north/south highway - one of the major north/south arteries of the state - but in the San Fernando Valley, it runs nearly due east/west. In fact, due to meandering, there are a few places where you can be northbound on the 101 but actually driving southwest. From PyTutor at DancesWithMice.info Mon Jun 7 17:40:51 2021 From: PyTutor at DancesWithMice.info (dn) Date: Tue, 8 Jun 2021 09:40:51 +1200 Subject: [Tutor] How is "set(ls).add('a') evaluated? [Was: Re: A program that can check if all elements of the list are mutually disjoint] In-Reply-To: References: <9c524c62-9916-997f-2435-8a45e01c82ca@DancesWithMice.info> Message-ID: <2f6af174-b1ad-bd39-359f-53b63b9adb3d@DancesWithMice.info> On 07/06/2021 09.54, dn via Tutor wrote: > On 07/06/2021 05.22, boB Stepp wrote: >> On Sun, Jun 6, 2021 at 3:53 AM dn via Tutor wrote: >>> On 06/06/2021 14.19, boB Stepp wrote: ... >> My sincere apologies, David (*not* dn)! I cannot seem to keep the >> "bouncing cats" segregated from the " dancing (with) mice" in my head, >> not to mention the same first names! It's easy-enough: I'm the better-looking one! (more, and more pertinent comment, below) >>>> >>>> I find it really hard to remember which functions and methods return >>>> "None" and operate by side effect with those that return a new object. >>>> This is especially difficult for me as I so often have long breaks >>>> between using/studying Python. I wish that Python had some sort of >>>> clever syntax that made it *obvious* which of the two possibilities >>>> will occur. >>>> >>>> ~(:>)) >>> >>> >>> It's "pythonic"! >>> (hey, don't shoot me, I'm only the piano-player) > > There are conventions at-play when help() and similar docs are composed. > At the higher level there are considerations of conveying detail whilst > also minimising the space required - the opposite applies (possibly) > when reading the Python Docs! > > Secondly, a 'skill' for reading such (as mentioned) lies in aligning > one's thinking with the help()-idiom. For example, "We hold these truths > to be self-evident" that each method will be preceded either by "self." > (within the class definition) or 'object-name.' after instantiation. > Thus, "s.pop()". ... > (PEP-484 and after, https://www.python.org/dev/peps/pep-0484/) > > Note that this PEP (and most) concentrates on automated features for its > justification, eg static checking; rather than any > documentation/readability factors! > > > Thus (from help() ): > > | __and__(self, Set, /) -> Set > | Return self&value. > ... > | add( Any ) [NB no -> return-value noted] > | Add an element to a set. > | > | This has no effect if the element is already present. > ... > | __len__(self, /) -> int > | Return len(self). > ... > | pop( index:Optional[ Int ] ) -> element:Any > | Remove and return an arbitrary set element. > Raises KeyError if the set is empty. > ... > | update( Set) [NB no -> return-value noted] > | Update a set with the union of itself and others. ... >> Ah, but here's the rub. The help(set) and online docs when they >> describe the add() method do *not* _explicitly_ state what the method >> returns. When reading this originally I naively thought it returned >> the updated set object. OTOH, the union() method explicitly states >> what is returned. I guess I am just dense and should have realized >> that if no *explicit* mention in the help and docs is made regarding a >> method's return value then it by default will return "None". This now >> is making me even more of a believer in using type annotations and >> making it *very explicit* for dense people like me what the return >> value is for each method and function. For me the docs and help() >> would read better if in every case for every function and method its >> return value is always explicitly stated. > > Whilst there may be some improvement - what do you think? > > It looks rather clumsy (to me), and one could argue that the lack of > return-value (in some cases), still requires the reader to notice and > appreciate the implication of its omission! > > The balance here may be that help() is more an easy-access > reference/reminder tool. When it comes to 'learning' (or reacting to an > experiment resulting in "surprise") might the Python Docs be a better > (official) source? eg > https://docs.python.org/3.9/library/stdtypes.html#set-types-set-frozenset > > Mea culpa: when trying something new, I often find that an aspect > lacking clarity (in my mind) in the docs, may be resolved by playing > with help() and the REPL - and equally, vice-versa! After relating a personal view of 'the documentation', I recalled another attitude applicable to a wider, and perhaps non-traditional audience, voiced at the Python Language Summit 2020. For your reading pleasure: <<< Batchelder observed that "twenty-five years ago, our main audience seemed to be refugees from C," but most readers of the Python docs today are not career software developers at all; they need different docs. >>> which goes-on to be 'bang on the money' regarding @boB's frustration: <<< Updating the documentation to fix common misunderstandings would save time in the long run for users and the core team. >>> Python Software Foundation's Blog: https://pyfound.blogspot.com/2020/04/cpython-documentation-next-5-years.html NB perhaps I haven't paid sufficient attention, but I don't think that the proposed Python Steering Council workgroup called the "Documentation Editorial Board" has published much since. (but will be delighted to be corrected on this, directed to web.refs, etc) -- Regards, =dn From mats at wichmann.us Mon Jun 7 18:12:05 2021 From: mats at wichmann.us (Mats Wichmann) Date: Mon, 7 Jun 2021 16:12:05 -0600 Subject: [Tutor] How is "set(ls).add('a') evaluated? [Was: Re: A program that can check if all elements of the list are mutually disjoint] In-Reply-To: <2f6af174-b1ad-bd39-359f-53b63b9adb3d@DancesWithMice.info> References: <9c524c62-9916-997f-2435-8a45e01c82ca@DancesWithMice.info> <2f6af174-b1ad-bd39-359f-53b63b9adb3d@DancesWithMice.info> Message-ID: On 6/7/21 3:40 PM, dn via Tutor wrote: > For your reading pleasure: > <<< > Batchelder observed that "twenty-five years ago, our main audience > seemed to be refugees from C," but most readers of the Python docs today > are not career software developers at all; they need different docs. >>>> > > which goes-on to be 'bang on the money' regarding @boB's frustration: > <<< > Updating the documentation to fix common misunderstandings would save > time in the long run for users and the core team. >>>> > Python Software Foundation's Blog: > https://pyfound.blogspot.com/2020/04/cpython-documentation-next-5-years.html > > > NB perhaps I haven't paid sufficient attention, but I don't think that > the proposed Python Steering Council workgroup called the "Documentation > Editorial Board" has published much since. > (but will be delighted to be corrected on this, directed to web.refs, etc) eh, it's having some startup pains. I guess maybe too many people volunteered to help out? (I was one of them) and a few people were assigned onto the committee, and some other Python longtimers raised a stink because they felt they were getting left out when they had long-term experience with the docs. Except, as dn has detailed here, those docs are currently oriented in a way that's known not to be helpful (enough) to many... so right now, something is going on, and we masses have no insight into it at all. the note about cross-purposes is pretty valid anyway. I'm working on a Python project where I'm trying to improve the way the generated-from-docstrings documentation renders when Sphinx is pointed at it. That's not at all the same thing as how it renders when you call help(). Fortunately for _this_ case the code is such your not likely to open an interpreter and introspect into it that way, by calling help on things. But it kind of highlights that when there are several possible consumers of the documentation, both groups of humans, and various tools, it is *really* hard to figure out how to present things. Which I guess was the point of the included above... to think about how to do better. From bouncingcats at gmail.com Mon Jun 7 22:53:14 2021 From: bouncingcats at gmail.com (David) Date: Tue, 8 Jun 2021 12:53:14 +1000 Subject: [Tutor] [OT] Re: How is "set(ls).add('a') evaluated? [Was: Re: A program that can check if all elements of the list are mutually disjoint] In-Reply-To: <2f6af174-b1ad-bd39-359f-53b63b9adb3d@DancesWithMice.info> References: <9c524c62-9916-997f-2435-8a45e01c82ca@DancesWithMice.info> <2f6af174-b1ad-bd39-359f-53b63b9adb3d@DancesWithMice.info> Message-ID: On Tue, 8 Jun 2021 at 07:42, dn via Tutor wrote: > On 07/06/2021 09.54, dn via Tutor wrote: > > On 07/06/2021 05.22, boB Stepp wrote: > >> On Sun, Jun 6, 2021 at 3:53 AM dn via Tutor wrote: > >>> On 06/06/2021 14.19, boB Stepp wrote: > >> My sincere apologies, David (*not* dn)! I cannot seem to keep the > >> "bouncing cats" segregated from the " dancing (with) mice" in my head, > >> not to mention the same first names! > It's easy-enough: I'm the better-looking one! haha, that's two days in a row now that this thread has given me a good laugh :) The first time was the mental association of the dancing mice and the bouncing cats, very amusing. By the way, bouncingcats is nothing to do with felines. It's just a throwaway name that popped into my head many years ago when trying to come up with something that didn't clash with account names that were already in use. I'm not particularly fond of it, and I have many different email accounts, but somehow this one ended up being the one that I use purely for mailing lists. "bouncingcats" is a reference to vocal percussion. "the imitation or approximation of percussion instruments ... music with your mouth ..." [1] Some time later, it was even used as the title of a documentary film on the subject [2]. Because repeatedly vocalising the phrase bouncingcatsbouncingcatsbouncingcats.... creates a percussive sound similar to the rhythm tracks of contemporary dance music. When it gets boring, you can throw in "baboonsandpigs..." [1] https://en.wikipedia.org/wiki/Beatboxing [2] https://en.wikipedia.org/wiki/Bouncing_Cats From marc.tompkins at gmail.com Mon Jun 7 23:17:16 2021 From: marc.tompkins at gmail.com (Marc Tompkins) Date: Mon, 7 Jun 2021 20:17:16 -0700 Subject: [Tutor] [OT] Re: How is "set(ls).add('a') evaluated? [Was: Re: A program that can check if all elements of the list are mutually disjoint] In-Reply-To: References: <9c524c62-9916-997f-2435-8a45e01c82ca@DancesWithMice.info> <2f6af174-b1ad-bd39-359f-53b63b9adb3d@DancesWithMice.info> Message-ID: On Mon, Jun 7, 2021 at 7:54 PM David wrote: > "bouncingcats" is a reference to vocal percussion. > "the imitation or approximation of percussion instruments > ... music with your mouth ..." [1] > Some time later, it was even used as the title of a > documentary film on the subject [2]. > > Because repeatedly vocalising the phrase > bouncingcatsbouncingcatsbouncingcats.... > creates a percussive sound similar to > the rhythm tracks of contemporary dance music. > > When it gets boring, you can throw in "baboonsandpigs..." > The one that always comes to my mind is "Boots & Pants", which I first heard as a throwaway joke in a GEICO commercial but later learned is a real, fairly popular dance song. Who knew? Before that, I'd just been saying "UntzUntzUntzUntzUntz..." https://www.youtube.com/watch?v=eYXla8N-Qq4 From nikbb124 at gmail.com Wed Jun 9 19:03:03 2021 From: nikbb124 at gmail.com (Nick Becker) Date: Wed, 9 Jun 2021 18:03:03 -0500 Subject: [Tutor] Issue with Currency Columns Message-ID: Good Evening, I read an excel spreadsheet into python with columns similar to those on the attached, plus many others. I am trying to isolate the company name and sales columns in my data so I can work on them and do some calculations. However, after I read the CSV file, I keep getting error messages around the sales column. I want to make a new dataframe with just these two columns, but when I enter "df2 = original file['Company Name', 'Sales']", I get error messages. I have converting the column of the source data to a number, and am still getting an error. Any suggestions or pointers would be appreciated. I am just getting started with Python, so pardon my ignorance. Best, Nick From joel.goldstick at gmail.com Sat Jun 12 16:35:35 2021 From: joel.goldstick at gmail.com (Joel Goldstick) Date: Sat, 12 Jun 2021 16:35:35 -0400 Subject: [Tutor] Issue with Currency Columns In-Reply-To: References: Message-ID: On Sat, Jun 12, 2021 at 3:09 PM Nick Becker wrote: > > Good Evening, > > I read an excel spreadsheet into python with columns similar to those on > the attached, plus many others. I am trying to isolate the company name > and sales columns in my data so I can work on them and do some > calculations. However, after I read the CSV file, I keep getting error > messages around the sales column. You can't send attachments to this list. So, copy and paste the traceback error message, and also copy and paste the code you are using. Also, copy and paste the data file -- not all of it, but a few rows that would allow the problem to appear. Do that, and someone here will be able to help you. > > I want to make a new dataframe with just these two columns, but when I > enter "df2 = original file['Company Name', 'Sales']", I get error > messages. I have converting the column of the source data to a number, and > am still getting an error. Any suggestions or pointers would be > appreciated. Since you say 'dataframe, I'm guessing you are using numpy which is a library very popular with certain python users, but its not part of python echo system itself. > > I am just getting started with Python, so pardon my ignorance. > > Best, > Nick > _______________________________________________ > Tutor maillist - Tutor at python.org > To unsubscribe or change subscription options: > https://mail.python.org/mailman/listinfo/tutor -- Joel Goldstick From PyTutor at DancesWithMice.info Sun Jun 13 01:03:17 2021 From: PyTutor at DancesWithMice.info (dn) Date: Sun, 13 Jun 2021 17:03:17 +1200 Subject: [Tutor] Terminology: EU language skills, and Master to Main (or ...) Message-ID: <4e60afa3-4438-69c5-bbd6-849881199c63@DancesWithMice.info> [to folk subscribed to both the Python list and Python-Tutor: apologies for cross-posting] Regarding levels of skill or complexity in learning, the European Union has been working on "The Common European Framework of Reference for Languages: Learning, Teaching, Assessment". It also standardises terminology for spoken/national-language training courses. https://en.wikipedia.org/wiki/Common_European_Framework_of_Reference_for_Languages I'm not a fan of such abstract labelling of one's progress (or a tutorial's content) with codes or "reference levels" (A1, A2, B1, B2, C1, C2) but maybe it will become widely recognised... The web-page amuses (this small mind) noting non-PC stereotypes, that the ever-pragmatic Dutch have scaled language skills based upon how one wants to apply or use them; the Scandinavians go for numerical progression; which the Italians do similarly but with 'flair' (using words not digits). LanguageCert International have added the EU-codes to their practical terms: Preliminary, Access, Achiever, Communicator, Expert, Mastery. A group at the King Juan-Carlos University (Madrid, Spain) is collecting practitioners' opinions in a bid to categorise Python mastery according to the Framework. You may like to contribute by responding to their research surveys (one form takes five-minutes, the other fifteen): https://docs.google.com/forms/d/e/1FAIpQLSdlzWGpvZHLHXl6iEdHbLTB6QvYXknrD9-JKmzY7riYJkPmNw/viewform I like to label tutorials and conference-talks (and sometimes individual slides/sections) to indicate levels of complexity. However, have replaced abstract terms such as "Beginner" or "Junior", "Intermediate", and "Advanced" or "Senior" which all sound school-ish; with the three terms: "Apprentice", "Journeyman", and "Master" (see also https://leanpub.com/b/python-craftsman). Whilst, there have been folk unfamiliar with (UK) "Guild" terms, they generally respond readily to explanation and the professional context. NB I use the terms solely to indicate an expected audience, as distinct from assessing an individual's capability (or pay-rate)! There is a potential-problem in the rising sensitivity of the word "Master", eg the git CVS has replaced the idea of a Master-branch with "Main branch" (or user-chosen alternative name). Will referring to skilled professionals as 'masters (of their profession/craft)' transgress (international or at least US-instigated) 'Political Correctness'? What do you think a professionally-recognisable series of skill-levels for programmers? -- Regards, =dn -- https://mail.python.org/mailman/listinfo/python-list -- Regards, =dn From PythonList at DancesWithMice.info Sat Jun 12 23:21:01 2021 From: PythonList at DancesWithMice.info (dn) Date: Sun, 13 Jun 2021 15:21:01 +1200 Subject: [Tutor] Terminology: EU language skills, and Master to Main (or ...) Message-ID: <94394a89-e123-eedb-ea7b-bf64bb5762dd@DancesWithMice.info> [to folk subscribed to both the Python list and Python-Tutor: apologies for cross-posting] Regarding levels of skill or complexity in learning, the European Union has been working on "The Common European Framework of Reference for Languages: Learning, Teaching, Assessment". It also standardises terminology for spoken/national-language training courses. https://en.wikipedia.org/wiki/Common_European_Framework_of_Reference_for_Languages I'm not a fan of such abstract labelling of one's progress (or a tutorial's content) with codes or "reference levels" (A1, A2, B1, B2, C1, C2) but maybe it will become widely recognised... The web-page amuses (this small mind) noting non-PC stereotypes, that the ever-pragmatic Dutch have scaled language skills based upon how one wants to apply or use them; the Scandinavians go for numerical progression; which the Italians do similarly but with 'flair' (using words not digits). LanguageCert International have added the EU-codes to their practical terms: Preliminary, Access, Achiever, Communicator, Expert, Mastery. A group at the King Juan-Carlos University (Madrid, Spain) is collecting practitioners' opinions in a bid to categorise Python mastery according to the Framework. You may like to contribute by responding to their research surveys (one form takes five-minutes, the other fifteen): https://docs.google.com/forms/d/e/1FAIpQLSdlzWGpvZHLHXl6iEdHbLTB6QvYXknrD9-JKmzY7riYJkPmNw/viewform I like to label tutorials and conference-talks (and sometimes individual slides/sections) to indicate levels of complexity. However, have replaced abstract terms such as "Beginner" or "Junior", "Intermediate", and "Advanced" or "Senior" which all sound school-ish; with the three terms: "Apprentice", "Journeyman", and "Master" (see also https://leanpub.com/b/python-craftsman). Whilst, there have been folk unfamiliar with (UK) "Guild" terms, they generally respond readily to explanation and the professional context. NB I use the terms solely to indicate an expected audience, as distinct from assessing an individual's capability (or pay-rate)! There is a potential-problem in the rising sensitivity of the word "Master", eg the git CVS has replaced the idea of a Master-branch with "Main branch" (or user-chosen alternative name). Will referring to skilled professionals as 'masters (of their profession/craft)' transgress (international or at least US-instigated) 'Political Correctness'? What do you think a professionally-recognisable series of skill-levels for programmers? -- Regards, =dn From rtm443x at googlemail.com Mon Jun 14 01:10:20 2021 From: rtm443x at googlemail.com (jan) Date: Mon, 14 Jun 2021 06:10:20 +0100 Subject: [Tutor] Terminology: EU language skills, and Master to Main (or ...) In-Reply-To: <94394a89-e123-eedb-ea7b-bf64bb5762dd@DancesWithMice.info> References: <94394a89-e123-eedb-ea7b-bf64bb5762dd@DancesWithMice.info> Message-ID: Hi, see below On 13/06/2021, dn via Python-list wrote: > [to folk subscribed to both the Python list and Python-Tutor: apologies > for cross-posting] > > > Regarding levels of skill or complexity in learning, the European Union > has been working on "The Common European Framework of Reference for > Languages: Learning, Teaching, Assessment". It also standardises > terminology for spoken/national-language training courses. > https://en.wikipedia.org/wiki/Common_European_Framework_of_Reference_for_Languages To re-state what you already said but I didn't pick up on, this is natural spoken languages. [snip] > > A group at the King Juan-Carlos University (Madrid, Spain) is collecting > practitioners' opinions in a bid to categorise Python mastery according > to the Framework. You may like to contribute by responding to their > research surveys (one form takes five-minutes, the other fifteen): > https://docs.google.com/forms/d/e/1FAIpQLSdlzWGpvZHLHXl6iEdHbLTB6QvYXknrD9-JKmzY7riYJkPmNw/viewform Also I'm not sure there's much to relate artificial (programming) languages with natural (spoken) ones. 'Mastery; of python programming is almost meaningless because if you are a decent programmer you will be able to pick up new paradigms *reasonably* straightforwardly, and paradignms thus internalised (functional/OO/procedural/logic/etc) will then transfer fairly easily across languages. Also it's about problem solving which is an independent skill altogether. Also it includes transferrable prior experiences and knowledge/exposure ("IThere's a library for that" / "regexps are a trap here" / "just use a parser generator, don't write it by hand" / "The largest element every time? Let me introduce you to the Heap data structure" / "if you stick a bloom filter in front of that you can cut out 90% of database accesses here") If you're a good scala programmer it will take only a few weeks to get up to speed with python - I've done it. Most of that time went on learning the libraries (of python, and scala) anyway. > > > I like to label tutorials and conference-talks (and sometimes individual > slides/sections) to indicate levels of complexity. However, have > replaced abstract terms such as "Beginner" or "Junior", "Intermediate", > and "Advanced" or "Senior" which all sound school-ish; with the three > terms: "Apprentice", "Journeyman", and "Master" (see also > https://leanpub.com/b/python-craftsman). Just words. > [snip] > > There is a potential-problem in the rising sensitivity of the word > "Master", eg the git CVS has replaced the idea of a Master-branch with > "Main branch" (or user-chosen alternative name). Will referring to > skilled professionals as 'masters (of their profession/craft)' > transgress (international or at least US-instigated) 'Political > Correctness'? I've never seen any of this at my workplaces. When I occasionally read about this on the web and follow up to the source posts of those doing this, my impression is there are a few, vocal people who are just there to disrupt rather than do anything constructive. That may be reporting bias though so my view may be of questionable reliability. Basically I've not seen much if any value in this PC stuff. > > > What do you think a professionally-recognisable series of skill-levels > for programmers? Fine. If you can do it in any meaningful sense. jan > > -- > Regards, > =dn > -- > https://mail.python.org/mailman/listinfo/python-list > From rh at saintbedes.net Thu Jun 17 07:50:26 2021 From: rh at saintbedes.net (D Rochester) Date: Thu, 17 Jun 2021 12:50:26 +0100 Subject: [Tutor] Trying to select an input from another Python file & output that input in the main game program Message-ID: Good afternoon, I have spent many hours trying to solve this problem. I have 4 Python files (arc_songdataload, game_menu, Noels_Music_Game & User_Signup) & 3 csv files for this game. It's a song guessing game. I am using IDLE Python version 3.7.4. The first file (Noels_Music_Game) is for authenticating the users and it looks at a csv file called 'users'. I have added another player to this file so that it now spans 4 columns with the 1st & 3rd entries being the users and the 2nd & 4th entries being the passwords. This element of the code works fine and gives access to the game using a simple IF statement detailed below; import csv import random import time import sys import game_menu # main game is stored here as a menu() function below. # opens users data userdata = open("users.csv", "r")# Will read the file userdataRead = csv.reader(userdata) userdataList = list(userdataRead) # username is abc or aaa # password is 123 or 2101 # call loop variable for login loop = 1 print("You need to log in before you can proceed.") # starts loop for logging in while loop == 1: username = input("Enter your username;\n") print(username) if username == userdataList[0][0] or username == userdataList[0][2]: #print(username) print("Verified user. Please enter your password") while True: password = input("Enter your password.\n:") if password == userdataList[0][1] or password == userdataList[0][3]: print(username,"You have been logged in. Welcome to the game.") loop = 0 break # breaks loop and starts program else: print("Wrong password.") else: print("No such user registered. Please enter a valid username.") game_menu.menu() After access is granted the game begins. I have an additional 2 csv files called 'scores' & 'songs'. After completing the game the output should detail the user that was playing but it doesn't, it always details the 1st user ('abc'). I know it's because of the line of code detailed in the 'game_menu'; scoredataAdd.writerow([globalpoints, userdataList[0][0]]). The latter element references the first point in the csv file but what I need it to do is to be able to reference the correct user from the opening file. I can make it do that by [0][2] but that's not how it should work, it should be automatically pulled into the output. The following code is the actual game; # importing files we'll need later import csv import random import time import sys # variables here gameloop1 = 1 globalpoints = 0 chances = 2 rngnum = 0 # random number generator for random songs (rng) def rng(): global rngnum rngnum = int(random.randint(0,5)) def menu(): # declare variables global so we can use them in the function global rngnum global gameloop1 global globalpoints global chances # intro text print("Welcome to the song guessing game.") print("You will be randomly presented with the first letter of each song title") print("and each artist of the song.") print("Each one you get correct will add 2 points onto your score, 1 if you take 2 guesses.") print("You get 2 chances to guess. Guessing incorrectly on the second guess") print("will end the game.") print("At the end, your high score will be displayed.") print("The game will now start.") # loads song data songdata = open("songs.csv", "r") songdataRead = csv.reader(songdata) songdataList = list(songdataRead) #loads user data userdata = open("users.csv", "r") userdataRead = csv.reader(userdata) userdataList = list(userdataRead) # appends score data scoredata = open("scores.csv", "a", newline="") scoredataAdd = csv.writer(scoredata) # actual game while gameloop1 == 1: rng() print("What song is it?") print(songdataList[rngnum][0][0] + " , by " + songdataList[rngnum][1]) #print(rngnum) userinputchoice = input(": ") if userinputchoice == songdataList[rngnum][0]: if chances == 2: globalpoints += 3 print("You have got " + str(globalpoints) + " points.\n") chances = 2 #elif chances == 1: #globalpoints += 1 #print("You have got " + str(globalpoints) + " points.\n") #chances = 2 else: chances -= 1 print("You have " + str(chances) + " chances left.\n") if chances == 0: gameloop1 = 0 gameloop2 = 1 while gameloop2 == 1: print("Guess again.") userinputchoice2 = input(": ") if userinputchoice2 == songdataList[rngnum][0]: if chances == 1: globalpoints += 1 print("You have got " + str(globalpoints) + " points.\n") chances = 2 gameloop2 = 0 else: gameloop2 = 0 gameloop1 = 0 #print("You have " + str(chances) + " chances left.\n") print("The game has ended.") print("Well done, you have scored", str(globalpoints) + ".") #adds score to table scoredataAdd.writerow([globalpoints, userdataList[0][0]]) #userdataList[0][0]])# Remember to amend the userdataList here as you have added another user scoredata.close() print("Top 5 scores nationally:\n") # quickly opens up score data scoreRead = open("scores.csv", "r") scoreReadReader = csv.DictReader(scoreRead) # sorts the list of values newList = sorted(scoreReadReader, key=lambda row: int(row['Score']), reverse=True) # prints scores print("Rank | Score | Name\n") for i, r in enumerate(newList[0:50]): print('{} | {} | {}'.format(str(int(i)+1), r['Score'], r['Name'])) I hope that everything that I have detailed here makes sense? I would hugely appreciate your help. Kindest regards, David -- Are you a former St Bede's student?? Join our growing Alumni network -? Click here to find out more and sign up -- If you have received this email in error, or the contents are considered inappropriate, please notify the sender or postmaster at st-bedes.surrey.sch.uk St Bede's School reserves the right to monitor all incoming and outgoing mail. From david at graniteweb.com Thu Jun 17 17:16:49 2021 From: david at graniteweb.com (David Rock) Date: Thu, 17 Jun 2021 16:16:49 -0500 Subject: [Tutor] Trying to select an input from another Python file & output that input in the main game program In-Reply-To: References: Message-ID: <20210617211649.GA17963@graniteweb.com> * D Rochester [2021-06-17 12:50]: > Good afternoon, > > The first file (Noels_Music_Game) is for authenticating the users and it > looks at a csv file called 'users'. I have added another player to this > file so that it now spans 4 columns with the 1st & 3rd entries being the > users and the 2nd & 4th entries being the passwords. This element of the > code works fine and gives access to the game using a simple IF statement > detailed below; Just a quick observation/question first: Why do you have the users.csv set up this way? It would simplify your code a lot if you had only two columns: user and password The list of users is begging to be a list of rows, one pair of information for each user, rather than trying to add horizontally. What happens when you need to add a third user; a fourth? It's not maintainable. There's also a logic problem with doing it horizontally: the wrong password can match against the wrong user. In other words, if you enter user2's name and user1's password, you will still authenticate. I doubt that's your intention. Try using the csv module and use the userids as keys to search against to compare the entered password. That might help with the rest of the work. -- David Rock david at graniteweb.com From manpritsinghece at gmail.com Fri Jun 18 02:13:22 2021 From: manpritsinghece at gmail.com (Manprit Singh) Date: Fri, 18 Jun 2021 11:43:22 +0530 Subject: [Tutor] stateless Vs operation with state Message-ID: Dear sir , Although this question may be a little out of scope of this mailing list, if you can answer it will be really helpful for me . An example of operation with state : Suppose I have to make a software solution that can maintain the bank balance of the customer . At a particular time if a customer has a balance of 2000 and he deposits 4000 . then his balance becomes 6000 . so this present balance is dependent on the previous balance . According to me this problem is a problem with state and for such kind of problems OOPS or classes are the solution . Example of stateless: Stateless problems are those problems, in which i can get the answer of the problem based on the present scenario only , and that answer is not going to alter or change the input in any way . Functions are best for these Am i right or wrong ? From __peter__ at web.de Fri Jun 18 02:36:59 2021 From: __peter__ at web.de (Peter Otten) Date: Fri, 18 Jun 2021 08:36:59 +0200 Subject: [Tutor] Trying to select an input from another Python file & output that input in the main game program In-Reply-To: <20210617211649.GA17963@graniteweb.com> References: <20210617211649.GA17963@graniteweb.com> Message-ID: On 17/06/2021 23:16, David Rock wrote: > * D Rochester [2021-06-17 12:50]: >> Good afternoon, >> >> The first file (Noels_Music_Game) is for authenticating the users and it >> looks at a csv file called 'users'. I have added another player to this >> file so that it now spans 4 columns with the 1st & 3rd entries being the >> users and the 2nd & 4th entries being the passwords. This element of the >> code works fine and gives access to the game using a simple IF statement >> detailed below; > > Just a quick observation/question first: > > Why do you have the users.csv set up this way? It would simplify your code a > lot if you had only two columns: user and password > > The list of users is begging to be a list of rows, one pair of information for > each user, rather than trying to add horizontally. What happens when you need > to add a third user; a fourth? It's not maintainable. > > There's also a logic problem with doing it horizontally: the wrong password can > match against the wrong user. In other words, if you enter user2's name and > user1's password, you will still authenticate. I doubt that's your intention. > > Try using the csv module and use the userids as keys to search against to > compare the entered password. That might help with the rest of the work. David's comment is spot-on. If you are ambitious read the csv into a dict with usernames as keys and paswords as values. Then you can check a username/password pair with lookup = {"jim": "secret", "jane": "wont-tell"} # example, actual dict # should be created at # runtime ... valid = lookup.get(username) == password As to the problem you describe I recommend that you move the code in your first module into a function, let's call it get_authenticated_user(), with the following structure (pseudocode): def get_authenticated_user(): read csv while True: username = input("user name: ") password = input("password: ") check password/username pair if password is correct: return username print("wrong user or password") Then change menu() to take a username argument def menu(username): ... # now you know the current user and call it menu(get_authenticated_user()) This approach can be generalized: avoid global variables, always try to pass state between functions explicitly as arguments. From mari1am.muha at gmail.com Thu Jun 17 17:37:29 2021 From: mari1am.muha at gmail.com (Mariam M.S) Date: Fri, 18 Jun 2021 00:37:29 +0300 Subject: [Tutor] Trying to get In-Reply-To: <20210617211649.GA17963@graniteweb.com> References: <20210617211649.GA17963@graniteweb.com> Message-ID: Good evening Trying to fix this problem (((1+2)?3)/4)^5 and I?m stuck on it for hours now I know this is simple one but I?m a beginner and could use your help. Btw google didn?t help. Thank you From alan.gauld at yahoo.co.uk Fri Jun 18 05:14:44 2021 From: alan.gauld at yahoo.co.uk (Alan Gauld) Date: Fri, 18 Jun 2021 10:14:44 +0100 Subject: [Tutor] stateless Vs operation with state In-Reply-To: References: Message-ID: On 18/06/2021 07:13, Manprit Singh wrote: > An example of operation with state : > balance of 2000 and he deposits 4000 . then his balance becomes 6000 . so > this present balance is dependent on the previous balance . According to me > this problem is a problem with state and for such kind of problems OOPS or > classes are the solution . They might be. But you could also use a global variable, or in a real world problem, a database. In which case functions will work just as well. In fact in Python, if it is just a single value you are tracking it could even be a generator function. That wouldn't apply with bank balances since there will be separate balances per customer, but in a control system it might be the case. However, for more complex systems, OOP has become the standard option for managing stateful entities. But database plus functions are still a valid option. > Example of stateless: > > Stateless problems are those problems, in which i can get the answer of the > problem based on the present scenario only , and that answer is not going > to alter or change the input in any way . Functions are best for these Here I agree, and in fact not just functions but the whole functional programming paradigm is directly suited to stateless applications. You can use OOP in stateless apps but there is little advantage. However, there are very, very, few stateless applications in the real world! -- Alan G Author of the Learn to Program web site http://www.alan-g.me.uk/ http://www.amazon.com/author/alan_gauld Follow my photo-blog on Flickr at: http://www.flickr.com/photos/alangauldphotos From alan.gauld at yahoo.co.uk Fri Jun 18 05:18:09 2021 From: alan.gauld at yahoo.co.uk (Alan Gauld) Date: Fri, 18 Jun 2021 10:18:09 +0100 Subject: [Tutor] Trying to get In-Reply-To: References: <20210617211649.GA17963@graniteweb.com> Message-ID: On 17/06/2021 22:37, Mariam M.S wrote: > Good evening > > Trying to fix this problem (((1+2)?3)/4)^5 and I?m stuck on it for hours > now I know this is simple one but I?m a beginner and could use your help. > Btw google didn?t help. Thank you You haven't told us what the problem is. You've posted an arithmetic expression which evaluates to 57.665 according to my calculator. What exactly are you trying to do that has you stuck? I assume you know how to use a calcxulator.... -- Alan G Author of the Learn to Program web site http://www.alan-g.me.uk/ http://www.amazon.com/author/alan_gauld Follow my photo-blog on Flickr at: http://www.flickr.com/photos/alangauldphotos From chris_roysmith at internode.on.net Fri Jun 18 06:19:43 2021 From: chris_roysmith at internode.on.net (Chris Roy-Smith) Date: Fri, 18 Jun 2021 20:19:43 +1000 Subject: [Tutor] Trying to get In-Reply-To: References: <20210617211649.GA17963@graniteweb.com> Message-ID: <05f79aef-3c45-e1ff-8229-d98989b6ddc7@internode.on.net> On 18/6/21 7:37 am, Mariam M.S wrote: > Good evening > > Trying to fix this problem (((1+2)?3)/4)^5 try (((1+2)*3)/4)**5 ** is the exponentiation operator regards, Chris and I?m stuck on it for hours > now I know this is simple one but I?m a beginner and could use your help. > Btw google didn?t help. Thank you > _______________________________________________ > Tutor maillist - Tutor at python.org > To unsubscribe or change subscription options: > https://mail.python.org/mailman/listinfo/tutor > From alan.gauld at yahoo.co.uk Fri Jun 18 14:10:49 2021 From: alan.gauld at yahoo.co.uk (Alan Gauld) Date: Fri, 18 Jun 2021 19:10:49 +0100 Subject: [Tutor] Trying to select an input from another Python file & output that input in the main game program In-Reply-To: References: Message-ID: On 17/06/2021 12:50, D Rochester wrote: > I have spent many hours trying to solve this problem. I suspect in part that's because you are writing very complex code using lots of global variables as flags to terminate loops etc. It reminds me of the kind of code we used to see in primitive BASIC programs. Others have addressed the issues around your use of a CSV file. But in your general code you should not need so many test variables, see below: > # starts loop for logging in > while loop == 1: > username = input("Enter your username;\n") > print(username) > if username == userdataList[0][0] or username == userdataList[0][2]: > #print(username) > print("Verified user. Please enter your password") > while True: > password = input("Enter your password.\n:") > if password == userdataList[0][1] or password == > userdataList[0][3]: > print(username,"You have been logged in. Welcome to the > game.") > loop = 0 > break # breaks loop and starts program > else: > print("Wrong password.") > else: > print("No such user registered. Please enter a valid username.") > game_menu.menu() You could simplify that by splitting it into two separate loops, one to get the user name the other for the password. No need for the nested structure which is harder to read and debug. while True: user = input.... if valid user break while True: pwd = input.... if valid pwd: break At this point you have a valid user with a valid pwd. And no need for a loop variable. > After access is granted the game begins. I have an additional 2 csv files > called 'scores' & 'songs'. After completing the game the output should > detail the user that was playing but it doesn't, it always details the 1st > user ('abc'). I know it's because of the line of code detailed in the > 'game_menu'; scoredataAdd.writerow([globalpoints, userdataList[0][0]]). The > latter element references the first point in the csv file but what I need > it to do is to be able to reference the correct user from the opening file. > I can make it do that by [0][2] but that's not how it should work, it > should be automatically pulled into the output. So use the user data that you captured. Put the game code in a function that takes a user name as an input argument. Call that function immediately after a successful login: start_game(user) > # variables here > gameloop1 = 1 > globalpoints = 0 > chances = 2 > rngnum = 0 > > # random number generator for random songs (rng) > def rng(): > global rngnum > rngnum = int(random.randint(0,5)) You don't need the int(), randint already returns an int And you shouldn't need the global either, instead return the number and assign the function result: def rng() return random.randint(0,5) rngnum = rng() I'll assume you intend to do something more sophisticated later, otherwise you might as well just do the assignment once: rngnum = random.randint(0,5) > def menu(): > # declare variables global so we can use them in the function > global rngnum > global gameloop1 > global globalpoints > global chances This function is not really a menu. It is the whole game - the start_game() that I described above. Also you only need to declare variables as global if you will be changing them. And you only need global variables if they will be shared between functions (and even then not always) Your gameloop should be a local variable since its only used inside the function. I suspect the same is true of most/all of the others. Global variables are considered bad practice, you should do all you can to remove them from your code. > # intro text > print("Welcome to the song guessing game.") > print("You will be randomly presented with the first letter of each > song title") > print("and each artist of the song.") > print("Each one you get correct will add 2 points onto your score, 1 if > you take 2 guesses.") > print("You get 2 chances to guess. Guessing incorrectly on the second > guess") > print("will end the game.") > print("At the end, your high score will be displayed.") > print("The game will now start.") You might consider using a triple quoted string for the welcome message. Easier to read, edit and very slightly more efficient.) > # loads song data > songdata = open("songs.csv", "r") > songdataRead = csv.reader(songdata) > songdataList = list(songdataRead) > > #loads user data > userdata = open("users.csv", "r") > userdataRead = csv.reader(userdata) > userdataList = list(userdataRead) You already have the user details when they logged in so just pass them from the login code to this function. Then you don't need to read it all over again. > # appends score data > scoredata = open("scores.csv", "a", newline="") > scoredataAdd = csv.writer(scoredata) Up to here you've done a lot of initialization but no menu has been presented, despite the name of the function.... > # actual game > while gameloop1 == 1: Just use while True and break statements as needed. > rng() > print("What song is it?") > print(songdataList[rngnum][0][0] + " , by " + > songdataList[rngnum][1]) > #print(rngnum) > userinputchoice = input(": ") > if userinputchoice == songdataList[rngnum][0]: > if chances == 2: > globalpoints += 3 > print("You have got " + str(globalpoints) + " points.\n") You don't need the str() conversion of points, print() does that for you. > chances = 2 This code only executes if chances already equals 2, this line is therefore pointless. > #elif chances == 1: > #globalpoints += 1 > #print("You have got " + str(globalpoints) + " points.\n") > #chances = 2 > else: > chances -= 1 > print("You have " + str(chances) + " chances left.\n") Again, no need for str() > if chances == 0: > gameloop1 = 0 > gameloop2 = 1 > while gameloop2 == 1: > print("Guess again.") > userinputchoice2 = input(": ") > if userinputchoice2 == songdataList[rngnum][0]: > if chances == 1: > globalpoints += 1 > print("You have got " + str(globalpoints) + " > points.\n") > chances = 2 > gameloop2 = 0 > else: > gameloop2 = 0 > gameloop1 = 0 > Again you don;t need the gameloop variables, they just complicate things. > #print("You have " + str(chances) + " chances left.\n") > print("The game has ended.") > print("Well done, you have scored", str(globalpoints) + ".") > > #adds score to table > > scoredataAdd.writerow([globalpoints, userdataList[0][0]]) > #userdataList[0][0]])# Remember to amend the userdataList here as you have > added another user > scoredata.close() You should probably put the following code into a separate function. If you put the login code into a function too then your top level code would be: user = login() play_game(user) report_stats() > print("Top 5 scores nationally:\n") > > # quickly opens up score data > scoreRead = open("scores.csv", "r") > scoreReadReader = csv.DictReader(scoreRead) > > # sorts the list of values > newList = sorted(scoreReadReader, key=lambda row: int(row['Score']), > reverse=True) > > # prints scores > print("Rank | Score | Name\n") > for i, r in enumerate(newList[0:50]): > print('{} | {} | {}'.format(str(int(i)+1), r['Score'], r['Name'])) Keep the code simple and it will be easier to debug. Simple code comes from simple data. You can keep with 3 csv files, but simplify their structure. Alternatively learn SQLite and use a single database with 3 tables. If you need an intro to SQLite you could try my tutorial and the Database section. But for now I'd suggest sticking with the 3 files, you have enough to do learning good Python idioms. -- Alan G Author of the Learn to Program web site http://www.alan-g.me.uk/ http://www.amazon.com/author/alan_gauld Follow my photo-blog on Flickr at: http://www.flickr.com/photos/alangauldphotos From breamoreboy at gmail.com Fri Jun 18 15:56:03 2021 From: breamoreboy at gmail.com (Mark Lawrence) Date: Fri, 18 Jun 2021 20:56:03 +0100 Subject: [Tutor] Trying to select an input from another Python file & output that input in the main game program In-Reply-To: References: Message-ID: On 18/06/2021 19:10, Alan Gauld via Tutor wrote: > > You could simplify that by splitting it into two separate loops, > one to get the user name the other for the password. No need for the > nested structure which is harder to read and debug. For anyone who's interested here's an excellent article on looping with python http://nedbatchelder.com/text/iter.html -- My fellow Pythonistas, ask not what our language can do for you, ask what you can do for our language. Mark Lawrence From pmazek at otwock.pl Fri Jun 18 05:12:52 2021 From: pmazek at otwock.pl (Piotr Mazek) Date: Fri, 18 Jun 2021 11:12:52 +0200 Subject: [Tutor] Trying to get In-Reply-To: References: <20210617211649.GA17963@graniteweb.com> Message-ID: <003801d76422$2c5b3700$8511a500$@otwock.pl> >> (((1+2)?3)/4)^5 Raising a number to a power N multiplies the number by itself N times. For example, 2 raised to the power of 3 is equal to 2 ? 2 ? 2 = 8. Use the power operator, **, to raise a number to a power Place the power operator, ** (this one's ^ wrong) between two numbers to raise the former number to the latter. two_cubed = 2.0 ** 3 print(two_cubed) Output: 8.0 In this example, 2.0 is a float. The power operator accepts any numeric data type, such as an int or a float. Regards -----Original Message----- From: Tutor [mailto:tutor-bounces+pmazek=otwock.pl at python.org] On Behalf Of Mariam M.S Sent: Thursday, June 17, 2021 11:37 PM To: David Rock; tutor at python.org Subject: [Tutor] Trying to get Good evening Trying to fix this problem (((1+2)?3)/4)^5 and I?m stuck on it for hours now I know this is simple one but I?m a beginner and could use your help. Btw google didn?t help. Thank you _______________________________________________ Tutor maillist - Tutor at python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor From rayrayrayng at yahoo.com.hk Fri Jun 18 05:40:08 2021 From: rayrayrayng at yahoo.com.hk (Ray Ng) Date: Fri, 18 Jun 2021 09:40:08 +0000 (UTC) Subject: [Tutor] Trying to get In-Reply-To: References: <20210617211649.GA17963@graniteweb.com> Message-ID: <1151037155.1195438.1624009208452@mail.yahoo.com> Hi Mariam, Replace '^'? to '**' on the equation. Mariam M.S () ? 2021?6?18???? ??04:59:47 [GMT+8] ??? Good evening Trying to fix this problem (((1+2)?3)/4)^5 and I?m stuck on it for hours now I know this is simple one but I?m a beginner and could use your help. Btw google didn?t help. Thank you _______________________________________________ Tutor maillist? -? Tutor at python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor From rayrayrayng at yahoo.com.hk Fri Jun 18 06:03:38 2021 From: rayrayrayng at yahoo.com.hk (Ray Ng) Date: Fri, 18 Jun 2021 10:03:38 +0000 (UTC) Subject: [Tutor] Trying to get In-Reply-To: <1151037155.1195438.1624009208452@mail.yahoo.com> References: <20210617211649.GA17963@graniteweb.com> <1151037155.1195438.1624009208452@mail.yahoo.com> Message-ID: <88562592.1200908.1624010618831@mail.yahoo.com> Hi Mariam, More description, In Python 'Power of' sign is using '**'. 6. Expressions ? Python 3.9.5 documentation | | | | 6. Expressions ? Python 3.9.5 documentation | | | Point 6.5.?The?power?operator Ray Ng () ? 2021?6?18???? ??05:40:08 [GMT+8] ??? Hi Mariam, Replace '^'? to '**' on the equation. Mariam M.S () ? 2021?6?18???? ??04:59:47 [GMT+8] ??? Good evening Trying to fix this problem (((1+2)?3)/4)^5 and I?m stuck on it for hours now I know this is simple one but I?m a beginner and could use your help. Btw google didn?t help. Thank you _______________________________________________ Tutor maillist? -? Tutor at python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor From manpritsinghece at gmail.com Sat Jun 19 14:24:19 2021 From: manpritsinghece at gmail.com (Manprit Singh) Date: Sat, 19 Jun 2021 23:54:19 +0530 Subject: [Tutor] Trying to find all pairs of numbers from list whose sum is 10 Message-ID: Dear sir, Let us consider a list lst = [2, 4, 7, 5, 9, 0, 8, 5, 3, 8] in this list there are 3 pairs (2, 8), (5, 5), (7, 3) whose sum is 10. To finding these pairs i have written the code as below : def pair_ten(x): y, st = x[:], set() for i in x: y.remove(i) st.update((i, ele) for ele in y if i+ele == 10) return st lst = [2, 4, 7, 5, 9, 0, 8, 5, 3, 8] ans = pair_ten(lst) print(ans) {(2, 8), (5, 5), (7, 3)} But i feel the code is still more complex, due to two for loops, in what way this ccan be dome more efficiently. kindly guide Regards Manprit singh From alan.gauld at yahoo.co.uk Sat Jun 19 14:47:43 2021 From: alan.gauld at yahoo.co.uk (Alan Gauld) Date: Sat, 19 Jun 2021 19:47:43 +0100 Subject: [Tutor] Trying to find all pairs of numbers from list whose sum is 10 In-Reply-To: References: Message-ID: On 19/06/2021 19:24, Manprit Singh wrote: > lst = [2, 4, 7, 5, 9, 0, 8, 5, 3, 8] > in this list there are 3 pairs (2, 8), (5, 5), (7, 3) whose sum is 10. To > finding these pairs i have written the code as below : > > def pair_ten(x): > y, st = x[:], set() > for i in x: > y.remove(i) > st.update((i, ele) for ele in y if i+ele == 10) > return st > But i feel the code is still more complex, due to two for loops, in > what way this ccan be dome more efficiently. More concisely, but not necessarily more efficiently(in terms of excution speed) import itertools as it lst = [2, 4, 7, 5, 9, 0, 8, 5, 3, 8] result = [pr for pr in set(it.combinations(lst,2)) if sum(pr) == 10] print(result) -- Alan G Author of the Learn to Program web site http://www.alan-g.me.uk/ http://www.amazon.com/author/alan_gauld Follow my photo-blog on Flickr at: http://www.flickr.com/photos/alangauldphotos From oscar.j.benjamin at gmail.com Sat Jun 19 16:24:20 2021 From: oscar.j.benjamin at gmail.com (Oscar Benjamin) Date: Sat, 19 Jun 2021 21:24:20 +0100 Subject: [Tutor] Trying to find all pairs of numbers from list whose sum is 10 In-Reply-To: References: Message-ID: On Sat, 19 Jun 2021 at 19:25, Manprit Singh wrote: > > Dear sir, > > Let us consider a list > lst = [2, 4, 7, 5, 9, 0, 8, 5, 3, 8] > in this list there are 3 pairs (2, 8), (5, 5), (7, 3) whose sum is 10. To > finding these pairs i have written the code as below : > > def pair_ten(x): > y, st = x[:], set() > for i in x: > y.remove(i) > st.update((i, ele) for ele in y if i+ele == 10) > return st > > lst = [2, 4, 7, 5, 9, 0, 8, 5, 3, 8] > ans = pair_ten(lst) > print(ans) > > {(2, 8), (5, 5), (7, 3)} > > But i feel the code is still more complex, due to two for loops, in > what way this ccan be dome more efficiently. What does "more efficiently" mean? What is the scope of this problem? Are the numbers always positive? Do you only want this for summing to 10 or is that just an example? The task as stated finding pairs that sum to 10 can be done very easily if all inputs are assumed to be positive: def pair_ten(numbers): numbers_set = set(numbers) pairs10 = [] for i in range(5+1): if i in numbers_set: i10 = 10 - i if i10 in numbers_set: pairs10.append((i, i10)) return pairs10 This only has one for loop. Your version has two as well as a hidden loop in the call to y.remove. Alan's version is nice and simple and works fine for small inputs but would be horribly inefficient if the size of the input list was large. You haven't said whether it could be large though (if it's always small then you are probably overthinking this). The version I showed above would be inefficient if numbers was a small list and you were looking for pairs that sum to say 1 million rather than 10. Also the implementation I showed above converts the numbers list to a set which is O(N) with no possibility of an early exit. If numbers was a huge list that contained all the possible pairs many times then it could be much quicker to loop through the first hundred or so items to see that all 5 possible pairs are there rather than convert the entire list to a set before doing anything else. More information is needed to say what is actually efficient. -- Oscar From __peter__ at web.de Sun Jun 20 02:50:28 2021 From: __peter__ at web.de (Peter Otten) Date: Sun, 20 Jun 2021 08:50:28 +0200 Subject: [Tutor] Trying to find all pairs of numbers from list whose sum is 10 In-Reply-To: References: Message-ID: On 19/06/2021 22:24, Oscar Benjamin wrote: > On Sat, 19 Jun 2021 at 19:25, Manprit Singh wrote: >> >> Dear sir, >> >> Let us consider a list >> lst = [2, 4, 7, 5, 9, 0, 8, 5, 3, 8] >> in this list there are 3 pairs (2, 8), (5, 5), (7, 3) whose sum is 10. To >> finding these pairs i have written the code as below : >> >> def pair_ten(x): >> y, st = x[:], set() >> for i in x: >> y.remove(i) >> st.update((i, ele) for ele in y if i+ele == 10) >> return st >> >> lst = [2, 4, 7, 5, 9, 0, 8, 5, 3, 8] >> ans = pair_ten(lst) >> print(ans) >> >> {(2, 8), (5, 5), (7, 3)} >> >> But i feel the code is still more complex, due to two for loops, in >> what way this ccan be dome more efficiently. > > What does "more efficiently" mean? What is the scope of this problem? > Are the numbers always positive? Do you only want this for summing to > 10 or is that just an example? > > The task as stated finding pairs that sum to 10 can be done very > easily if all inputs are assumed to be positive: > > def pair_ten(numbers): > numbers_set = set(numbers) > pairs10 = [] > for i in range(5+1): > if i in numbers_set: > i10 = 10 - i > if i10 in numbers_set: > pairs10.append((i, i10)) > return pairs10 lst=[6, 4, 6] manprit: {(6, 4), (4, 6)} alan: [(6, 4), (4, 6)] oscar: [(4, 6)] lst=[5] manprit: set() alan: [] oscar: [(5, 5)] Efficiency considerations aside: Are the tuples meant to be ordered? Can one number in the original list be reused for both entries of a tuple? > > This only has one for loop. Your version has two as well as a hidden > loop in the call to y.remove. > > Alan's version is nice and simple and works fine for small inputs but > would be horribly inefficient if the size of the input list was large. > You haven't said whether it could be large though (if it's always > small then you are probably overthinking this). The version I showed > above would be inefficient if numbers was a small list and you were > looking for pairs that sum to say 1 million rather than 10. Also the > implementation I showed above converts the numbers list to a set which > is O(N) with no possibility of an early exit. If numbers was a huge > list that contained all the possible pairs many times then it could be > much quicker to loop through the first hundred or so items to see that > all 5 possible pairs are there rather than convert the entire list to > a set before doing anything else. > > More information is needed to say what is actually efficient. From robertvstepp at gmail.com Mon Jun 21 20:54:20 2021 From: robertvstepp at gmail.com (boB Stepp) Date: Mon, 21 Jun 2021 19:54:20 -0500 Subject: [Tutor] What has happened to the searchable Tutor Mail Archive? Message-ID: I am used to using the searchable Tutor Mail Archive at https://www.mail-archive.com/tutor at python.org/ However, when trying to use it tonight I note that there are no recent posts archived since sometime last year. Is this site broken? If it is is there another alternative? The ActiveState one seems to have its own issues. boB Stepp From bouncingcats at gmail.com Mon Jun 21 21:20:25 2021 From: bouncingcats at gmail.com (David) Date: Tue, 22 Jun 2021 11:20:25 +1000 Subject: [Tutor] What has happened to the searchable Tutor Mail Archive? In-Reply-To: References: Message-ID: On Tue, 22 Jun 2021 at 10:55, boB Stepp wrote: > I am used to using the searchable Tutor Mail Archive at > https://www.mail-archive.com/tutor at python.org/ However, when trying > to use it tonight I note that there are no recent posts archived since > sometime last year. Is this site broken? If it is is there another > alternative? The ActiveState one seems to have its own issues. Quoting https://www.mail-archive.com/: Looking for an easy way to turn your mailing list into a searchable archive? Just add The Mail Archive as a member to your mailing list as described in the how-to-guide. Given that, I suspect the reason that archive stopped receiving posts from this mailing list in August 2019 can be found in this thread: https://mail.python.org/pipermail/tutor/2019-August/115378.html From robertvstepp at gmail.com Mon Jun 21 23:55:55 2021 From: robertvstepp at gmail.com (boB Stepp) Date: Mon, 21 Jun 2021 22:55:55 -0500 Subject: [Tutor] What has happened to the searchable Tutor Mail Archive? In-Reply-To: References: Message-ID: On Mon, Jun 21, 2021 at 8:21 PM David wrote: > > On Tue, 22 Jun 2021 at 10:55, boB Stepp wrote: > > > I am used to using the searchable Tutor Mail Archive at > > https://www.mail-archive.com/tutor at python.org/ However, when trying > > to use it tonight I note that there are no recent posts archived since > > sometime last year. Is this site broken? If it is is there another > > alternative? The ActiveState one seems to have its own issues. > > Quoting https://www.mail-archive.com/: > Looking for an easy way to turn your mailing list into a searchable archive? > Just add The Mail Archive as a member to your mailing list as > described in the how-to-guide. > > Given that, I suspect the reason that archive stopped receiving posts from > this mailing list in August 2019 can be found in this thread: > https://mail.python.org/pipermail/tutor/2019-August/115378.html David, that sounds right. Hmm. Alan, is this something you would/should do? boB Stepp From edwinconnell at gmail.com Mon Jun 21 15:01:33 2021 From: edwinconnell at gmail.com (Ed Connell) Date: Mon, 21 Jun 2021 14:01:33 -0500 Subject: [Tutor] Using a dictionary in a menu implementation. Message-ID: Greetings! For a long time my menus were implemented like this: Menu = '[D]elete [E]dit [Q]uit' task = getchar().upper() if task == 'D': delete( id ) elif task == 'E': edit( id, size, age ) elif task == 'Q': sys.exit() Then I saw examples which used a dictionary, but every example had no arguments or always used the same argument. Obviously that would not work for this example. So I played around and came up with: do = dict( D = [ delete, id ], E = [ edit, id, size, age ], [ Q = [ sys.exit ] ) task = getchar().upper() if task in do: n = do[task] n[0]( *n[1: ] ) Finally my question :) - how would you implement this menu using a dictionary? Is there a standard way to do this? Thanks for all you do, Ed Connell -- I have a right and a left brain, but there is nothing right in the left one and there is nothing left in the right one! From robertvstepp at gmail.com Tue Jun 22 20:05:32 2021 From: robertvstepp at gmail.com (boB Stepp) Date: Tue, 22 Jun 2021 19:05:32 -0500 Subject: [Tutor] Using a dictionary in a menu implementation. In-Reply-To: References: Message-ID: On Tue, Jun 22, 2021 at 5:18 PM Ed Connell wrote: > For a long time my menus were implemented like this: > > Menu = '[D]elete [E]dit [Q]uit' > > task = getchar().upper() Is "getchar()" your own function? You don't specify what GUI environment you are operating in. I guess this is meant to be pseudocode? > if task == 'D': > delete( id ) > elif task == 'E': > edit( id, size, age ) > elif task == 'Q': > sys.exit() > > Then I saw examples which used a dictionary, but every example had no > arguments or always used the same argument. Obviously that would not work > for this example. So I played around and came up with: > > do = dict( D = [ delete, id ], E = [ edit, id, size, age ], [ Q = [ > sys.exit ] ) The problem with creating the dictionary this way is that you are essentially hard-coding the values of "id", "size", and "age" to whatever they happen to be at the time of the dictionary's creation. I doubt that is what you want, and probably why the examples you saw either had no arguments or used the same one since they didn't need it to change in the latter case. But it is okay to just associate the keys with the desired function arguments to get this dictionary: D = {'D': delete, 'E': edit, 'Q': sys.exit} So, to take this approach you would have to come up with a way to add the appropriate arguments to get 1) D['D'](id) 2) D['E'](id, size, age) 3) D['Q']() I don't know enough to suggest what the "standard" way to this would be, assuming such a standard way exists. HTH, boB Stepp From robertvstepp at gmail.com Wed Jun 23 18:18:51 2021 From: robertvstepp at gmail.com (boB Stepp) Date: Wed, 23 Jun 2021 17:18:51 -0500 Subject: [Tutor] Clarification questions about how Python uses references. Message-ID: I continue to attempt to refine my understanding of how Python uses identifiers to reference objects. Below is an interpreter session that I have a couple of questions about: >>> a = 'banana' >>> b = 'banana' >>> a is b True >>> a = 1000000000 >>> b = 1000000000 >>> a is b False >>> a = 'This is a much longer string than previously. I wonder what the result will be?' >>> b = 'This is a much longer string than previously. I wonder what the result will be?' >>> a is b False The integer example is as expected. I know that Python caches "small" integers for reuse. The exact ranges were discussed in a previous thread. But the string example is new to me. It appears that Python caches smaller strings. Is this true? If yes, what might the caching limits be? On to lists. My current understanding is that lists don't actually contain the objects themselves, but, instead, references to those objects. Is this correct? How could I prove this to myself in the interpreter? Does this translate to tuples and sets? Even though tuples are immutable they can contain mutable objects. Playing around in the interpreter it looks like even if sets contain tuples, no mutable elements can be in the tuples. Is this in general correct? TIA! boB Stepp From robertvstepp at gmail.com Wed Jun 23 19:58:14 2021 From: robertvstepp at gmail.com (boB Stepp) Date: Wed, 23 Jun 2021 18:58:14 -0500 Subject: [Tutor] Which is better style for a function that modifies a list? Message-ID: I am tremendously enjoying "Practical Programming -- An Introduction to Computer Science Using Python 3.6, 3rd Edition" by Paul Gries, Jennifer Campbell and Jason Montojo, c. 2017. I highly recommend it to anyone who needs to learn fundamental Python and/or introductory computer science. In its chapter on lists the authors point out that a function like def remove_last_item(L: list) -> list: """Return list L with the last item removed.""" del L[-1] return L does not require the return statement since the original passed in list will be modified in place, so there is no need for the function to return anything. I know from experience reading the Tutor and main lists that this is a frequent "gotcha" that catches many fellow learners. So I wonder: 1) For such a function that mutates the passed-in list is it clearer/better style to keep or omit the return statement? Which reads better? 2) From a preventative of subtle bugs perspective, if one must alter a passed-in list, would it be better to do a deep copy of the list passed in, mutate that, and return *that* list, leaving the original passed-in list unaltered? Of course the function documentation would reflect this. TIA! boB Stepp From robertvstepp at gmail.com Wed Jun 23 20:31:41 2021 From: robertvstepp at gmail.com (boB Stepp) Date: Wed, 23 Jun 2021 19:31:41 -0500 Subject: [Tutor] Which is better style for a function that modifies a list? In-Reply-To: References: Message-ID: On Wed, Jun 23, 2021 at 6:58 PM boB Stepp wrote: > > I am tremendously enjoying "Practical Programming -- An Introduction > to Computer Science Using Python 3.6, 3rd Edition" by Paul Gries, > Jennifer Campbell and Jason Montojo, c. 2017. I highly recommend it > to anyone who needs to learn fundamental Python and/or introductory > computer science. > > In its chapter on lists the authors point out that a function like > > def remove_last_item(L: list) -> list: > """Return list L with the last item removed.""" > > del L[-1] > return L > > does not require the return statement since the original passed in > list will be modified in place, so there is no need for the function > to return anything. I know from experience reading the Tutor and main > lists that this is a frequent "gotcha" that catches many fellow > learners. So I wonder: > > 1) For such a function that mutates the passed-in list is it > clearer/better style to keep or omit the return statement? Which > reads better? In light of an earlier thread I started plus now examining all of the list methods I conclude that the Pythonic way must be to omit the "return" statement, so that "None" is returned (Or return "None" explicitly.) whenever a list is modified in place without returning a new list. > TIA! > boB Stepp From robertvstepp at gmail.com Wed Jun 23 23:01:10 2021 From: robertvstepp at gmail.com (boB Stepp) Date: Wed, 23 Jun 2021 22:01:10 -0500 Subject: [Tutor] OT: "Mercurial" CPUs miscalculating and corrupting data Message-ID: A year or two ago I asked some questions relating to how likely stored data is to become corrupted. I just read the following article, "FYI: Today's computer chips are so advanced, they are more 'mercurial' than precise ? and here's the proof Rarely seen miscalculations now crop up frequently at cloud hyperscale" at https://www.theregister.com/2021/06/04/google_chip_flaws/ Apparently Google and Facebook are increasingly noticing aberrant CPU behavior. Fascinating -- at least for me -- read! Cheers! boB Stepp From PyTutor at DancesWithMice.info Wed Jun 23 23:02:45 2021 From: PyTutor at DancesWithMice.info (dn) Date: Thu, 24 Jun 2021 15:02:45 +1200 Subject: [Tutor] Which is better style for a function that modifies a list? In-Reply-To: References: Message-ID: <053e14ee-05c7-b492-f4f6-023390e90e6b@DancesWithMice.info> On 24/06/2021 12.31, boB Stepp wrote: > On Wed, Jun 23, 2021 at 6:58 PM boB Stepp wrote: >> >> I am tremendously enjoying "Practical Programming -- An Introduction >> to Computer Science Using Python 3.6, 3rd Edition" by Paul Gries, >> Jennifer Campbell and Jason Montojo, c. 2017. I highly recommend it >> to anyone who needs to learn fundamental Python and/or introductory >> computer science. >> >> In its chapter on lists the authors point out that a function like >> >> def remove_last_item(L: list) -> list: >> """Return list L with the last item removed.""" >> >> del L[-1] >> return L >> >> does not require the return statement since the original passed in >> list will be modified in place, so there is no need for the function >> to return anything. I know from experience reading the Tutor and main >> lists that this is a frequent "gotcha" that catches many fellow >> learners. So I wonder: >> >> 1) For such a function that mutates the passed-in list is it >> clearer/better style to keep or omit the return statement? Which >> reads better? > > In light of an earlier thread I started plus now examining all of the > list methods I conclude that the Pythonic way must be to omit the > "return" statement, so that "None" is returned (Or return "None" > explicitly.) whenever a list is modified in place without returning a > new list. +1, why add an unnecessary LoC? Unless... the intent is to return an entirely new list. Except... Zen of Python says "explicit > implicit". ??? I'm concerned that the function offers little, um, functionality. In short, would it not be better to write: del original_list[ -1 ] rather than: remove_last_item( original_list ) Well... (to part-answer my own question), isn't the latter more readable? Is part of this prevarication and 'hedging my bets' due to the lack of functionality? In other words, if someone came-up with a more realistic example, might one 'side' of the argument fall-away? There is a branch of programming-thought called "Functional Programming" and one of the aspects of code it seeks to avoid is "side-effects". The 'gotcha' implicit within mutable collections has been discussed in an earlier thread. If we want to remove_/del is it really a side-effect? Perhaps if we were considering the wider objective(s), and thereafter why we are removing the list's (or string's, or ...) last element, ie what other functionality we will be asking of the 'list', our thinking may lead us to create a custom class/object, sub-classed from list (or ...)? Whilst this seems an OTT solution to the bare-scenario painted above, assuming other justification, a custom class would enable: particular_purpose_list = CustomClass( etc ) ... particular_purpose_list.remove_last_item() Because the internal "state" of our object-instance is maintained within "self", there is now little doubt of what is happening, and to what. There is no 'gotcha' (assuming good practice code). Plus, it elevates the suspicion of a side-effect to being a desired-impact. All of which results in the code/data-structure being more tightly testable because it implements a more bounded purpose. Your thoughts? -- Regards, =dn From cs at cskk.id.au Wed Jun 23 21:12:20 2021 From: cs at cskk.id.au (Cameron Simpson) Date: Thu, 24 Jun 2021 11:12:20 +1000 Subject: [Tutor] Clarification questions about how Python uses references. In-Reply-To: References: Message-ID: On 23Jun2021 17:18, boB Stepp wrote: >I continue to attempt to refine my understanding of how Python uses >identifiers to reference objects. Below is an interpreter session >that I have a couple of questions about: > >>>> a = 'banana' >>>> b = 'banana' >>>> a is b >True >>>> a = 1000000000 >>>> b = 1000000000 >>>> a is b >False >>>> a = 'This is a much longer string than previously. I wonder what the result will be?' >>>> b = 'This is a much longer string than previously. I wonder what the result will be?' >>>> a is b >False > >The integer example is as expected. I know that Python caches "small" >integers for reuse. The exact ranges were discussed in a previous >thread. > >But the string example is new to me. It appears that Python caches >smaller strings. Is this true? If yes, what might the caching limits >be? Yes. Unsure. Maybe read the source. I don't think the language specifies that this must occur. Remember, _any_ immutable type can do this keep-just-one-copy approach. Just accept that it may happen. It almost never affects how you code. >On to lists. My current understanding is that lists don't actually >contain the objects themselves, but, instead, references to those >objects. Is this correct? Yes. Just like _any_ variable or container. >How could I prove this to myself in the >interpreter? >>> L1 = [1] >>> L2 = [2] >>> LL1 = [L1, L1] >>> LL1 [[1], [1]] >>> LL3 = [L1, L2] >>> LL3 [[1], [2]] >>> L1.append(2) >>> LL1 [[1, 2], [1, 2]] >>> LL3 [[1, 2], [2]] >>> LL1[0] is L1 True >Does this translate to tuples and sets? Of course. They just store references. >Even though >tuples are immutable they can contain mutable objects. Sure. Immutable means you can't change the references, not necessarily that the references are to other immutable things. >Playing around >in the interpreter it looks like even if sets contain tuples, no >mutable elements can be in the tuples. Is this in general correct? That's because set elements, like dict keys, need to be hashable and stable. Collisions in sets and dicts rely on the objects' hash functions staying the same, because the hash function governs which slot in the internal hash table contains the reference. If the hash of an object changes, the interpreter will look in a different hash slot. badness ensures. The explicitly immutable types like str or tuple have a __hash__ method. The contract you obey if you implement a __hash__ method yourself is that it is stable regardless of what happens to the object. So most mutable types don't provide a __hash__, which is how you know you can't use them in a set: try to put one in a set and when the set tries to access the hash code it fails. Cheers, Cameron Simpson From robertvstepp at gmail.com Wed Jun 23 23:51:09 2021 From: robertvstepp at gmail.com (boB Stepp) Date: Wed, 23 Jun 2021 22:51:09 -0500 Subject: [Tutor] Which is better style for a function that modifies a list? In-Reply-To: <053e14ee-05c7-b492-f4f6-023390e90e6b@DancesWithMice.info> References: <053e14ee-05c7-b492-f4f6-023390e90e6b@DancesWithMice.info> Message-ID: On Wed, Jun 23, 2021 at 10:09 PM dn via Tutor wrote: > > On 24/06/2021 12.31, boB Stepp wrote: > > On Wed, Jun 23, 2021 at 6:58 PM boB Stepp wrote: > >> > >> I am tremendously enjoying "Practical Programming -- An Introduction > >> to Computer Science Using Python 3.6, 3rd Edition" by Paul Gries, > >> Jennifer Campbell and Jason Montojo, c. 2017. I highly recommend it > >> to anyone who needs to learn fundamental Python and/or introductory > >> computer science. [OT: I finally encountered my first real quibble with the book's authors on page 155. Here they give an example: values = [4, 10, 3, 8, -6] for i in range(len(values)): print(i) where they wish to demonstrate sometimes you need to use the index of a list when looping. That's a good point to make, but of course we are more satisfied with using for i, value in enumerate(values): [...] I actually bothered tonight to file a "suggestion" on the book's website. Another thing they did later on this page was to loop over a list's elements and modify them in place. I felt they should have issued a strong warning about the perils of doing such things.] > >> > >> In its chapter on lists the authors point out that a function like > >> > >> def remove_last_item(L: list) -> list: > >> """Return list L with the last item removed.""" > >> > >> del L[-1] > >> return L > >> > >> does not require the return statement since the original passed in > >> list will be modified in place, so there is no need for the function > >> to return anything. I know from experience reading the Tutor and main > >> lists that this is a frequent "gotcha" that catches many fellow > >> learners. So I wonder: > >> > >> 1) For such a function that mutates the passed-in list is it > >> clearer/better style to keep or omit the return statement? Which > >> reads better? > > > > In light of an earlier thread I started plus now examining all of the > > list methods I conclude that the Pythonic way must be to omit the > > "return" statement, so that "None" is returned (Or return "None" > > explicitly.) whenever a list is modified in place without returning a > > new list. > > > +1, why add an unnecessary LoC? > > Unless... the intent is to return an entirely new list. > > Except... Zen of Python says "explicit > implicit". > > ??? It seems to me that modern Python is just as likely to violate the Zen as follow it. I guess it was never meant to be taken particularly literally anyway. > > I'm concerned that the function offers little, um, functionality. In > short, would it not be better to write: > > del original_list[ -1 ] > > rather than: > > remove_last_item( original_list ) Ah, dn, you must give the authors a break here. They were merely trying to provide a bare-bones example to illustrate a point that they wanted the reader to understand. My question is merely extrapolating from their simple example to explore good Python style and practice. > There is a branch of programming-thought called "Functional Programming" > and one of the aspects of code it seeks to avoid is "side-effects". The > 'gotcha' implicit within mutable collections has been discussed in an > earlier thread. If we want to remove_/del is it really a side-effect? As I gain more knowledge and do more consideration of intentional side effects in code -- or worse, unintentional! -- it seems a breeding ground for subtle bugs in large, complex code bases. If I ever get a decent handle on Python perhaps I will investigate one of these true functional programming languages. > > Perhaps if we were considering the wider objective(s), and thereafter > why we are removing the list's (or string's, or ...) last element, ie > what other functionality we will be asking of the 'list', our thinking > may lead us to create a custom class/object, sub-classed from list (or ...)? > > Whilst this seems an OTT solution to the bare-scenario painted above, > assuming other justification, a custom class would enable: > > particular_purpose_list = CustomClass( etc ) > ... > particular_purpose_list.remove_last_item() > > Because the internal "state" of our object-instance is maintained within > "self", there is now little doubt of what is happening, and to what. > There is no 'gotcha' (assuming good practice code). Plus, it elevates > the suspicion of a side-effect to being a desired-impact. All of which > results in the code/data-structure being more tightly testable because > it implements a more bounded purpose. > > > Your thoughts? I find no disagreement here *if* a worthwhile purpose for having such a method existed for a custom data type. And more than likely such a remove the last item from such a data type might not even be part of the public API but be a hidden implementation detail to the user of the class. Still, though, the tests better be good! Cheers! boB Stepp From robertvstepp at gmail.com Thu Jun 24 00:13:14 2021 From: robertvstepp at gmail.com (boB Stepp) Date: Wed, 23 Jun 2021 23:13:14 -0500 Subject: [Tutor] Clarification questions about how Python uses references. In-Reply-To: References: Message-ID: On Wed, Jun 23, 2021 at 10:50 PM Cameron Simpson wrote: > > On 23Jun2021 17:18, boB Stepp wrote: > >I continue to attempt to refine my understanding of how Python uses > >identifiers to reference objects. Below is an interpreter session > >that I have a couple of questions about: > > > >>>> a = 'banana' > >>>> b = 'banana' > >>>> a is b > >True > >>>> a = 1000000000 > >>>> b = 1000000000 > >>>> a is b > >False > >>>> a = 'This is a much longer string than previously. I wonder what the result will be?' > >>>> b = 'This is a much longer string than previously. I wonder what the result will be?' > >>>> a is b > >False > >But the string example is new to me. It appears that Python caches > >smaller strings. Is this true? If yes, what might the caching limits > >be? > > Yes. > > Unsure. Maybe read the source. I don't think the language specifies that > this must occur. Ugh. Peter's last pointing to the source and links to a couple of articles are still on my reading list. Parsing C code when I've never studied C is hard slogging for me. But being ever curious I will probably getting around to looking at string cache limits, to no useful point I'm sure! > >On to lists. My current understanding is that lists don't actually > >contain the objects themselves, but, instead, references to those > >objects. Is this correct? > > Yes. Just like _any_ variable or container. > > >How could I prove this to myself in the > >interpreter? > > >>> L1 = [1] > >>> L2 = [2] > >>> LL1 = [L1, L1] > >>> LL1 > [[1], [1]] > >>> LL3 = [L1, L2] > >>> LL3 > [[1], [2]] > >>> L1.append(2) > >>> LL1 > [[1, 2], [1, 2]] > >>> LL3 > [[1, 2], [2]] > >>> LL1[0] is L1 > True I actually did some similar experimenting prior to posting, but for some reason talked myself out of what I thought I was demonstrating. Sleepy I guess. > >Does this translate to tuples and sets? > > Of course. They just store references. > > >Even though > >tuples are immutable they can contain mutable objects. > > Sure. Immutable means you can't change the references, not necessarily > that the references are to other immutable things. Ah! This is a key point that has not clicked with me till now! It's the _references_ that can't be changed. Thanks!! > >Playing around > >in the interpreter it looks like even if sets contain tuples, no > >mutable elements can be in the tuples. Is this in general correct? > > That's because set elements, like dict keys, need to be hashable and > stable. Collisions in sets and dicts rely on the objects' hash functions > staying the same, because the hash function governs which slot in the > internal hash table contains the reference. > > If the hash of an object changes, the interpreter will look in a > different hash slot. badness ensures. > > The explicitly immutable types like str or tuple have a __hash__ method. > The contract you obey if you implement a __hash__ method yourself is > that it is stable regardless of what happens to the object. So most > mutable types don't provide a __hash__, which is how you know you can't > use them in a set: try to put one in a set and when the set tries to > access the hash code it fails. Hashes/hashing have not made it into my education yet. I am hoping that this intro CSc book I am reading will eventually address this. Plus it seems that there are other contexts/meanings as in things like checksums using MD5 hashing or some such, et al. Another thing to learn... Thanks! boB Stepp From PyTutor at DancesWithMice.info Thu Jun 24 02:04:28 2021 From: PyTutor at DancesWithMice.info (dn) Date: Thu, 24 Jun 2021 18:04:28 +1200 Subject: [Tutor] Which is better style for a function that modifies a list? In-Reply-To: References: <053e14ee-05c7-b492-f4f6-023390e90e6b@DancesWithMice.info> Message-ID: On 24/06/2021 15.51, boB Stepp wrote: > On Wed, Jun 23, 2021 at 10:09 PM dn via Tutor wrote: >> >> On 24/06/2021 12.31, boB Stepp wrote: >>> On Wed, Jun 23, 2021 at 6:58 PM boB Stepp wrote: >>>> >>>> I am tremendously enjoying "Practical Programming -- An Introduction >>>> to Computer Science Using Python 3.6, 3rd Edition" by Paul Gries, >>>> Jennifer Campbell and Jason Montojo, c. 2017. I highly recommend it >>>> to anyone who needs to learn fundamental Python and/or introductory >>>> computer science. > > [OT: I finally encountered my first real quibble with the book's > authors on page 155. Here they give an example: > > values = [4, 10, 3, 8, -6] > for i in range(len(values)): > print(i) > > where they wish to demonstrate sometimes you need to use the index of > a list when looping. That's a good point to make, but of course we > are more satisfied with using > > for i, value in enumerate(values): > [...] > > I actually bothered tonight to file a "suggestion" on the book's website. +1 (nevertheless still a book that is well-worth the reading time!) > Another thing they did later on this page was to loop over a list's > elements and modify them in place. I felt they should have issued a > strong warning about the perils of doing such things.] > >>>> >>>> In its chapter on lists the authors point out that a function like >>>> >>>> def remove_last_item(L: list) -> list: >>>> """Return list L with the last item removed.""" >>>> >>>> del L[-1] >>>> return L >>>> >>>> does not require the return statement since the original passed in >>>> list will be modified in place, so there is no need for the function >>>> to return anything. I know from experience reading the Tutor and main >>>> lists that this is a frequent "gotcha" that catches many fellow >>>> learners. So I wonder: >>>> >>>> 1) For such a function that mutates the passed-in list is it >>>> clearer/better style to keep or omit the return statement? Which >>>> reads better? >>> >>> In light of an earlier thread I started plus now examining all of the >>> list methods I conclude that the Pythonic way must be to omit the >>> "return" statement, so that "None" is returned (Or return "None" >>> explicitly.) whenever a list is modified in place without returning a >>> new list. >> >> >> +1, why add an unnecessary LoC? >> >> Unless... the intent is to return an entirely new list. >> >> Except... Zen of Python says "explicit > implicit". >> >> ??? > > It seems to me that modern Python is just as likely to violate the Zen > as follow it. I guess it was never meant to be taken particularly > literally anyway. It's amusing. Sadly, what was once a joke between friends is probably non-PC by today's standards, given that the 'dig' at GvR might now be taken by some as 'racist'. Sigh! Then there is the "one way" to do things, which, as you say, must surely have 'gone by the board' way back in Python 2.n days... >> I'm concerned that the function offers little, um, functionality. In >> short, would it not be better to write: >> >> del original_list[ -1 ] >> >> rather than: >> >> remove_last_item( original_list ) > > Ah, dn, you must give the authors a break here. They were merely > trying to provide a bare-bones example to illustrate a point that they > wanted the reader to understand. My question is merely extrapolating > from their simple example to explore good Python style and practice. Then the observation applies to the extrapolation, not the text! Yes, of all people I know that finding 'good' concise-examples is not easy... >> There is a branch of programming-thought called "Functional Programming" >> and one of the aspects of code it seeks to avoid is "side-effects". The >> 'gotcha' implicit within mutable collections has been discussed in an >> earlier thread. If we want to remove_/del is it really a side-effect? > > As I gain more knowledge and do more consideration of intentional side > effects in code -- or worse, unintentional! -- it seems a breeding > ground for subtle bugs in large, complex code bases. If I ever get a > decent handle on Python perhaps I will investigate one of these true > functional programming languages. No need. Some intro-reading will explain the concepts. They are easily taken on-board (as you've already demonstrated). Thereafter, there are plenty of articles which talk about FP in Python or using Python for FP. Per the above, the problem for 'purists' attempting to 'push' FP, is that it is simply not possible. As long as the code includes I/O, it (technically) violates FP-principles. However, there are indeed (once again, as you have already observed) some very valuable lessons to be learned from what the FP-crowd have to offer. Avoiding whole categories of error, eg mutating mutables within a sub-namespace, amongst them! Whilst on the subject (and I may have posed this question before) how should one document such, in order to prevent later readers/maintainers of the code from falling into the gaping-jaws of this 'gotcha'? Modifying the code above: >>>> def remove_last_item(): >>>> """Remove last item from the list.""" >>>> >>>> del L[-1] we have no parameters, and no return-value. Thus no 'typing' hints appear within the def. Accordingly, we need 'manual documentation'. Whereas the original version (with return-value list) is fully documented, as-is. How do people (or is that: do people) document that they are expecting (and messing with) a mutable within a function? PS while we're being snippy about the range()/enumerate() idiom (above), may I point-out that PEP-8 implies that "L" names a class, not a list. Cue comments about the intent of PEP-8... >> Perhaps if we were considering the wider objective(s), and thereafter >> why we are removing the list's (or string's, or ...) last element, ie >> what other functionality we will be asking of the 'list', our thinking >> may lead us to create a custom class/object, sub-classed from list (or ...)? >> >> Whilst this seems an OTT solution to the bare-scenario painted above, >> assuming other justification, a custom class would enable: >> >> particular_purpose_list = CustomClass( etc ) >> ... >> particular_purpose_list.remove_last_item() >> >> Because the internal "state" of our object-instance is maintained within >> "self", there is now little doubt of what is happening, and to what. >> There is no 'gotcha' (assuming good practice code). Plus, it elevates >> the suspicion of a side-effect to being a desired-impact. All of which >> results in the code/data-structure being more tightly testable because >> it implements a more bounded purpose. >> >> >> Your thoughts? > > I find no disagreement here *if* a worthwhile purpose for having such > a method existed for a custom data type. And more than likely such a > remove the last item from such a data type might not even be part of > the public API but be a hidden implementation detail to the user of > the class. Still, though, the tests better be good! Which brings us right back to the concerns about using mutables - and the problem(s) of adequately testing callables that employ such. How could one write tests which competently-test the straw-man version (no parameters, no return-value) outlined above. The function cannot (really) be unit-tested in-isolation. Thus, its use is highly-dependent upon the calling environment (and namespace), which opens the topic of cohesion/coupling... You've raised valid concerns, but have you developed a head-ache yet? -- Regards, =dn From __peter__ at web.de Thu Jun 24 03:37:22 2021 From: __peter__ at web.de (Peter Otten) Date: Thu, 24 Jun 2021 09:37:22 +0200 Subject: [Tutor] Clarification questions about how Python uses references. In-Reply-To: References: Message-ID: On 24/06/2021 00:18, boB Stepp wrote: > But the string example is new to me. It appears that Python caches > smaller strings. Is this true? If yes, what might the caching limits > be? https://docs.python.org/3/library/sys.html?highlight=intern#sys.intern I remembered that potential variable names are interned, >>> a = "abc" >>> b = "abc" >>> a is b True >>> a = "a b" >>> b = "a b" >>> a is b False but at least sequences of digits are also interned: >>> a = "123" >>> b = "123" >>> a is b True >>> a = "-123" >>> b = "-123" >>> a is b False There also seems to be a length limit: >>> b = "b"*4096 >>> a = "b"*4096 >>> a is b True >>> a = "b"*4097 >>> b = "b"*4097 >>> a is b False From cs at cskk.id.au Thu Jun 24 04:09:30 2021 From: cs at cskk.id.au (Cameron Simpson) Date: Thu, 24 Jun 2021 18:09:30 +1000 Subject: [Tutor] Clarification questions about how Python uses references. In-Reply-To: References: Message-ID: On 23Jun2021 23:13, boB Stepp wrote: >Hashes/hashing have not made it into my education yet. I am hoping >that this intro CSc book I am reading will eventually address this. >Plus it seems that there are other contexts/meanings as in things like >checksums using MD5 hashing or some such, et al. They're related terms. I give a poor introduction below: A hash function takes a source value (eg a string such as you might use to index a dict) or, really, anything else, and computes a value, normally an integer. It is always the same for a given source value. The usual objective is that the various source values are evenly distributed over some range. In the case of a "hash table", this is what gives dicts and sets O(1) lookup times. When you store things in a hash table, the table has an array of "slots" where things get stored according to their key. You compute a hash value which dicates the slot a particular item is stored in. Then to see if an item is present, you only need to to look at the items stores in that particular slot, not at every item in the table as a whole. For maximum speed the number of slots is around the number of items stored, so that the number of items in a slot is around 1. Obviously that depends a bit on the source values, and this is why hash functions are chosen to produce evenly distributes values - to get a fairly even spread of slot usage regardless of the source values. Imagine you're storing strings. You might compute a hash function by summing the ordinals of the characters in the string. Then take that modulo the number of slots in the hash table. Then your hash function computes an index into the table. This function is actually not a great one, because string values are often short and often have very skewed character frequencies. But it illustrates the idea: you're computing a number from the source value (the string) - it is always the same for a given source value. Suppose "abcdef" results in 6. The if you stored something keyed by "abcdef" it would be kept in slot 6, and to find thing thing keyed by "abcdef" you only need to look in slot 6. Cheers, Cameron Simpson From robertvstepp at gmail.com Thu Jun 24 13:27:03 2021 From: robertvstepp at gmail.com (boB Stepp) Date: Thu, 24 Jun 2021 12:27:03 -0500 Subject: [Tutor] Clarification questions about how Python uses references. In-Reply-To: References: Message-ID: On Thu, Jun 24, 2021 at 2:38 AM Peter Otten <__peter__ at web.de> wrote: > > On 24/06/2021 00:18, boB Stepp wrote: > > > But the string example is new to me. It appears that Python caches > > smaller strings. Is this true? If yes, what might the caching limits > > be? > > https://docs.python.org/3/library/sys.html?highlight=intern#sys.intern Aha! You cleared up something else I had meant to look up, but forgot about. On the main list Chris Angelico had used "interning" in a thread and I did not know exactly what he was talking about. I had never heard this term used in a programming context before. > I remembered that potential variable names are interned, > > >>> a = "abc" > >>> b = "abc" > >>> a is b > True > >>> a = "a b" > >>> b = "a b" > >>> a is b > False > > but at least sequences of digits are also interned: > > >>> a = "123" > >>> b = "123" > >>> a is b > True > >>> a = "-123" > >>> b = "-123" > >>> a is b > False > > There also seems to be a length limit: > > >>> b = "b"*4096 > >>> a = "b"*4096 > >>> a is b > True > >>> a = "b"*4097 > >>> b = "b"*4097 > >>> a is b > False That clears up a lot. Thanks, Peter! boB Stepp From robertvstepp at gmail.com Thu Jun 24 13:36:33 2021 From: robertvstepp at gmail.com (boB Stepp) Date: Thu, 24 Jun 2021 12:36:33 -0500 Subject: [Tutor] Clarification questions about how Python uses references. In-Reply-To: References: Message-ID: On Thu, Jun 24, 2021 at 3:10 AM Cameron Simpson wrote: > > On 23Jun2021 23:13, boB Stepp wrote: > >Hashes/hashing have not made it into my education yet. I am hoping > >that this intro CSc book I am reading will eventually address this. > >Plus it seems that there are other contexts/meanings as in things like > >checksums using MD5 hashing or some such, et al. > > They're related terms. > A hash function takes a source value (eg a string such as you might use > to index a dict) or, really, anything else, and computes a value, > normally an integer. It is always the same for a given source value. Then I suppose "MD5" and similar terms refer to the algorithm being employed to come up with this mapping of values. > The usual objective is that the various source values are evenly > distributed over some range. > > In the case of a "hash table", this is what gives dicts and sets O(1) > lookup times. When you store things in a hash table, the table has an > array of "slots" where things get stored according to their key. > You compute a hash value which dicates the slot a particular item is > stored in. Then to see if an item is present, you only need to to look > at the items stores in that particular slot, not at every item in the > table as a whole. > > For maximum speed the number of slots is around the number of items > stored, so that the number of items in a slot is around 1. Obviously > that depends a bit on the source values, and this is why hash functions > are chosen to produce evenly distributes values - to get a fairly even > spread of slot usage regardless of the source values. So if it should happen that more than one value can get assigned to a slot, this is what "hash collision" is referring to? I vaguely recall reading concerns over potential hashing collision issues. I don't recall what hashing algorithm that reading was talking about, but it seems it was a potential serious security concern. > Imagine you're storing strings. You might compute a hash function by > summing the ordinals of the characters in the string. Then take that > modulo the number of slots in the hash table. Then your hash function > computes an index into the table. > > This function is actually not a great one, because string values are > often short and often have very skewed character frequencies. But it > illustrates the idea: you're computing a number from the source value > (the string) - it is always the same for a given source value. > > Suppose "abcdef" results in 6. The if you stored something keyed by > "abcdef" it would be kept in slot 6, and to find thing thing keyed by > "abcdef" you only need to look in slot 6. This makes a whole lot more sense now. Thanks for taking the time for this brief intro! boB Stepp From robertvstepp at gmail.com Thu Jun 24 17:20:49 2021 From: robertvstepp at gmail.com (boB Stepp) Date: Thu, 24 Jun 2021 16:20:49 -0500 Subject: [Tutor] Fwd: Which is better style for a function that modifies a list? In-Reply-To: <579C80E1-2D25-4420-AC5E-1DC60E9C4882@yahoo.co.uk> References: <579C80E1-2D25-4420-AC5E-1DC60E9C4882@yahoo.co.uk> Message-ID: Alan accidentally sent this to just me. See below. ---------- Forwarded message --------- From: Alan Gauld Date: Thu, Jun 24, 2021 at 3:45 AM Subject: Re: [Tutor] Which is better style for a function that modifies a list? To: boB Stepp Sending from a phone so not a full-size response, but I had to chip i my tuppence worth?. > On 24 Jun 2021, at 00:59, boB Stepp wrote: > > def remove_last_item(L: list) -> list: > """Return list L with the last item removed.""" > > del L[-1] > return L > > does not require the return statement since the original passed in > list will be modified in place, so there is no need for the function > to return anything. > 1) For such a function that mutates the passed-in list is it > clearer/better style to keep or omit the return statement? Which > reads better? > Functions should return something unless they ?do? nothing - eg print() But if they modify something they should return it - that?s what functions do, its in their definition. So even if they, as a side-effect, modify an argument it is still good practice to return the modified item as well. It also allows more consistent code, with all function usage following the form of Result = function(args) Unfortunately many Python methods do not follow this form, but most of the built in functions do - eg sorted(), etc. The argument for methods not returning values is that the modification is to the object which is being messaged so it doesn?t need to return a value (although other OOP languages recommend returning self to enable chaining) > 2) From a preventative of subtle bugs perspective, if one must alter > a passed-in list, would it be better to do a deep copy of the list > passed in, mutate that, and return *that* list, leaving the original > passed-in list unaltered? That?s what the pure functional programming school says (and what python does with string methods) but pragmatically it can be a huge consumer of memory and potentially very slow so its better to just return the modified item IMHO. Alan G (Still on my hols) From cs at cskk.id.au Thu Jun 24 19:12:56 2021 From: cs at cskk.id.au (Cameron Simpson) Date: Fri, 25 Jun 2021 09:12:56 +1000 Subject: [Tutor] Clarification questions about how Python uses references. In-Reply-To: References: Message-ID: On 24Jun2021 12:36, boB Stepp wrote: >On Thu, Jun 24, 2021 at 3:10 AM Cameron Simpson wrote: >> They're related terms. > >> A hash function takes a source value (eg a string such as you might use >> to index a dict) or, really, anything else, and computes a value, >> normally an integer. It is always the same for a given source value. > >Then I suppose "MD5" and similar terms refer to the algorithm being >employed to come up with this mapping of values. Yes. Hash functions are one way (see your collision stuff below) - because multiple source values can have the same hash value the hash value does not reverse to the source value - notionally it would reverse to many potential source values. A cryptographic hash (MD, SHA in their flavours) are very strongly of this form - not reversible, and a single bit change in the source value would usually flip half the bits in the hash value (which is the same as _randomly_ flipping or not flipping all the bits, which means the values are very widely spread over their range). I say 'randomly" - obviously not, but to an observer of _only_ the hash values there's not supposed to be a pattern from which to make deductions. >> The usual objective is that the various source values are evenly >> distributed over some range. >> >> In the case of a "hash table", this is what gives dicts and sets O(1) >> lookup times. When you store things in a hash table, the table has an >> array of "slots" where things get stored according to their key. >> You compute a hash value which dicates the slot a particular item is >> stored in. Then to see if an item is present, you only need to to look >> at the items stores in that particular slot, not at every item in the >> table as a whole. >> >> For maximum speed the number of slots is around the number of items >> stored, so that the number of items in a slot is around 1. Obviously >> that depends a bit on the source values, and this is why hash functions >> are chosen to produce evenly distributes values - to get a fairly even >> spread of slot usage regardless of the source values. > >So if it should happen that more than one value can get assigned to a >slot, this is what "hash collision" is referring to? I vaguely recall >reading concerns over potential hashing collision issues. I don't >recall what hashing algorithm that reading was talking about, but it >seems it was a potential serious security concern. It is, in the cryptographic world. Let's talk hash tables first. You've got a table of slots indexed by the hash function (or the hash function modulo the width of the table, to size it). Obviously multiple source values can map to the same slot. For good lookup performance (a) you size the hash table to be wide enough for the number of values stored so that usually there are very few entries per slot, given even distribution and (b) you want a hash function which does a nice even distribution from typical source values. Note that for a dict or set, this implies that as it grows the hash table gets resized every so often to keep the number of slot entries low. This amounts to making a new table, rerunning the hash for each key modulo the new width, and putting the keys in new places. That also means the collisions move around - if the main hash function gave 47 and 33 and they collided in a 5 slot table (47 % 7 == 5, 33 % 7 == 5), if the table grew and we made 17 slots they wouldn't collide. But other hash values would. The collisions are why you want a hash function to have a nice even spread of values - that spreads the source values across slots evenly producing evenly shallow slows. Obviously you could invent pathological sets of source keys that all collide for a particular hash function and slot count. The objective is that typical use doesn't do that. There is a thing called a "perfect hash function", which distributes every source value to distinct slots. You need to know all the source values to create one of them. But it has the property that (a) you have no collisions or unused slots and (b) you can compare the hash values instead of comparing the keys. Because the slots are expected to have few members, you can put a very simple variable sized data structure in each slot, eg a linked list, because you expect it to be very short. Collisions make that bad, so we try to avoid them. Back to cryptography and why collisions are very important there. "Cryptographic" hash functions have a heap of desireable features: - very resistant to reversal - you should not be able to predict the source value (or any potential source value) from the hash value, meaning you can pass them around without fear of revealing source values - constant time and power (various crypto functions can be attacked when their run time varies with their input data) - very even distribution (see the bit flip feature above) - a very wide result domain - even in heavy use you should not expect to find two source values with the same hash value during the lifetime of the universe This last feature has a couple of implications: We often use the hash as a proxy for the source key - you're already guarenteed that different hash values imply different source values; all hash functions have that property. But if collisions are astronomically improbable, then if the hash values are the same you can infer very reliably that the source values are the same (doesn't tell you the source value, note). That is also the basis for cryptographic signatures; if you hash a document and get the signature, you are very sure that the document has not been tampered with. And this is the second implication: if you _can_ find a pair of source values with the same cryptographic hash value, the function is probably broken for crypto purposes. For example, such a collision would allow you to make a different document with the same signature, allowing forgery. Etc. MD4, Md5 and SHA1 are all now too weak. Some because of a weakness in the algorithm (eg you can derive collisions, or at least make a targeted search) or because of compute - you can do a brute force search with enough resources. The (statistical) isomorphic property of a cryptographic hash function is also used for content based storage. If you want to deduplicate input data to store only one copy of data blocks with the same content (eg in a domain where that is common, eg a bunch of virtual machine images with OSes stored in them, etc etc) you can make an index of blocks keyed by their hash value. If the hash value's in the index (a) you know you've got this block already and (b) you know where you put it. So you don't need another copy. Deduplication achieved. Cheers, Cameron Simpson From cs at cskk.id.au Thu Jun 24 21:05:31 2021 From: cs at cskk.id.au (Cameron Simpson) Date: Fri, 25 Jun 2021 11:05:31 +1000 Subject: [Tutor] Clarification questions about how Python uses references. In-Reply-To: References: Message-ID: On 24Jun2021 19:46, Dennis Lee Bieber wrote: >On Thu, 24 Jun 2021 12:36:33 -0500, boB Stepp >declaimed the following: >>Then I suppose "MD5" and similar terms refer to the algorithm being >>employed to come up with this mapping of values. >> > MD5 is more used to create a value that can be used to verify a file >has not been corrupted (by rerunning the MD5 algorithm on the file and >comparing the message-digest value to that provided [independently] by the >supplier of said file). It isn't used to "index" any structure. Not in Python internals I guess. But I've seen them used as keys in data structures and as, effectively, job names for user requests (hash the request, which is unique eg JSON with user id and request number etc) which becomes an index of sorts. And of course in memory if manipulating things associated with same. > The final assignment in my data structures course required > implementing >a hashed-head multiple linked-list system. (Seems Wikipedia refers to them >as "hashing with chaining"). The only place I've seen such a structure used >in an actual application is the Amiga file-system. > > Each directory block held room for something like 64 entries. Each >entry had a pointer to a list of [file header | (sub)directory header]. >Each header block held the name of the file/directory (there is the >comparison source). File headers held pointers to 64 data blocks; if the >file was larger than that, a link to an overflow block was provided, for >the next 64 data blocks, etc. Finding a file/directory required hashing the >current path component, then following the linked list comparing the name >of the path component to the name stored in the linked header blocks. Pretty sure most modern filesystems use hashed directories in order to support efficient lookup of large/wide directories (having many entries). I'd expect an analogous structure there. Cheers, Cameron Simpson From robertvstepp at gmail.com Fri Jun 25 15:48:00 2021 From: robertvstepp at gmail.com (boB Stepp) Date: Fri, 25 Jun 2021 14:48:00 -0500 Subject: [Tutor] Clarification questions about how Python uses references. In-Reply-To: References: Message-ID: On Thu, Jun 24, 2021 at 6:46 PM Dennis Lee Bieber wrote: > > On Thu, 24 Jun 2021 12:36:33 -0500, boB Stepp > declaimed the following: > > >Then I suppose "MD5" and similar terms refer to the algorithm being > >employed to come up with this mapping of values. > > > MD5 is more used to create a value that can be used to verify a file > has not been corrupted (by rerunning the MD5 algorithm on the file and > comparing the message-digest value to that provided [independently] by the > supplier of said file). It isn't used to "index" any structure. The Wikipedia entry on MD5 states, "The MD5 message-digest algorithm is a widely used *hash function* [my emphasis] producing a 128-bit hash value." Looking up "hash function", the Wikipedia article on it states, "A hash function is any function that can be used to map data of arbitrary size to fixed-size values. The values returned by a hash function are called hash values, hash codes, digests, or simply hashes. The values are usually used to index a fixed-size table called a hash table." This seems to fit in perfectly well with what Cameron stated and my usage above seems to be correct. The "index" in this instance would be for the entire file that the MD5 value was computed for. This may be (ignorant?) quibbling on my part, but it seems that we spend much of our time on these mailing lists trying to be uber-precise in our language usage. I guess I am either falling into this trap or am engaging in a good thing? ~(:>)) boB Stepp From robertvstepp at gmail.com Fri Jun 25 20:01:22 2021 From: robertvstepp at gmail.com (boB Stepp) Date: Fri, 25 Jun 2021 19:01:22 -0500 Subject: [Tutor] What is the easiest way to ensure the current working directory is the same as the directory where the main program is saved? Message-ID: "Practical Programming -- An Introduction to Computer Science Using Python 3.6, 3rd ed." states on p. 177: "...When you run a Python program, the current working directory is the directory where that program is saved..." I have established that this is not true. The current working directory of the program depends on from what location the program is launched. Examples: Program: testing.py: -------------------------- import os print("Testing!") print("Current working directory: ", os.getcwd()) print("Source code file: ", __file__) Running program from different locations: ------------------------------------------------------- PS C:\Users\boB> Practical_Programming\testing.py Testing! Current working directory: C:\Users\boB Source code file: C:\Users\boB\Practical_Programming\testing.py PS C:\Users\boB> cd c:\ PS C:\> Users\boB\Practical_Programming\testing.py Testing! Current working directory: C:\ Source code file: C:\Users\boB\Practical_Programming\testing.py So this demonstrates that where you are when you launch a Python program determines its initial working directory. A handy thing for working in the interpreter when you want to import one of your own programs. Is there an _easy_ way to have a program start with its current working directory in the same directory where its source code resides no matter from where the program is launched? Or must I always use the __file__ attribute to determine where the source code resides and then change directories from there? How do people deal with their Python applications that in theory may be installed anywhere in a user's file system? There must be some standard way of dealing with this that I am too dense to uncover. TIA! boB Stepp From robertvstepp at gmail.com Fri Jun 25 20:20:49 2021 From: robertvstepp at gmail.com (boB Stepp) Date: Fri, 25 Jun 2021 19:20:49 -0500 Subject: [Tutor] Clarification questions about how Python uses references. In-Reply-To: References: Message-ID: On Fri, Jun 25, 2021 at 6:21 PM Dennis Lee Bieber wrote: > > On Fri, 25 Jun 2021 14:48:00 -0500, boB Stepp > declaimed the following: > > >The Wikipedia entry on MD5 states, "The MD5 message-digest algorithm > >is a widely used *hash function* [my emphasis] producing a 128-bit > >hash value." Looking up "hash function", the Wikipedia article on it > >states, "A hash function is any function that can be used to map data > >of arbitrary size to fixed-size values. The values returned by a hash > >function are called hash values, hash codes, digests, or simply > >hashes. The values are usually used to index a fixed-size table called > >a hash table." This seems to fit in perfectly well with what Cameron > >stated and my usage above seems to be correct. The "index" in this > >instance would be for the entire file that the MD5 value was computed > >for. This may be (ignorant?) quibbling on my part, but it seems that > >we spend much of our time on these mailing lists trying to be > >uber-precise in our language usage. I guess I am either falling into > >this trap or am engaging in a good thing? ~(:>)) > > > > But there is no /table/ being indexed by the MD5 hash! So how do you > locate the original file if given the MD5 hash? File systems that use > hashes use the file name, and don't hash the file contents (any edit of the > contents would invalidate the MD5 hash, and require regenerating the hash > value). The file name stays the same regardless of the edits to the file > itself, so its hash also stays the same.. I see your point better now; however, the hash function definition above does say, "...The values are *usually* [my emphasis] used to index..." I guess my (ignorant?) quibbling point is that MD5 is still by definition a hash function. But this nit is not worth picking. Everything you say sounds eminently practical and sane! Cheers! boB Stepp From cs at cskk.id.au Fri Jun 25 20:50:06 2021 From: cs at cskk.id.au (Cameron Simpson) Date: Sat, 26 Jun 2021 10:50:06 +1000 Subject: [Tutor] Clarification questions about how Python uses references. In-Reply-To: References: Message-ID: On 25Jun2021 19:20, Dennis Lee Bieber wrote: > But there is no /table/ being indexed by the MD5 hash! So how do you >locate the original file if given the MD5 hash? File systems that use >hashes use the file name, and don't hash the file contents (any edit of the >contents would invalidate the MD5 hash, and require regenerating the hash >value). The file name stays the same regardless of the edits to the file >itself, so its hash also stays the same.. The example of a file indexed by its MD5 hash I had in mind was a real world example of a client report request, and the corresponding output files. Receive the request, save it to disc for processing under a filename based on the MD5 checksum, save that filename in a work queue. The worker picks up the filename and makes the reports, likewise saving them for the client to collect. The MD5 here is used as an easy way to pick _unique_ filenames for the work request and its results. The request never changes content once received. Not the usage you had in mind, but valid nonetheless. A filename is effectively an index into a filesystem. A filename made directly from the MD5 of its contents in this case, as a free unique name in a shared work area. This is a similar use case to that for GUUIDs in other contexts. Cheers, Cameron Simpson From Richard at Damon-Family.org Fri Jun 25 20:59:44 2021 From: Richard at Damon-Family.org (Richard Damon) Date: Fri, 25 Jun 2021 20:59:44 -0400 Subject: [Tutor] Clarification questions about how Python uses references. In-Reply-To: References: Message-ID: On 6/25/21 8:20 PM, boB Stepp wrote: > On Fri, Jun 25, 2021 at 6:21 PM Dennis Lee Bieber wrote: >> On Fri, 25 Jun 2021 14:48:00 -0500, boB Stepp >> declaimed the following: >> >>> The Wikipedia entry on MD5 states, "The MD5 message-digest algorithm >>> is a widely used *hash function* [my emphasis] producing a 128-bit >>> hash value." Looking up "hash function", the Wikipedia article on it >>> states, "A hash function is any function that can be used to map data >>> of arbitrary size to fixed-size values. The values returned by a hash >>> function are called hash values, hash codes, digests, or simply >>> hashes. The values are usually used to index a fixed-size table called >>> a hash table." This seems to fit in perfectly well with what Cameron >>> stated and my usage above seems to be correct. The "index" in this >>> instance would be for the entire file that the MD5 value was computed >>> for. This may be (ignorant?) quibbling on my part, but it seems that >>> we spend much of our time on these mailing lists trying to be >>> uber-precise in our language usage. I guess I am either falling into >>> this trap or am engaging in a good thing? ~(:>)) >>> >> But there is no /table/ being indexed by the MD5 hash! So how do you >> locate the original file if given the MD5 hash? File systems that use >> hashes use the file name, and don't hash the file contents (any edit of the >> contents would invalidate the MD5 hash, and require regenerating the hash >> value). The file name stays the same regardless of the edits to the file >> itself, so its hash also stays the same.. > I see your point better now; however, the hash function definition > above does say, "...The values are *usually* [my emphasis] used to > index..." I guess my (ignorant?) quibbling point is that MD5 is still > by definition a hash function. But this nit is not worth picking. > Everything you say sounds eminently practical and sane! > > Cheers! > boB Stepp Note, there are TWO major distinct uses for Hash Functions. One technique uses a hash function to ultimately get a fairly small number, to make lookup of values O(1) in a container. The Second uses cryptographic secure hashes to verify a file. (This is where MD5 is used), This use has hashes that generate BIG numbers. Python uses that first type of hash for sets and dictionaries, and for that use you want a hash that is quick to compute, It doesn't need to make sure that two distinct object will always have different hash values, but you do want this to be the normally expected case, and maybe you want to make it hard to intentionally generate values that intentionally collide to avoid Denial of Service Attacks. The second type needs very different requirements, you don't want it to be practical for someone given a hash value to create a different file that gives that value, and you generally don't mind it taking a degree of effort to compute that hash. One generic tool, two very different specific version for different applications. -- Richard Damon From robertvstepp at gmail.com Fri Jun 25 23:10:22 2021 From: robertvstepp at gmail.com (boB Stepp) Date: Fri, 25 Jun 2021 22:10:22 -0500 Subject: [Tutor] How to read the first so many Unicode characters from a file? Message-ID: Say I have a text file with a mixture of ASCII and non-ASCII characters (assumed to be UTF-8) and I wanted to read the first N characters from the file. The first thought that comes to mind is: with open(filename) as f: N_characters = f.read(N) But reading the docs says that this will read N bytes. My understanding is that some UTF-8 characters can be more than one byte. So it seems to me that this won't work in general. Does Python provide a way to accomplish this easily? Or is my understanding flawed? TIA! boB Stepp From cs at cskk.id.au Fri Jun 25 23:10:24 2021 From: cs at cskk.id.au (Cameron Simpson) Date: Sat, 26 Jun 2021 13:10:24 +1000 Subject: [Tutor] What is the easiest way to ensure the current working directory is the same as the directory where the main program is saved? In-Reply-To: References: Message-ID: On 25Jun2021 19:01, boB Stepp wrote: >"Practical Programming -- An Introduction to Computer Science Using >Python 3.6, 3rd ed." states on p. 177: "...When you run a Python >program, the current working directory is the directory where that >program is saved..." I have established that this is not true. The >current working directory of the program depends on from what location >the program is launched. Correct. [...] >Is there an _easy_ way to have a program start with its current >working directory in the same directory where its source code resides >no matter from where the program is launched? Or must I always use >the __file__ attribute to determine where the source code resides and >then change directories from there? The latter. But I recommend _not_ changing your working directory. It is a process global state which affects everything. When someone invokes your programme and supplies a filename on the command line, they usually expect that name to be resolved from the working directory where they invoked the programme. If the change directory you need to know where you were if you need to work with any caller suppplied relative filenames. >How do people deal with their >Python applications that in theory may be installed anywhere in a >user's file system? There must be some standard way of dealing with >this that I am too dense to uncover. Well, broadly you don't care where the code is. You care about the data of the person using your programme, which they'll tell you. (Directly with filenames or the like, or also by envionment variables.) When you do care there are two basic approaches that come to mind: - use __file__, get its dirname, and access some resources you know are installed beside the source; this is generally awkward - you need control during install, and it is possible to ship python source as a zip file and the __file__ paradigm doesn't work there - have a config file specifying the location of resources Maybe you could enumerate some circumstances giving you trouble. See what can be done. Cheers, Cameron Simpson From robertvstepp at gmail.com Fri Jun 25 23:50:52 2021 From: robertvstepp at gmail.com (boB Stepp) Date: Fri, 25 Jun 2021 22:50:52 -0500 Subject: [Tutor] What is the easiest way to ensure the current working directory is the same as the directory where the main program is saved? In-Reply-To: References: Message-ID: I have asked questions on this topic off and on in the past. So far I have been able to make my programs do what I intend, but I have a feeling I am still not getting something fundamental. On Fri, Jun 25, 2021 at 10:21 PM Cameron Simpson wrote: > [...] > >Is there an _easy_ way to have a program start with its current > >working directory in the same directory where its source code resides > >no matter from where the program is launched? Or must I always use > >the __file__ attribute to determine where the source code resides and > >then change directories from there? > > The latter. > > But I recommend _not_ changing your working directory. It is a process > global state which affects everything. When someone invokes your > programme and supplies a filename on the command line, they usually > expect that name to be resolved from the working directory where they > invoked the programme. I never considered this possibility! Yes, must avoid doing that!! > If the change directory you need to know where you were if you need to > work with any caller suppplied relative filenames. > > >How do people deal with their > >Python applications that in theory may be installed anywhere in a > >user's file system? There must be some standard way of dealing with > >this that I am too dense to uncover. > > Well, broadly you don't care where the code is. You care about the data > of the person using your programme, which they'll tell you. (Directly > with filenames or the like, or also by envionment variables.) But I *do* care about where my source code winds up, don't I? How else do I load data from its data folder if I don't know where I am in the user's file system? > When you do care there are two basic approaches that come to mind: > > - use __file__, get its dirname, and access some resources you know are > installed beside the source; this is generally awkward - you need > control during install, and it is possible to ship python source as a > zip file and the __file__ paradigm doesn't work there > > - have a config file specifying the location of resources Maybe this is where my misunderstandings are occurring. To date, I have had no installation process. Any programs that I need to be elsewhere, I copy the program to its new living facilities, whether elsewhere on my PC, somewhere else on a network, or to another PC. Does an actual installation process make these issues go away? As a user of other people's programs that come with an installer one of the first choices the user usually must make is accept the default installation suggestion or choose a different location. I suppose this is where the needed information is fed into the program and stored for later use? I thought I could put off figuring out how to install Python packages the "proper" way (if there is one) ... > Maybe you could enumerate some circumstances giving you trouble. See > what can be done. A typical example from my earlier playing-around-with-this session -- needing to open and read in another file: Play around code: ------------------------ import os print("Testing!") print("Current working directory: ", os.getcwd()) print("Source code file: ", __file__) with open('test_text.txt') as f: print(f.read()) ------------------------ Now if I run this from the same folder where the source code and data file are, all is fine. But if I start the program elsewhere then a problem ensues: PS C:\> Users\boB\Practical_Programming\testing.py # NOT the program directory. Testing! Current working directory: C:\ Source code file: C:\Users\boB\Practical_Programming\testing.py Traceback (most recent call last): File "C:\Users\boB\Practical_Programming\testing.py", line 6, in with open('test_text.txt') as f: FileNotFoundError: [Errno 2] No such file or directory: 'test_text.txt' PS C:\> cd .\Users\boB\Practical_Programming\ PS C:\Users\boB\Practical_Programming> .\testing.py # Now in the program's folder. Testing! Current working directory: C:\Users\boB\Practical_Programming Source code file: C:\Users\boB\Practical_Programming\testing.py This is just a plain text file of no import. This is the second line of a boring file. Though a toy example, this is the basic problem I have been hacking around. My usual approach is when I am lazy I hard-code the "home" location for wherever I place the program. When I am more energetic I make use of the __file__ attribute to automatically decide the "home" location. How should I be handling these situations? Cheers! boB Stepp From eryksun at gmail.com Sat Jun 26 00:25:26 2021 From: eryksun at gmail.com (Eryk Sun) Date: Fri, 25 Jun 2021 23:25:26 -0500 Subject: [Tutor] How to read the first so many Unicode characters from a file? In-Reply-To: References: Message-ID: On 6/25/21, boB Stepp wrote: > Say I have a text file with a mixture of ASCII and non-ASCII > characters (assumed to be UTF-8) and I wanted to read the first N > characters from the file. The first thought that comes to mind is: > > with open(filename) as f: > N_characters = f.read(N) Assuming Python 3, you're opening the file in text mode, which reads characters, not bytes. That said, you're using the default encoding that's based on the platform and locale. In Windows this will be the process ANSI code page, unless UTF-8 mode is enabled (e.g. `python -X utf8`). You can explicitly decode the file as UTF-8 via open(filename, encoding='utf-8'). From robertvstepp at gmail.com Sat Jun 26 00:39:47 2021 From: robertvstepp at gmail.com (boB Stepp) Date: Fri, 25 Jun 2021 23:39:47 -0500 Subject: [Tutor] How to read the first so many Unicode characters from a file? In-Reply-To: References: Message-ID: On Fri, Jun 25, 2021 at 11:25 PM Eryk Sun wrote: > > On 6/25/21, boB Stepp wrote: > > Say I have a text file with a mixture of ASCII and non-ASCII > > characters (assumed to be UTF-8) and I wanted to read the first N > > characters from the file. The first thought that comes to mind is: > > > > with open(filename) as f: > > N_characters = f.read(N) > > Assuming Python 3, you're opening the file in text mode, which reads > characters, not bytes. That said, you're using the default encoding > that's based on the platform and locale. In Windows this will be the > process ANSI code page, unless UTF-8 mode is enabled (e.g. `python -X > utf8`). You can explicitly decode the file as UTF-8 via open(filename, > encoding='utf-8'). Ah, foolish me. I thought I was reading about text streams in the docs, but I was actually in a bytes section. This combined with where I am at in the book I'm reading misled me. If I specify the encoding at the top of the program file, will that suffice for overcoming Windows code page issues -- being ASCII not UTF-8? Thanks! boB Stepp From robertvstepp at gmail.com Sat Jun 26 01:03:39 2021 From: robertvstepp at gmail.com (boB Stepp) Date: Sat, 26 Jun 2021 00:03:39 -0500 Subject: [Tutor] How to read the first so many Unicode characters from a file? In-Reply-To: References: Message-ID: On Fri, Jun 25, 2021 at 11:39 PM boB Stepp wrote: > > On Fri, Jun 25, 2021 at 11:25 PM Eryk Sun wrote: > > > > On 6/25/21, boB Stepp wrote: > > > Say I have a text file with a mixture of ASCII and non-ASCII > > > characters (assumed to be UTF-8) and I wanted to read the first N > > > characters from the file. The first thought that comes to mind is: > > > > > > with open(filename) as f: > > > N_characters = f.read(N) > > > > Assuming Python 3, you're opening the file in text mode, which reads > > characters, not bytes. That said, you're using the default encoding > > that's based on the platform and locale. In Windows this will be the > > process ANSI code page, unless UTF-8 mode is enabled (e.g. `python -X > > utf8`). You can explicitly decode the file as UTF-8 via open(filename, > > encoding='utf-8'). > > Ah, foolish me. I thought I was reading about text streams in the > docs, but I was actually in a bytes section. This combined with where > I am at in the book I'm reading misled me. > > If I specify the encoding at the top of the program file, will that > suffice for overcoming Windows code page issues -- being ASCII not > UTF-8? Apparently this won't help as it only affects how Python reads the source code (its default already) not how it executes it. So sayeth a Stack Overflow post I just found:] https://stackoverflow.com/questions/14083111/should-i-use-encoding-declaration-in-python-3 boB Stepp From eryksun at gmail.com Sat Jun 26 01:24:09 2021 From: eryksun at gmail.com (Eryk Sun) Date: Sat, 26 Jun 2021 00:24:09 -0500 Subject: [Tutor] How to read the first so many Unicode characters from a file? In-Reply-To: References: Message-ID: On 6/25/21, boB Stepp wrote: > > If I specify the encoding at the top of the program file, will that > suffice for overcoming Windows code page issues -- being ASCII not > UTF-8? One rarely needs to specify and encoding for a Python 3 source file, since the default is UTF-8. Even if you do, it has nothing to do with how the code executes once the file is compiled to bytecode. To force UTF-8 as the default encoding for open(), enable UTF-8 mode in Python 3.7+. You can enable it permanently by defining the environment variable PYTHONUTF8=1, e.g. run `setx.exe PYTHONUTF8 1` at the command line or the Win+R run dialog. https://docs.python.org/3/using/cmdline.html#envvar-PYTHONUTF8 FYI, ANSI code pages in Windows are single-byte or double-byte encodings. They often extend 7-bit ASCII, but no Windows locale uses just ASCII. In Western European and American locales, the ANSI code page is 1252, which is a single-byte encoding that extends Latin-1, which extends ASCII. Five byte values in code page 1252 are not mapped to any character: 0x81, 0x8D, 0x8F, 0x90, 0x9D. These values can occur in UTF-8 sequences, in which case decoding as 1252 will fail instead of just returning mojibake nonsense. Here's an example of mojibake: >>> '\ufeff'.encode('utf-8').decode('1252') '???' From robertvstepp at gmail.com Sat Jun 26 15:14:23 2021 From: robertvstepp at gmail.com (boB Stepp) Date: Sat, 26 Jun 2021 14:14:23 -0500 Subject: [Tutor] What is the easiest way to ensure the current working directory is the same as the directory where the main program is saved? In-Reply-To: References: Message-ID: Please forgive the top posting in this instance. I want to ensure Dennis' response gets saved intact into the Tutor archives so I can easily find it later as his postings otherwise never show up unless someone responds to them. In summary of what you write below, Dennis, you are basically saying to me to store my data files in standard locations expected by each operating system that the program is installed into? Then this should be a no-brainer for the actual program to access its data and configuration files. But what about during a program's development phase? It is handy to have data files in the same project directory space as everything else. I have been doing some online searching and it appears that during an installation process that installers can copy data and configuration files to those standard locations that you are advocating. But again I know nothing about installing Python programs and the software that gets it done. I merely know how to use pip to install applications others have developed, not how to get my code ready for such a procedure. Anyway, thanks Dennis for your detailed response! boB Stepp On Sat, Jun 26, 2021 at 12:10 PM Dennis Lee Bieber wrote: > > On Fri, 25 Jun 2021 22:50:52 -0500, boB Stepp > declaimed the following: > > > > But I *do* care about where my source code winds up, > >don't I? How else do I load data from its data folder if I don't know > >where I am in the user's file system? > > Most applications do not store live "data" in their installation > directory. On Windows, that is what each user's AppData directory is used > for (though many applications coming from Linux/UNIX background create > "dot" files in %userprofile% (in Linux, "dot" files are normally hidden) > > C:\Users\Wulfraed>echo %userprofile% > C:\Users\Wulfraed > > C:\Users\Wulfraed>dir %userprofile% > Volume in drive C is OS > Volume Serial Number is 4ACC-3CB4 > > Directory of C:\Users\Wulfraed > > 06/22/2021 03:05 PM . > 06/22/2021 03:05 PM .. > 12/05/2020 04:57 PM 6,514 .bash_history > 06/24/2021 06:28 PM 866 .flexprop.config > 07/29/2019 01:48 PM 166 .gitconfig > 09/13/2020 02:39 PM 2,277 .kdiff3rc > 05/31/2021 07:11 PM 1,367 .python_history > 05/07/2021 04:25 PM 2,596 .RData > 05/07/2021 04:25 PM 166 .Rhistory > > If the program requires default configuration to exist, it uses, > perhaps, __file__ to find the install directory, uses various os.path > methods to parse off the filename itself leaving the directory, and then > combines the directory with the known subpath/config-data -- and copies > that to the user specific location > > C:\Users\Wulfraed>echo %appdata% > C:\Users\Wulfraed\AppData\Roaming > > C:\Users\Wulfraed>dir %appdata% > Volume in drive C is OS > Volume Serial Number is 4ACC-3CB4 > > Directory of C:\Users\Wulfraed\AppData\Roaming > > 06/02/2021 02:51 PM . > 06/02/2021 02:51 PM .. > 12/03/2016 06:28 PM .kde > 12/03/2016 06:28 PM .mplab_ide > 12/03/2016 06:28 PM .oit > 01/03/2017 01:54 AM 5KPlayer > 12/03/2016 06:28 PM Acronis > 10/10/2017 12:28 PM Adobe > > > > >Maybe this is where my misunderstandings are occurring. To date, I > >have had no installation process. Any programs that I need to be > >elsewhere, I copy the program to its new living facilities, whether > >elsewhere on my PC, somewhere else on a network, or to another PC. > >Does an actual installation process make these issues go away? As a > >user of other people's programs that come with an installer one of the > >first choices the user usually must make is accept the default > >installation suggestion or choose a different location. I suppose > >this is where the needed information is fed into the program and > >stored for later use? > > The install location might be an entry in the Windows registry > (especially if the application has an uninstall feature). About the only > things done in an install directory (for Python) is to compile .py files > (modules) into .pyc; that often needs to use the local Python, and reduces > the distribution file size. > > An installer program may handle copying the default config files to the > user's directory, where the application expects to find them. > > > > >I thought I could put off figuring out how to install Python packages > >the "proper" way (if there is one) ... > > > > Are you talking packages you are writing? If so, the main feature is to > have them in a directory that Python searches during import. > > https://docs.python.org/3/tutorial/modules.html > """ > 6.1.2. The Module Search Path > > When a module named spam is imported, the interpreter first searches for a > built-in module with that name. If not found, it then searches for a file > named spam.py in a list of directories given by the variable sys.path. > sys.path is initialized from these locations: > > * The directory containing the input script (or the current directory > when no file is specified). > > * PYTHONPATH (a list of directory names, with the same syntax as the > shell variable PATH). > > * The installation-dependent default. > """ > > >------------------------ > >import os > > > >print("Testing!") > >print("Current working directory: ", os.getcwd()) > >print("Source code file: ", __file__) > >with open('test_text.txt') as f: > > print(f.read()) > >------------------------ > > > > with open(os.path.join(os.path.dirname(__file__), "test_text.txt")) as f: > > > > >How should I be handling these situations? > > > > Personally... by not putting /data/ in the application space. Arrange > for it to be in user-specific space... > > Windows: %userprofile%/.myapp/config or %appdata%/myapp/config > Linux: ~/.myapp/config > > Also consider the first bullet in the module search path quote... If > you arrange for your "data" to be a .py file, you should be able to just > > import config > > (may need to ensure you don't have conflicting names) > > > -- > Wulfraed Dennis Lee Bieber AF6VN > wlfraed at ix.netcom.com http://wlfraed.microdiversity.freeddns.org/ > > _______________________________________________ > Tutor maillist - Tutor at python.org > To unsubscribe or change subscription options: > https://mail.python.org/mailman/listinfo/tutor From PyTutor at DancesWithMice.info Sat Jun 26 19:56:38 2021 From: PyTutor at DancesWithMice.info (dn) Date: Sun, 27 Jun 2021 11:56:38 +1200 Subject: [Tutor] What is the easiest way to ensure the current working directory is the same as the directory where the main program is saved? In-Reply-To: References: Message-ID: <56c529d3-ec2d-bd54-584c-75ff3bd73de6@DancesWithMice.info> On 26/06/2021 15.50, boB Stepp wrote: > I have asked questions on this topic off and on in the past. So far I > have been able to make my programs do what I intend, but I have a > feeling I am still not getting something fundamental. Perhaps these feelings are related to progressing from writing program(me)s which are relatively one-off or single-user, to a more corporate 'delivery' model? > Maybe this is where my misunderstandings are occurring. To date, I > have had no installation process. Any programs that I need to be > elsewhere, I copy the program to its new living facilities, whether > elsewhere on my PC, somewhere else on a network, or to another PC. > Does an actual installation process make these issues go away? As a > user of other people's programs that come with an installer one of the > first choices the user usually must make is accept the default > installation suggestion or choose a different location. I suppose > this is where the needed information is fed into the program and > stored for later use? Most toy- and academic-examples which involve I/O will place data-files in the same directory as the script, or (like modules) in a sub-directory thereof. Sadly, this 'paints a picture'. Another important principle, often missing from the more mathematical application areas, is that data-design is (also) important - and this doesn't merely refer to laying-out fields in a DB table or file-record! (I'm trying to rein-in the scope of the question. Be aware that data- and app-security contexts also apply - and perhaps help to answer some of the 'why?' questions that may arise) >> Well, broadly you don't care where the code is. You care about the data >> of the person using your programme, which they'll tell you. (Directly >> with filenames or the like, or also by envionment variables.) > > But I *do* care about where my source code winds up, > don't I? How else do I load data from its data folder if I don't know > where I am in the user's file system? If we are both conducting experiments, should the app consolidate our results because we are working 'together', or should they keep them separate because my source-sample is quite different from yours? If we are talking about an instant-messaging system, should all the messages be stored in a central location, or kept separate by user/on user devices (as above)? Similarly, we should consider code-as-data: will your 'next great thing' be implemented 'in the cloud' where you control (and can thus easily update) one copy of the code, or will you make the app available for others to copy/clone and implement themselves? As soon as we start thinking like this, the question of 'where' becomes question*s*, many questions...! The answers are multi-dimensional. A multi-client or "multi-tenant" architecture, generally leads to each client having his/her/their own database. Within that, the schema of tables and relationships will be identical. As long as each client has an unique "name", the databases are separate and dedicated. A multi-user machine enables you/me separation by using sub-directories beneath the respective home-directories. The different paths create separation, even if (again) the data-files have the same name and internal-structure/purpose. When you wish to have a shared data-store, life becomes a little more 'interesting' - particularly if there is a need to identify users/access. Even more-so, if company "A" using the application wants their data kept quite separate from company "B". Now we have to find some shared location, as well as implementing data-separation. The above sets the scene (and probably sets your mind racing). Be aware that there is no one data-structure 'to rule them all'! Also, that this question has only slight relevance to, or solution within, "packaging". Nevertheless you can find such in Python's own approach. If an application running on my machine import-s a library, various system directories under '/usr/lib64/' are searched, plus '/usr/local/lib64/python3.9/site-packages' which likely include content not present on your PC, then there are libraries which have been designated for my use only (no other user on this PC can 'see' them), eg '/home/dn/.local/lib/python3.9/site-packages'. Thereafter, if we look at enclosing a project within a (Python) virtual-environment, we have yet another level of access/control/separation. (bearing in-mind that we're talking about finding code-modules here, cf data-files) Various ones have mentioned Command-Line variables, Environment Variables, and Configuration files. The first relies upon the user. ('nuff said?). The second 'feels' awkward to set-up (but perhaps that says more about my lack of knowledge in 'packaging'). Accordingly, I habitually implement the latter (in fact, some find my preference a bit OTT). Remember the sage advice to put all of a program(me)'s 'constants' up-at-the-top of the code? The philosophy behind this, anticipates that change may occur over-time. Thus, seeking to obviate the need to hunt through the code-base to (be sure we) find each reference. Hang on for a moment! "Change":"Constant"??? Accordingly, I tend to put application 'constants' within a config- or what we might think of as a 'preferences' file. Now, if I have one application which will be delivered to multiple clients, each can be instructed to edit that one file (don't mess with MY CODE!) and change a single line to suit their own requirement(s), eg to mention their name, experiment, grant, ... in the printed/displayed headings: header = "boB and dn's great adventure", or header = "University of Pythonic Hard Knocks" In this manner we can throw-in all manner of considerations, eg number of relevant decimal-places, how accurately we wish to represent pi, how close is close-enough for asymptotes... NB there are whole articles debating if such files should be Python code, .ini, .conf, JSON, YAML, or 'put your favorite format here'. IIRC this also discussed as a "Friday Finking", way-back. Perhaps I can 'get away with this' in the case of reasonably 'aware' and 'logical' users - who appreciate the agency and/or freedom for experimentation. At the same time, there are many (many, many,...) cases where such an approach should never be attempted. YMMV! Righty-ho. As soon as we start talking about config files, we must explore those different 'dimensions'. Are the 'constants' relevant only at the 'app level', do others vary only by organisation, still more by user? Then we can revert to the question(s) of where these files should be located: related to the application code, identified by organisation, or sitting in user-space (mentioned elsewhere in this thread)? Indeed, it is quite possible that the above produces multiple answers, according to the type of data, leading to the idea of having multiple config-files, to suit variations which may occur within those different 'dimensions'. As this complexity builds, you can see the appeal of Command-Line Variables! I've recently built a WordPress (blogging) instance and noted that they enable customisation with a front-end installation routine and/or editing a config-file (the former editing the latter). Thus coping with both 'simple folk' and 'advanced users'. One 'nice thing' about tailoring config-files is that file and directory references can be required to be absolute-paths. At which point, the user's cwd/pwd, and the code-directory become irrelevant (we can assume that the user executed the application using a correct path/script_name (or directory_name, or zip_archive), else the config file wouldn't have been found/opened!). BTW: >>> How do people deal with their >>> Python applications that in theory may be installed anywhere in a >>> user's file system? There must be some standard way of dealing with >>> this that I am too dense to uncover. The technical/Python answer to the narrowest part of the question is that most solutions front-load the search-path for modules (sys.path). -- Regards, =dn From cs at cskk.id.au Sat Jun 26 20:44:18 2021 From: cs at cskk.id.au (Cameron Simpson) Date: Sun, 27 Jun 2021 10:44:18 +1000 Subject: [Tutor] What is the easiest way to ensure the current working directory is the same as the directory where the main program is saved? In-Reply-To: References: Message-ID: On 26Jun2021 14:14, boB Stepp wrote: >Please forgive the top posting in this instance. I want to ensure >Dennis' response gets saved intact into the Tutor archives so I can >easily find it later as his postings otherwise never show up unless >someone responds to them. Aye. I was going to reply to you last night, but Dennis' post is far far better. >But what about during a program's development phase? It is handy to >have data files in the same project directory space as everything >else. Usually using an environment variable to override the default location. For example, I've a little MP3 parser whose associated tests script has this line: TESTFILE = os.environ.get('TEST_CS_MP3_TESTFILE', 'TEST.mp3') near the start. That sets the TESTFILE variable to 'TEST.mp3' unless an environment variable overrides it; I can set $TEST_CS_MP3_TESTFILE to the path of a different test file if I want. Plenty of programmes use this approach. The envvar above is weirdly named because it is so special purpose, but something like $APPNAME_DATADIR replacing APPNAME with the name of your own command would be very normal. Then you can go: datadir_path = os.environ.get('MYAPPNAME_DATADIR') if not datadir_path: datadir_path = expanduser('~/.myappname/data') to look for an override, and fall back to the "normal" path without one. When I'm testing in dev, I have a little script which sets the "development environment": a little set of environment variables I want in place _when in dev_. That way standing in the dev area does not change behaviour (I use the "installed" tools), but to run something "in dev" I go: dev some command ... where "dev" is my little script to set up the environment _just for that command_. For example: dev python3 -m my.module.name ... That would typically: - set $PYTHONPATH to access the modules in the dev environment, eg a local virtualenv and of course my own code there - set $PATH to access the bin directories of the development environment, such as the local "bin" and the "bin' from a local virtualenv - any special variables like $TEST_CS_MP3_TESTFILE or $MYAPPNAME_DATADIR imagined above I'm in the UNIX world, but I'd expect it should be fairly easy to write yourself a little batch file to do this in Windows and achieve exactly the same thing. >I have been doing some online searching and it appears that >during an installation process that installers can copy data and >configuration files to those standard locations that you are >advocating. But again I know nothing about installing Python programs >and the software that gets it done. I merely know how to use pip to >install applications others have developed, not how to get my code >ready for such a procedure. You don't need to use pip for your own code. I've got quite a lot of personal modules, and only a subset is published to PyPI for use by pip. The typical approach is (again, UNIX nomenclature): - make a virtualenv for development, typically in a "venv" subdirectory of your development tree - set $PATH and $PYTHONPATH to use it In UNIX that would go: python3 -m venv venv using the python 'venv" module to make a virtualenv install in "venv". Do that once. Install third party modules for your dev env like this: ./venv/bin/pip3 install module_names.... Keep your own modules sitting around _outside_ the venv. I use the "lib/python" subdirectory myself ("cs" prefixes most of my modules, so make that too): mkdir -p lib/python/cs The "dev" environment setup is then (UNIX shell syntax, again): PYTHONPATH=$(pwd)/lib/python PATH=$(pwd)/venv/bin:$PATH export PYTHONPATH PATH That puts the venv ahead of other stuff via $PATH, and makes python look for your modules in the lib/python subdirectory. My personal "dev" batch script essentially then looks like the above: #!/bin/sh PYTHONPATH=$(pwd)/lib/python PATH=$(pwd)/venv/bin:$PATH export PYTHONPATH PATH set -x exec "$@" which tweaks the environment, then runs whatever was passed on the command line, so that: dev python3 -m my_module blah... runs my in-development module "my_module" with the right environment settings. There's no "install" step for your stuff - you're just tweaking $PYTHONPATH to look for it in the right place. "lib/python" is a personal foible; whatever works for you will do. Cheers, Cameron Simpson From robertvstepp at gmail.com Sun Jun 27 17:41:17 2021 From: robertvstepp at gmail.com (boB Stepp) Date: Sun, 27 Jun 2021 16:41:17 -0500 Subject: [Tutor] Readability: To use certain Python features or not? Message-ID: Questions inspired by an example from "Practical Programming, 3rd ed." by Gries, Campbell and Montojo. p. 221 example. Compare: [...] bird_to_observations = {} for line in observations_file: bird = line.strip() if bird in bird_to_observations: bird_to_observations[bird] = bird_to_obserations[bird] + 1 else: bird_to_observations[bird] = 1 [...] to [...] bird_to_observations = {} for line in observations_file: bird = line.strip() bird_to_observations[bird] = bird_to_observations.get(bird, 0) + 1 [...] The authors comment: "Using the get method makes the program shorter, but some programmers find it harder to understand at a glance..." To my mind *if* the reader is fully familiar with Python the second example is both shorter and more expressive. For the most maintainable code which of these is better? What assumptions about our code's future audience should we hold to best help our future readers of our code? Note that I am asking a different set of questions from another common topic: When to use Python constructs like list comprehensions, lambdas, etc., which can greatly shorten LOC, but easily become a sea of letters when one makes them too complex. TIA! boB Stepp From alan.gauld at yahoo.co.uk Sun Jun 27 17:49:28 2021 From: alan.gauld at yahoo.co.uk (alan.gauld at yahoo.co.uk) Date: Sun, 27 Jun 2021 22:49:28 +0100 Subject: [Tutor] Readability: To use certain Python features or not? In-Reply-To: References: <2816c760-9bff-42e6-96bd-c8c1d51f7255.ref@email.android.com> Message-ID: <2816c760-9bff-42e6-96bd-c8c1d51f7255@email.android.com> Personally I think the second is more reliable and maintainable so prefer it. If a reader is so new to python they don't know about get() then they need to look it up and learn. But OTOH a default dict might be better still! There is a difference between writing clever code that is unreadable and using standard language or library features that might be less well known. One is easy to look up, the other is just downright hard work and therefore fragile. On 27 Jun 2021 22:41, boB Stepp wrote: Questions inspired by an example from "Practical Programming, 3rd ed." by Gries, Campbell and Montojo. p. 221 example. Compare: [...] bird_to_observations = {} for line in observations_file: bird = line.strip() if bird in bird_to_observations: bird_to_observations[bird] = bird_to_obserations[bird] + 1 else: bird_to_observations[bird] = 1 [...] to [...] bird_to_observations = {} for line in observations_file: bird = line.strip() bird_to_observations[bird] = bird_to_observations.get(bird, 0) + 1 [...] The authors comment: "Using the get method makes the program shorter, but some programmers find it harder to understand at a glance..." To my mind *if* the reader is fully familiar with Python the second example is both shorter and more expressive. For the most maintainable code which of these is better? What assumptions about our code's future audience should we hold to best help our future readers of our code? Note that I am asking a different set of questions from another common topic: When to use Python constructs like list comprehensions, lambdas, etc., which can greatly shorten LOC, but easily become a sea of letters when one makes them too complex. TIA! boB Stepp _______________________________________________ Tutor maillist - Tutor at python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor From robertvstepp at gmail.com Sun Jun 27 18:21:14 2021 From: robertvstepp at gmail.com (boB Stepp) Date: Sun, 27 Jun 2021 17:21:14 -0500 Subject: [Tutor] Readability: To use certain Python features or not? In-Reply-To: <2816c760-9bff-42e6-96bd-c8c1d51f7255@email.android.com> References: <2816c760-9bff-42e6-96bd-c8c1d51f7255.ref@email.android.com> <2816c760-9bff-42e6-96bd-c8c1d51f7255@email.android.com> Message-ID: On Sun, Jun 27, 2021 at 4:49 PM wrote: > > Personally I think the second is more reliable and maintainable so prefer it. If a reader is so new to python they don't know about get() then they need to look it up and learn. But OTOH a default dict might be better still! > There is a difference between writing clever code that is unreadable and using standard language or library features that might be less well known. One is easy to look up, the other is just downright hard work and therefore fragile. > > On 27 Jun 2021 22:41, boB Stepp wrote: > > Questions inspired by an example from "Practical Programming, 3rd ed." > by Gries, Campbell and Montojo. > > p. 221 example. Compare: > > [...] > bird_to_observations = {} > for line in observations_file: > bird = line.strip() > if bird in bird_to_observations: > bird_to_observations[bird] = bird_to_obserations[bird] + 1 > else: > bird_to_observations[bird] = 1 > [...] > > to > > [...] > bird_to_observations = {} > for line in observations_file: > bird = line.strip() > bird_to_observations[bird] = bird_to_observations.get(bird, 0) + 1 > [...] Hmm. So, Alan, I guess you are suggesting the following for this concrete instance: from collections import defaultdict [...] bird_to_obeservations = defaultdict(int) for line in observations_file: bird = line.strip() bird_to_observations[bird] += 1 [...] That does look to my eye clearer and more expressive. The cognitive load on the reader is to know how to use default dictionaries and know that int() always returns 0. But as you point out the reader can always look up defaultdict and the collections module is very popular and well-used AFAIK. So if I am understanding your answer to the more general questions, you believe that even using less well-known Python standard features is desirable if it simplifies the code presentation and is more expressive of intent? Cheers! boB Stepp From __peter__ at web.de Mon Jun 28 04:28:22 2021 From: __peter__ at web.de (Peter Otten) Date: Mon, 28 Jun 2021 10:28:22 +0200 Subject: [Tutor] Readability: To use certain Python features or not? In-Reply-To: References: <2816c760-9bff-42e6-96bd-c8c1d51f7255.ref@email.android.com> <2816c760-9bff-42e6-96bd-c8c1d51f7255@email.android.com> Message-ID: On 28/06/2021 00:21, boB Stepp wrote: > On Sun, Jun 27, 2021 at 4:49 PM wrote: >> >> Personally I think the second is more reliable and maintainable so prefer it. If a reader is so new to python they don't know about get() then they need to look it up and learn. But OTOH a default dict might be better still! >> There is a difference between writing clever code that is unreadable and using standard language or library features that might be less well known. One is easy to look up, the other is just downright hard work and therefore fragile. >> >> On 27 Jun 2021 22:41, boB Stepp wrote: >> >> Questions inspired by an example from "Practical Programming, 3rd ed." >> by Gries, Campbell and Montojo. >> >> p. 221 example. Compare: >> >> [...] >> bird_to_observations = {} >> for line in observations_file: >> bird = line.strip() >> if bird in bird_to_observations: >> bird_to_observations[bird] = bird_to_obserations[bird] + 1 >> else: >> bird_to_observations[bird] = 1 >> [...] >> >> to >> >> [...] >> bird_to_observations = {} >> for line in observations_file: >> bird = line.strip() >> bird_to_observations[bird] = bird_to_observations.get(bird, 0) + 1 >> [...] > > Hmm. So, Alan, I guess you are suggesting the following for this > concrete instance: > > from collections import defaultdict > [...] > bird_to_obeservations = defaultdict(int) > for line in observations_file: > bird = line.strip() > bird_to_observations[bird] += 1 > [...] > > That does look to my eye clearer and more expressive. The cognitive > load on the reader is to know how to use default dictionaries and know > that int() always returns 0. But as you point out the reader can > always look up defaultdict and the collections module is very popular > and well-used AFAIK. > > So if I am understanding your answer to the more general questions, > you believe that even using less well-known Python standard features > is desirable if it simplifies the code presentation and is more > expressive of intent? You didn't ask me, but I always try to use the "best fit" that I know of rather than the "best known fit" -- at least when that best fit is provided by the stdlib. In this case that would be birds = (line.strip() for line in observations_file) bird_to_observations = collections.Counter(birds) If that isn't self-explanatory wrap it in a function with proper documentation def count_birds(birds): """Count birds by species. >>> count_birds( ... ["norwegian blue", "norwegian blue", "unladen swallow"]) Counter({'norwegian blue': 2, 'unladen swallow': 1}) """ return collections.Counter(birds) This has the advantage that the function could contain any implementation that satisfies the doctest. As a consequence you can start with something that barely works and refine it as you learn more, without affecting the rest of the script. From manpritsinghece at gmail.com Mon Jun 28 12:08:01 2021 From: manpritsinghece at gmail.com (Manprit Singh) Date: Mon, 28 Jun 2021 21:38:01 +0530 Subject: [Tutor] Count number of female names in dict Message-ID: Dear sir, Consider a dict as given below : dic = {"Miss Preeti":30, "Master Paras": 31, "Master Jay": 30, "Miss Pooja": 32} The keys are names of students and values are marks obtained by them . Now if i have to count the number of female students (Names starting with Miss are female names ), can i do it in this way as given below ? sum(x.startswith("Miss") for x in dic.keys()) See the generator expression inside sum() function will return an iterator of True & False (True for females ), So i can say that passing this Generator expression inside sum() will give count of True that is 2 and is correct answer (There are only 2 female names). Is my way correct ? Regards Manprit Singh From lists at mostrom.pp.se Mon Jun 28 13:27:25 2021 From: lists at mostrom.pp.se (Jan Erik =?utf-8?q?Mostr=C3=B6m?=) Date: Mon, 28 Jun 2021 19:27:25 +0200 Subject: [Tutor] Count number of female names in dict In-Reply-To: References: Message-ID: <29F7D2BC-1D1C-456A-822B-EAC4E6BB56A2@mostrom.pp.se> On 28 Jun 2021, at 18:08, Manprit Singh wrote: > Is my way correct ? Since it does what you want, it correct. = jem From breamoreboy at gmail.com Sun Jun 27 18:42:59 2021 From: breamoreboy at gmail.com (Mark Lawrence) Date: Sun, 27 Jun 2021 23:42:59 +0100 Subject: [Tutor] Readability: To use certain Python features or not? In-Reply-To: <2816c760-9bff-42e6-96bd-c8c1d51f7255@email.android.com> References: <2816c760-9bff-42e6-96bd-c8c1d51f7255.ref@email.android.com> <2816c760-9bff-42e6-96bd-c8c1d51f7255@email.android.com> Message-ID: <13c1b53f-a46b-025b-817e-2ff6587e69cd@gmail.com> On 27/06/2021 22:49, Alan G via Tutor wrote: > Personally I think the second is more reliable and maintainable so prefer > it. If a reader is so new to python they don't know about get() then they > need to look it up and learn. But OTOH a default dict might be better > still! Perhaps a counter? https://docs.python.org/3/library/collections.html#collections.Counter > There is a difference between writing clever code that is unreadable and > using standard language or library features that might be less well known. > One is easy to look up, the other is just downright hard work and > therefore fragile. > On 27 Jun 2021 22:41, boB Stepp wrote: > > Questions inspired by an example from "Practical Programming, 3rd ed." > by Gries, Campbell and Montojo. > > p. 221 example. Compare: > > [...] > bird_to_observations = {} > for line in observations_file: > bird = line.strip() > if bird in bird_to_observations: > bird_to_observations[bird] = bird_to_obserations[bird] + 1 > else: > bird_to_observations[bird] = 1 > [...] > > to > > [...] > bird_to_observations = {} > for line in observations_file: > bird = line.strip() > bird_to_observations[bird] = bird_to_observations.get(bird, 0) + 1 > [...] > > The authors comment: "Using the get method makes the program shorter, > but some programmers find it harder to understand at a glance..." > > To my mind *if* the reader is fully familiar with Python the second > example is both shorter and more expressive. For the most > maintainable code which of these is better? What assumptions about > our code's future audience should we hold to best help our future > readers of our code? > > Note that I am asking a different set of questions from another common > topic: When to use Python constructs like list comprehensions, > lambdas, etc., which can greatly shorten LOC, but easily become a sea > of letters when one makes them too complex. > > TIA! > boB Stepp -- My fellow Pythonistas, ask not what our language can do for you, ask what you can do for our language. Mark Lawrence From vern_16 at yahoo.com Fri Jun 25 17:34:17 2021 From: vern_16 at yahoo.com (Chris Verhalen) Date: Fri, 25 Jun 2021 21:34:17 +0000 (UTC) Subject: [Tutor] Having issues with a simple port scanner using scapy References: <1484817393.489166.1624656857483.ref@mail.yahoo.com> Message-ID: <1484817393.489166.1624656857483@mail.yahoo.com> I?m new to python andI have no idea where I went wrong to get this port scanner using Scapy to work properly. Any help is appreciated. import loggingfrom scapy.layers.inet import TCP, ICMP, IPlogging.getLogger("scapy.runtime").setLevel(logging.ERROR) # Disable the annoying No Route found warning !from scapy.all import *ip = "10.0.0.3"closed_ports = 0open_ports = []def is_up(ip): #""" Tests if host is up """ icmp = IP(dst=ip)/ICMP() resp = sr1(icmp, timeout=10) if resp == None: return False else: return Trueif __name__ == '__main__': conf.verb = 0 # Disable verbose in sr(), sr1() methods start_time = time.time() ports = range(1, 1024) if is_up(ip): print("Host %s is up, start scanning" % ip) for port in ports: src_port = RandShort() # Getting a random port as source port p = IP(dst=ip)/TCP(sport=src_port, dport=port, flags='S') # Forging SYN packet resp = sr1(p, timeout=2) # Sending packet if str(type(resp)) == "": closed += 1 elif resp.haslayer(TCP): if resp.getlayer(TCP).flags == 0x12: send_rst = sr(IP(dst=ip)/TCP(sport=src_port, dport=port, flags='AR'), timeout=1) openp.append(port) elif resp.getlayer(TCP).flags == 0x14: closed += 1 duration = time.time()-start_time print("%s Scan Completed in %fs" % (ip, duration)) if len(openp) != 0: for opp in openp: print("%d open" % pop) print("%d closed ports in %d total port scanned" % (closed, len(ports)) else: print("Host %s is Down" % ip) line 47? ? else:? ? ^SyntaxError: invalid syntax Process finished with exit code 1 From roel at roelschroeven.net Fri Jun 25 06:56:03 2021 From: roel at roelschroeven.net (Roel Schroeven) Date: Fri, 25 Jun 2021 12:56:03 +0200 Subject: [Tutor] Clarification questions about how Python uses references. In-Reply-To: References: Message-ID: boB Stepp schreef op 24/06/2021 om 0:18: > I continue to attempt to refine my understanding of how Python uses > identifiers to reference objects. > ... > On to lists. My current understanding is that lists don't actually > contain the objects themselves, but, instead, references to those > objects. Is this correct? How could I prove this to myself in the > interpreter? Does this translate to tuples and sets? Even though > tuples are immutable they can contain mutable objects. Playing around > in the interpreter it looks like even if sets contain tuples, no > mutable elements can be in the tuples. Is this in general correct? Have you seen Ned Batchelders presentation "Python Names and Values"? I think it offers a very good way of thinking about how Python works. You can find it here: https://nedbatchelder.com/text/names1.html Something to keep in mind while reading that (or while watching the video), since you're asking about lists and other containers: somewhere halfway down the page Ned writes: "All of the examples I?ve been using so far used names as references to values, but other things can be references. Python has a number of compound data structures each of which hold references to values: list elements, dictionary keys and values, object attributes, and so on. Each of those can be used on the left-hand side of an assignment, and all the details I?ve been talking about apply to them." He doesn't mention that in the beginning to keep things simple, but it's an important concept to know about. -- "Honest criticism is hard to take, particularly from a relative, a friend, an acquaintance, or a stranger." -- Franklin P. Jones Roel Schroeven From roel at roelschroeven.net Sun Jun 27 05:36:46 2021 From: roel at roelschroeven.net (Roel Schroeven) Date: Sun, 27 Jun 2021 11:36:46 +0200 Subject: [Tutor] Clarification questions about how Python uses references. In-Reply-To: References: Message-ID: Richard Damon schreef op 26/06/2021 om 2:59: > One technique uses a hash function to ultimately get a fairly small > number, to make lookup of values O(1) in a container. > > The Second uses cryptographic secure hashes to verify a file. (This is > where MD5 is used), This use has hashes that generate BIG numbers. Indeed. It's a bit unfortunate that the same name is used for two distinct (but related) classes of functions. Note that MD5 "should be considered cryptographically broken and unsuitable for further use." (https://www.kb.cert.org/vuls/id/836068) SHA-1 is compromised too. For cryptographic purposes, use SHA-256 at least. -- "Honest criticism is hard to take, particularly from a relative, a friend, an acquaintance, or a stranger." -- Franklin P. Jones Roel Schroeven From roel at roelschroeven.net Sun Jun 27 05:39:18 2021 From: roel at roelschroeven.net (Roel Schroeven) Date: Sun, 27 Jun 2021 11:39:18 +0200 Subject: [Tutor] Clarification questions about how Python uses references. In-Reply-To: References: Message-ID: Cameron Simpson schreef op 26/06/2021 om 2:50: > The example of a file indexed by its MD5 hash I had in mind was a real > world example of a client report request, and the corresponding output > files. Receive the request, save it to disc for processing under a > filename based on the MD5 checksum, save that filename in a work queue. > The worker picks up the filename and makes the reports, likewise saving > them for the client to collect. > > The MD5 here is used as an easy way to pick _unique_ filenames for the > work request and its results. The request never changes content once > received. This is in essence the basis of content-addressable storage systems as used in e.g. git. -- "Honest criticism is hard to take, particularly from a relative, a friend, an acquaintance, or a stranger." -- Franklin P. Jones Roel Schroeven From tariqkhasiri at gmail.com Fri Jun 25 08:43:50 2021 From: tariqkhasiri at gmail.com (Tariq Khasiri) Date: Fri, 25 Jun 2021 07:43:50 -0500 Subject: [Tutor] PatsyError: categorical data cannot be >1-dimensional Message-ID: This following command is saying that my cannot be >1-dimensional. My categorical variables here are : law_lag (effective year of a law in a few states of the USA ) , year and statefip ( which is 20 years from 1990-2010 across 50 states [statefip ] of the United States ). Can anyone tell me if there are any basic issues I'm missing here to address ? At a beginner level , there's a high chance I'll miss the basic stuff. These are the info for my variables btw from my Dataframe after putting the info command: year 93585 non-null object statefip 93585 non-null category law_lag 92924 non-null float32 ## My regression command which is returning me the main error mentioned in the header PH = PH[~pd.isnull(PH.ln_incwage)] PH_male = PH[PH.male==1] formula = ( "ln_incwage ~ C(law_lag)*C(year) + C(statefip)" " + hispanic + asian + ismarried + lths + hsdegree + somecollege + age2" ) reg = ( smf .wls(formula, data=PH_male, weights=PH.black.values) .fit( cov_type='cluster', cov_kwds={'groups': PH_male.fip.values}, method='pinv') ) reg.summary() From learn2program at gmail.com Mon Jun 28 15:51:35 2021 From: learn2program at gmail.com (Alan Gauld) Date: Mon, 28 Jun 2021 20:51:35 +0100 Subject: [Tutor] Having issues with a simple port scanner using scapy In-Reply-To: <1484817393.489166.1624656857483@mail.yahoo.com> References: <1484817393.489166.1624656857483.ref@mail.yahoo.com> <1484817393.489166.1624656857483@mail.yahoo.com> Message-ID: <219a7d5b-7a55-18a6-dacc-970d57fa3cdc@yahoo.co.uk> On 25/06/2021 22:34, Chris Verhalen via Tutor wrote: > I?m new to python andI have no idea where I went wrong First, My apologies for the late posting. As a new poster your message got held in te moderation queue but the moderator(me!) was on vacation and I just got back. Also if posting code please post using plai8n text. Rich text or HTML gets mangled in the mail system and loses all the formarting that is so important in Python. See below... If you can repost in plain text we might be able to help 9assuming you still need it!) > to get this port scanner using Scapy to work properly. Any help is appreciated. > > import loggingfrom scapy.layers.inet import TCP, ICMP, IPlogging.getLogger("scapy.runtime").setLevel(logging.ERROR) # Disable the annoying No Route found warning !from scapy.all import *ip = "10.0.0.3"closed_ports = 0open_ports = []def is_up(ip): #""" Tests if host is up """ icmp = IP(dst=ip)/ICMP() resp = sr1(icmp, timeout=10) if resp == None: return False else: return Trueif __name__ == '__main__': conf.verb = 0 # Disable verbose in sr(), sr1() methods start_time = time.time() ports = range(1, 1024) if is_up(ip): print("Host %s is up, start scanning" % ip) for port in ports: src_port = RandShort() # Getting a random port as source port p = IP(dst=ip)/TCP(sport=src_port, dport=port, flags='S') # Forging SYN packet resp = sr1(p, timeout=2) # Sending packet if str(type(resp)) == "": closed += 1 elif resp.haslayer(TCP): if resp.getlayer(TCP).flags == 0x12: send_rst = sr(IP(dst=ip)/TCP(sport=src_port, dport=port, flags='AR'), timeout=1) openp.append(port) elif resp.getlayer(TCP).flags == 0x14: closed += 1 duration = time.time()-start_time print("%s Scan Completed in %fs" % (ip, duration)) if len(openp) != 0: for opp in openp: print("%d open" % pop) print("%d closed ports in %d total port scanned" % (closed, len(ports)) else: print("Host %s is Down" % ip) > line 47? ? else:? ? ^SyntaxError: invalid syntax > Process finished with exit code 1 > _______________________________________________ > Tutor maillist - Tutor at python.org > To unsubscribe or change subscription options: > https://mail.python.org/mailman/listinfo/tutor -- Alan G Author of the Learn to Program web site http://www.alan-g.me.uk/ http://www.amazon.com/author/alan_gauld Follow my photo-blog on Flickr at: http://www.flickr.com/photos/alangauldphotos From Richard at damon-family.org Mon Jun 28 15:51:54 2021 From: Richard at damon-family.org (Richard Damon) Date: Mon, 28 Jun 2021 15:51:54 -0400 Subject: [Tutor] Clarification questions about how Python uses references. In-Reply-To: References: Message-ID: <1D60FA8D-52B8-42DC-80F7-201091F5F456@damon-family.org> > On Jun 28, 2021, at 3:41 PM, Roel Schroeven wrote: > > ?Richard Damon schreef op 26/06/2021 om 2:59: >> One technique uses a hash function to ultimately get a fairly small >> number, to make lookup of values O(1) in a container. >> The Second uses cryptographic secure hashes to verify a file. (This is >> where MD5 is used), This use has hashes that generate BIG numbers. > > Indeed. It's a bit unfortunate that the same name is used for two distinct (but related) classes of functions. > > Note that MD5 "should be considered cryptographically broken and unsuitable for further use." (https://www.kb.cert.org/vuls/id/836068) SHA-1 is compromised too. For cryptographic purposes, use SHA-256 at least. > They are not that distinct, they all take in an ?object? (or maybe a string or bytes) and generate a number that corresponds to it. The second group DO in fact have a distinct name, they are cryptographic hashes, indicating they are (or at least claim to be or have been) suitable for cryptographic purposes. Thinking more, there is a third middle ground where the hash is used for error detection (like.a CRC or ECC), where you want a very high probability of detecting a change, but of a ?random? nature, so you don?t need the ultimate security of a cryptographic hash. From robertvstepp at gmail.com Mon Jun 28 18:31:39 2021 From: robertvstepp at gmail.com (boB Stepp) Date: Mon, 28 Jun 2021 17:31:39 -0500 Subject: [Tutor] Readability: To use certain Python features or not? In-Reply-To: References: <2816c760-9bff-42e6-96bd-c8c1d51f7255.ref@email.android.com> <2816c760-9bff-42e6-96bd-c8c1d51f7255@email.android.com> Message-ID: On Mon, Jun 28, 2021 at 3:29 AM Peter Otten <__peter__ at web.de> wrote: > > So if I am understanding your answer to the more general questions, > > you believe that even using less well-known Python standard features > > is desirable if it simplifies the code presentation and is more > > expressive of intent? > > You didn't ask me, but I always try to use the "best fit" that I know of > rather than the "best known fit" -- at least when that best fit is > provided by the stdlib. > > In this case that would be > > birds = (line.strip() for line in observations_file) > bird_to_observations = collections.Counter(birds) > > If that isn't self-explanatory wrap it in a function with proper > documentation No, that is rather self-explanatory! > def count_birds(birds): > """Count birds by species. > >>> count_birds( > ... ["norwegian blue", "norwegian blue", "unladen swallow"]) > Counter({'norwegian blue': 2, 'unladen swallow': 1}) > """ > return collections.Counter(birds) > > This has the advantage that the function could contain any > implementation that satisfies the doctest. As a consequence you can > start with something that barely works and refine it as you learn more, > without affecting the rest of the script. One of the problems I encounter with being actively engaged with a book is that I am immersed in the authors' perspective and focused on the points they are trying to make. This leads to a certain tunnel vision on my part. As this section of the book was explaining dictionaries and their use, that is what I was focused on as well. Alan's suggestion fit in with that current focus. OTOH, if some one had given me this programming problem free of a textbook's context, I think the Counter would have occurred to me. Your solution, Peter, definitely wins the "no prize"! (That's an old reference to Marvel Comics) But the main intent of my questions was not the actual problem, but whether one should use the best standard Python tool for the job even if it is not well-known by a substantial number of programmers or to cater to their possible lack of knowledge. I think you and Alan have settled that point in my mind now -- choose the best, most expressive tool Python provides. The future maintainer can look it up if they are lacking knowledge. And Mark in his post I think would agree from his seconding of a Counter solution. As always, thanks! boB Stepp From Francesco.Pugliese at liverpool.ac.uk Wed Jun 30 11:00:37 2021 From: Francesco.Pugliese at liverpool.ac.uk (Pugliese, Francesco) Date: Wed, 30 Jun 2021 15:00:37 +0000 Subject: [Tutor] Multiprocessing Message-ID: <31ab6d9a51b343c08a9e8c4dbaa9fa32@liverpool.ac.uk> Dear all, I am Francesco and currently PhD at the University of Liverpool. I am trying to use the multiprocessing library in python to performance some structural analyses using Opensees (please see the attached file for reference). I am coding like this: if __name__=="__main__": p = Pool(8) p.map(fun, range(1,10001), chunksize = 1) p.terminate() p.join() After completing all analyses, the system does not close the pool but remains stuck (frozen) without doing anything. Could please help me out to understand where I am making mistakes? Thanks a lot. Kind Regards, Francesco - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - Francesco, PUGLIESE, BEng, Meng, MRes, S.M.ASCE, MICE PhD Candidate in Risk Analysis and Earthquake Engineering Seismic Team School of Engineering Brodie Tower, 6th Floor, PostGraduate Research Office University of Liverpool, Liverpool, Uk