dangerous class neighborhood

Wed Jan 9 22:22:23 EST 2019

This message is a delayed reply to what Chris wrote late last year and I initially chose not to reply to. My fault, as I was having a few bad days and did not see the content as constructive. But as things are fine now and I have seen more of what Chris posts, I will reply, but not in-line as I want to make a few focused replies. And I note Chris has a sense of humor that oddly may align with mine so I regret the misunderstanding. No unending back and forth messages are needed. I think we may understand each other. Those not interested, feel free to escape.

FOREPLAY: Like some of my posts, I wrote a fairly long message asking questions about choices people might want to make and offering perhaps uncommon solutions. I have no actual current interest in the specific problem for any actual programming I am doing. Just a discussion.

As such, some of my questions in the original message should NOT have a definite answer as much as a range of opinions depending on the circumstance. The overall scenario, to remind some readers,  was some unusual behavior when someone was doing calculations within the body of a class definition that involved evaluating  constants into an assortment of class variables with some massaging.

I asked this:

> Question 2: Do you want the variables available at the class level or  at the instance level?

That again is a question to which I deny there is a specific answer. I was asking if the USER at that moment preferred or needed one or the other. I mean python allows you to do it quite a few ways. If you want a counter of how many objects have been instantiated that are descended from that class, for instance, then the counter makes perfect sense as being a variable stored  in the single shared object representing the class. Constants also probably do not need to be defined and created in every single instance of a class. But there may be times you do want it there such as if you want to modify them or perhaps use them up with each instance starting with the same ones.

CHRIS: For constants, definitely put them on the class. They'll be available on instances as well ("for free", if you like). For mutables, obviously you need to decide on a case-by-case basis.

In that light, Chris answered well enough except I think I reacted to the word "definitely" as a suggestion that the question of what the user might want as being silly. A closer look now indicates that the second part allows different choices for something like mutables. So I have no disagreement. As always, I can think of other reasons why the lunch is not free as it takes extra work to search for variables higher up in the class chain and it gets ridiculous with multiple inheritance.

My next question was focused on how to work within the rules to avoid this anomaly. If you need a reminder, it had something to do with the scope within a class definition and whether it was visible to functions deeper within the scope visually but not by the rules python currently  has.

> Question 3: Which python variations on syntactic sugar, such as list 
> comprehensions, get expanded invisibly in ways that make the problem 
> happen by asking for variables to be found when no longer in the visible range?

CHRIS: The oddities with comprehensions were tackled partly during the discussion of PEP 572. If you want to know exactly why this isn't changing, go read a few hundred emails on the subject. A lot of the main points are summarized in the PEP itself:

Now on the one hand, the reply was full of PEP but did not directly seem to address my point, at first. My first take was GO READ IT. HUNDREDS of emails? Reasonable but I had too much else to work on so a bit frustrating. What I had hoped for was a list of specific python varieties of code such as a list comprehension and so on. In later discussions with Chris I realize he was probably frustrated as he saw the ultimate cause of the original problem as rather artificial and not quite due to the scope rules "effect" I seemed to be asking about. In particular, I was looking (elsewhere in the post) for ways to find a more hospitable and reliable place where you could use ANY python functionality you wished and get a valid result and simply EXPORT the results to whatever place (class, instance, or something external) to be used when needed. Chris may have wondered what problem I was solving as there wasn't one.

I won't copy all the rest, as it can be seen below, I presented some ways this could be done. One was to do it in an initializer and save the results in the instance, or if needed overwrite it in the class) or do it all in a single function (all local scope) and return multiple outputs that can be instantiated in the class. The comments Chris made were not focused in the direction I was trying to do as he did not necessarily see it as a problem to solve. Fair enough. I may be a bit poisoned in that my reading has shown that deeper aspects of python are riddled with anomalies and places where good people disagree about the wisdom of adding a feature like super() and assumed this was a bigger anomaly. Chris does not and I think is right.

CHRIS: If you write simple and Pythonic code, these will almost always work perfectly. The recent thread citing an oddity worked just fine until it was written to iterate over range(len(x)) instead of iterating directly.

That remark is reasonable to a point. Indeed, there was no reason to iterate over a range. However, it was VALID code that would work elsewhere and would work (presumably) in the safe havens I was describing.

At no point did I suggest anything I was looking at was particularly efficient. It is perfectly fair to say that my methods are silly because of that but the attitude I thought I saw seemed sarcastic because I was presenting a solution to a different problem and efficiency was not really a concern. I have often seen code that I looked at as ridiculously inefficient as it passed over the data over and over to do calculations that could easily be combined. It might for example describe a data set by telling how many entries (n) there are, and the minimum and maximum and mean and median and variance and standard deviation and skew and kurtosis and more. Each item is calculated by calling a function that returns just that. But you could easily write a simple loop or two that would pass over the data maybe twice and do all that. But sometimes simple works too, and is easier to understand as it does one thing well.

CHRIS: Lovely. Now you have to define your variables once inside the function, then name them a second time in that function's return statement, and finally name them all a *third* time in the class statement (at least, I presume "def Foo():" is meant to be "class Foo:"). A mismatch will create bizarre and hard-to-debug problems.
What do you actually gain? Can you show me real-world code that would truly benefit from this?

Well, clearly we spoke past each other. He is right I meant class, not def. Yes, it is not intended to be at all efficient. Now, after some discussion, it may be more easy to understand. Yes, all the copying of variables was indeed done.  We can complain about a compiler that loads variables from memory into registers A and B, does a calculation then copies some back to memory as being wasteful but if the architecture works only in registers, it is not. Every time you do something simple by using a function you incur additional overhead and yet it often is seen as a better way for other reasons.

So, yes, What I wrote was KNOWN to be less efficient. It was not THE point. I won't give lots of other examples when the code would benefit from this. The benefit was in doing the calculation in a place I knew would not show anomalies at a time when I thought the class context was the problem. I now know that there was not much of a need as it is generally a safe enough place and thus no gimmicks are normally needed.

The remaining comments are now seen in this light and need no further replies from me. Many seem to be knocking down a strawman I did not think I had set up.

CHRIS: Unified? No more so than the class statement itself. Safe? Definitely not, because of the mandatory duplication of names.

CHRIS: Uhh..... nope, that's nothing but FUD. There is no reason to believe that some language features would be "unsafe".

Another example of criticism follows when I made a general comment about how list comprehensions are really made into a while loop and he corrected my generalism as if I was saying this was all of it. My style can go from abstract to concrete and back and admittedly may be hard to follow. Still, I did say ' some parts may expand to calls to a "range" statement ' as an example but meant that if the list comprehension had a call within it to a range statement, that would then show up in the fully developed loop. I was not intending to say a list comprehension normally created a range statement. 

And I also mentioned it could include an 'if' which obviously it can but only if included in the comprehension in the first place. I saw the reply as nitpicking. I suspect that long before this, Chris developed an impression of me based on a misunderstanding or two and at this point was looking for anything to snap at and I FELT IT and reacted accordingly. I make no claims about relative abilities and have only recently focused on python but I doubt I am the boob these comments make it sound like. I am a bit more of a generalist who learns more and more until he knows nothing about everything 😊

This forum was supposed to be about reasonable and serious debates, not about whatever this reply started to look like. My views when viewed properly are rather close to what Chris says, except that I generally did not say what he is replying to here. In particular, I would NOT write the code the way I discuss unless everything else I tried failed.

So, I am leaving it like that. I differentiate between what is valid use of the language and what is a 'best practice' and what works quickly when you hit a roadblock and need to get around it NOW using some kind of patch that may even be outside the box.

Main point: I haver no beef with Chris, or anyone here. The neighborhood is safe again.

-----Original Message-----
From: Python-list <python-list-bounces+avigross=verizon.net at python.org> On Behalf Of Chris Angelico
Sent: Thursday, December 27, 2018 5:11 PM
To: Python <python-list at python.org>
Subject: Re: dangerous class neighborhood

On Fri, Dec 28, 2018 at 8:47 AM Avi Gross <avigross at verizon.net> wrote:
> Question 2: Do you want the variables available at the class level or 
> at the instance level?

For constants, definitely put them on the class. They'll be available on instances as well ("for free", if you like). For mutables, obviously you need to decide on a case-by-case basis.

> Question 3: Which python variations on syntactic sugar, such as list 
> comprehensions, get expanded invisibly in ways that make the problem 
> happen by asking for variables to be found when no longer in the visible range?

The oddities with comprehensions were tackled partly during the discussion of PEP 572. If you want to know exactly why this isn't changing, go read a few hundred emails on the subject. A lot of the main points are summarized in the PEP itself:

https://www.python.org/dev/peps/pep-0572/

> There may be matters of efficiency some would consider but some of the 
> examples seen recently seemed almost silly and easy to compute. The 
> people asking about this issue wanted to define a bunch of CONSTANTS, 
> or things that might as well be constants, like this:
>
>
>
> def Foo():
>
>                 A = ("male", "female", "other")
>
>                 B = [ kind[0] for kind in A ]            # First letters
> only
>
>                 # And so on making more constants like a dictionary 
> mapping each string to a number or vice versa.
>
>
>
> All the above can be evaluated at the time the class is defined but 
> unintuitive scope rules make some operations fail as variables defined 
> in the scope become unavailable to other things that SEEM to be 
> embedded in the same scope.

If you write simple and Pythonic code, these will almost always work perfectly. The recent thread citing an oddity worked just fine until it was written to iterate over range(len(x)) instead of iterating directly.

> If they are ONLY to be used within an instance of Foo or invoked from 
> within there, there may be a fairly simple suggestion. If you already 
> have a __init__ method, then instantiate the variables there carefully 
> using the self object to reference those needed.

But why? __init__ should initialize an instance, not class-level constants. A Python class is not restricted to just methods, and there's no reason to avoid class attributes.

> Create a function either outside the class or defined within. Have it 
> do any internal calculations you need in which all internal variables 
> can play nicely with each other. Then let it return all the variables 
> in a tuple like
> this:
>
> def make_sexual_constants():
>
>                 A = .
>
>                 B = .
>
>                 C = f(A,B)
>
>                 D = .
>
> def Foo():
>
>                 (A, B, C, D) = make_sexual_constants():

Lovely. Now you have to define your variables once inside the function, then name them a second time in that function's return statement, and finally name them all a *third* time in the class statement (at least, I presume "def Foo():" is meant to be "class Foo:"). A mismatch will create bizarre and hard-to-debug problems.
What do you actually gain? Can you show me real-world code that would truly benefit from this?

> Can we agree that the class Foo now has those 4 variables defined and 
> available at either the class level or sub-class or instance levels? 
> But the values are created, again, in a unified safe environment?

Unified? No more so than the class statement itself. Safe? Definitely not, because of the mandatory duplication of names.

> As noted in section 3, it would be good to know what python features 
> may be unsafe in this kind of context. I had an unrelated recent 
> discussion where it was mentioned that some proposed feature changes 
> might not be thread safe. Valid consideration when that may lead to hard-to-explain anomalies.

Uhh..... nope, that's nothing but FUD. There is no reason to believe that some language features would be "unsafe".

> We now hear that because a list comprehension can be unwound 
> internally into a "while" loop and an "if" statement and that some 
> parts may expand to calls to a "range" statement, perhaps some 
> variables are now in more deeply embedded contexts that have no access to any class variables.

No idea what you're looking at. A comprehension can be unwound in a fairly straight-forward way, although there are some subtleties to them.

B = [ kind[0] for kind in A ]
# equivalent to, approximately:
def listcomp(iter):
    result = []
    for kind in iter:
        result.append(kind[0])
    return result
B = listcomp(A)

For casual usage, you can describe a list comp very simply and neatly:

B = [ kind[0] for kind in A ]
# equivalent to, more approximately:
B = []
for kind in A:
    B.append(kind[0])

Nothing here expands to a call to range(), nothing has a while loop.
The only way you'll get an "if" is if you had one in the comprehension itself.

> I think that
> is quite reasonable; hence my suggestion we need to know which ones to 
> avoid, or use a workaround like expanding it out ourselves and perhaps 
> carefully import variables into other contexts such as by passing the 
> variable into the function that otherwise cannot access it from a 
> point it can still be seen.

Sure. If the comprehension doesn't work for you, just put a for loop inside your class statement. This is not a problem.

> Least, but at least last, I ask if the need really exists for these 
> variables as constants versus functions. If creating C this way runs 
> into problems, but A and B are fine, consider making a method with 
> some name like
> Foo.get_C() that can see A and B and do the calculation and yet return 
> the value of C needed. Less efficient but .

Definitely not. That would imply that the value of C might change, or might have significant cost, or in some other way actually merits a getter function. Python isn't built to encourage that.

Class scope has edge cases, to be sure, but they're much more notable in carefully-crafted exploratory code than in actual real-world code.

ChrisA
--
https://mail.python.org/mailman/listinfo/python-list