convenience

Avi Gross avigross at verizon.net
Tue Mar 22 14:00:08 EDT 2022


An earlier post talked about a method they used for "convenience" in a way 
they apparently did not understand and many of us educated them, hopefully.

That made me wonder of teh impact on our code when we use various forms
of convenience. Is it convenient for us as programmers, other potential readers,
or a compiler or interpreter?

The example used was something like:

varname = objectname.varname

The above clearly requires an existing object. But there is no reason it
has to be the same name. Consider the somewhat related idea used
almost always in code:

import numpy as np

Both cases make a sort of pointer variable and are a tad shorter to write.

But what impact does it have on interpreters and compilers?

I assume objectname.varname makes the interpreter look up "objectname" in
some stack of namespaces and find it. Then it looks within the namespace
inside the object and finds varname. Finally, it does whatever you asked for such
as getting or setting a value or something more complex. 

So what happens if you simply call "varname" after the above line of code has
set it to be a partial synonym for the longer name? Clearly it no longer does any
work on evaluating "objectname" and that may be a significant saving as the
name may be deep in the stack of namespaces. But are there costs or even errors
if you approach an inner part of an object directly? Can there be dunder methods not
invoked that would be from the standard approach? What kind of inadvertent
errors can creep in?

I have seen lots of errors in many languages and functions designed to make
some things superficially easier. Sometimes it involves adding a namespace
to the stack of namespaces that contains the names of all column names
of a DataFrame, for example to avoid extra typing, and this can make fairly
complex and long pieces of code way shorter. But this can also make other
existing names go deeper into the stack and temporarily shadowed or if
a similar technique is used a bit later in the code while this is in effect, can
shadow it as in opening another object of the same kind.

Sometimes there is a price for convenience. And even if you are fully
aware of the pitfalls, and expect to gain from the convenience to you
or even the interpreter, your work may be copied or expanded on by
others who do not take care. In the example above in a language like R,
you can add the namespace to the environment and use it in read-only mode
but writing to a resulting variable like "col1" makes a new local variable and does 
not make any change to the underlying filed in a data.frame.

I mean (again this is not Python code) that:

mydata$result <- mydata$col1 + mydata$col2

will create or update a column called "result" within the data.frame called mydata,

BUT

with(mydata, result <- col1 + col2)

That does not work well and leaves "result" alone or non-existent. Instead, a local variable 
called result is created and then rapidly abandoned as shown by code that displays it:

with(mydata, {result <- col1 + col2; print(result)})

The above returns 3 when the columns in mydata are 1 and 2.

Just FYI, there are other packages that handle such short names properly and well, but they 
are done carefully. Regular users may easily make mistakes for the sake of convenience.

So I wonder if some pythonic methods fit into convenience modes or can
veer into dangerous modes. Spelling things out in full detail may be annoying
but generally safer.


More information about the Python-list mailing list