Newline (NuBe Question)

Sun Nov 26 22:15:35 EST 2023

Dave,

Back on a hopefully more serious note, I want to make a bit of an analogy
with what happens when you save data in a format like a .CSV file.

Often you have a choice of including a header line giving names to the
resulting columns, or not.

If you read in the data to some structure, often to some variation I would
loosely call a data.frame or perhaps something like a matrix, then without
headers you have to specify what you want positionally or create your own
names for columns to use. If names are already there, your program can
manipulate things by using the names and if they are well chosen, with no
studs among them, the resulting code can be quite readable. More
importantly, if the data being read changes and includes additional columns
or in a different order, your original program may run fine as long as the
names of the columns you care about remain the same. 

Positional programs can be positioned to fail in quite subtle ways if the
positions no longer apply.

As I see it, many situations where some aspects are variable are not ideal
for naming. A dictionary is an example that is useful when you have no idea
how many items with unknown keys may be present. You can iterate over the
names that are there, or use techniques that detect and deal with keys from
your list that are not present. Not using names/keys here might involve a
longer list with lots of empty slots to designate missing items, This
clearly is not great when the data present is sparse or when the number of
items is not known in advance or cannot be maintained in the right order. 

There are many other situations with assorted tradeoffs and to insist on
using lists/tuples exclusively would be silly but at the same time, if you
are using a list to hold the real and imaginary parts of a complex number,
or the X/Y[/Z] coordinates of a point where the order is almost universally
accepted, then maybe it is not worth using a data structure more complex or
derived as the use may be obvious.

I do recall odd methods sometimes used way back when I programmed in C/C++
or similar languages when some method was used to declare small constants
like:

#define FIRSTNAME 1
#define LASTNAME 2

Or concepts like "const GPA = 3"

And so on, so code asking for student_record[LASTNAME] would be a tad more
readable and if the order of entries somehow were different, just redefine
the constant.

In some sense, some of the data structures we are discussing, under the
hood, actually may do something very similar as they remap the name to a
small integer offset. Others may do much more or be slower but often add
value in other ways. A full-blown class may not just encapsulate the names
of components of an object but verify the validity of the contents or do
logging or any number of other things. Using a list or tuple does nothing
else.

So if you need nothing else, they are often suitable and sometimes even
preferable. 

-----Original Message-----
From: Python-list <python-list-bounces+avi.e.gross=gmail.com at python.org> On
Behalf Of DL Neil via Python-list
Sent: Sunday, November 26, 2023 5:19 PM
To: python-list at python.org
Subject: Re: Newline (NuBe Question)

On 11/27/2023 10:04 AM, Peter J. Holzer via Python-list wrote:
> On 2023-11-25 08:32:24 -0600, Michael F. Stemper via Python-list wrote:
>> On 24/11/2023 21.45, avi.e.gross at gmail.com wrote:
>>> Of course, for serious work, some might suggest avoiding constructs like
a
>>> list of lists and switch to using modules and data structures [...]
>>
>> Those who would recommend that approach do not appear to include Mr.
>> Rossum, who said:
>>    Avoid overengineering data structures.
>            ^^^^^^^^^^^^^^^
> 
> The key point here is *over*engineering. Don't make things more
> complicated than they need to be. But also don't make them simpler than
> necessary.
> 
>>    Tuples are better than objects (try namedtuple too though).
> 
> If Guido thought that tuples would always be better than objects, then
> Python wouldn't have objects. Why would he add such a complicated
> feature to the language if he thought it was useless?
> 
> The (unspoken?) context here is "if tuples are sufficient, then ..."

At recent PUG-meetings I've listened to a colleague asking questions and 
conducting research on Python data-structures*, eg lists-of-lists cf 
lists-of-tuples, etc, etc. The "etc, etc" goes on for some time! 
Respecting the effort, even as it becomes boringly-detailed, am 
encouraging him to publish his findings.

* sadly, he is resistant to OOP and included only a cursory look at 
custom-objects, and early in the process. His 'new thinking' has been to 
look at in-core databases and the speed-ups SQL (or other) might offer...

However, his motivation came from a particular application, and to 
create a naming-system so that he could distinguish a list-of-lists 
structure from some other tabular abstraction. The latter enables the 
code to change data-format to speed the next process, without the coder 
losing-track of the data-type/format.

The trouble is, whereas the research reveals which is faster 
(in-isolation, and (only) on his 'platform'), my suspicion is that he 
loses all gains by reformatting the data between 'the most efficient' 
structure for each step. A problem of only looking at the 'micro', 
whilst ignoring wider/macro concerns.

Accordingly, as to the word "engineering" (above), a reminder that we 
work in two domains: code and data. The short 'toy examples' in training 
courses discourage us from a design-stage for the former - until we 
enter 'the real world' and meet a problem/solution too large to fit in a 
single human-brain. Sadly, too many of us are pre-disposed to be 
math/algorithmically-oriented, and thus data-design is rarely-considered 
(in the macro!). Yet, here we are...

--
Regards =dn
-- 
https://mail.python.org/mailman/listinfo/python-list