Explicit is better than Implicit

dn PythonList at DancesWithMice.info
Thu Aug 6 17:40:15 EDT 2020


On 07/08/2020 05:33, Skip Montanaro wrote:
> Hmmm... Rename genes, fix Excel, or dump Excel in favor of Python? I know
> what my choice would have been. :-)
> 
> https://www.theverge.com/2020/8/6/21355674/human-genes-rename-microsoft-excel-misreading-dates


At the risk of screaming off-topic...

The article does point-out that MS-Excel is attempting to be helpful in 
identifying data, and thus formatting it appropriately. The human-error 
is exposed: "opens the same spreadsheet in Excel without thinking, 
errors will be introduced". So, should the mistake be laid at the feet 
of the tool?
(No matter that I/many of us here will agree with your 
preference/predilection!)

The reason that a Python solution would not have this problem is less to 
do with Python, or even Gene nomenclature. It is because when we 
(professional projects) code a solution, we proceed through 
design-stages. We think about the data to be transformed, as well as the 
process of transformation itself.

Of course, if we develop-by-prototype: adding a chunk of code 'here' and 
another chunk 'there', with no top-down view; the very same sort of 
problem could so-easily occur!
- despite and/or because of Python's fast-and-loose dynamic typing, for 
example.

I postulate that the issue really stems from MSFT's Training Approach. 
They start from the level of 'here is a column of numbers let's total 
them', and then run through every command on the menus/ribbon. Their 
material rarely talks about 'design' - and few individuals have the 
patience/are afforded the budget, for the 'advanced courses' that do! NB 
the same applies to MS-Word, etc.

MS-Excel (or better: LibreOffice Calc, etc, from the F/LOSS stable) is a 
powerful tool with the additional virtue that it is easy to use. Thus, 
people are able to concentrate on the demands of their own speciality, 
and use of the tool becomes 'automatic' or 'muscle memory'. A mark of 
"success" if ever there was one!

Unfortunately, this forms the mind-set of folk creating a worksheet in 
an organic (prototype-as-product/design-less) fashion, and certainly 
when picking-up someone else's spreadsheet (per quote, above).

However, the article continues to describe the tool: “It’s a widespread 
tool and if you are a bit computationally illiterate you will use it" 
and using any tool - particularly when also using someone else's data, 
without over-view thought, is a bit like the old prank of asking some 
'innocent' to "format c:" - and ultimately, as fatal.

If we started an MS-product solution from 'design', then we would 
commence with templates and styles - that column of the worksheet would 
be formatted as a string, eg "MARCH3", and not left to MS-Excel's 
'intelligence'/tender mercies.

So, is it an Excel-problem? Is it a human-laziness problem? Is it plain 
ignorance? Is it a training/learning issue?

We expect people driving a car to know how to drive - without expecting 
them to be professional drivers (racers or truckies). Why don't we 
expect people manipulating statistics and other forms of information to 
be appropriately-able?


That they would alter the jargon and thinking of an entire discipline to 
suit the sub-standard, overly-bossy, commonly-used tool is surely 
'putting the cart before the horse'...

That said, names do matter. How often do you search the web for some 
detail of/in Python and find an insinuation of snakes nestled amongst 
the results - or someone thinks that it is time for a joke about 
swallows or parrots? I don't have time to imagine how the folks who use 
C or R manage!


PS programming languages also include 'danger zones'. Early in my career 
I found a similar embarrassment of 'infallible belief in the tool', with 
the same consequence of research papers containing erroneous 
numbers/bases/conclusions being published. A suite of programs declared 
storage 'arrays' and populated them with (knowingly) incomplete data 
(reasonably complete, not exactly "sparse") - but forgetting that this 
technology required the data-arrays to be zeroed first! So, random data 
from previous use of the same storage area, in random formats, threw all 
manner of 'spanners in the works'. When you take such news to your boss 
and colleagues, do NOT even try to convince yourself that they will not 
"shoot the messenger"!
-- 
Regards =dn


More information about the Python-list mailing list