Indentation and optional delimiters

Thu Feb 28 01:46:14 EST 2008

By the way bearophile... the readability of your posts will increase a 
LOT if you break it up into paragraphs, rather than use one or two giant 
run-on paragraphs.

My comments follow.

On Tue, 26 Feb 2008 15:22:16 -0800, bearophileHUGS wrote:

> Steven D'Aprano:
>> Usability for beginners is a good thing, but not at the expense of
>> teaching them the right way to do things. Insisting on explicit
>> requests before copying data is a *good* thing. If it's a gotcha for
>> newbies, that's just a sign that newbies don't know the Right Way from
>> the Wrong Way yet. The solution is to teach them, not to compromise on
>> the Wrong Way. I don't want to write code where the following is
>> possible: ...
>> ... suddenly my code hits an unexpected performance drop ... as
>> gigabytes of data get duplicated
> 
> I understand your point of view, and I tend to agree. But let me express
> my other point of view. Computer languages are a way to ask a machine to
> do some job. As time passes, computers become faster, 

But never fast enough, because as they get faster, we demand more from 
them.

> and people find
> that it becomes possible to create languages that are higher level, that
> is often more distant from how the CPU actually performs the job,
> allowing the human to express the job in a way closer to how less
> trained humans talk to each other and perform jobs. 

Yes, but in practice, there is always a gap between what we say and what 
we mean. The discipline of having to write down precisely what we mean is 
not something that will ever go away -- all we can do is use "bigger" 
concepts, and thus change the places where we have to be precise. 

e.g. the difference between writing

index = 0
while index < len(seq):
    do_something_with(seq[index])
    index += 1

and 

for x in seq:
    do_something_with(x)

is that iterating over an object is, in some sense, a "bigger" concept 
than merely indexing into an array. If seq happens to be an appropriately-
written tree structure, the same for-loop will work, while the while loop 
probably won't.

> Probably many years
> ago a language like Python was too much costly in terms of CPU, making
> it of little use for most non-toy purposes. But there's a need for
> higher level computer languages. Today Ruby is a bit higher-level than
> Python (despite being rather close). So my mostly alternative answers to
> your problem are: 1) The code goes slow if you try to perform that
> operation? It means the JIT is "broken", and we have to find a smarter
> JIT (and the user will look for a better language).
[...]

Of course I expect that languages will continue to get smarter, but there 
will always be a gap between "Do What I Say" and "Do What I Mean".

It may also turn out that, in the future, I won't care about Python4000 
copying ten gigabytes of data unexpectedly, because copying 10GB will be 
a trivial operation. But I will care about it copying 100 petabytes of 
data unexpectedly, and complain that Python4000 is slower than G.

The thing is, make-another-copy and make-another-reference are 
semantically different things: they mean something different. Expecting 
the compiler to tell whether I want "x = y" to make a copy or to make 
another reference is never going to work, not without running "import 
telepathy" first. All you can do is shift the Gotcha! moment around.

You should read this article:

http://www.joelonsoftware.com/articles/fog0000000319.html

It specifically talks about C, but it's relevant to Python, and all  
hypothetical future languages. Think about string concatenation in Python.

> A higher level
> language means that the user is more free to ignore what's under the
> hood, the user just cares that the machine will perform the job,
> regardless how, the user focuses the mind on what job to do, the low
> level details regarding how to do it are left to the machine.

More free, yes. Completely free, no.

> Despite that I think today lot of people that have a 3GHZ CPU
> that may accept to use a language 5 times slower than Python, that for
> example uses base-10 floating point numbers (they are different from
> Python Decimal numbers). Almost every day on the Python newsgroup a
> newbie asks if the round() is broken seeing this:
>>>> round(1/3.0, 2)
> 0.33000000000000002
> A higher level language (like Mathematica) must be designed to give more
> numerically correct answers, even if it may require more CPU. But such
> language isn't just for newbies: if I write a 10 lines program that has
> to print 100 lines of numbers I want it to reduce my coding time,
> avoiding me to think about base-2 floating point numbers.

Sure. But all you're doing is moving the Gotcha around. Now newbies will 
start asking why (2**0.5)**2 doesn't give 2 exactly when (2*0.5)*2 does. 
And if you fix that by creating a surd data type, at more performance 
cost, you'll create a different Gotcha somewhere else.

> If the
> language use a higher-level numbers by default I can ignore that
> problem, 

But you can't. The problem only occurs somewhere else: Decimal is base 
10, and there are base 10 numbers that can't be expressed exactly no 
matter how many bits you use. They're different from the numbers you 
can't express exactly in base 2 numbers, and different from the numbers 
you can't express exactly as rationals, but they're there, waiting to 
trip you up:

>>> from decimal import Decimal as d
>>> x = d(1)/d(3)  # one third
>>> x
Decimal("0.3333333333333333333333333333")
>>> assert x*3 == d(1)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
AssertionError

-- 
Steven