Are the critiques in "All the things I hate about Python" valid?

Chris Angelico rosuav at gmail.com
Sat Feb 17 16:02:10 EST 2018


On Sun, Feb 18, 2018 at 5:05 AM, Steven D'Aprano
<steve+comp.lang.python at pearwood.info> wrote:
> On Sat, 17 Feb 2018 15:25:15 +1100, Chris Angelico wrote:
>
>> 1) Type safety.
>>
>> This is often touted as a necessity for industrial-grade software. It
>> isn't. There are many things that a type system, no matter how
>> sophisticated, cannot catch;
>
> The usual response to that is to make ever-finer-grained types, until the
> type-system can prove the code is correct.
>
> integers
> positive integers
> positive integers greater than 10
> positive integers greater than 10 but less than 15003
> positive odd integers greater than 10 but less than 15003
> positive odd integers greater than 10 but less than 15003 divisible by 17
>
> Of course, this has a few minor (ha!) difficulties... starting with the
> hardest problem in computer science, naming things.

Naming things isn't a problem if we're working with a type inference
system. On the flip side, if your last example is purely type
inference, it's not really a type checking system - it's a holistic
static analysis. You can't say "TypeError: spamminess must be less
than 15003" without also saying "oh but that might be a bug in this
function, since it's meant to be able to take numbers >= 15003".

Some of the type names CAN be generated algebraically. For instance,
Pike lets you declare that something is an "int", or "array(int)" (an
array of integers), or "int(1..)" (integer, minimum of 1, no maximum -
in other words, positive integer), or "array(int(1..))" (yup, array of
positive integers). You could probably devise a system like
"int(11..15002|1%2|0%17)" to mean "must be between 11 and 15002, and
must equal 1, modulo 2, and must equal 0, modulo 17". I'm not sure how
often it would be of value, though, and it's pretty ugly.

> Even if you can come up with unique, concise names for these types that
> won't overwhelm the reader, it isn't clear that the type system will
> always be capable of representing such fine distinctions. How would you
> specify two string types, one for personal names and one for family
> names, so that the compiler can detect any attempt to assign a family
> name to a personal name, or vise versa?

That's where the type system breaks down and the variable naming
system shines. My favourite example here is of collections (which
should be named in the plural) and their elements (which usually won't
be). For instance:

for msg in msgs:
for person in people:
for character in disney_princesses:
for item in recipe.ingredients:

No type system can reliably figure out that "msg" is singular and
"msgs" is plural. And the concrete data types might even be identical
("msgs" could be a dict mapping message IDs to their actual messages,
and then "msg" could be a dict mapping headers to their values - "for
msg in msgs.values():" would thus iterate through one dict, yielding
other dicts), even though *to the programmer* they are completely
different, so type inference would need a lot of help.

>> for some reason, though, we don't hear
>> people saying "C is useless for industrial-grade software because it
>> doesn't have function contracts".
>
> You obviously don't speak to Eiffel programmers then :-)

True, I don't, but I'm not surprised there are people who think that
way. But how many people write blog posts like the one that sparked
this thread, clickbaitingly describing C as useless for serious work?

>> Anyway, if you want some system of type checking, you can use static
>> analysis (eg tools like MyPy) to go over your code the same way a
>> compiler might.
>
> Indeed. Despite our criticisms of the *attitude* that static typing is a
> panacea, it must be recognised that it is useful, and the bigger the
> project, the more useful it is. And some type checkers are *very*
> impressive. Google for "compiler found my infinite loop" for a classic
> example of a compiler detecting at compile-time than a while loop would
> never terminate.

Exactly. Though there's a bit of a blurring now between "type
checking" and "holistic static analysis". I've seen some incredible
discoveries by Coverity; extremely narrow situational bugs where, if
this happens and that fails and thingy was exactly 47, then the
response message might use one byte more space than its buffer. That's
pretty useful and seriously impressive, but I'm terrified of any sort
of "type system" that could actually give a NAME to the data type that
shows up this bug.

> I can understand people saying that for sufficiently large projects, they
> consider it indispensable to have the assistance of a type checker. That
> in and of itself is no worse than saying that, as a writer, I find a
> spell checker to be indispensable.

Hmm, I do think there are a lot of people who take lessons learned on
gigantic projects with huge contributor teams, and then say "EVERY
program needs these tools". When you write a 200-page book, you might
find an automated table of contents to be, not simply a useful tool,
but an absolute necessity. Great! But when you write a one-screen
README, that TOC creator is useless, along with the boilerplate in
your document to tell it what to do.

(Can you imagine adding a type checker to bash scripting?)

>> "The first glaring issue is that I have no guarantee that is_valid()
>> returns a bool type." -- huh? It's being used in a boolean context, and
>> it has a name that starts "is_". How much guarantee are you looking for?
>> *ANY* object can be used in an 'if', so it doesn't even matter. This is
>> a stupidly contrived criticism.
>
> I don't think so -- I think a lot of people really have difficulty coming
> to terms with Python's duck-typing of bools. They just don't like, or
> possibly even grok, the idea of truthy and falsey values, and want the
> comfort of knowing that the value "really is" a True or False.

Okay, but even if you don't grok the truthy/falsey concept, it says
"is_valid". Unless you're expecting outright MALICIOUS code, you
should be able to assume that "is_valid" returns a boolean.

Oh wait. We're talking about programmers here. Malicious has nothing
on the rampant stupidity...

https://thedailywtf.com/articles/What_Is_Truth_0x3f_

Still, it's not a fault of *this* function if it expects the normal
case. It's no worse to expect is_valid to return a boolean (or
something usable in a boolean context) than to expect math.sqrt to
return a non-negative number. Sure, sqrt might have a bug in it so it
returns a negative... but that's not your problem.

> We can come up with some contrived justifications for this... what if
> is_valid() contains a bug:
>
> def is_valid(arg):
>     if some condition:
>         return arg  # oops I meant True
>     return False
>
> then static analysis would detect this. With truthiness, you can't tell:
> what if *nearly* all the input args just happen to be truthy? Then the
> code will nearly always work, and the errors will be perplexing.

Right, and that's either a bug, or a design flaw (maybe the returning
of 'arg' was intentional, because the author thought that ALL these
args were truthy). That's fine. The name implies that it's returning a
boolean, so you can look at this function *on its own* and pinpoint
the (potential) problem.

> But I consider that a fairly contrived scenario, and one with at least
> two alternate solutions: code review, and unit tests.

Exactly. Also, this sort of thing DOES happen, so if ever you add a
__bool__ method to an object, consider the implications.

> But still, I do see the point that a static analyser could have picked up
> that error even if you didn't have code review, even if the person
> writing the unit tests never imagined this failure mode.

Perhaps. On the flip side, unless you rigidly demand that "if"
statements must ONLY operate on the two values True and False, you
still won't detect these problems at the calling site. You might
detect them inside is_valid (if you declare that it'll return bool,
and then return something that's not a bool, poof, error), but in the
original complaint, type checking can't help without straitjacketing
conditionals.

>> Totally not true. The GIL does not stop other threads from running.
>> Also, Python has existed for multiple CPU systems pretty much since its
>> inception, I believe. (Summoning the D'Aprano for history lesson?)
>
> If you're talking about common desktop computers, I think you're
> forgetting how recent multicore machines actually are. I'm having
> difficulty finding when multicore machines first hit the market, but it
> seems to have been well into the 21st century -- perhaps as late as 2006
> with the AMD Athelon 64 X2:

No, I'm talking about big iron. Has Python been running on multi-CPU
supercomputers earlier than that?

> By the way, multiple CPU machines are different from CPUs with multiple
> cores:
>
> http://smallbusiness.chron.com/multiple-cpu-vs-multicore-33195.html

Yeah, it was always "multiple CPUs", not "multiple cores" when I was
growing up. And it was only ever in reference to the expensive
hardware that I could never even dream of working with. I was always
on the single-CPU home-grade systems.

> Certainly though there have been versions of Python without a GIL for a
> long time:
>
> Jython started as JPython, in 1997; IronPython was started around 2003 or
> so, and reached the 1.0 milestone in 2006.
>
> Fun fact: (then) Microsoft engineer Jim Hugunin created both JPython and
> IronPython!

Yes. I'm not sure how Jython handles concurrency; I'm totally in the
dark about IronPython. I suspect both of them let the underlying
system handle it, but that doesn't help me as I don't know that
either. How well do they handle the "two threads spinning,
incrementing the same global" stress test? I doubt they'll improve the
efficiency.

ChrisA



More information about the Python-list mailing list