Expression can be simplified on list

Wed Sep 14 04:28:09 EDT 2016

On Wednesday 14 September 2016 17:16, Rustom Mody wrote:

> On Wednesday, September 14, 2016 at 10:23:12 AM UTC+5:30, Steven D'Aprano
> wrote:

>> And if somebody designed an iterator that behaved badly or surprisingly,
>> would you conclude that the entire concept of iteration is therefore broken?
>> 
>> The midnight gotcha was (it has now been fixed) a design bug of the class.
>> As simple as that.
> 
> Quite the contrary
> I showed if you remember that for regular expressions, dfas, graphs
> the questions of when is one of these falsey is highly non-trivial.

No, you *claimed* to have done so, but you actually didn't.

I'm not sure if it makes sense to talk about an "empty regular expression", or 
if that is the same as re.compile(''). Presumably if such a thing makes sense, 
it would match nothing, from any input at all. If so, that could be falsey and 
all non-empty regexes would be truthy.

Or, since regexes aren't containers *or* numbers, you might prefer that all 
regexes are truthy. I don't mind which you pick neither view is fundamentally 
better than the other, and both are compatible with Python semantics. Decide 
according to what you think is most useful. If you can't decide, toss a coin. 
Likewise for DFAs. In neither case it is "highly non-trivial".

Graphs, on the other hand, are trivial to decide: the empty or null graph, with 
no nodes at all, is falsey. All non-empty graphs are truthy. As graphs are 
containers, no other model makes sense in the context of Python.

The problem of deciding whether a value should be considered truthy or falsey 
can be as simple or complex as the language designer likes. Python has a 
particularly simple rule, which makes the decision trivial in most 
circumstances. Other languages, perhaps not so much.

On the other hand, there is at least one scenario where forcing a boolean 
notion of truthiness *is* fundamentally difficult: many-valued logic. Once you 
introduce more than two logical values ("True", "False", "Maybe" for example), 
it is hard to decide what some of the values should be treated as.

For example, True clearly maps to boolean True, and False to boolean False; but 
what does Maybe map to? To paraphrase Billy Crystal's character in "The 
Princess Bride":

    Mostly false is a little bit true

so to force the logical value "Maybe" into a True/False dichotomy will always 
require an element of arbitrary choice.

https://en.wikipedia.org/wiki/Many-valued_logic

> In more general terms:
> For complex data-types, the exact specific nature of which is ‘trivial’ may
> be a highly non-trivial question.

Hypothetically? Sure. But in practice? No.

The decision is simple: if your data type is a kind of container, then it 
should follow the rule that emptiness ≡ falsey. If it is a number, then zero is 
falsey. If it is a string of some sort, then the empty string is falsey. All 
else is truthy.

If your data type has a clear and obvious dichotomy analogous to "something 
versus nothing", "non-empty versus empty", or equivalent, then that's your 
dividing line.

If there is no such clear dichotomy, then perhaps all your values should be 
considered truthy.

In practice, it should be rare that this is a hard decision to make. In theory, 
one can state that it is hard to make this decision, but actually coming up 
with an example where it is difficult is not simple.

Perhaps the best example (apart from many-valued logic, discussed above) is in 
fact the "midnight" example. Although we use numbers to *represent* times, 
times are not themselves numbers. Unlike durations, you cannot add times: 3am 
plus 5pm is meaningless. So midnight, represented by the number zero, doesn't 
actually have the properties of number zero. It isn't the additive identity, 
nor the multiplicative nullity.

To put it another way, unlike the integer 0 or float 0.0 or complex 0+0j, 
midnight is just another time, no different from all the other times. It is an 
artefact of the implementation that it was ever falsey.

> However to extrapolate from here and believe that ALL TYPES can have a falsey
> value meaningfully, especially in some obvious fashion, is mathematically
> nonsense.

Mathematically nonsense and practically sensible. It is easy to claim that 
there are types where it is hard to decide which values should be truthy and 
which falsey, but in practice apart from many-valued logics, it is *usually* 
easy to decide.

Might there be exceptions? Of course. If you think that I've ever argued that 
there are never any exceptions, then you're arguing against a strawman. Its 
just that those exceptions are unusual: exceptional, difficult cases are 
*rare*.

(In the Python std lib, with dozens of data types, there is only *one* where 
the choice of truthiness was controversial: abstract times of the day.)

-- 
Steven
git gets easier once you get the basic idea that branches are homeomorphic 
endofunctors mapping submanifolds of a Hilbert space.