Functional programming

Tue Mar 4 03:56:33 EST 2014

On Tue, 04 Mar 2014 17:04:55 +1100, Chris Angelico wrote:

> On Tue, Mar 4, 2014 at 4:35 PM, Steven D'Aprano <steve at pearwood.info>
> wrote:
>> On Tue, 04 Mar 2014 05:37:27 +1100, Chris Angelico wrote:
>>> x = 23 # Compiler goes: Okay, x takes ints. x += 5 # Compiler: No
>>> prob, int += int --> int x = str(x) # Compiler: NO WAY! str(int) -->
>>> str, not allowed!
>>>
>>> It's fine and correct to infer that x is an int, x is an int, x is a
>>> str. It's *not* okay to make the third line a SyntaxError because you
>>> just put a str into an int variable.
>>
>> It won't be a Syntax Error, it will be a compile-time Type Error. And,
>> yes, it is fine. That's the point of static typing! The tradeoff of
>> being able to detect a whole lot of errors *at compile time* is that
>> you give up the ability to re-use the same variable for different types
>> in a single scope. (You can have an x which is a string elsewhere, just
>> not in this scope where it is an int.)
> 
> Okay, a compile-type type error, same difference. What I'm saying is
> that the auto-detection can't know what else you plan to do. 

Obviously it can't see the code you haven't written yet, but it can see 
what you *do* do.

> If you
> explicitly say that this is an int, then yes, that should be disallowed;

It's that "explicitly" part that doesn't follow. Having to manage types 
is the most tedious, boring, annoying, *unproductive* part of languages 
like Java, C and Pascal. Almost always, you're telling the compiler stuff 
that it can work out for itself.

In the same way that managing jumps for GOTO has been automated with for 
loops, while, etc., and managing memory has been automated, there's no 
good reason not to allow the compiler to manage types. Dynamically typed 
languages like Python do so at runtime. Type inference simply allows 
statically typed languages to do the same only at compile time.

[...]
>> That Haskell has homogeneous lists is not a property of the type
>> system, but a design choice. I'm sure Haskell will also have a tuple or
>> record type that allows fields of different types.
> 
> If it's not the list type, pick some other. It's not uncommon to want to
> have a record that has different types (C does this with struct, C++ has
> a few more ways to do it); what I'm finding odd is that whatever goes
> into it first is specifying for everything else.

That's because in Haskell the design was made that lists *must* be used 
for homogeneous data. If you read Python documentation from back in the 
1.5 and early 2.x days, there was a *very* strong recommendation that 
lists be used for homogeneous data only and tuples for heterogeneous 
data. This recommendation goes all the way up to Guido.

# Yes
[1, 3, 4, 2, 5, 9]
(1, "hello", None, 3.5)

# No
[1, "hello", None, 3.5]

That is, lists are for collections of data of arbitrary length, tuples 
are for records or structs with dedicated fields.

That convention is a bit weaker these days than it used to be. Tuples now 
have list-like methods, and we have namedtuple for record/struct-like 
objects with named fields. But still, it is normal to use lists with 
homogeneous data, where there is an arbitrary number of "things" with 
different values, but all the same kind of thing.

In the case of Haskell, that's more than a mere convention, it's a rule, 
but that's not terribly different from (say) Pascal where you can have an 
array of integer but not an array of integer-or-real.

The thing is though, how often do you really have a situation where you 
have a bunch of arbitrary data, or unknown length, where you don't know 
what type of data it is? Sure, in the interactive interpreter it is 
useful to be able to write

[1, "spam", None, [], {}, 0.1, set()]

and I write unit tests with that sort of thing all the time:

for obj in list_of_arbitrary_objects:
    self.assertRaises(TypeError, func, obj)

kind of thing. But that doesn't have to be a *list*. It just needs to 
have convenient syntax.

>> I have not used Haskell enough to tell you whether you can specify
>> subtypes. I know that, at least for numeric (integer) types, venerable
>> old Pascal allows you to define subtypes based on integer ranges, so
>> I'd be surprised if you couldn't do the same thing in Haskell.
>>
>> The flexibility of the type system -- its ability to create subtypes
>> and union types -- is independent of whether it is explicitly declared
>> or uses type inference.
> 
> I'm not sure how you could have type inference with subtypes. How does
> the compiler figure out what subtype of integers is acceptable, such
> that it can reject some?

You seem to be under the impression that type inference means "guess what 
the programmer wants", or even "read the programmer's mind". Take this 
example:

> x = 5
> x = 7
> x = 11
> x = 17
> x = 27
> 
> Should the last one be rejected because it's not prime? How can it know
> that I actually wanted that to be int(3..20)? 

It can't, of course, any more than I could, or anyone other than you. But 
if you asked a hundred people what all those values of x had in common, 
93 of them would say "they're all integers", 6 would say "they're all 
positive integers", and 1 would say "they're all positive odd integers".

[Disclaimer: percentages plucked out of thin air.]

No type system can, or should, try to guess whatever bizarre subtype you 
*might* want. ("Only numbers spelled with the letter V.") If you want 
something stupid^W weird^W unusual, it's your responsibility to work 
within the constraints of the programming language to get that, whether 
you are using Python, Pascal or Haskell.

> That's why I see them as
> connected. All sorts of flexibilities are possible when the programmer
> explicitly tells the compiler what the rules are.

And you can still do that, *when you need to*. Assuming the type system 
has a way of specifying "integers that include the letter V", you can 
specify it when you want it. 

> Static and dynamic typing both have their uses. But when I use static
> typing, I want to specify the types myself. I'm aware that's a matter of
> opinion, but I don't like the idea of the compiler rejecting code based
> on inferred types.

Well, so long as you admit it's an irrational preference :-)

The bottom line is, if the compiler rejects code, it's because it has a 
bug. There's *no difference* between the compiler telling you that you 
can't add a string and an int when you've explicitly declared the types, 
and the compiler telling you that you can't add a string and an int when 
it has determined for itself that they are the types because that's all 
that they can be.

-- 
Steven