Encapsulation, inheritance and polymorphism

Wed Jul 18 08:01:06 EDT 2012

Am 18.07.2012 11:06, schrieb Lipska the Kat:
> On 18/07/12 01:46, Andrew Cooper wrote:
>> Take for example a Linux system call handler.  The general form looks a
>> little like (substituting C for python style pseudocode)
>>
>> if not (you are permitted to do this):
>>      return -EPERM
>> if not (you've given me some valid data):
>>      return -EFAULT
>> if not (you've given me some sensible data):
>>      return -EINVAL
>> return actually_try_to_do_something_with(data)
>>
>> How would you program this sort of logic with a single return statement?
>>   This is very common logic for all routines for which there is even the
>> remotest possibility that some data has come from an untrusted source.
>
> Eeek! well if you insist (type bound)
>
> someType -EPERM
> someType -EFAULT
> sometype -EINVAL
> someType -EDOSOMETHING
>
> //method
> someType checkSomething(data){
>
> someType result = -EINVAL //or your most likely or 'safest' result
>
> if not (you are permitted to do this):
>        result = -EPERM
> if not (you've given me some valid data):
>        result = -EFAULT
> if not (you've given me some sensible data):
>        result = -EINVAL
> else
>        result = -EDSOMETHING
>
>        return result
> }
> //cohesive, encapsulated, reusable and easy to read

This is a classic discussion topic, whether single exit (SE) functions 
should be used or not. There are two things I consider as problematic 
with them:
1. In the presence of exceptions, every function has at least two 
possible paths that can be taken out, one returns a value (or None, in 
Python), the other throws an exception. For that reason, trying to 
achieve SE is a dangerous illusion. The syscall handler above is C, 
which doesn't have exceptions like Java, C++ or Python, so it doesn't 
suffer those two paths.
2. The biggest problem with SE functions is that you often need to skip 
over lots of code before you finally find out that the fault at the very 
beginning causes nothing else to happen inside the function before it is 
finally returned to the caller. A typical symptom is deeply nested 
if-else structures. Another symptom is result variables that are checked 
multiple times to skip over effectively the rest of the function, which 
"unrolls" the nested if-else structures. Yet another symptom is a very 
fine granularity of microscopic functions, which is effectively a 
distributed nest of if-else structures.

Coming back to Python, this would look like this:

    if not /you are permitted to do this/:
        raise NotPermitted("go awai!")
    if not /you've given me valid data/:
        raise TypeError("unexpected input")
    if not /you're given me sensible data/:
        raise ValueError("invalid input")
    # do stuff here...

If you shoehorn this into an SE function (which you can't do if 
something in between might throw), then it probably looks like this:

    error = None
    if not /you are permitted to do this/:
        error = NotPermitted("go awai!")
    elif not /you've given me valid data/:
        raise TypeError("unexpected input")
    elif not /you're given me sensible data/:
        raise ValueError("invalid input")
    else:
        # do stuff here...
    if error:
        raise error
    else:
        return result

> //later
>
> if(checkSomething(data) == EDOSOMETHING){
>
>      actually_try_to_do_something_with(data)
> }
> else{
>      //who knows
> }

Interestingly, you suggest to divide the original function into one that 
verifies some conditions and one that does the actual work. Using an 
early return is to me like drawing a big red line inside a function by 
which it can be split into two sections. This makes it IMHO equally 
clear, even clearer since I don't have to locate and read that other 
function. My bioware parses this so that if the first part succeeds, the 
second part can be read independently thereof, which reduces the amount 
of info to keep in mind at a time.

Also, when changing code, I don't have to locate other places where the 
utility function (checkSomething) is called (Python allows local 
functions, which can be very(!!) useful). Since the code is inline, I 
know that only this one function is affected. Yes, this is in direct 
contrast to the reusability you mentioned. Neither ease of change nor 
reusability are goals in and of themselves though, so this is not a 
black-or-white question and a compromise can be good enough. It's a 
question of taste, experience, phase of the moon, coffeination levels etc.

:)

Uli