[Tutor] recursive problem

Steven D'Aprano steve at pearwood.info
Sat Sep 11 19:18:19 CEST 2010


On Fri, 10 Sep 2010 09:30:23 pm Ewald Horn wrote:

> While EAFP is great, it's not
> always the best way to proceed. 

That, at least, is correct. But what you say next is not:

> Sometimes you want programs to be 
> very robust and secure, this is where LBYL comes in - it is quite
> often used in online transaction processing and other areas where
> absolute certainty is more important than any other consideration.

If what you are saying is correct, and I doubt seriously that it is, 
then chances are good that they're not succeeding in their aim.

> EAFP tends to use less code and is faster to use, while LBYL
> principles makes your program more bulletproof.

That's almost 100% backwards.

Let's take a simple example: you want to open a file, and deal with the 
case of the file being missing:

filename = 'myfile.txt'  # whatever...
fp = open(filename)
do_something_with(fp)

Let's add some error checking. With EAFP, you get this:

try:
    fp = open(filename)
except IOError:
    handle_error()


With LBYL, you get:

if os.path.exists(filename):
    fp = open(filename)
else:
    handle_error()


The amount of code is about the same, but the try...except block 
automatically handles a whole slew of errors -- missing files, 
permission denied, bad file names, corrupt disks, all sorts of things 
that would be difficult or tedious to Look Before You Leap. Some of 
these things -- like disk corruption -- you simply can't check ahead of 
time. There is no way of knowing if a file is corrupt without actually 
opening and/or reading from it.

It gets worse. Your computer is a multitasking system. Virtually all 
computers are these days, yes, even the iPhone. Even if os.path.exists 
returns True, there is no guarantee that the file will still be there a 
millisecond later when you try to open it. Perhaps the operating 
system, or some other process, has deleted the file or renamed it. 
That's a bug waiting to happen -- a "race condition".

So if you're silly, you write this:

if os.path.exists(filename):
    try:
        fp = open(filename)
    except IOError:
        handle_error()
else:
    handle_error()

If you're sensible, you realise that for reliable, secure code, checking 
for existence *before* opening the file is a waste of time and energy. 
It's barely acceptable for quick and dirty scripts, certainly not for 
high reliability applications.

This is not the only sort of race condition. Imagine you're writing one 
of these high reliability online transactions you talked about, and you 
want to transfer money from one account to another:

amount = 1000.00
if balance >= amount:
    transfer(old_account, new_account, amount)
else:
    insufficient_balance()


Wait a second... that looks almost exactly like the LBYL code above, and 
it is vulnerable to the same sort of race condition if multiple 
processes can connect to the account at the same time. Does your bank 
allow you to log in twice? Does it have automatic transfers? If so, 
then one process can be transferring money while the other is checking 
the balance, and you have a bug waiting to happen.

In practice, the banks allow accounts to become temporarily overdrawn, 
and often charge you for the privilege. And they write complicated code 
that looks like this:


lock_id = lock_account()  # Stop anything else from transferring funds.
while lock_id == 0
    # Lock failed, wait a second and try again.
    time.sleep(1)
    lock_id = lock_account()
    if number_of_attempts() > 10:
        handle_error("internal error, please try again")
# Now it's safe to check the balance.
if balance >= amount:
    transfer(old_account, new_account, amount, lock_id)
else:
    insufficient_balance()
# Don't forget to unlock the account, or there will be trouble later!
errcode = unlock_account(lock_id)
if errcode != 0:
    # This should never happen. If it does, it might mean the lock ID 
    # is incorrect (how?), but probably means the database is corrupt.
    log_serious_error(errcode, lock_id)


It's ugly and error-prone, but it's also a kind of EAFP: instead of 
checking whether a lock is available, and then taking it, you just try 
to acquire a lock, and deal with the consequences of not receiving one 
if it fails. The only difference is that you're manually checking an 
error code rather than catching an exception.

Whatever mechanism is used for EAFP, it is most often shorter, simpler, 
more reliable and safer than LBYL.

So why would anyone ever use LBYL? Well, sometimes it is more convenient 
for quick and dirty scripts, such as using os.path.exists. But more 
importantly, sometimes you need a transaction to apply in full, or not 
at all. You can't do this:

try:
    do_this()
    do_that()
    do_something_else()
except Exception:
    do_error()


because if do_this() succeeds and do_that() fails, you might leave your 
data is a seriously inconsistent or broken state. You could do this:


failed = False
save_state()
try:
    do_this()
    try:
        do_that()
        try:
            do_something_else()
        except Exception:
            rollback()
            failed = True
    except Exception:
        rollback()
        failed = True
except Exception:
    rollback()
    failed = True
if failed:
    do_error()


Or you could do this:

if do_this_will_succeed() and do_that_will_succeed() \
and do_something_else_will_succeed():
    do_this()
    do_that()
    do_something_else()
else:
    do_error()

But that hasn't done anything to prevent race conditions. So the real 
reason people use LBYL is that they're too lazy to write hideously 
ugly, but reliable, code, and they're just hoping that they will never 
expose the race condition. (Often this is a pretty safe hope, but not 
always.)

And now you know why ACID-compliant databases are so complex.



-- 
Steven D'Aprano


More information about the Tutor mailing list