Are the critiques in "All the things I hate about Python" valid?

Steven D'Aprano steve+comp.lang.python at pearwood.info
Mon Feb 26 22:25:35 EST 2018


On Mon, 26 Feb 2018 17:18:38 -0800, Rick Johnson wrote:

[...]
> So, for instance: if your birthday is January 25th 1969, the last second
> of the last day of your _first_ year is January 24th 1970 @ 11:59:59PM.
> And the last second of the last day of your _second_ year is January
> 24th 1971 @ 11:59:59PM. And so forth...
> 
> Does this make sense?

Indeed it does, and frankly, the Racehorse scheme is better.

At least with the Racehorse scheme, you only need to update the database 
once a year, at midnight on the new year, which hopefully is the quietest 
time of the year for you. You can lock access to the database, run the 
update, and hopefully be up and running again before anyone notices 
anything other than a minor outage.

With your scheme, well, I can think of a few ways to do it, none of which 
are good. A database expert might be able to think of some better ideas, 
but you might:

1. Run a separate scheduled job for each record, which does nothing but 
advance the age by one at a certain time, then sleep for a year. If you 
have ten million records, you need ten million scheduled jobs; I doubt 
many scheduling systems can cope with that many jobs. (But I welcome 
correction.)

Also, few scheduling systems guarantee that jobs will execute at 
*precisely* the time you expect. If the system is down at the time the 
job was scheduled to run, they may never run at all. So there is likely 
to be a lag between when you want the records updated, and when they 
actually are updated.

No, using scheduled jobs is fragile, and expensive.


Plan 2: have a single job that does nothing but scan the database, 
continuously in a loop, and if a record's birthdate is more than a year 
in the past, and hasn't been updated in the last year, update the age by 
one.

Actually, I think this sucks worse than the ten-million-scheduled-jobs 
idea. Hopefully it will be obvious why this idea is so awful.


Plan 3: have a trigger that runs whenever a record is queried or 
accessed. If the birthdate is more than a year in the past, and it has 
been more than a year since the last access, then update the age.

This at least doesn't *entirely* suck. But if you're going to go to the 
trouble of doing this on *every* access to the record, isn't it simpler 
to just make the age a computed field that calculates the age when needed?

The cost of computing the age is not that expensive, especially if you 
store the birthdate in seconds. It's just a subtraction, maybe followed 
by a division if you want the age in years. It hardly seems worthwhile 
storing the age as a pre-computed integer if you then need a cunning 
scheme to possibly update that integer on every access to the record.

I think that Rick's "optimization" here is a perfect example of 
pessimisation (making a program slower in the mistaken belief that you're 
making it faster). To quote W.A. Wulf:

"More computing sins are committed in the name of efficiency (without 
necessarily achieving it) than for any other single reason — including 
blind stupidity."



-- 
Steve




More information about the Python-list mailing list