A modest indentation proposal

Mon Dec 3 00:20:12 EST 2001

On Sat, 1 Dec 2001 16:28:26 -0800 (PST), brueckd at tbye.com <brueckd at tbye.com>
wrote:
>On 1 Dec 2001, Kragen Sitaker wrote:
>
>> I'm interested to hear what you find to be common causes of problems
>> when you're programming in Python.
>
>Here's some long-winded ones off the top of my head:
>- I've yet to find a really good way to organize modules and functions
>I've written and know I'll want to use again later. If there's something
>annonying or missing in Python then usually a small function or module
>will fix that, but then I end up littering my home and work computers with
>various copies of different versions of that same thing, where some are
>accessed by being in the same directory as whatever program I'm working
>on, others by making sure the Python path includes some dump-all
>directory, and others by modifying the Python path at runtime. Maybe the
>solution is sort of a private CPAN-like thing, but I've never gotten
>around to working on it.

I have a 'local' package, with various sub-modules.  Yes, versioning is a
problem.  I edit a local copy of the package, and have a script that runs the
unit tests, increments a 'version' file, and copies my local copy to the
system library directory.  The unit tests hopefully catch any changes that
could break existing code.

The question of where to put them arises for dynamic libraries in any
language.  If they're truly global, site-packages works, otherwise I hard code
sys.path.insert(0, '/home/dir/lib/py').  Playing games with symlinks might be
wise if you don't trust your home dir to stay in the same place, but I can't
think of any better solutions under unix.

Under plan9 the problem is nicely solved by unioning $home/lib/python onto
/sys/lib/python but that doesn't help the unix people.

>- Too often I fail to use neat Python idioms, instead relying on using C,
>C++, Java, etc. techniques in Python. I'm anxious to read anyone's books
>on Python "patterns" or programming "gems" as they represent potentially
>untapped power.

Hmm, can't think of whether I do this or not, but I'm no expert.  I do know
there are a lot of clever dynamic things you can do in python, but with the
exception of a few gettattr()s and setattr()s I've almost always regretted
using them.

For example, I'm often tempted to use __getattr__ for delegation or lazy
evaluation or plain old getters/setters.  But (IMO) language flaws make
__getattr__ and __setattr__ very hard to use, because of their interaction
with all the __magic__ attributes.  And then you throw in built-in types that
don't have the magic attributes in the first place even though they support
the operations (len() for instance).  You can seriously mess up your day by
having __del__ and __getattr__ within 15 feet of each other.

__del__ is understandably dangerous magic, but the other two are so useful I'm
continually tempted to use them.  (minor complaint about __del__: it gets
called if __init__ throws, so be careful about unlocking resources that were
maybe never locked---there's no way to tell if they were or not).

The latest changes in 2.2 do a lot to address this, I think.  At the cost of
two kinds of classes and method lookup rules and a semi smalltalk style single
rooted hierarchy along with the old style len(x) str(y) because of no common
ancestor.

And we already have more than enough magic method names, but there are more
every day, I can barely keep up with them.  __all__, __dynamic__, __eq__ vs.
__cmp__, __help__ __I'm__ __losing__ __my__ __mind__.

Since new magic attributes seem to pop up every day, what about existing
__getattr__-using code that had all the holes patched up, but is now going to
explode because suddenly python wants to look up an __eq__ method where it
never did before?

>- I sometimes abuse dynamic types by letting function parameters take on
>too large a set of meanings. One IMO beautiful example of where this works
>well and is ok is in asynchat, where you tell the module to read data from
>a socket until some terminator is reached. If the terminator is None, it
>reads all data, if the terminator is a number, it means read that many
>bytes, and if it's a string, it means read until that token is found in
>the data. Unfortunately, in my own code there's rarely that level of
>cohesion among the different meanings of a parameter.

Hmm, try this rule of thumb: avoid type().  Distinguishing between None and a
value is very handy and valid in a reference oriented language.  But if you
use type() only when you really mean it, you shouldn't have trouble with
too-clever guess-what-you-mean functions.