[XML-SIG] Ugh! Why are DOM access methods spelled with a leading '_'?

Uche Ogbuji uogbuji@fourthought.com
Tue, 27 Jun 2000 08:23:48 -0600


> Jim Fulton wrote:
> 
> > > In 4DOM, we are actually moving away from __getattr__ (for speed).
> > 
> > IMO, this is strong evidence that the Python DOM should
> > *not* use attributes for implementing the DOM/IDL attributes
> 
> Direct attribute access is MUCH, MUCH faster than a method call, whethr
> through getattr or not. Under the current implementation we have the
> option of caching attributes for performance. An all method design would
> take that away. Minidom uses direct attributes for 95% of what it does.
> I think I used one getter method to lazily evaluate attributes.


These are the optimizations we've been using, and 4DOM has gained tremendously 
in speed.  Funnily enough, the biggest remaining speed barrier is the fact 
that we also have to support method access.  This makes it very attractive to 
simply use attribute access, simplify our code and then get a spped boost.

There are other structural optimizations we're looking at (and one 
deserialization optimization Lars suggested), but all else has nothing to do 
with attribute access.


> > I'd like it to be as easy as possible for various objects to implement
> > the DOM. (See for example StructuredTextNG.) I'd hate to make implementers
> > go through the pain and performance hit of getattr or dictate an implementation
> > (like caching attributes or otherwise directly storing them, creating
> > memory leaks).
> 
> I guess the question is who do we cater to? Heretofore it has been DOM
> users first, DOM implementors second. I don't think that we should turn
> that around based on the argument that all Python objects will have a
> DOM interface soon.


That was my feeling.  Yes, it's easier to code as methods, but we should be 
considering making things easier for users, even if implementors have to jump 
through hoops.


> To me, this looks like Python:
> 
> a=b.childNodes[0].attributes["abc"]
> 
> and this looks like Java:
> 
> a=b.getChildNodes()[0].getAttributes()["abc"]
> 
> The second grates on me as having interface enforced because of
> implementation limitations. (which is what all of this griping about
> getattr being slow boils down to...isn't it better to fix that problem
> once, for everyone than to work around it a hundred times?) It also
> drives me crazy that the latter always invokes a method call even when
> it is stored underneath as a simple attribute.
> 
> Surely there is some imaginative way to make life easier for your
> implementors using base classes. For instance, wouldn't it be nice for
> you to automatically set up the attributes list based on Python
> attributes? Something like:
> def __getattr__( self, name ):
> 	if name=="attributes":
> 		keys=self.__dict__.keys()
> 		values=map( str, self.__dict__.values()
> 		return JimsAttributeList( keys, values )
> 


We've done exactly this by making Node a sort of "attribute manager".  It does 
make implementation much easier, and has led to a good deal of flexibility, 
for instance, when we released a ZDOM version, we had to change little more 
than Node.

This works for us even through the long inheritance line from, say Node -> 
HTMLTableElement

We also ourselves pretty much exclusively use attribute access on 4DOM because 
it's cleaner and faster.

-- 
Uche Ogbuji                               Principal Consultant
uche.ogbuji@fourthought.com               +01 303 583 9900 x 101
Fourthought, Inc.                         http://Fourthought.com 
4735 East Walnut St, Ste. C, Boulder, CO 80301-2537, USA
Software-engineering, knowledge-management, XML, CORBA, Linux, Python