[4suite] Re: [XML-SIG] problem whith the error " ImportError: cannot import name boolean"

28 Jul 2002 00:10:59 -0600

On Sat, 2002-07-27 at 23:56, Uche Ogbuji wrote:
> On Sat, 2002-07-27 at 17:42, Mike Olson wrote:
> > On Fri, 2002-07-26 at 23:52, Uche Ogbuji wrote:
> > > On Fri, 2002-07-26 at 13:49, Mike Olson wrote:
> > > > On Fri, 2002-07-26 at 12:00, Uche Ogbuji wrote:
> > > > > 
> > > > > Can't we fix this with an import hook inside Ft/Lib/__init__.py?
> > > > > 
> > > > > I'm sure that's not clear, so I'll do a bit of experimentation and
> > > > > report back.
> > > > 
> > > > 
> > > > Nope, unless we move DistExt.
> > > > 
> > > > The problem is that in setup.py we do "from Ft.Lib import DistExt".  If,
> > > > in Ft/Lib/__init__.py you mess with boolean, which does not exist at
> > > > install time, then we won't be able to install.
> > > 
> > > Was this meant to be a challenge?  No better way to get me to fix the
> > > problem  :-)
> > > 
> > 
> > Oh yah, XSLT and RDF are slow and I don't think these are fixable :)
> 
> I disagree.  They are quite "fixable", but I'm personally not interested
> in any more non-trivial optimizations to this generation of XSLT and
> RDF.  I'm looking forward to 4Suite 1.0 so that we can refactor the lot.

Just to point out what I know about optimizing XSLT and RDF, in case
someone else is feeling extraordinarily ambitious:

When I was putting a lot of work into optimizing Versa at Client behest,
and seeing a great deal of result, I would occasionally notice parallel
optimizations that could be made in XPath.  We do very little
intelligent truncation of result sets in XPath, and very little caching
of partial results.  Implementing such optimizations would make a
dramatic difference.

As for RDF: there are several matters.  First of all, our parsing is
very slow and inefficient because it is based on DOM.  If we re-wrote
the parser to use SAX (according to my analysis RDF is actually quite
amenable to SAX parsing, and this is what Redfoot does, after all), we'd
have a lot to reap.  We could also go straight to low-level expat
interface, of course.

We could also implement statements and objects as C extensions, which
would help speed and space

Also, we make a lot of round trips to the DBMS in some cases.  By using
PsycoPG's ability to aggregate querites, we could glean significant
speed-ups.  This could be combined with writing stored procs for certain
aggregated queries.

Finally, I think based on Postgres C API, we could write an
RDF-specialized indexer with a moderate amount of effort that would
yield dramatic results.  I already had to shut off our index on objects
because it's really easy to blow PG's maximum text index block
limitations, and in any case, PG does not use indexes if the tables have
entries beyond a certain length.  It was just slowing down our manip
with no gain in query speed.  Starting from the full-text
Intermedia-like-thingie that comes in PG's contrib bundle, I think we
could write something very spiffy for RDF.

But as I said, I think all such effort should be spent on the next
generation.  My personal biggest interest for 4Suite now is completeness
and stability.

-- 
Uche Ogbuji                                    Fourthought, Inc.
http://uche.ogbuji.net    http://4Suite.org    http://fourthought.com
Track chair, XML/Web Services One Boston: http://www.xmlconference.com/
Basic XML and RDF techniques for knowledge management, Part 7 -
http://www-106.ibm.com/developerworks/xml/library/x-think12.html
Keeping pace with James Clark -
http://www-106.ibm.com/developerworks/xml/library/x-jclark.html
Python and XML development using 4Suite, Part 3: 4RDF -
http://www-105.ibm.com/developerworks/education.nsf/xml-onlinecourse-bytitle/8A1EA5A2CF4621C386256BBB006F4CEC