[XML-SIG] Replacing a Java tool-chain with 4Suite?

Mike C. Fletcher mcfletch@rogers.com
Thu, 16 Jan 2003 16:47:19 -0500


This is a multi-part message in MIME format.
--------------090202050003080909060208
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit

Well, I think I've got a functional Python-coded transform (attached for 
those following along at home).  It doesn't yet do the setting of the 
"PyOpenGL.version" attribute because I don't see where the xsl version 
is getting that from.  As I worked on it, however, I noticed a fairly 
strange effect which would seem relevant to the multi-hour running time 
of the 4xslt version.

My original solution, based loosely on the original xsl, was to do 
something along these lines:

    find all "refentry" nodes in source, add to a mapping from 
refentry.id -> node
    for each refentry in dest (the template):
        search for nodes within refentry //*[@condition='replace'] using 
an XPath query
        search for nodes within the corresponding source refentry with 
the same nodeType and id attribute value using another XPath query.
        replace the first with the second

That worked (for the first entry at least), but it was human-time slow 
(I'd guess about 5-10 seconds).  From what I could tell, creating a 
context with a sub-node (not the document root) wasn't restricting the 
search to the sub-node, that is:

    Compile( ""//*[@condition='replace']"").evaluate( Context.Context( 
base, GetAllNs(base)))

where base is a sub-node of the document wasn't restricting the search 
to base and its children, but was instead searching the whole document. 
 (Or, for some reason, was unbelievably slow in searching the sub-set). 
 If this is a general "feature", I can imagine the original xsl, which 
does at least 1 selection query per refentry (there are 325 of those) 
was bogging down in that.  Don't really know (shrug). (Here's the line 
from the xsl that makes me think it's doing that query):

    <xsl:variable name="orig" 
select="$original//refmeta[@id=current()/@id]"/>

My "solution" was to exploit a characteristic of the particular 
documents in that the replacement IDs are actually globally unique, so I 
can just do a straight mapping from id:originalnode instead of touching 
the refentry nodes at all.

I _think_ that using ".//*[@condition='replace']" as the xpath might do 
the restrictions, but haven't found anything to back up the idea other 
than the original xsl source.

BTW, Uche, your tutorials were of great help to me in getting this 
working. Thanks.

Still haven't tried to convert to HTML yet, that's the next project.
Enjoy,
Mike


Uche Ogbuji wrote:
...

>>Given the simplicity of the transformation in this case (just a merge by 
>>section name!) I may write the darned thing in Python to save time (I 
>>started out around 22 hours ago saying to myself "oh, guess I should 
>>regenerate the manual before I start working on the web-site" :) ).
>>    
>>
>
>Can you post merge.xsl?  I'm guessing it simply operates on Docbook 
>section/title elements and thus could work with any Docbook source file?  DO 
>you use recursive templates in merge.xslt?  It's really easy to get into an 
>infinite loop with XSLT recursive templates.
>
>Based on the simplicity of the transform you're doing, it certainly looks like 
>a really easy job using the sorts of genrrator/iterator tools I outline in
>
>http://www.xml.com/pub/a/2003/01/08/py-xml.html
>
>I like XSLT better than most, but it's not for every task.
>
>
>  
>

-- 
_______________________________________
  Mike C. Fletcher
  Designer, VR Plumber, Coder
  http://members.rogers.com/mcfletch/



--------------090202050003080909060208
Content-Type: text/plain;
 name="testdom2.py"
Content-Transfer-Encoding: 7bit
Content-Disposition: inline;
 filename="testdom2.py"

"""FourSuite-specific XML-documentation processing script

There's something wrong with the xsl merge mechanism, so
this module does the merge using direct Python manipulation
of the files via the FourSuite Python XML tools.
"""
import sys, os
from Ft.Xml import InputSource
from Ft.Xml.XPath import Compile, Context
from Ft.Xml.Domlette import GetAllNs, PrettyPrint, NonvalidatingReader

try:
	import logging
	log = logging.getLogger( 'xmlmerge' )
	logging.basicConfig()
	log.setLevel( logging.INFO )
except ImportError:
	log = None

def load( source ):
	"""Load a document from source as a DOM"""
	uri = 'file:'+ os.path.abspath(source).replace( "\\", "/" )
	if log:
		log.info( "Loading source document %r", uri )
	result = NonvalidatingReader.parseUri(uri)
	if log:
		log.debug( "Finished loading document %r", uri )
	return result
def save( doc, destination ):
	if log:
		log.info( "Saving document to %r", destination )
	PrettyPrint(doc, open(destination,'w'))
	if log:
		log.debug( "Finished saving document %r", destination )
	

def finder( pattern ):
	"""Create an xpath searcher for the given pattern"""
	return Compile( pattern )
	
def find( specifier, base ):
	"""Find subnodes of base with given XPath specifier"""
	return finder(specifier).evaluate( Context.Context( base, GetAllNs(base)) )

REPLACEFINDER = finder( "//*[@condition='replace']")

def main( rootFile, originalDirectory, destination ):
	"""Load rootFile, merge with the docs in originalDirectory and write to destination"""
	prefixedDocs = []
	set = {}
	for prefix in ['glut','glu','gle','gl']:
		filename = os.path.join(originalDirectory, prefix.upper(), 'reference.xml')
		doc = load(filename)
		for node in find( "//*[@id]", doc ):
			set[ node.getAttributeNS(None,'id')] = node
		prefixedDocs.append( (prefix, doc))
	doc = load( rootFile )
	for entry in find( "//*[@condition='replace']", doc ):
		# now, for each refentry, there is an "original" entry
		# from which we copy 90% of the data...
		id = entry.getAttributeNS(None,'id')
		if log:
			log.debug( "substitution for %r", id )
		original = set.get( id )
		if not original:
			if log:
				log.warn( "Unable to find substitution source for %r", id )
			continue; # next entry
		entry.parentNode.replaceChild( original, entry )
	save( doc, destination )

main(
	rootFile = os.path.abspath(sys.argv[1]),
	originalDirectory = os.path.abspath(os.path.join( os.path.dirname(sys.argv[1]), '..', 'original')),
	destination = os.path.abspath(sys.argv[2]),
)


--------------090202050003080909060208--