How to use mxTextTools

Mike Fletcher mfletch at tpresence.com
Thu Dec 14 12:01:03 EST 2000


Something you might find useful would be to look at the mcf.vrml.parser
module, which uses simpleparse (which just spits out mxTextTools tuples) to
process a file into an in-memory node graph.

See http://members.home.com/mcfletch/programming/mcf_vrml.htm for the
mcf.vrml distribution.  Here's some code from there...

	def readNext( self):
		'''Read the next root-level construct'''
		success, tags, next = TextTools.tag( self.data,
ROOTITEMPARSER, self.position )
##		print 'readnext', success
		if self.position >= self.datalength:
			print 'reached file end'
			return None
		if success:
			#print '  successful parse'
			self.position = next
			if self.parseOnly:
				return success
			map (self.rootItem_Item, tags )
			return success
		else:
			return None
	def rootItem (self, (type, start, stop, (item,))):
		''' Process a single root item '''
		self.rootItem_Item( item )
	def rootItem_Item( self, item ):
		result = self._dispatch(item)
		if result is not None:
##			print "non-null result"
##			print id( self.sceneGraphStack[-1] ), id(self.result
)
			self.sceneGraphStack[-1].children.append( result )
	def _getString (self, (tag, start, stop, sublist)):
		''' Return the raw string for a given interval in the data
'''
		return self.data [start: stop]
	def _dispatch (self, (tag, left, right, sublist)):
		''' Dispatch to the appropriate processing function based on
tag value '''
##		print "dispatch", tag
		try:
			function = getattr (self, tag)
		except AttributeError:
			raise AttributeError( '''Unknown parse tag "%s"
found! Check the parser definition!'''%(tag))
		return function( (tag, left, right, sublist) )
	def Proto(self, (tag, start, stop, sublist)):
		''' Create a new prototype in the current sceneGraph '''
		# first entry is always GI
		GI = self._getString ( sublist [0])
##		print "PROTO",GI
		newNode =Prototype (GI)
##		print "\t",newNode
		setattr ( self.sceneGraphStack [-1].protoTypes, GI, newNode)
		self.prototypeStack.append( newNode )
		# process the rest of the entries with the given stack
		map ( self._dispatch, sublist [1:] )
		self.prototypeStack.pop( )
	def fieldDecl(self,(tag, left, right, (exposure, datatype, name,
field))):
		''' Create a new field declaration for the current
prototype'''
		# get the definition in recognizable format
		exposure = self._getString (exposure) == "exposedField"
		datatype = self._getString (datatype)
		name = self._getString (name)
		# get the vrml value for the field
		self.fieldTypeStack.append( datatype )
		field = self._dispatch (field)
		self.fieldTypeStack.pop( )
		self.prototypeStack[-1].addField ((name, datatype,
exposure), field)

HTH,
Mike

-----Original Message-----
From: Paul Moore [mailto:paul.moore at uk.origin-it.com]
Sent: Thursday, December 14, 2000 8:36 AM
To: python-list at python.org
Subject: How to use mxTextTools


Hi,
I'm looking at mxTextTools to see if it would be suitable for some
types of text parsing work I am interested in (nothing concrete yet,
so I can't give specifics...)

The example in the documentation of tagging HTML looks fine - I
understand what's going on there, and as I understand it, this will
give me back a taglist, which is (effectively) the text stream with
portions tagged as I ask.

What I dont't see (yet), and I can't find any good examples for, is
what to do with the resulting taglist. There seem to be no functions
for working with taglists, and the lists themselves seem like
relatively complex data structures, so is it right that I should be
manipulating them "by hand"?

More information, or better still, some complete examples, would be
very helpful. (All the examples in the distribution just use
print_tags() to display the tags, and don't do anything with them...)

Paul

-- 
http://www.python.org/mailman/listinfo/python-list




More information about the Python-list mailing list