How to use mxTextTools

Andrew Dalke dalke at acm.org
Fri Dec 15 00:30:49 EST 2000


Paul Moore wrote:
>I'm looking at mxTextTools to see if it would be suitable for some
>types of text parsing work I am interested in (nothing concrete yet,
>so I can't give specifics...)

I am using it as the basis of a parsing project for biopython.org
called Martel (http://www.biopython.org/~dalke/Martel/ ).  It
turned out to be very useful, but its learning curve was quite
high.  I think it took me about 4 days of playing around with
it to feel okay using it, and another few weeks of using it
to feel proficient.

>What I dont't see (yet), and I can't find any good examples for, is
>what to do with the resulting taglist. There seem to be no functions
>for working with taglists, and the lists themselves seem like
>relatively complex data structures, so is it right that I should be
>manipulating them "by hand"?

There aren't any that I've come across.  One of the downside of
Martel is that it produces pretty unoptimized tagtables.  I know
of a few ways to get better performance out of them but they
requiring modifications which are too cumbersome to implement
directly by hand.  So if you are interested in working on such
things, get ahold of me :)

>More information, or better still, some complete examples, would be
>very helpful.

You can look at Martel.  It's a parser generator which uses a
(possibly very large) regular expression as the format description.
It creates a corresponding tagtable which is used by the parser
to parse those files.

So if you know regular expresssions you'll be able to see how
to make some of those constructs to mxTextTools.  However,
mxTextTools is more powerful than regular expressions so it
doesn't work the other way.

                    Andrew
                    dalke at acm.org






More information about the Python-list mailing list