Regular Expressions: large amount of or's

Daniel Yoo dyoo at hkn.eecs.berkeley.edu
Mon Mar 14 17:26:31 EST 2005


Scott David Daniels <Scott.Daniels at acm.org> wrote:

: I have a (very high speed) modified Aho-Corasick machine that I sell.
: The calling model that I found works well is:

:      def chases(self, sourcestream, ...):
:           '''A generator taking a generator of source blocks,
:           yielding (matches, position) pairs where position is an
:           offset within the "current" block.
:           '''

: You might consider taking a look at providing that form.


Hi Scott,

No problem, I'll be happy to do this.

I need some clarification on the calling model though.  Would this be
an accurate test case?

######
    def testChasesInterface(self):
        self.tree.add("python")
        self.tree.add("is")
        self.tree.make()
        sourceStream = iter(("python programming is fun",
                             "how much is that python in the window"))
        self.assertEqual([
                           (sourceBlocks[0], (0, 6)),
                           (sourceBlocks[0], (19, 21)),
                           (sourceBlocks[1], (9, 11)),
                           (sourceBlocks[1], (17, 23)),
                         ],
                         list(self.tree.chases(sourceStream))
######

Here, I'm assuming that chases() takes in a 'sourceStream', which is
an iterator of text blocks., and that the return value is itself an
iterator.


Best of wishes!



More information about the Python-list mailing list