Please help... with re

Tim Hochberg tim.hochberg at ieee.org
Wed Jul 26 20:14:04 EDT 2000


Here's an improved version of the string splitting routine I posted
earlier. This is twice as fast as it was, courtesty of no longer using
the re module. It's also roughly 10X faster than the stream based
approach that was also posted to this thread (although it does assume
that there are no embedded nulls).

It would be interesting to see what the original 68 line monster
looked like and compare it's performance to the posted solutions
<hint>.

import string
def minimonster(text):    
    # Replace " with \0 (ASSUMES NO EMBEDDED NULLS)
    newtext = string.replace(text, '"', '\0')
    # Now replace \\0 (formerly \") with "
    newtext = string.replace(newtext, '\\\0', '"')
    # now split on \0 (formerly ")
    textlist = string.split(newtext, '\0')
    # Now split on even sections only (odd sections are quoted).
    result = []
    isEven = 1
    for item in textlist:
        if isEven:
            result.extend(string.split(item))
        else:
            result.append(item)
        isEven = not isEven
    return result

-tim



More information about the Python-list mailing list