need help of regular expression genius

Paul McGuire ptmcg at austin.rr._bogus_.com
Wed Aug 2 22:58:56 EDT 2006


"GHUM" <haraldarminmassa at gmail.com> wrote in message
news:1154532421.096157.289570 at i42g2000cwa.googlegroups.com...
> I need to split a text at every ; (Semikolon), but not at semikolons
> which are "escaped" within a pair of $$ or $_$ signs.
>

The pyparsing rendition to this looks very similar to the SE solution,
except for the regexp's:

text = """ ... input source text ... ""

from pyparsing import SkipTo,Literal,replaceWith
ign1 = "$$" + SkipTo("$$") + "$$"
ign2 = "$_$" + SkipTo("$_$") + "$_$"
semi = Literal(";").setParseAction( replaceWith("; <***>") )
print (ign1 | ign2 | semi).transformString(text)

In concept, this works just like the SE program: as the scanner/parser scans
through the input text, the ignoreable expressions are looked for first, and
if found, just skipped over.  If the semicolon expression is found, then its
parse action is executed, which replaces the ';' with "; <***>", or whatever
you choose.

The pyparsing wiki is at pyparsing.wikispaces.com.

-- Paul





More information about the Python-list mailing list