Breaking up Strings correctly:

Michael Yanowitz m.yanowitz at kearfott.com
Tue Apr 10 07:12:53 EDT 2007



-----Original Message-----
From: python-list-bounces+m.yanowitz=kearfott.com at python.org
[mailto:python-list-bounces+m.yanowitz=kearfott.com at python.org]On Behalf
Of Adam Atlas
Sent: Monday, April 09, 2007 11:28 PM
To: python-list at python.org
Subject: Re: Breaking up Strings correctly:


On Apr 9, 8:19 am, "Michael Yanowitz" <m.yanow... at kearfott.com> wrote:
> Hello:
>
>    I have been searching for an easy solution, and hopefully one
> has already been written, so I don't want to reinvent the wheel:

Pyparsing is indeed a fine package, but if Paul gets to plug his
module, then so do I! :)

I have a package called ZestyParser... a lot of it is inspired by
Pyparsing, actually, but I'm going in a different direction in many
areas. (One major goal is to be crazily dynamic and flexible on the
inside. And it hasn't failed me thus far; I've used it to easily parse
grammars that would make lex and yacc scream in horror.)

Here's how I'd do it...

from ZestyParser import *
from ZestyParser.Helpers import *

varName = Token(r'\$(\w+)', group=1)
varVal = QuoteHelper() | Int
sp = Skip(Token(r'\s*'))
comparison = sp.pad(varName + CompositeToken([RawToken(sym) for sym in
('=','<','>','>=','<=','!=')]) + varVal)
#Maybe I should "borrow" PyParsing's OneOf idea :)

expr = ExpressionHelper((
    comparison,
    (RawToken('(') + Only(_top_) + RawToken(')')),
    oper('NOT', ops=UNARY),
    oper('AND'),
    oper('OR'),
))

Now you can scan for `expr` and get a return value like [[['IP', '=',
'127.1.2.3'], ['AX', '<', 15]], [['IP', '=', '127.1.2.4'], ['AY', '!
=', 0]]] (for the example you gave).

Note that this example uses several features that won't be available
until the next release, but it's coming soon. So Michael, though you'd
still be able to parse this with the current version, the code
wouldn't look as nice as this or the Pyparsing version. Maybe just add
it to your watchlist. :)

- Adam

--


  Thanks for your and Gerard's and Gabriel's responses.
I guess what I was looking for was something simpler than parsing.
I may actually use some of what you posted. But I am hoping that
if given a string such as:
'((($IP = "127.1.2.3") AND ($AX < 15)) OR (($IP = "127.1.2.4") AND ($AY !=
0)))'
  something like split(), where I can pass it something like [' AND ', ' OR
', ' XOR ']
will split the string by AND, OR, or XOR.
  BUT split it up in such a way to preserve the parentheses order, so that
it will
split on the outermost parenthesis.
  So that the above string becomes:
['OR',  '(($IP = "127.1.2.3") AND ($AX < 15))', '(($IP = "127.1.2.4") AND
($AY != 0))']
  No need to do this recursively, I can repeat the process, however if I
wish on each
string in the list and get:
['OR',  ['AND', '($IP = "127.1.2.3")', '($AX < 15)'], ['AND', '($IP =
"127.1.2.4")', '($AY != 0)']]

  Can this be done without parsers? Perhaps with some variation of re or
split.
Has something like this already been written?


Thanks in advance:





More information about the Python-list mailing list