Breaking up Strings correctly:

Gabriel Genellina gagsl-py2 at yahoo.com.ar
Tue Apr 10 10:35:39 EDT 2007


En Tue, 10 Apr 2007 08:12:53 -0300, Michael Yanowitz  
<m.yanowitz at kearfott.com> escribió:

> I guess what I was looking for was something simpler than parsing.
> I may actually use some of what you posted. But I am hoping that
> if given a string such as:
> '((($IP = "127.1.2.3") AND ($AX < 15)) OR (($IP = "127.1.2.4") AND ($AY  
> !=
> 0)))'
>   something like split(), where I can pass it something like [' AND ', '  
> OR
> ', ' XOR ']
> will split the string by AND, OR, or XOR.
>   BUT split it up in such a way to preserve the parentheses order, so  
> that
> it will
> split on the outermost parenthesis.
>   So that the above string becomes:
> ['OR',  '(($IP = "127.1.2.3") AND ($AX < 15))', '(($IP = "127.1.2.4") AND
> ($AY != 0))']
>   No need to do this recursively, I can repeat the process, however if I
> wish on each
> string in the list and get:
> ['OR',  ['AND', '($IP = "127.1.2.3")', '($AX < 15)'], ['AND', '($IP =
> "127.1.2.4")', '($AY != 0)']]
>
> Can this be done without parsers?

This is exactly what parsers do. Sure, it can be done without using a  
preexistent general parser, but you'll be writing your own specialized one  
by hand.

> Perhaps with some variation of re or
> split.

Regular expressions cannot represent arbitrary expressions like yours  
(simply because they're not regular).
If you know beforehand that all input has some fixed form, like "condition  
AND condition OR condition AND condition", or at least a finite set of  
fixed forms, it could be done with many re's. But I think it's much more  
work than using PyParsing or similar tools.

If you have some bizarre constraints (parserphobia?) or for whatever  
reason don't want to use such tools, the infix evaluator posted yesterday  
by Gerard Flanagan could be an alternative (it only uses standard modules).

> Has something like this already been written?

Yes, hundreds of times since programmable computers exist: they're known  
as "lexers" and "parsers" :)

-- 
Gabriel Genellina




More information about the Python-list mailing list