PyParsing and Headaches

Bytter bytter at gmail.com
Thu Nov 23 06:57:04 EST 2006


(This message has already been sent to the mailing-list, but I don't
have sure this is arriving well since it doesn't come up in the usenet,
so I'm posting it through here now.)

Chris,

Thanks for your quick answer. That changes a lot of stuff, and now I'm
able to do my parsing as I intended to.

Still, there's a remaining problem. By using Combine(), everything is
interpreted as a single token. Though what I need is that
'include_bool' and 'literal' be parsed as separated tokens, though
without a space in the middle...

Paul,

Thanks for your detailed explanation. One of the things I think is
missing from the documentation (or that I couldn't find easy) is the
kind of explanation you give about 'The Way of PyParsing'. For example,
It took me a while to understand that I could easily implement simple
recursions using OneOrMany(Group()). Or maybe things were out there and
I didn't searched enough...

Still, fwiw, congratulations for the library. PyParsing allowed me to
do in just a couple of hours, including learning about it's API (minus
this little inconvenient) what would have taken me a couple of days
with, for example,  ANTLR (in fact, I've already put aside ANTLR more
than once in the past for a built-from-scratch parser).

Cheers,

Hugo Ferreira

On Nov 22, 7:50 pm, Chris Lambacher <c... at kateandchris.net> wrote:
> On Wed, Nov 22, 2006 at 11:17:52AM -0800, Bytter wrote:
> > Hi,
>
> > I'm trying to construct a parser, but I'm stuck with some basic
> > stuff... For example, I want to match the following:
>
> > letter = "A"..."Z" | "a"..."z"
> > literal = letter+
> > include_bool := "+" | "-"
> > term = [include_bool] literal
>
> > So I defined this as:
>
> > literal = Word(alphas)
> > include_bool = Optional(oneOf("+ -"))
> > term = include_bool + literal+ here means that you allow a space.  You need to explicitly override this.
> Try:
>
> term = Combine(include_bool + literal)
>
>
>
> > The problem is that:
>
> > term.parseString("+a") -> (['+', 'a'], {}) # OK
> > term.parseString("+ a") -> (['+', 'a'], {}) # KO. It shouldn't
> > recognize any token since I didn't said the SPACE was allowed between
> > include_bool and literal.
>
> > Can anyone give me an hand here?
>
> > Cheers!
>
> > Hugo Ferreira
>
> > BTW, the following is the complete grammar I'm trying to implement with
> > pyparsing:
>
> > ## L ::= expr | expr L
> > ## expr ::= term | binary_expr
> > ## binary_expr ::= term " " binary_op " " term
> > ## binary_op ::= "*" | "OR" | "AND"
> > ## include_bool ::= "+" | "-"
> > ## term ::= ([include_bool] [modifier ":"] (literal | range)) | ("~"
> > literal)
> > ## modifier ::= (letter | "_")+
> > ## literal ::= word | quoted_words
> > ## quoted_words ::= '"' word (" " word)* '"'
> > ## word ::= (letter | digit | "_")+
> > ## number ::= digit+
> > ## range ::= number (".." | "...") number
> > ## letter ::= "A"..."Z" | "a"..."z"
> > ## digit ::= "0"..."9"
>
> > And this is where I got so far:
>
> > word = Word(nums + alphas + "_")
> > binary_op = oneOf("* and or", caseless=True).setResultsName("operator")
> > include_bool = oneOf("+ -")
> > literal = (word | quotedString).setResultsName("literal")
> > modifier = Word(alphas + "_")
> > rng = Word(nums) + (Literal("..") | Literal("...")) + Word(nums)
> > term = ((Optional(include_bool) + Optional(modifier + ":") + (literal |
> > rng)) | ("~" + literal)).setResultsName("Term")
> > binary_expr = (term + binary_op + term).setResultsName("binary")
> > expr = (binary_expr | term).setResultsName("Expr")
> > L = OneOrMore(expr)
>
> > --
> > GPG Fingerprint: B0D7 1249 447D F5BB 22C5  5B9B 078C 2615 504B 7B85
> 
> > --
> >http://mail.python.org/mailman/listinfo/python-list




More information about the Python-list mailing list