pyparsing: match empty line
Marek Kubica
marek at xivilization.net
Wed Sep 3 05:26:25 EDT 2008
Hi,
First of all a big thank you for your excellent library and of course
also for your extensive and enlightening answer!
> 1) Well done in resetting the default whitespace characters, since you
> are doing some parsing that is dependent on the presence of line ends.
> When you do this, it is useful to define an expression for end of line
> so that you can reference it where you explicitly expect to find line
> ends:
>
> EOL = LineEnd().suppress()
Ok, I didn't think about this. But as my program is not only a parser but
a long-running process and setDefaultWhitespace modifies a global
variable I don't feel too comfortable with it. I could set the whitespace
on every element, but that is as you surely agree quite ugly. Do you
accept patches? I'm thinking about some kind of factory-class which would
automatically set the whitespaces:
>>> factory = TokenFactory(' \t\r')
>>> word = Factory.Word(alphas)
>>>
That way, one wouldn't need to set a grobal value which might interfere
with other pyparsers running in the same process.
> parser = OneOrMore(watchname ^ pagebreak ^ leaveempty ^ EOL)
>
> This will now permit the second test to pass.
Right. Seems that working with whitespace requires a bit better
understanding than I had.
> 3) Your definition of pagebreak looks okay now, but I don't understand
> why your test containing 2 blank lines is only supposed to generate a
> single <PAGEBREAK>.
No, it should be one <PAGEBREAK> per blank line, now it works as expected.
> 4) leaveempty probably needs this parse action to be attached to it:
>
> leaveempty =
> Literal('EMPTY').setParseAction(replaceWith('<EMPTY>'))
I added this in the meantime. replaceWith is really a handy helper.
> parser = OneOrMore(watchname | pagebreak | leaveempty | EOL)
>
> '|' operators generate MatchFirst expressions. MatchFirst will do
> short-circuit evaluation - the first expression that matches will be the
> one chosen as the matching alternative.
Okay, adjusted it.
> If you have more pyparsing questions, you can also post them on the
> pyparsing wiki - the Discussion tab on the wiki Home page has become a
> running support forum - and there is also a Help/Discussion mailing
> list.
Which of these two would you prefer?
Thanks again, it works now just as I imagined!
regards,
Marek
More information about the Python-list
mailing list