pyparsing: match empty line

Marek Kubica marek at xivilization.net
Wed Sep 3 05:26:25 EDT 2008


Hi,

First of all a big thank you for your excellent library and of course 
also for your extensive and enlightening answer!

> 1) Well done in resetting the default whitespace characters, since you
> are doing some parsing that is dependent on the presence of line ends. 
> When you do this, it is useful to define an expression for end of line
> so that you can reference it where you explicitly expect to find line
> ends:
> 
>     EOL = LineEnd().suppress()

Ok, I didn't think about this. But as my program is not only a parser but 
a long-running process and setDefaultWhitespace modifies a global 
variable I don't feel too comfortable with it. I could set the whitespace 
on every element, but that is as you surely agree quite ugly. Do you 
accept patches? I'm thinking about some kind of factory-class which would 
automatically set the whitespaces:

>>> factory = TokenFactory(' \t\r')
>>> word = Factory.Word(alphas)
>>>

That way, one wouldn't need to set a grobal value which might interfere 
with other pyparsers running in the same process.

>     parser = OneOrMore(watchname ^ pagebreak ^ leaveempty ^ EOL)
> 
> This will now permit the second test to pass.

Right. Seems that working with whitespace requires a bit better 
understanding than I had.

> 3) Your definition of pagebreak looks okay now, but I don't understand
> why your test containing 2 blank lines is only supposed to generate a
> single <PAGEBREAK>.

No, it should be one <PAGEBREAK> per blank line, now it works as expected.

> 4) leaveempty probably needs this parse action to be attached to it:
> 
>     leaveempty =
> Literal('EMPTY').setParseAction(replaceWith('<EMPTY>'))

I added this in the meantime. replaceWith is really a handy helper.

>     parser = OneOrMore(watchname | pagebreak | leaveempty | EOL)
> 
> '|' operators generate MatchFirst expressions.  MatchFirst will do
> short-circuit evaluation - the first expression that matches will be the
> one chosen as the matching alternative.

Okay, adjusted it.

> If you have more pyparsing questions, you can also post them on the
> pyparsing wiki - the Discussion tab on the wiki Home page has become a
> running support forum - and there is also a Help/Discussion mailing
> list.

Which of these two would you prefer?

Thanks again, it works now just as I imagined!

regards,
Marek



More information about the Python-list mailing list