pyparsing and svg

Paul McGuire ptmcg at austin.rr.com
Thu Nov 8 05:08:43 EST 2007


On Nov 8, 3:14 am, Donn Ingle <donn.in... at gmail.com> wrote:

> float = nums + dot + nums

Should be:

float = Combine(Word(nums) + dot + Word(nums))

nums is a string that defines the set of numeric digits for composing
Word instances.  nums is not an expression by itself.

For that matter, I see in your later tests that some values have a
leading minus sign, so you should really go with:

float = Combine(Optional("-") + Word(nums) + dot + Word(nums))



Some other comments:

1. Read up on the Word class, you are not using it quite right.

command = Word("MLCZ")

will work with your test set, but it is not the form I would choose.
Word(characterstring) will match any "word" made up of the characters
in the input string.  So Word("MLCZ") will match
M
L
C
Z
MM
LC
MCZL
MMLCLLZCZLLM

I would suggest instead using:

command = Literal("M") | "L" | "C" | "Z"

or

command = oneOf("M L C Z")

2. Change comma to

comma = Literal(",").suppress()

The comma is important to the parsing process, but the ',' token is
not much use in the returned set of matched tokens, get rid of it (by
using suppress).

3. Group your expressions, such as

couple = Group(float + comma + float)

It will really simplify getting at the resulting parsed tokens.


4. What is the purpose of (couple + couple)?  This is sufficient:

phrase = OneOrMore(command + Group(OneOrMore(couple)) )

(Note use of Group to return the coord pairs as a sublist.)


5. Results names!

phrase = OneOrMore(command("command") + Group(OneOrMore(couple))
("coords") )

will allow you to access these fields by name instead of by index.
This will make your parser code *way* more readable.


-- Paul




More information about the Python-list mailing list