re beginner

Paul McGuire ptmcg at austin.rr._bogus_.com
Sun Jun 4 23:57:22 EDT 2006


"John Machin" <sjmachin at lexicon.net> wrote in message
news:448399AF.9030001 at lexicon.net...
> On 5/06/2006 10:07 AM, Paul McGuire wrote:
> > "John Machin" <sjmachin at lexicon.net> wrote in message
> > news:4483665A.206 at lexicon.net...
> >> Fantastic -- at least for the OP's carefully copied-and-pasted input.
> >> Meanwhile back in the real world, there might be problems with multiple
> >> tabs used for 'prettiness' instead of 1 tab, non-integer values, etc
etc.
> >> In that case a loop approach that validated as it went and was able to
> >> report the position and contents of any invalid input might be better.
> >
> > Yeah, for that you'd need more like a real parser... hey, wait a minute!
> > What about pyparsing?!
> >
> > Here's a pyparsing version.  The definition of the parsing patterns
takes
> > little more than the re definition does - the bulk of the rest of the
code
> > is parsing/scanning the input and reporting the results.
> >
>
> [big snip]
>
> I didn't see any evidence of error handling in there anywhere.
>
>
Pyparsing has a certain amount of error reporting built in, raising a
ParseException when a mismatch occurs.

This particular "grammar" is actually pretty error-tolerant.  To force an
error, I replaced "One for the money" with "1 for the money", and here is
the exception reported by pyparsing, along with a diagnostic method,
markInputline:


stuff = 'Yellow hat\t2\tBlue shirt\t1\nWhite socks\t4\tGreen pants\t1\nBlue
bag\t4\tNice perfume\t3\nWrist watch\t7\tMobile phone\t4\nWireless
cord!\t2\tBuilding tools\t3\nOne for the money\t7\tTwo for the show\t4'
badstuff = 'Yellow hat\t2\tBlue shirt\t1\nWhite socks\t4\tGreen
pants\t1\nBlue bag\t4\tNice perfume\t3\nWrist watch\t7\tMobile
phone\t4\nWireless cord!\t2\tBuilding tools\t3\n1 for the money\t7\tTwo for
the show\t4'
pattern = dictOf( itemDesc, integer ) + stringEnd
print pattern.parseString(stuff)
print
try:
    print pattern.parseString(badstuff)
except ParseException, pe:
    print pe
    print pe.markInputline()

Gives:
[['Yellow hat', '2'], ['Blue shirt', '1'], ['White socks', '4'], ['Green
pants', '1'], ['Blue bag', '4'], ['Nice perfume', '3'], ['Wrist watch',
'7'], ['Mobile phone', '4'], ['Wireless cord!', '2'], ['Building tools',
'3'], ['One for the money', '7'], ['Two for the show', '4']]

Expected stringEnd (at char 210), (line:6, col:1)
>!<1 for the money 7       Two for the show        4

-- Paul





More information about the Python-list mailing list