pyparsing question

hubritic colinlandrum at gmail.com
Tue Jan 1 18:32:17 EST 2008


I am trying to parse data that looks like this:

IDENTIFIER    TIMESTAMP   T  C   RESOURCE_NAME   DESCRIPTION
2BFA76F6     1208230607   T   S   SYSPROC                    SYSTEM
SHUTDOWN BY USER
A6D1BD62   1215230807     I
H                                            Firmware Event

My problem is that sometimes there is a RESOURCE_NAME and sometimes
not, so I wind up with "Firmware" as my RESOURCE_NAME and "Event" as
my DESCRIPTION.  The formating seems to use a set number of spaces.

I have tried making RESOURCE_NAME an Optional(Word(alphanums))) and
Description OneOrMore(Word(alphas) + LineEnd(). So the question is,
how can I avoid having the first word of Description sucked into
RESOURCE_NAME when that field should be blank?


The data I have has a fixed number of characters per field, so I could
split it up that way, but wouldn't that defeat the purpose of using a
parser?  I am determined to become proficient with pyparsing so I am
using it even when it could be considered overkill; thus, it has gone
past mere utility now, this is a matter of principle!

thanks



More information about the Python-list mailing list