pyparsing question
Neil Cerutti
mr.cerutti at gmail.com
Tue Jan 1 18:54:54 EST 2008
On Jan 1, 2008 6:32 PM, hubritic <colinlandrum at gmail.com> wrote:
> I am trying to parse data that looks like this:
>
> IDENTIFIER TIMESTAMP T C RESOURCE_NAME DESCRIPTION
> 2BFA76F6 1208230607 T S SYSPROC SYSTEM
> SHUTDOWN BY USER
> A6D1BD62 1215230807 I
> H Firmware Event
>
> My problem is that sometimes there is a RESOURCE_NAME and sometimes
> not, so I wind up with "Firmware" as my RESOURCE_NAME and "Event" as
> my DESCRIPTION. The formating seems to use a set number of spaces.
>
> The data I have has a fixed number of characters per field, so I could
> split it up that way, but wouldn't that defeat the purpose of using a
> parser? I am determined to become proficient with pyparsing so I am
> using it even when it could be considered overkill; thus, it has gone
> past mere utility now, this is a matter of principle!
If your data is really in fixed-size columns, then pyparsing is the wrong
tool.
There's no standard Python tool for reading and writing fixed-length field
"flatfile" data files, but it's pretty simple to use named slices to get at
the data.
identifier = slice(0, 8)
timestamp = slice(8, 18)
t = slice(18, 21)
c = slice(21, 24)
resource_name = slice(24, 35)
description = slice(35)
for line in file:
line = line.rstrip("\n")
print "id:", line[identifier]
print "timestamp:", line[timestamp]
...etc...
--
Neil Cerutti
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-list/attachments/20080101/799cb92f/attachment-0001.html>
More information about the Python-list
mailing list