parsing of structured text

Robert Fendt robert.fendt at googlemail.com
Wed Oct 27 16:03:25 EDT 2010


Hi all,

I have to parse a file containing (slightly erroneous) vCal data. The
format of vCal/iCal is that of a structured ASCII file, not unlike XML
in a way. A vCal block contains information on a line-by-line basis,
with the possibility of sub-blocks (for events).

BEGIN:VCALENDAR
VERSION:1.0
BEGIN:VEVENT
...
END:VEVENT
BEGIN:VEVENT
...
END:VEVENT
END:VCALENDAR
BEGIN:VCALENDAR
VERSION:1.0
...
END:VCALENDAR

Were this C++, I would use an iterator approach, with classes for the
calendar and event blocks respectively, and pass an iterator pointing
to the current position in the file for deserialisation, getting  a
new iterator back that points to the position behind the block. That
way I decide what to do next based on the current line's contents,
i.e., implement a state machine of some sorts.

While this approach is certainly possible in Python as well, I have
the nagging feeling that there should be a much cleaner, simpler
(i.e., "Pythonic") way to deal with such a problem. Ideally, the end
result would look something like this, however I am a bit at a loss
right now as to how best to achieve it. Any suggestions?

for calendar_block in input_file:
  version = calendar_block.version
  num_events = len(calendar_block.events)


Thanks,
Robert



More information about the Python-list mailing list