Line Text Parsing

David Goodger goodger at python.org
Wed Feb 4 14:56:22 EST 2004


allanc wrote:
> Here's a sample of the records I need to parse:
> 
> 01508390019002      11284361000002SUGARPLUM
> 015083915549           SHORT ON LAST ORDER 
> 0150839220692 000002EA BMC   15 KG   001400
> 
> 1st Line is a (portion of) header record.
> 2nd Line is an text instruction record.
> 3rd Line is a Transaction Line Item record.

I've written many programs to parse data very similar to this,
until I generalized the algorithm (a line-oriented state machine)
into a module.  You can find the module (internally documented)
at http://docutils.sf.net/docutils/statemachine.py.

Hope it helps!

-- 
David Goodger                               http://python.net/~goodger
For hire: http://python.net/~goodger/cv





More information about the Python-list mailing list