[Tutor] newbie text parsing question

Alan Trautman ATrautman@perryjudds.com
Wed, 28 Aug 2002 11:41:16 -0500


Again if you are looking for a concept of how to do this rather than the
code I will give the approach I would use.

read the file
strip all newlines ('/n')
put newlines ('/n') after every colon
save the new file
open the new file
read every other line inserting a comma in between each element
add a newline ('/n') at the end of a record
append this to you master file contain all the previously parsed items
repeat until all record are parsed

hope that helps. It not really clever, smart or taking advantage of any
special features of Python but it should work. 

You will have to add the extra step of splitting the records apart (I'm
Hoping for your sake they are the same length) and you can then repeat the
process every x number of lines in the original file.

Good luck 

Alan

-----Original Message-----
From: Rob [mailto:rob@uselesspython.com]
Sent: Wednesday, August 28, 2002 11:32 AM
To: Python Tutor
Subject: RE: [Tutor] newbie text parsing question


There are different ways to get to the solution you're after. Do you want to
code for a situation in which you know you will always expect the same
format to the file, or do you want to account for files that don't have
precisely the same format?

For instance, will you always have only one "Problem:" listed?

Do you already have the grasp of reading and writing files to your
satisfaction? The Tutorial that tends to come ship with Python distributions
(and also easily found at python.org) has a section demonstrating File I/O.

There are also lots of samples out there, at sites like the Vaults of
Parnassus, the Python Cookbook site, and Useless Python.

Rob

-----Original Message-----
From: tutor-admin@python.org [mailto:tutor-admin@python.org]On Behalf Of Ron
Nixon
Sent: Wednesday, August 28, 2002 10:57 AM
To: tutor@python.org
Subject: [Tutor] newbie text parsing question


Ive got a file that looks like this:
   Case Number: 076-2000  Recall Notification Report:  RNR076-2000
   Date Opened: 12/20/2000  Date Closed:  04/20/2001
   Recall Class:  1  Press Release (Y/N):  Y
   Domestic Est. Number:  02040  M     Name:  Harper's Country Ham
   Imported Product (Y/N):  Y      Foreign Estab. Number:  N/A
   City:  Clinton   State:  KY  Country:  USA
   Product:  Country Ham
   Problem:  BACTERIA  Description: LISTERIA
   Total Pounds Recalled:  10,400  Pounds Recovered:    7,561

I'd like to be able to read all of the file in a extract the data following
the Title and ":" to produce some like this:
076-2000, RNR076-2000,04/20/2001,04/20/2001,1,Y,02040  M, Harper's Country
Ham, etc
that I can then import into a spreadsheet or database. I found nothing at
the Python.org site nor in the Text Processing using Python book. Any ideas?
thanks in advance

Ron




Do You Yahoo!?
Yahoo! Finance - Get real-time stock quotes



_______________________________________________
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor