Best way to parse file into db-type layout?

John Machin sjmachin at lexicon.net
Sun May 1 05:18:25 EDT 2005


On Sat, 30 Apr 2005 23:11:48 -0400, Steve Holden <steve at holdenweb.com>
wrote:

>John Machin wrote:

>> If the job at hand is simulating awk's file reading habits, yes then
>> fileinput is convenient. However if the job at hand involves anything
>> like real-world commercial data processing requirements then fileinput
>> is NOT convenient.
>> 
>Yet again, get real. If someone tells me that fileinput  meets their 
>requirements who am I (not to mention who are *you*) to say they should 
>invest extra effort in solving their problem some other way?

Michael Hoffmann has said that it meets his simple requirements. He
doesn't use its filelineno() and nextfile(), and says he wouldn't use
it if he needed that sort of functionality. I have no argument with
that.

If I genuinely thought that fileinput (or any other piece of software)
would not meet somebody's requirements i.e. would not solve their
problem, then what should I do? Unless bound by blood ties or
contractual obligations should I keep silent? In any case, who are
*you* to suggest I shouldn't express an opinion?


Back to fileinput: it complicates things when you want to do something
less simple, like some action at the end of each file -- you have to
poll for the first line of the next file, which means that the
end-of-each-file code has to be repeated at the end of all files.
Further, you don't get to see empty files. Hence the examples:

>
>> Example 1: Requirement is, for each input file, to display name of
>> file, number of records, and some data totals.
>> 
>> Example 2: Requirement is, if end of file occurs when not expected
>> (including, but not restricted to, the case of zero records) display
>> an error message and terminate abnormally.
>> 
>Possibly these examples would have some force if they weren't simply 
>invented.

The only "invention" was *simplification* of genuine real-world
requirements.

Many entities receive periodically, often daily, remittances from
other entities with whom they do business. In parallel to the
remittance being paid into the recipient's bank account, there is sent
a file containing details of the breakdown of the total money amount.
At the end of the file there is a trailer record which is mandated to
contain the number of detail records and the total amount of money. 

Checking the contents of the trailer record against (a) the bank
account and (b) calculated totals from the detail records is a real
requirement. So is ringing the alarm bells if end of file is detected
before the trailer record is detected (or there is any other evidence
that the file is defective). How could you possibly imagine that these
are "simply invented"?

Perhaps we should just agree that we have differing perceptions of
reality, and move on.

Cheers,
John



More information about the Python-list mailing list