Best way to parse file into db-type layout?

John Machin sjmachin at lexicon.net
Fri Apr 29 20:17:57 EDT 2005


On Sat, 30 Apr 2005 00:40:50 +0100, Michael Hoffman
<cam.ac.uk at mh391.invalid> wrote:

>John Machin wrote:
> > [Michael Hoffman]:
>>>John Machin wrote:
>>>>[Michael Hoffman]:
>>>>
>>>>>for row in csv.reader(fileinput.input()):
>>>>
>>>>csv.reader requires that if the first arg is a file that it be opened
>>>>in binary mode.
>>>
>>>fileinput.input() is not a file.
>> 
>> Hair-splitter.
>
>Is name-calling really necessary?

I beg your pardon. How does: "Your point addresses the letter rather
than the spirit of the 'law'" sound?

>
>> It's an awk simulation and shouldn't be used for real-world data.
>
>I don't see why not, so long as your data is text.

Real-world data is not "text".

>
>>>I have tested this code and it works fine for the provided example.
>> 
>> Well I've got news for you: real-world data has embedded CRs, LFs and
>> (worst of all) ^Zs often enough, and you won't find them mentioned in
>> any documentation, nor find them in examples.
>
>That's nice. Well I agree with you, if the OP is concerned about embedded
>CRs, LFs and ^Zs in his data (and he is using Windows in the latter case),
>then he *definitely* shouldn't use fileinput.

And if the OP is naive enough not to be concerned, then it's OK, is
it?

>
>And otherwise, there's really no reason not to.

Except, perhaps, the reason stated in fileinput.py itself: 

"""
Performance: this module is unfortunately one of the slower ways of
processing large numbers of input lines.
"""



More information about the Python-list mailing list