Best way to parse file into db-type layout?
John Machin
sjmachin at lexicon.net
Fri Apr 29 20:17:57 EDT 2005
On Sat, 30 Apr 2005 00:40:50 +0100, Michael Hoffman
<cam.ac.uk at mh391.invalid> wrote:
>John Machin wrote:
> > [Michael Hoffman]:
>>>John Machin wrote:
>>>>[Michael Hoffman]:
>>>>
>>>>>for row in csv.reader(fileinput.input()):
>>>>
>>>>csv.reader requires that if the first arg is a file that it be opened
>>>>in binary mode.
>>>
>>>fileinput.input() is not a file.
>>
>> Hair-splitter.
>
>Is name-calling really necessary?
I beg your pardon. How does: "Your point addresses the letter rather
than the spirit of the 'law'" sound?
>
>> It's an awk simulation and shouldn't be used for real-world data.
>
>I don't see why not, so long as your data is text.
Real-world data is not "text".
>
>>>I have tested this code and it works fine for the provided example.
>>
>> Well I've got news for you: real-world data has embedded CRs, LFs and
>> (worst of all) ^Zs often enough, and you won't find them mentioned in
>> any documentation, nor find them in examples.
>
>That's nice. Well I agree with you, if the OP is concerned about embedded
>CRs, LFs and ^Zs in his data (and he is using Windows in the latter case),
>then he *definitely* shouldn't use fileinput.
And if the OP is naive enough not to be concerned, then it's OK, is
it?
>
>And otherwise, there's really no reason not to.
Except, perhaps, the reason stated in fileinput.py itself:
"""
Performance: this module is unfortunately one of the slower ways of
processing large numbers of input lines.
"""
More information about the Python-list
mailing list