Best way to parse file into db-type layout?

Michael Hoffman cam.ac.uk at mh391.invalid
Sat Apr 30 06:35:05 EDT 2005


John Machin wrote:

> I beg your pardon. How does: "Your point addresses the letter rather
> than the spirit of the 'law'" sound?

Sure, thanks.

> Real-world data is not "text".

A lot of real-world data is. For example, almost all of the data I deal with
is text.

>>That's nice. Well I agree with you, if the OP is concerned about embedded
>>CRs, LFs and ^Zs in his data (and he is using Windows in the latter case),
>>then he *definitely* shouldn't use fileinput.
> 
> And if the OP is naive enough not to be concerned, then it's OK, is
> it?

It simply isn't a problem in some real-world problem domains. And if there
are control characters the OP didn't expect in the input, and csv loads it
without complaint, I would say that he is likely to have other problems once
he's processing it.

> Except, perhaps, the reason stated in fileinput.py itself: 
> 
> """
> Performance: this module is unfortunately one of the slower ways of
> processing large numbers of input lines.
> """

Fair enough, although Python is full of useful things that save the
programmer's time at the expense of that of the CPU, and this is
frequently considered a Good Thing.

Let me ask you this, are you simply opposed to something like fileinput
in principle or is it only because of (1) no binary mode, and (2) poor
performance? Because those are both things that could be fixed. I think
fileinput is so useful that I'm willing to spend some time working on it
when I have some.
-- 
Michael Hoffman



More information about the Python-list mailing list