Where is the syntax for the dict() constructor ?!
Marc 'BlackJack' Rintsch
bj_666 at gmx.net
Fri Jul 6 05:18:58 EDT 2007
On Fri, 06 Jul 2007 08:34:55 +0200, Hendrik van Rooyen wrote:
> "John Machin" <sj,,,n at lexicon.net> wrote:
>
>>
>> I don't know what you mean by "requires more than one
>> character of lookahead" -- any non-Mickey-Mouse implementation of a
>> csv reader will use a finite state machine with about half-a-dozen
>> states, and data structures no more complicated than (1) completed
>> rows received so far (2) completed fields in current row (3) bytes in
>> current field. When a new input byte arrives, what to do can be
>> determined based on only that byte and the current state; no look-
>> ahead into the input stream is required, nor is any look-back into
>> those data structures.
>>
>
> True.
>
> You can even do it more simply - by writing a GetField() that
> scans for either the delimiter or end of line or end of file, and
> returns the "field" found, along with the delimiter that caused
> it to exit, and then writing a GetRecord() that repetitively calls
> the GetField and assembles the row record until the delimiter
> returned is either the end of line or the end of file, remembering
> that the returned field may be empty, and handling the cases based
> on the delimiter returned when it is.
>
> This also makes all the decisions based on the current character
> read, no lookahead as far as I can see.
>
> Also no state variables, no switch statements...
>
> Is this the method that you would call "Mickey Mouse"?
Maybe, because you've left out all handling of quoting and escape
characters here. Consider this:
erik,viking,"ham, spam and eggs","He said ""Ni!""","line one
line two"
That's 5 elements:
1: eric
2: viking
3: ham, spam and eggs
4: He said "Ni!"
5: line one
line two
Ciao,
Marc 'BlackJack' Rintsch
More information about the Python-list
mailing list