Where is the syntax for the dict() constructor ?!

Marc 'BlackJack' Rintsch bj_666 at gmx.net
Fri Jul 6 05:18:58 EDT 2007


On Fri, 06 Jul 2007 08:34:55 +0200, Hendrik van Rooyen wrote:

> "John Machin" <sj,,,n at lexicon.net> wrote:
> 
>> 
>> I don't know what you mean by "requires more than one
>> character of lookahead" -- any non-Mickey-Mouse implementation of a
>> csv reader will use a finite state machine with about half-a-dozen
>> states, and data structures no more complicated than (1) completed
>> rows received so far (2) completed fields in current row (3) bytes in
>> current field. When a new input byte arrives, what to do can be
>> determined based on only that byte and the current state; no look-
>> ahead into the input stream is required, nor is any look-back into
>> those data structures.
>> 
> 
> True.
> 
> You can even do it more simply - by writing a GetField() that
> scans for either the delimiter or end of line or end of file, and 
> returns the "field" found, along with the delimiter that caused 
> it to exit, and then writing a GetRecord() that repetitively calls
> the GetField and assembles the row record until the delimiter 
> returned is either the end of line or the end of file, remembering 
> that the returned field may be empty, and handling the cases based 
> on the delimiter returned when it is.
> 
> This also makes all the decisions based on the current character
> read, no lookahead as far as I can see.
> 
> Also no state variables, no switch statements...
> 
> Is this the method that you would call "Mickey Mouse"?

Maybe, because you've left out all handling of quoting and escape
characters here.  Consider this:

erik,viking,"ham, spam and eggs","He said ""Ni!""","line one
line two"

That's 5 elements:

1: eric
2: viking
3: ham, spam and eggs
4: He said "Ni!"
5: line one
   line two

Ciao,
	Marc 'BlackJack' Rintsch



More information about the Python-list mailing list