Parsing a file with iterators

Eddie Corns eddie at holyrood.ed.ac.uk
Fri Oct 17 12:26:35 EDT 2008


Luis Zarrabeitia <kyrie at uh.cu> writes:


>I need to parse a file, text file. The format is something like that:

>TYPE1 metadata
>data line 1
>data line 2
>...
>data line N
>TYPE2 metadata
>data line 1
>...
>TYPE3 metadata
>...

>And so on. The type and metadata determine how to parse the following dat=
>a
>lines. When the parser fails to parse one of the lines, the next parser i=
>s
>chosen (or if there is no 'TYPE metadata' line there, an exception is thr=
>own).

>This doesn't work:

>=3D=3D=3D
>for line in input:
>    parser =3D parser_from_string(line)
>    parser(input)
>=3D=3D=3D

>because when the parser iterates over the input, it can't know that it fi=
>nished
>processing the section until it reads the next "TYPE" line (actually, unt=
>il it
>reads the first line that it cannot parse, which if everything went well,=
> should
>be the 'TYPE'), but once it reads it, it is no longer available to the ou=
>ter
>loop. I wouldn't like to leak the internals of the parsers to the outside=
>.

>What could I do?
>(to the curious: the format is a dialect of the E00 used in GIS)
>=20
>--=20
>Luis Zarrabeitia
>Facultad de Matem=E1tica y Computaci=F3n, UH
>http://profesores.matcom.uh.cu/~kyrie




One simple way is to allow your "input" iterator to support pushing values
back into the input stream as soon as it finds an input it can't handle.

See http://code.activestate.com/recipes/502304/ for an example.



More information about the Python-list mailing list