Parsing a file based on differing delimiters

Kylotan kylotan at hotmail.com
Wed Oct 22 19:53:24 EDT 2003


bokr at oz.net (Bengt Richter) wrote in message news:<bn5o45$gru$0 at 216.39.172.122>...

> A generator can look ahead by holding put-back info in its own state
> without yielding a result until it has decided what to do. It can read
> input line-wise and scan lines for patterns and store ambiguous info
> for re-analysis if backup is needed. You can go character by character
> or whip through lines of comments in bigger chunks, and recognize alternative
> patterns with regular expressions. There are lots of options.

Sadly none of these options seem obvious to me :)  Basically 90% of
the time, I know exactly what type to expect. Other times, I am gonna
get one of several things back, where sometimes one of those things is
actually part of something totally different, so I need to leave it
there for the next routine. How would that be done with a generator?

> Communicating clearly and precisely should be more than enough justification IMO ;-)
> 
> What you've said above sounds like approximately:
> 
>     kylotan_file: ( string_text '~' | number WS | some_identifiers NL )*
> 
> If it's not that complicated, why not complete the picture?

Because it would be a fairly flat grammar where each non-terminal
symbol has a very long rule of almost exclusively terminal symbols
describing what it contains. There's no recursiveness and very little
iteration or alternation in here. With all this in mind, I'd rather
keep all the logic for reading and assigning values in one place
rather than going through a parser middleman which will complicate the
code. Traditional tokenizers and lexers are also of little use since
many of the tokens are context-dependent.

> Look and Andrew Dalke's recent post for a number of ideas and code you might
> snip and adapt to your problem

All I found in a short search was something complex that appeared to
be an expression parser, which is not really what I need here.

Thanks,

Ben Sizer




More information about the Python-list mailing list