pyparsing Combine without merging sub-expressions

Steven Bethard steven.bethard at gmail.com
Sun Jan 21 23:20:26 EST 2007


Dennis Lee Bieber wrote:
> On Sat, 20 Jan 2007 13:49:52 -0700, Steven Bethard
> <steven.bethard at gmail.com> declaimed the following in comp.lang.python:
> 
>> Within a larger pyparsing grammar, I have something that looks like::
>>
>>      wsj/00/wsj_0003.mrg
>>
>> When parsing this, I'd like to keep around both the full string, and the 
>> AAA_NNNN substring of it, so I'd like something like::
>>
>>      >>> foo.parseString('wsj/00/wsj_0003.mrg')
>>      (['wsj/00/wsj_0003.mrg', 'wsj_0003'], {})
>>
> 	If working file name/paths, why not use the functions in os.path?

Two reasons.  First, as I mentioned, this is within a larger pyparsing 
grammar so it's not as easy to switch back and forth between the two. 
Second, I do want to do some data validation (e.g. the name of the file 
needs to be in a particular format) so I either need to post-process the 
os.path approach or just do it in pyparsing.


>> But that then allows whitespace between the pieces of the path, which 
>> there shouldn't be::
>>
> 	If you didn't have whitespace coming in, there shouldn't be any
> going out. If you do, you likely have malformed data and probably should
> detect it earlier...

Well that's the intention of using pyparsing here.  With a proper 
grammar, pyparsing can detect the malformed data for me and throw an error.

STeVe



More information about the Python-list mailing list