pyparsing Combine without merging sub-expressions
Steven Bethard
steven.bethard at gmail.com
Sun Jan 21 23:20:26 EST 2007
Dennis Lee Bieber wrote:
> On Sat, 20 Jan 2007 13:49:52 -0700, Steven Bethard
> <steven.bethard at gmail.com> declaimed the following in comp.lang.python:
>
>> Within a larger pyparsing grammar, I have something that looks like::
>>
>> wsj/00/wsj_0003.mrg
>>
>> When parsing this, I'd like to keep around both the full string, and the
>> AAA_NNNN substring of it, so I'd like something like::
>>
>> >>> foo.parseString('wsj/00/wsj_0003.mrg')
>> (['wsj/00/wsj_0003.mrg', 'wsj_0003'], {})
>>
> If working file name/paths, why not use the functions in os.path?
Two reasons. First, as I mentioned, this is within a larger pyparsing
grammar so it's not as easy to switch back and forth between the two.
Second, I do want to do some data validation (e.g. the name of the file
needs to be in a particular format) so I either need to post-process the
os.path approach or just do it in pyparsing.
>> But that then allows whitespace between the pieces of the path, which
>> there shouldn't be::
>>
> If you didn't have whitespace coming in, there shouldn't be any
> going out. If you do, you likely have malformed data and probably should
> detect it earlier...
Well that's the intention of using pyparsing here. With a proper
grammar, pyparsing can detect the malformed data for me and throw an error.
STeVe
More information about the Python-list
mailing list