pyparsing Combine without merging sub-expressions

Steven Bethard steven.bethard at gmail.com
Sun Jan 21 23:27:39 EST 2007


Paul McGuire wrote:
> Steven Bethard wrote:
>> Within a larger pyparsing grammar, I have something that looks like::
>>
>>      wsj/00/wsj_0003.mrg
>>
>> When parsing this, I'd like to keep around both the full string, and the
>> AAA_NNNN substring of it, so I'd like something like::
>>
>>      >>> foo.parseString('wsj/00/wsj_0003.mrg')
>>      (['wsj/00/wsj_0003.mrg', 'wsj_0003'], {})
>>
>> How do I go about this? I was using something like::
>>
>>      >>> digits = pp.Word(pp.nums)
>>      >>> alphas = pp.Word(pp.alphas)
>>      >>> wsj_name = pp.Combine(alphas + '_' + digits)
>>      >>> wsj_path = pp.Combine(alphas + '/' + digits + '/' + wsj_name +
>>      ... '.mrg')
[snip]
> BUT, if all you want is to be able to easily *access*
> that sub-field, then why not give it a results name?  Like this:
> 
> wsj_name = pp.Combine(alphas + '_' + digits).setResultsName("name")
> 
> Leave everything else the same, but now you can access the name field
> independently from the rest of the combined tokens.

Works great.  Thanks!

STeVe



More information about the Python-list mailing list