text processing

Paul McGuire ptmcg at austin.rr.com
Fri Sep 26 00:32:35 EDT 2008


On Sep 25, 9:51 am, "jitensha... at gmail.com" <jitensha... at gmail.com>
wrote:
> I have string like follow
> 12560/ABC,12567/BC,123,567,890/JK
>
> I want above string to group like as follow
> (12560,ABC)
> (12567,BC)
> (123,567,890,JK)
>
> i try regular expression i am able to get first two not the third one.
> can regular expression given data in different groups

Looks like each item is:
- a list of 1 or more integers, in a comma-delimited list
- a slash
- a word composed of alpha characters

And the whole thing is a list of items in a comma-delimited list

Now to implement that in pyparsing:

>>> data = "12560/ABC,12567/BC,123,567,890/JK"
>>> from pyparsing import Suppress, delimitedList, Word, alphas, nums, Group
>>> SLASH = Suppress('/')
>>> dataitem = delimitedList(Word(nums)) + SLASH + Word(alphas)
>>> dataformat = delimitedList(Group(dataitem))
>>> map(tuple, dataformat.parseString(data))
[('12560', 'ABC'), ('12567', 'BC'), ('123', '567', '890', 'JK')]

Wah-lah! (as one of my wife's 1st graders announced in one of his
school papers)

-- Paul





More information about the Python-list mailing list