Regex for repeated character?
Paul McGuire
ptmcg at austin.rr.com
Thu Jun 16 09:40:59 EDT 2005
A brute-force pyparsing approach - define an alternation of all
possible Words made up of the same letter.
Plus an alternate version that just picks out the repeats, and gives
their location in the input string:
from pyparsing import ZeroOrMore, MatchFirst, Word, alphas
print "group string by character repeats"
repeats = ZeroOrMore( MatchFirst( [ Word(a) for a in alphas ] ) )
test = "foo ooobaaazZZ"
print repeats.parseString(test)
print
print "find just the repeated characters"
repeats = MatchFirst( [ Word(a,min=2) for a in alphas ] )
test = "foo ooobaaazZZ"
for toks,loc,endloc in repeats.scanString(test):
print toks,loc
Gives:
group string by character repeats
['f', 'oo', 'ooo', 'b', 'aaa', 'z', 'ZZ']
find just the repeated characters
['oo'] 1
['ooo'] 4
['aaa'] 8
['ZZ'] 12
(pyparsing implicitly ignores whitespace, that's why there is no ' '
entry in the first list)
Download pyparsing at http://pyparsing.sourceforge.net.
-- Paul
More information about the Python-list
mailing list