need simple parsing ability
Christopher T King
squirrel at WPI.EDU
Fri Jul 16 12:04:15 EDT 2004
On Fri, 16 Jul 2004, george young wrote:
> I need to read user input of a subset of these. The user will type a
> set of names separated by commas (with optional white space), but there
> may also be sequences indicated by a dash between two integers, e.g.:
>
> "9-11" meaning 9,10,11
> "foo_11-13" meaning foo_11, foo_12, and foo_13.
> "foo_9-11" meaning foo_9,foo_10,foo_11, or
> "bar09-11" meaning bar09,bar10,bar11
>
> (Yes, I have to deal with integers with and without leading zeros)
> [I'll proclaim inverse sequences like "foo_11-9" invalid]
> So a sample input might be:
>
> 9,foo7-9,2-4,xxx meaning 9,foo7,foo8,foo9,2,3,4,xxx
>
> The order of the resultant list of names is not important; I have
> to sort them later anyway.
The following should do the trick, using nothing more than the built-in
re package:
---
import re
def expand(pattern):
r = re.search('\d+-\d+$',pattern)
if r is None:
yield pattern
return
s,e = r.group().split('-')
for n in xrange(int(s),int(e)+1):
yield pattern[:r.start()]+str(n)
def expand_list(pattern_list):
return [ w for pattern in pattern_list.split(',')
for w in expand(pattern) ]
print expand_list('9,foo7-9,2-4,xxx')
---
If you want to let the syntax be a little more lenient, replace
"pattern_list.split(',')" in expand_list() with
"re.split('\s*,\s*',pattern_list)". This will allow spaces to surround
commas.
Note that because this uses generators, it won't work on Pythons prior to
2.3.
Hope this helps!
More information about the Python-list
mailing list