need simple parsing ability
Jean Brouwers
JBrouwersAtProphICyDotCom at no.spam.net
Fri Jul 16 17:07:47 EDT 2004
Further (, final?) update and check some range errors.
/Jean Brouwers
ns = '9,2-4,xxx, bar, foo_6-11,x07-9, 0-1, 00-1'
# list of names and expanded names
fs = []
for n in ns.split(','):
n = n.strip()
r = n.split('-')
if len(r) == 2: # expand name with range
h = r[0].rstrip('0123456789') # header
r[0] = r[0][len(h):]
# range can't be empty
if not (r[0] and r[1]):
raise ValueError, 'empty range: ' + n
# handle leading zeros
if r[0] == '0' or r[0][0] != '0':
h += '%d'
else:
w = [len(i) for i in r]
if w[1] > w[0]:
raise ValueError, 'wide range: ' + n
h += '%%0%dd' % max(w)
# check range
r = [int(i, 10) for i in r]
if r[0] > r[1]:
raise ValueError, 'bad range: ' + n
for i in range(r[0], r[1]+1):
fs.append(h % i)
else: # simple name
fs.append(n)
# remove duplicates
fs = dict([(n, i) for i, n in enumerate(fs)]).keys()
# sort, maybe
fs.sort()
print fs
>>> ['0', '00', '01', '1', '2', '3', '4', '9', 'bar', 'foo_10',
'foo_11', 'foo_6', 'foo_7', 'foo_8', 'foo_9', 'x07', 'x08', 'x09',
'xxx']
In article <160720041008526140%JBrouwersAtProphICyDotCom at no.spam.net>,
Jean Brouwers <JBrouwersAtProphICyDotCom at no.spam.net> wrote:
> With two fixes, one bug and one typo:
>
> ns = '9,foo7-9,2-4,xxx,5, 6, 7, 8, 9, bar, foo_6, foo_10, foo_11'
>
> # list of plain, clean names
> ns = [n.strip() for n in ns.split(',')]
> # expand names with range
> fs = []
> for n in ns:
> r = n.split('-')
> if len(r) != 2: # simple name
> fs.append(n)
> else: # name with range
> h = r[0].rstrip('0123456789') # header
> for i in range(int(r[0][len(h):]), 1 + int(r[1])):
> fs.append(h + str(i))
> # remove duplicates
> fs = dict([(n, i) for i, n in enumerate(fs)])
> fs = fs.keys()
> # sort, maybe
> fs.sort()
>
> print fs
>
>
> /Jean Brouwers
>
>
> In article <160720040947530644%JBrouwersAtProphICyDotCom at no.spam.net>,
> Jean Brouwers <JBrouwersAtProphICyDotCom at no.spam.net> wrote:
>
> > Here is one possible way to do that with just Python:
> >
> >
> > ns = '9,foo7-9,2-4,xxx,5, 6, 7, 8, 9, bar, foo_6, foo_10, foo_11'
> >
> > # list of plain, clean names
> > ns = [n.strip() for n in ns.split(',')]
> > # expand names with range
> > fs = []
> > for n in ns:
> > r = n.split('-')
> > if len(r) != 2: # simple name
> > fs.append(n)
> > else: # name with range
> > h = r[0].rstrip('0123456789') # header
> > for i in range(int(r[0][len(h):]), int(r[1])):
> > fs.append(h + str(i))
> > # remove duplicitates
> > fs = dict([(n, i) for i, n in enumerate(fs)])
> > fs = fs.keys()
> > # sort
> > fs.sort()
> >
> > print fs
> >
> >
> > /Jean Brouwers
> >
> >
> >
> > In article <20040716111324.09267883.gry at ll.mit.edu>, george young
> > <gry at ll.mit.edu> wrote:
> >
> > > [python 2.3.3, x86 linux]
> > > For each run of my app, I have a known set of (<100) wafer names.
> > > Names are sometimes simply integers, sometimes a short string, and
> > > sometimes a short string followed by an integer, e.g.:
> > >
> > > 5, 6, 7, 8, 9, bar, foo_6, foo_7, foo_8, foo_9, foo_10, foo_11
> > >
> > > I need to read user input of a subset of these. The user will type a
> > > set of names separated by commas (with optional white space), but there
> > > may also be sequences indicated by a dash between two integers, e.g.:
> > >
> > > "9-11" meaning 9,10,11
> > > "foo_11-13" meaning foo_11, foo_12, and foo_13.
> > > "foo_9-11" meaning foo_9,foo_10,foo_11, or
> > > "bar09-11" meaning bar09,bar10,bar11
> > >
> > > (Yes, I have to deal with integers with and without leading zeros)
> > > [I'll proclaim inverse sequences like "foo_11-9" invalid]
> > > So a sample input might be:
> > >
> > > 9,foo7-9,2-4,xxx meaning 9,foo7,foo8,foo9,2,3,4,xxx
> > >
> > > The order of the resultant list of names is not important; I have
> > > to sort them later anyway.
> > >
> > > Fancy error recovery is not needed; an invalid input string will be
> > > peremptorily wiped from the screen with an annoyed beep.
> > >
> > > Can anyone suggest a clean way of doing this? I don't mind
> > > installing and importing some parsing package, as long as my code
> > > using it is clear and simple. Performance is not an issue.
> > >
> > >
> > > -- George Young
More information about the Python-list
mailing list