need simple parsing ability

Jean Brouwers JBrouwersAtProphICyDotCom at no.spam.net
Fri Jul 16 13:10:03 EDT 2004


With two fixes, one bug and one typo:

ns = '9,foo7-9,2-4,xxx,5, 6, 7, 8, 9, bar, foo_6, foo_10, foo_11'

 # list of plain, clean names
ns = [n.strip() for n in ns.split(',')]
 # expand names with range
fs = []
for n in ns:
    r = n.split('-')
    if len(r) != 2:  # simple name
        fs.append(n)
    else: # name with range
        h = r[0].rstrip('0123456789')  # header
        for i in range(int(r[0][len(h):]), 1 + int(r[1])):
            fs.append(h + str(i))
 # remove duplicates
fs = dict([(n, i) for i, n in enumerate(fs)])
fs = fs.keys()
 # sort, maybe
fs.sort()

print fs


/Jean Brouwers


In article <160720040947530644%JBrouwersAtProphICyDotCom at no.spam.net>,
Jean Brouwers <JBrouwersAtProphICyDotCom at no.spam.net> wrote:

>  Here is one possible way to do that with just Python:
> 
> 
>  ns = '9,foo7-9,2-4,xxx,5, 6, 7, 8, 9, bar, foo_6, foo_10, foo_11'
> 
>  # list of plain, clean names
> ns = [n.strip() for n in ns.split(',')]
>  # expand names with range
> fs = []
> for n in ns:
>     r = n.split('-')
>     if len(r) != 2:  # simple name
>         fs.append(n)
>     else: # name with range
>         h = r[0].rstrip('0123456789')  # header
>         for i in range(int(r[0][len(h):]), int(r[1])):
>             fs.append(h + str(i))
>  # remove duplicitates
> fs = dict([(n, i) for i, n in enumerate(fs)])
> fs = fs.keys()
>  # sort
> fs.sort()
> 
> print fs
> 
> 
> /Jean Brouwers
> 
> 
> 
> In article <20040716111324.09267883.gry at ll.mit.edu>, george young
> <gry at ll.mit.edu> wrote:
> 
> > [python 2.3.3, x86 linux]
> > For each run of my app, I have a known set of (<100) wafer names.
> > Names are sometimes simply integers, sometimes a short string, and
> > sometimes a short string followed by an integer, e.g.:
> > 
> >   5, 6, 7, 8, 9, bar, foo_6, foo_7, foo_8, foo_9, foo_10, foo_11
> > 
> > I need to read user input of a subset of these.  The user will type a
> > set of names separated by commas (with optional white space), but there
> > may also be sequences indicated by a dash between two integers, e.g.: 
> > 
> >    "9-11"       meaning 9,10,11
> >    "foo_11-13"  meaning foo_11, foo_12, and foo_13.
> >    "foo_9-11"   meaning foo_9,foo_10,foo_11, or 
> >    "bar09-11"   meaning bar09,bar10,bar11
> > 
> > (Yes, I have to deal with integers with and without leading zeros)
> > [I'll proclaim inverse sequences like "foo_11-9" invalid]
> > So a sample input might be:
> > 
> >    9,foo7-9,2-4,xxx   meaning 9,foo7,foo8,foo9,2,3,4,xxx
> > 
> > The order of the resultant list of names is not important; I have
> > to sort them later anyway.
> > 
> > Fancy error recovery is not needed; an invalid input string will be
> > peremptorily wiped from the screen with an annoyed beep.
> > 
> > Can anyone suggest a clean way of doing this?  I don't mind
> > installing and importing some parsing package, as long as my code
> > using it is clear and simple.  Performance is not an issue.
> > 
> > 
> > -- George Young



More information about the Python-list mailing list