Help me use re better

Alex Martelli aleaxit at yahoo.com
Tue Apr 24 10:02:48 EDT 2001


"P Browning" <glpb at eis.bris.ac.uk> wrote in message
news:GCAAFM.7Jv at bath.ac.uk...
> I've read AMK's RE HowTo but I think I'm missing something
> obvious when it comes to substitutions. Any offers for
> a more elegant solution to the program below gratefully

Elegance is in the eye of the beholder, but...:

> import string,re
>
> directors = ['Prof A B Looney','Dr C D E Ftang','Ms H I J K Biscuit
Barrel']
> # I want no spaces between the initials
> # Prof AB Looney
> # Dr CDE Ftang
> # Ms HIJK Biscuit Barrel
>
> match_initials = re.compile(r'([A-Z] )+')

This may not match *QUITE* what you want -- if the honorific or
any part of the name but the last ever ends with a capital, this
may unwontedly match it.  It may be best to stick a word-boundary
marker before that initial:

match_initials = re.compile(r'(\b[A-Z] )+')


Anyway, now add:

def nospaces(matchobj):
    return matchobj.group(0).replace(' ','')


> print
> for director in directors:
>     s = match_initials.search(director)
>     inits = s.group(0)
>     new_inits = string.replace(inits,' ','')
>     new_director = string.replace(director,inits,new_inits + ' ')
>     print director,new_director

This then becomes:

for director in directors:
    new_director = match_initials.sub(nospaces, director)

plus of course whatever "print"s you want.


Alex






More information about the Python-list mailing list