[Tutor] Re: A Demolished Function
Christopher Smith
csmith@blakeschool.org
Sun, 21 Apr 2002 16:21:55 -0500
>Danny Yoo wrote:
>>After writing this function, though, I still feel tense and
>apprehensive.
>>Does anyone see any improvements one could make to make the function
>>easier to read? Any criticism or dissension would be great. Thanks!
>
Kirby replied:
>Some inconsequential coding changes -- not necessarily easier
>to read:
>
>def conservativeSplit(regex, stuff):
> """Split 'stuff' along 'regex' seams."""
> fragments = []
> while 1:
> match = regex.search(stuff)
> if not match: break
> begin, end = match.span()
> if not begin == 0:
> fragments.append(stuff[0 : begin])
> fragments.append(stuff[begin : end])
> stuff = stuff[end :]
> if stuff: fragments.append(stuff)
> return fragments
>
>Kirby
I was revisting string splitting and the sorting titles that contain
numbers and I realized that the "conservative split" is already a
regex option: just enclose your pattern in parentheses and the
match will be retained in the split:
>>> import re
>>> t="five sir four sir three sir two sir one sir"
>>> sir=re.compile(r'(sir)')
>>> sir.split(t)
['five ', 'sir', ' four ', 'sir', ' three ', 'sir', ' two ', 'sir', ' one
', 'sir', '']
Another way to sort titles containing numbers is to split out (and retain)
the numbers, convert the numbers to integers, and then sort the split-up
titles. This will cause the numbers to be treated as numbers instead of
strings. Joining the split titles back together again brings you back to
the original title.
Here's a demo using Danny's original data:
def sortTitles(b):
"""The strings in list b are split apart on numbers (integer runs)
before they are sorted so the strings get sorted according to
numerical values when they occur (rather than text values) so
2 will follow 1, for example, rather than 11."""
# the () around the pattern preserves the pattern in the split
integer=re.compile(r'(\d+)')
for j in range(len(b)):
title=integer.split(b[j])
#
# Every *other* element will be an integer which was found;
# convert it to a int
#
for i in range(1,len(title),2):
title[i]=int(title[i])
b[j]=title[:]
#
# Now sort and rejoin the elements of the title
#
b.sort()
for i in range(len(b)):
for j in range(1,len(b[i]),2):
b[i][j]=str(b[i][j])
b[i]=''.join(b[i])
if __name__ == '__main__':
book_titles = ['The 10th Annual Python Proceedings',
'The 9th Annual Python Proceedings',
'The 40th Annual Python Proceedings',
'3.1415... A Beautiful Number',
'3.14... The Complexity of Infinity',
'TAOCP Volume 3: Sorting and Searching',
'TAOCP Volume 2: Seminumerical Algorithms',
'The Hitchhiker\'s Guide to the Galaxy']
print "Here are a list of my books, in sorted order:"
sortTitles(book_titles)
print book_titles
/c