Extract the middle N chars of a string

boB Stepp robertvstepp at gmail.com
Sat May 21 00:00:39 EDT 2016


On Wed, May 18, 2016 at 10:47 AM, Steven D'Aprano <steve at pearwood.info> wrote:

> Getting the middle N seems like it ought to be easy:
>
> s[N//2:-N//2]
>
> but that is wrong. It's not even the right length!
>
> py> s = 'aardvark'
> py> s[5//2:-5//2]
> 'rdv'
>
>
> So after spending a ridiculous amount of time on what seemed like it ought
> to be a trivial function, and an embarrassingly large number of off-by-one
> and off-by-I-don't-even errors, I eventually came up with this:
>
> def mid(string, n):
>     """Return middle n chars of string."""
>     L = len(string)
>     if n <= 0:
>         return ''
>     elif n < L:
>         Lr = L % 2
>         a, ar = divmod(L-n, 2)
>         b, br = divmod(L+n, 2)
>         a += Lr*ar
>         b += Lr*br
>         string = string[a:b]
>     return string

As some of you know, I usually post on the Tutor list while attempting
to learn Python as time permits.  I had to try my hand at this problem
as a learning opportunity.  I hope you don't mind if I explain how I
got to my solution and welcome your critiques, so I may improve.  I
chose to cheat my answers to the right; I did not think about the
possibility of alternating the sides to allot the extra character
(when needed) to average things out until I read everyone's answers
after getting my own.

I started considering two strings, s_even = '0123456789' and s_odd =
'123456789', with trial values of n = 4 and n = 5 for how many
characters to extract.  This gave me the following four desired
outputs to replicate:

1)  s_even with n = 5.  Desired output:  '34567' (Cheating right.)  =>
Slice s_even[3:8]
2)  s_even with n = 4.  Desired output:  '3456' (Exact.)  => Slice s_even[3:7]
3)  s_odd with n = 5.  Desired output:  '34567' (Exact.)  => Slice s_odd[2:7]
4)  s_odd with n = 4.  Desired output:  '4567' (Cheating right.)  =>
Slice s_odd[3:7]

Starting to generalize to get the desired indices for each case:

1)  (len(s_even)//2 - n//2):(len(s_even)//2 + n//2 + 1)
2)  (len(s_even)//2 - n//2):(len(s_even)//2 + n//2)
3)  (len(s_odd)//2 - n//2):(len(s_odd)//2 + n//2 + 1)
4)  (len(s_odd)//2 + 1 - n//2):(len(s_odd)//2 + n//2 + 1)

Looking at the starting index for each case, I had an extra 1 for case
(4), which, in table form:

        n even    n odd
s_even    0         0
s_odd     1         0

To duplicate this I came up with the expression:  (len(s)%2) * (1 - n%2)

Similarly, for the ending slice index, all cases have an extra "+ 1"
except for case (2), with the following table:

        n even    n odd
s_even    0         1
s_odd     1         1

And the expression:  1 - ((len(s) + 1)%2 * (n +1)%2)

All this was scribbled onto scratch paper, so I hope I did not make
any typos!  This led me to the following code:

py3: def mid(s, n):
...     index0_offset = (len(s)%2) * (1 - n%2)
...     index1_offset = 1 - ((len(s) + 1)%2) * ((n + 1)%2)
...     index0 = len(s)//2 - n//2 + index0_offset
...     index1 = len(s)//2 + n//2 + index1_offset
...     return s[index0:index1]
...
py3: s = '0123456789'
py3: n = 5
py3: mid(s, n)
'34567'
py3: n = 4
py3: mid(s, n)
'3456'
py3: s = '123456789'
py3: n = 5
py3: mid(s, n)
'34567'
py3: n = 4
py3: mid(s, n)
'4567'
py3: s = 'aardvark'
py3: n = 5
py3: mid(s, n)
'rdvar'

This also returns an empty string for values of n <= 0.

As far as I can tell, my solution works (Given cheating right.).  I
ran it on all of Steve's examples, and I got what I expected given
that I am consistently cheating right.  But I am not sure my code
adequately conveys an understanding of what I am doing to the casual
reader.  Thoughts?

TIA!
boB



More information about the Python-list mailing list