Extract the middle N chars of a string
boB Stepp
robertvstepp at gmail.com
Sat May 21 00:00:39 EDT 2016
On Wed, May 18, 2016 at 10:47 AM, Steven D'Aprano <steve at pearwood.info> wrote:
> Getting the middle N seems like it ought to be easy:
>
> s[N//2:-N//2]
>
> but that is wrong. It's not even the right length!
>
> py> s = 'aardvark'
> py> s[5//2:-5//2]
> 'rdv'
>
>
> So after spending a ridiculous amount of time on what seemed like it ought
> to be a trivial function, and an embarrassingly large number of off-by-one
> and off-by-I-don't-even errors, I eventually came up with this:
>
> def mid(string, n):
> """Return middle n chars of string."""
> L = len(string)
> if n <= 0:
> return ''
> elif n < L:
> Lr = L % 2
> a, ar = divmod(L-n, 2)
> b, br = divmod(L+n, 2)
> a += Lr*ar
> b += Lr*br
> string = string[a:b]
> return string
As some of you know, I usually post on the Tutor list while attempting
to learn Python as time permits. I had to try my hand at this problem
as a learning opportunity. I hope you don't mind if I explain how I
got to my solution and welcome your critiques, so I may improve. I
chose to cheat my answers to the right; I did not think about the
possibility of alternating the sides to allot the extra character
(when needed) to average things out until I read everyone's answers
after getting my own.
I started considering two strings, s_even = '0123456789' and s_odd =
'123456789', with trial values of n = 4 and n = 5 for how many
characters to extract. This gave me the following four desired
outputs to replicate:
1) s_even with n = 5. Desired output: '34567' (Cheating right.) =>
Slice s_even[3:8]
2) s_even with n = 4. Desired output: '3456' (Exact.) => Slice s_even[3:7]
3) s_odd with n = 5. Desired output: '34567' (Exact.) => Slice s_odd[2:7]
4) s_odd with n = 4. Desired output: '4567' (Cheating right.) =>
Slice s_odd[3:7]
Starting to generalize to get the desired indices for each case:
1) (len(s_even)//2 - n//2):(len(s_even)//2 + n//2 + 1)
2) (len(s_even)//2 - n//2):(len(s_even)//2 + n//2)
3) (len(s_odd)//2 - n//2):(len(s_odd)//2 + n//2 + 1)
4) (len(s_odd)//2 + 1 - n//2):(len(s_odd)//2 + n//2 + 1)
Looking at the starting index for each case, I had an extra 1 for case
(4), which, in table form:
n even n odd
s_even 0 0
s_odd 1 0
To duplicate this I came up with the expression: (len(s)%2) * (1 - n%2)
Similarly, for the ending slice index, all cases have an extra "+ 1"
except for case (2), with the following table:
n even n odd
s_even 0 1
s_odd 1 1
And the expression: 1 - ((len(s) + 1)%2 * (n +1)%2)
All this was scribbled onto scratch paper, so I hope I did not make
any typos! This led me to the following code:
py3: def mid(s, n):
... index0_offset = (len(s)%2) * (1 - n%2)
... index1_offset = 1 - ((len(s) + 1)%2) * ((n + 1)%2)
... index0 = len(s)//2 - n//2 + index0_offset
... index1 = len(s)//2 + n//2 + index1_offset
... return s[index0:index1]
...
py3: s = '0123456789'
py3: n = 5
py3: mid(s, n)
'34567'
py3: n = 4
py3: mid(s, n)
'3456'
py3: s = '123456789'
py3: n = 5
py3: mid(s, n)
'34567'
py3: n = 4
py3: mid(s, n)
'4567'
py3: s = 'aardvark'
py3: n = 5
py3: mid(s, n)
'rdvar'
This also returns an empty string for values of n <= 0.
As far as I can tell, my solution works (Given cheating right.). I
ran it on all of Steve's examples, and I got what I expected given
that I am consistently cheating right. But I am not sure my code
adequately conveys an understanding of what I am doing to the casual
reader. Thoughts?
TIA!
boB
More information about the Python-list
mailing list