[Tutor] Query: lists
Cameron Simpson
cs at cskk.id.au
Tue Aug 14 17:38:01 EDT 2018
On 14Aug2018 18:11, Deepti K <kdeepti2013 at gmail.com> wrote:
> when I pass ['bbb', 'ccc', 'axx', 'xzz', 'xaa'] as words to the below
>function, it picks up only 'xzz' and not 'xaa'
>
>def front_x(words):
> # +++your code here+++
> a = []
> b = []
> for z in words:
> if z.startswith('x'):
> words.remove(z)
> b.append(z)
> print 'z is', z
> print 'original', sorted(words)
> print 'new', sorted(b)
> print sorted(b) + sorted(words)
That is because you are making a common mistake which applies to almost any
data structure, but is particularly easy with lists and loops: you are
modifying the list _while_ iterating over it.
After you go:
words.remove(z)
all the elements _after_ z (i.e. those after 'xzz' i.e. ['xaa']) are moved down
the list.
In your particular case, that means that 'xaa' is now at index 3, and the next
iteration of the loop would have picked up position 4. Therefore the loop
doesn't get to see the value 'xaa'.
A "for" loop and almost anything that "iterates" over a data structure does not
work by taking a copy of that structure ahead of time, and looping over the
values. This is normal, because a data structure may be of any size - you do
not want to "make a copy of all the values" by default - that can be
arbitrarily expensive.
Instead, a for loop obtains an "iterator" of what you ask it to loop over. The
iterator for a list effectively has a reference to the list (in order to obtain
the values) and a notion of where in the list it is up to (i.e. a list index, a
counter starting at 0 for the first element and incrementing until it exceeds
the length of the list).
So when you run "for z in words", the iterator is up to index 3 when you reach
"xzz". So z[3] == "xzz". After you remove "xzz", z[3] == "xaa" and in this case
there is no longer a z[4] at all because the list is shortened. So the next
loop iteration never inspects that value. Even if the list had more value, the
loop would still skip the "xaa" value.
You should perhaps ask yourself: why am I removing values from "words"?
If you're just trying to obtain the values starting with "x" you do not need to
modify words because you're already collecting the values you want in "b".
If you're trying to partition words into values starting with "x" and values
not starting with "x", you're better off making a separate collection for the
"not starting with x" values. And that has me wondering what the list "b" in
your code was for originally.
As a matter of principle, functions that "compute a value" (in your case, a
list of the values starting with "x") should try not to modify what they are
given as parameters. When you pass values to Python functions, you are passing
a reference, not a new copy. If a function modifies that reference's _content_,
as you do when you go "words.move(z)", you're modifying the original.
Try running this code:
my_words = ['bbb', 'ccc', 'axx', 'xzz', 'xaa']
print 'words before =", my_words
front_x(my_words)
print 'words after =", my_words
You will find that "my_words" has been modified. This is called a "side
effect", where calling a function affects something outside it. It is usually
undesirable.
Cheers,
Cameron Simpson <cs at cskk.id.au>
More information about the Tutor
mailing list