[Tutor] Semantic Error: Trying to access elements of list and append to empty list with for loop

Oscar Benjamin oscar.j.benjamin at gmail.com
Thu Jun 2 17:31:04 EDT 2016


On 2 June 2016 at 19:19, Olaoluwa Thomas <thomasolaoluwa at gmail.com> wrote:
> Thanks, everyone, for your help.
>
> The objective was to extract all words from each line and place them in a
> list IF they didn't already exist in it.
> I sorted it out by adding little bits of everyone's suggestions.

Well done. It looks like you have it working now which is good.

Now that you've told us about the extra bit that you only wanted to
store *unique* words I thought I'd tell you about Python's set data
type. A set is a container similar to a list but different. You can
create a set by using curly brackets {} rather than square brackets []
for a list e.g.:

>>> myset = {3,2,6}
>>> myset
set([2, 3, 6])

Notice first that when we print the set out it doesn't print the
elements in the same order that we put them in. This is because a set
doesn't care about the order of the elements. Also a set only ever
stores one copy of each item that you add so

>>> myset.add(-1)
>>> myset
set([2, 3, -1, 6])
>>> myset.add(6)
>>> myset
set([2, 3, -1, 6])

The add method adds an element but when we add an element that's
already in the set it doesn't get added a second time: a set only
stores unique elements (which is what you want to do). So you wrote:

> lst = list()
> for line in fhand:
>     words = line.split()
>     for word in words:
>         if word in lst:
>             continue
>         else:
>             lst.append(word)
> lst.sort()
> print lst

Using a set we could instead write:

unique_words = set()
for line in fhand:
    words = line.split()
    for word in words:
        unique_words.add(word)
lst = sorted(unique_words)
print lst

This is made simpler because we didn't need to check if the word was
already in the set (the set.add method takes care of this for us).
However since a set doesn't have an "order" it doesn't have a sort
method. If we want a sorted list we can use the sorted function to get
that from the set.

Also there is a set method "update" which can add many elements at
once given e.g. a list like words so we can do:

unique_words = set()
for line in fhand:
    words = line.split()
    unique_words.update(words)
lst = sorted(unique_words)
print lst

I've mentioned two big differences between a set and a list: a set is
unordered and only stores unique elements. There is another
significant difference which is about how long it takes the computer
to perform certain operations with sets vs lists. In particular when
we do this

      if word in lst:
            continue
        else:
            lst.append(word)

Testing if word is "in" a list can take a lot longer than testing if
it is "in" a set. If you need to test many times whether different
objects are "in" a list then it can often make your program a lot
slower than if you used a set. You can understand this intuitively by
thinking that "if word in lst" requires the computer to loop through
all items of the list comparing them with word. With sets the computer
has a cleverer way of doing this that is much faster when the set/list
is large (look up hash tables if you're interested).


--
Oscar


More information about the Tutor mailing list