Review Request of Python Code

Matt Wheeler m at funkyhat.org
Wed Mar 9 07:06:31 EST 2016


I'm just going to focus on a couple of lines as others are already
looking at the whole thing:

On 9 March 2016 at 04:18,  <subhabangalore at gmail.com> wrote:
> [snip].........
>                 if word in a4:
>                     [stuff]
>                 elif word not in a4:
>                     [other stuff]
>                 else:
>                     print "None"

This is bad for a couple of reasons.

Most obviously, your `else: print "None"` case can never be reached.
word not in a4 is the inverse of word in a4.
That also means for the `not` case the entire a4 list is scanned
*twice*, and the second time is completely pointless.

This can be simplified to
                 if word in a4:
                     [stuff]
                 else:
                     [other stuff]

But we can still do better. A list is a poor choice for this kind of
lookup, as Python has no way to find elements other than by checking
them one after another. (given (one of the) name(s) you've given it
sounds a bit like "dictionary" I assume it contains rather a lot of
items)

If you change one other line:

> dict_word=dict_read.split()
> a4=dict_word #Assignment for code.

a4=set(dict_word)
#(this could of course be shortened further but I'll leave that to you/others)

You'll probably see a massive speedup in your code, possibly even
dwarfing the speedup you see from more sensible database access like
INADA Naoki suggested (though you should definitely still do that
too!), especially if your word list is very large.

This is because the set type uses a hashmap internally, making lookups
for matches extremely fast, compared to scanning through the list.


-- 
Matt Wheeler
http://funkyh.at



More information about the Python-list mailing list