Here I am again, same old arguments

Steven D'Aprano steve at REMOVETHIScyber.com.au
Sun Oct 9 08:15:13 EDT 2005


On Sun, 09 Oct 2005 07:02:52 +0000, CJ wrote:

>    Okay, same program, different issue. Thanks to the help that I was 
> given I was able to complete my program to find variables in a list that 
> were repeated, and display them once, and how many times they appeared in 
> the list. And it worked great! 
> 
>    But, being the perfectionist that I am, I wanted to make the proggie 
> allow any size of list, and not have to be recoded every time. So step 
> one was to not make the program reliant on the list itself being of X 
> length all the time.

First off -- don't use a for loop with an index as you are doing.
 
> #setup variables
> grub=[3,25,3,5,3,"a","a","BOB",3,3,45,36,26,25,"a",3,3,3,"bob","BOB",67]
> grubrpt=grub
> cntro=0
> cntrt=0
> rpt=0
> skipped=0

You are doing too much manual work! Let Python do the lion's share of the
work for you!
 
> #set up for variable length of grub
> ttllen=len(grub)-1

Why are you subtracting one from the length of the list?

> print "The heck is this for loop doing?" 
> for point in range(0,ttllen,1):

Using point as a loop index is generally a bad idea. The result coming
from range is not a point, it is an integer, so why call it a point?

You are also over-specifying the input arguments to range. If the step
size is one, you don't need to specify it -- that's the default. You just
make it harder to read, for no reason. Likewise the initial starting value
of zero. Just use range(ttllen).

This, by the way, will return a list [0, 1, 2, ... , length of list -
TWO] because you already subtracted one from the length.

>     print "Here's Grub=",grub
>     print "And grubrpt=",grubrpt
>     grubrpt[point]="blk"

As others have pointed out, grub and grubrpt are both names for the same
list. Changing one changes the other.


> #Makes sure that there are not multiple prints. 
> def alrdy_dn(grub,grubrpt):
>     if grub[cntro] in grubrpt:

Ew!!! Global variables!!!

Bad programmer! No biscuit!!!

*wink*

Global variables are almost always a BAD idea.

>         return grubrpt
>     else:
>         print grub[cntro],"appears in list",rpt,"times."
>         grubrpt[grubrpt.index("blk")]=grub[cntro] return grubrpt

This is a strange function. What exactly is it meant to do? It
combines user interface (printing the number of times each item appears)
and functionality (counting the number of times each item appears) and
side effects (changing the list), before returning one of the input
arguments again.

At least two of those things (counting the items, and printing the
results) should be separated into different functions for ease of
comprehension.

I'm going to skip the rest of your code, because I don't understand it and
am too lazy, er, I mean busy, to spend the time trying to decipher it.
Especially since the function you are trying to duplicate manually is so
easy to do if you work with Python instead of against it.

def count_item(L, item):
    """Count the number of times item appears in list L."""
    return L.count(item)

Or wait... that's too easy :-)

If you want to roll your own, then do it like this:

def count_item(L, item):
    """Count the number of times item appears in list L by reinventing
    the wheel."""
    n = 0
    for obj in L:
        if obj == item:
            n += 1
    return n

Notice that we don't change the list at any time. Why change it? That just
adds complexity to our program and adds extra places to make bugs. Of
which you have many :-)

Now you use it like this:

grub=[3,25,3,5,3,"a","a","BOB",3,3,45,36,26,25,"a",3,3,3,"bob","BOB",67]
for item in grub:
    n = count_item(grub, item)
    print item, "appears in list", n, "times."


And you are done.

No, not quite -- my code has a bug in it. You want to print the count for
each *unique* item. Mine prints the count for each item, regardless of
whether it is unique or not. So what we need to keep track of which items
have been counted before. Here is one way of doing it:

grub=[3,25,3,5,3,"a","a","BOB",3,3,45,36,26,25,"a",3,3,3,"bob","BOB",67]
already_seen = []
for item in grub:
    if item not in already_seen:
        n = count_item(grub, item)
        print item, "appears in list", n, "times."
        already_seen.append(item)

Notice that rather than *deleting* from a copy of the original list, we
*add* to a new list that started off empty.

Here is another way:
    
grub=[3,25,3,5,3,"a","a","BOB",3,3,45,36,26,25,"a",3,3,3,"bob","BOB",67]
unique_counts = {} # use a dictionary, not a list
for item in grub:
    n = count_item(grub, item)
    unique_counts[item] = n
for item, n in unique_counts.items():
        print item, "appears in list", n, "times."

The second version has a disadvantage that the objects in your list can't
be lists themselves, because lists can't be keys of dictionaries.

It also has what appears to be a disadvantage that it stores the item
count in the dictionary, even if that count has already been stored.
Wasted work! But wait... it might not be wasted. It takes work to test if
your item has already been seen. That work might be more than it would
take to just blindly store the result even if it is there.

Think of this real world equivalent. You tried to send a fax to somebody,
but you aren't sure if it went through correctly. What's the cheapest way
to make sure? You could call them up on the phone and ask, but that costs
money and time. It could be cheaper and quicker to just re-fax the
document. 

> Also, I do realize that there is an easier way to do this, I just
> created a little project for myself to learn the basics of the language.

This is good, but keep in mind that "the basics" of Python include tools
to do things you are trying to do by hand.

For instance, if you find yourself writing a loop like this:

counter = 0
while counter < some_value:
    do_something_with(counter)
    counter = counter + 1

you almost always want to change that to:

for counter in range(some_value):
    do_something_with(counter)

If you find a loop like this:

for indx in range(len(some_list)):
    obj = some_list[indx]
    do_something_with(obj)

you almost always want to write it like this:

for obj in some_list:
    do_something_with(obj)

Don't fight the language -- you aren't programming in C or Java now, this
is Python and there is usually an easier way to do something.

*wink*

Hope this is of use to you,


-- 
Steven.




More information about the Python-list mailing list