Looking for feedback on weighted voting algorithm

Thu Apr 14 16:48:12 EDT 2016

On Thu, Apr 14, 2016, 7:37 PM justin walters <walters.justin01 at gmail.com>
wrote:

> On Apr 14, 2016 9:41 AM, "Martin A. Brown" <martin at linux-ip.net> wrote:
> >
> >
> > Greetings Justin,
> >
> > >    score = sum_of_votes/num_of_votes
> >
> > >votes = [(72, 4), (96, 3), (48, 2), (53, 1), (26, 4), (31, 3), (68, 2),
> (91, 1)]
> >
> > >Specifically, I'm wondering if this is a good algorithm for
> > >weighted voting. Essentially a vote is weighted by the number of
> > >votes it counts as. I realize that this is an extremely simple
> > >algorithm, but I was wondering if anyone had suggestions on how to
> > >improve it.
> >
> > I snipped most of your code.  I don't see anything wrong with your
> > overall approach.  I will make one suggestion: watch out for
> > DivisionByZero.
> >
> >     try:
> >         score = sum_of_votes / num_of_votes
> >     except ZeroDivisionError:
> >         score = float('nan')
> >
> > In your example data, all of the weights were integers, which means
> > that a simple mean function would work, as well, if you expanded the
> > votes to an alternate representation:
> >
> >   votes = [72, 72, 72, 72, 96, 96, 96, 48, 48, 53, 26, 26, 26, 26,
> >            31, 31, 31, 68, 68, 91]
> >
> > But, don't bother!
> >
> > Your function can handle votes that have a float weight:
> >
> >   >>> weight([(4, 1.3), (1, 1),])
> >   2.695652173913044
> >
> > Have fun!
> >
> > -Martin
> >
> > --
> > Martin A. Brown
> > http://linux-ip.net/
>
> Thanks Martin!
>
> I'll add the check for division by zero. Didn't think about that. I think
> I'm going to sanitize input anyways, but always better to be safe than
> sorry.
>

I suggest not worrying about sanitizing inputs. If someone provides bad
data, Python will do the right thing: stop the program and print an
explanation of what went wrong, often a more helpful message than one you'd
write. Use error handling mostly for when you want to do something *other*
than stop the program.

I'm not sure I'd use NaN instead of raise division by zero error. NaNs can
be troublesome for downstream code that might not notice until it gets
confusing. A div-by-zero error is clear and easier to track down because of
the traceback.

What do you think of using list comprehensions?

    weighted_sum = sum(rating * weight for rating, weight in votes)
    total_weights = sum(weight for rating, weight in votes)
    score = weighted_sum / total_weights

It's two loops as I wrote it, which is instinctively slower, but it might
actually execute faster because of the built-in sum vs a regular for loop.
Not sure.

>