[Tutor] What's going on in this Python code from Programming Collective Intelligence?

bluepresley bluepresley at gastonia.com
Sun Jul 7 05:18:35 CEST 2013


I'm reading the book Programming Collective Intelligence by Toby Segaran.
I'm having a lot of difficulty understanding the some of the code from
chapter four (the code for this chapter is available online at
https://github.com/cataska/programming-collective-intelligence-code/blob/master/chapter4/searchengine.pyand
starts at line172), specifically staring with this function:

def getscoredlist(self,rows,wordids):
    totalscores=dict([(row[0],0) for row in rows])# This is where you'll
later put the scoring functions
    weights=[]

    for (weight,scores) in weights:
      for url in totalscores:
        totalscores[url]+=weight*scores[url]

    return totalscores

What does this mean?
totalscores=dict([(row[0],0) for row in rows])

I have a lot of experience with PHP and Objective C.  If anyone is familiar
with PHP or another language could you please provide the equivalent? I
think that would really help me understand better.


The function just before that, getmatchingrows, provides the arguments for
getscoredlist. "rows" is rows from a database query; "wordids" is a list of
word ids searched for that generated rows result set. For that function
(getmatchingrows) it returns 2 variables simultaneously. I'm unfamiliar
with this. What's going on there?

Also, as far as I can tell from the getmatchingrows code, it returns a
multidimensional array of database results with row[0] being the urlid (NOT
the url), and other indices correspond to the word id location.

In getscoredlist, totalscores[url] doesn't make sense. Where is [url]
coming from? could they have meant to say urlid here?

This chapter is also available online for free from O'reilly.  Here is the
page that talks specifically about this part.

Any help understanding what this code in this part of this book is doing
would be greatly appreciated.

Thanks,
Blue

http://my.safaribooksonline.com/book/web-development/9780596529321/4dot-searching-and-ranking/querying#X2ludGVybmFsX0h0bWxWaWV3P3htbGlkPTk3ODA1OTY1MjkzMjElMkZjb250ZW50YmFzZWRfcmFua2luZyZxdWVyeT0=
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/tutor/attachments/20130707/2f5cdba6/attachment.html>


More information about the Tutor mailing list