Trouble writing to database: RSS-reader

Gabriel Genellina gagsl-py2 at yahoo.com.ar
Mon Jan 21 13:41:50 EST 2008


En Mon, 21 Jan 2008 14:12:43 -0200, Arne <arne.k.h at gmail.com> escribi�:

> I try to make a rss-reader in python just for fun, and I'm almost
> finished. I don't have any syntax-errors, but when i run my program,
> nothing happends.
>
> This program is supposed to download a .xml-file, save the contents in
> a buffer-file(buffer.txt) and parse the file looking for start-tags.
> When it has found a start tag, it asumes that the content (between the
> start-tag and the end-tag) is on the same line, so then it removes the
> start-tag and the end-tag and saves the content and put it into a
> database.

That's a gratuitous assumption and may not hold on many sources; you  
should use a proper XML parser instead (using ElementTree, by example, is  
even easier than your sequence of find and replace)

> The problem is that i cant find the data in the database! If i watch
> my program while im running it, i can see that it sucsessfuly
> downloads the .xml-file from the web and saves it in the buffer.

Ok. So the problem should be either when you read the buffer again, when  
processing it, or when saving in the database.
It's very strange to create the table each time you want to save anything,  
but this gives you another clue: the table is created and remains empty,  
else the select statement in print_rss would have failed. So you know that  
those lines are executed. Now, the print statement is your friend:

         self.buffer = file('buffer.txt')
         for line in self.buffer.readline():
             print "line=",line # add this and see what you get

Once you get your code working, it's time to analyze it. I think someone  
told you "in Python, you have to use self. everywhere" and you read it  
literally. Let's see:

     def update_buffer(self):
         self.buffer = file('buffer.txt', 'w')
         self.temp_buffer = urllib2.urlopen(self.rssurl).read()
         self.buffer.write(self.temp_buffer)
         self.buffer.close()

All those "self." are unneeded and wrong. You *can*, and *should*, use  
local variables. Perhaps it's a bit hard to grasp at first, but local  
variables, instance attributes and global variables are different things  
used for different purposes. I'll try an example: you [an object] have a  
diary, where you record things that you have to remember [your instance  
attributes, or "data members" as they are called on other languages]. You  
also carry a tiny notepad in your pocket, where you make a few notes when  
you are doing something, but you always throw away the page once the job  
is finished [local variables]. Your brothers, sisters and parents [other  
objects] use the same schema, but there is a whiteboard on the kitchen  
where important things that all of you have to know are recorded [global  
variables] (anybody can read and write on the board).
Now, back to the code, why "self." everywhere? Let's see, self.buffer is a  
file: opened, written, and closed, all inside the same function. Once it's  
closed, there is no need to keep a reference to the file elsewhere. It's  
discardable, as your notepad pages: use a local variable instead. In fact,  
*all* your variables should be locals, the *only* things you should keep  
inside your object are rssurl and the database location, and perhaps  
temp_buffer (with another, more meaningful name, rssdata by example).

Other -more or less random- remarks:

             if self.titleStored == True and self.linkStored == True and  
descriptionStored == True:

Don't compare against True/False. Just use their boolean value:

             if titleStored and linkStored and descriptionStored:

Your code resets those flags at *every* line read, and since a line  
contains at most one tag, they will never be True at the same time. You  
should reset the flags only after you got the three items and wrote them  
onto the database.

The rss feed, after being read, is available into self.temp_buffer; why do  
you read it again from the buffer file? If you want to iterate over the  
individual lines, use:

     for line in self.temp_buffer.splitlines():

-- 
Gabriel Genellina




More information about the Python-list mailing list