Trouble writing to database: RSS-reader

Bruno Desthuilliers bruno.42.desthuilliers at wtf.websiteburo.oops.com
Mon Jan 21 13:15:29 EST 2008


Arne a écrit :
> Hi!
> 
> I try to make a rss-reader in python just for fun, and I'm almost
> finished.

Bad news : you're not.

> I don't have any syntax-errors, but when i run my program,
> nothing happends.
> 
> This program is supposed to download a .xml-file, save the contents in
> a buffer-file(buffer.txt) and parse the file looking for start-tags.
> When it has found a start tag, it asumes that the content (between the
> start-tag and the end-tag) is on the same line,

Very hazardous assumption. FWIW, you can more safely assule this will 
almost never be the case. FWIW, don't assume *anything* wrt/ newlines 
when it comes to XML - you can even have newlines between two attributes 
of a same tag...

> so then it removes the
> start-tag and the end-tag and saves the content and put it into a
> database.
> 
> The problem is that i cant find the data in the database! If i watch
> my program while im running it, i can see that it sucsessfuly
> downloads the .xml-file from the web and saves it in the buffer.
> 
> But I dont think that i save the data in the correct way, so it would
> be nice if someone had some time to help me.
> 
> Full code: http://pastebin.com/m56487698
> Saving to database: http://pastebin.com/m7ec69e1b
> Retrieving from database: http://pastebin.com/m714c3ef8

1/ you don't need to make each and every variable an attribute of the 
class - only use attributes for what constitute the object state (ie: 
need to be maintain between different, possibly unrelated method calls). 
In your update_sql method, for exemple, beside self.connection and 
_eventually_ self.cursor, you don't need any attribute - local variables 
are enough.

2/ you don't need these <xxx>Stored variables at all - just reset 
title/link/description to None *when needed* (cf below), then test these 
variables against None.

3/ learn to use if/elif properly !-)

4/ *big* logic flaw (and probably the first cause of your problem): on 
*each* iteration, you reset your <xxx>Stored flags to False - whether 
you stored something in the database or not. Since you don't expect to 
have all there data on a single line (another wrong assumption : you 
might get a whole rss stream as one single big line), I bet you never 
write anything into the database .

5/ other big flaw : either use an autoincrement for your primary key - 
and *dont* pass any value for it in your query -  or provide (a 
*unique*) id by yourself.

6/ FWIW, also learn to properly use the DB api - don't build your SQL 
query using string formatting, but pass the argument as a tuple, IOW:

# bad:
cursor.execute(
     '''INSERT INTO main VALUES(null, %s, %s, %s)'''
     % title, link, description
)

# good (assuming you're using an autoincrementing key for your id) :
cursor.execute(
     "INSERT INTO main VALUES(<X>, <X>, <X>)",
     (title, link, description)
)

NB : replace <X> with the appropriate placeholder for your database - cf 
your db module documentation (usually either '?' or '%s')

This will make the db module properly escape and convert values.

7/ str.replace() doesn't modify the string in-place (Python strings are 
immutable), but returns a new string. so you want:
   line = line.replace('x', 'y')

8/ you don't need to explicitely call connection.commit on each and 
every statement, and you don't need to call it at all on SELECT 
statements !-)

9/ have you tried calling print_rss *twice* on the same instance ?-)

10/ are you sure it's useful to open the same 'buffer.txt' file for 
writing *twice* (once in __init__, the other in update_sql). BTW, use 
open(), not file().

11/ are you sure you need to use this buffer file at all ?

12/ are you really *sure* you want to *destroy* your table and recreate 
it each time you call your script ?

> And yes, I know that there is rss-parseres already built, but this is
> only for learning.

This should not prevent you from learning how to properly parse XML 
(hint: with an XML parser). XML is *not* a line-oriented format, so you 
just can't get nowhere trying to parse it this way.



HTH



More information about the Python-list mailing list