Trouble writing to database: RSS-reader
Bruno Desthuilliers
bruno.42.desthuilliers at wtf.websiteburo.oops.com
Mon Jan 21 13:15:29 EST 2008
Arne a écrit :
> Hi!
>
> I try to make a rss-reader in python just for fun, and I'm almost
> finished.
Bad news : you're not.
> I don't have any syntax-errors, but when i run my program,
> nothing happends.
>
> This program is supposed to download a .xml-file, save the contents in
> a buffer-file(buffer.txt) and parse the file looking for start-tags.
> When it has found a start tag, it asumes that the content (between the
> start-tag and the end-tag) is on the same line,
Very hazardous assumption. FWIW, you can more safely assule this will
almost never be the case. FWIW, don't assume *anything* wrt/ newlines
when it comes to XML - you can even have newlines between two attributes
of a same tag...
> so then it removes the
> start-tag and the end-tag and saves the content and put it into a
> database.
>
> The problem is that i cant find the data in the database! If i watch
> my program while im running it, i can see that it sucsessfuly
> downloads the .xml-file from the web and saves it in the buffer.
>
> But I dont think that i save the data in the correct way, so it would
> be nice if someone had some time to help me.
>
> Full code: http://pastebin.com/m56487698
> Saving to database: http://pastebin.com/m7ec69e1b
> Retrieving from database: http://pastebin.com/m714c3ef8
1/ you don't need to make each and every variable an attribute of the
class - only use attributes for what constitute the object state (ie:
need to be maintain between different, possibly unrelated method calls).
In your update_sql method, for exemple, beside self.connection and
_eventually_ self.cursor, you don't need any attribute - local variables
are enough.
2/ you don't need these <xxx>Stored variables at all - just reset
title/link/description to None *when needed* (cf below), then test these
variables against None.
3/ learn to use if/elif properly !-)
4/ *big* logic flaw (and probably the first cause of your problem): on
*each* iteration, you reset your <xxx>Stored flags to False - whether
you stored something in the database or not. Since you don't expect to
have all there data on a single line (another wrong assumption : you
might get a whole rss stream as one single big line), I bet you never
write anything into the database .
5/ other big flaw : either use an autoincrement for your primary key -
and *dont* pass any value for it in your query - or provide (a
*unique*) id by yourself.
6/ FWIW, also learn to properly use the DB api - don't build your SQL
query using string formatting, but pass the argument as a tuple, IOW:
# bad:
cursor.execute(
'''INSERT INTO main VALUES(null, %s, %s, %s)'''
% title, link, description
)
# good (assuming you're using an autoincrementing key for your id) :
cursor.execute(
"INSERT INTO main VALUES(<X>, <X>, <X>)",
(title, link, description)
)
NB : replace <X> with the appropriate placeholder for your database - cf
your db module documentation (usually either '?' or '%s')
This will make the db module properly escape and convert values.
7/ str.replace() doesn't modify the string in-place (Python strings are
immutable), but returns a new string. so you want:
line = line.replace('x', 'y')
8/ you don't need to explicitely call connection.commit on each and
every statement, and you don't need to call it at all on SELECT
statements !-)
9/ have you tried calling print_rss *twice* on the same instance ?-)
10/ are you sure it's useful to open the same 'buffer.txt' file for
writing *twice* (once in __init__, the other in update_sql). BTW, use
open(), not file().
11/ are you sure you need to use this buffer file at all ?
12/ are you really *sure* you want to *destroy* your table and recreate
it each time you call your script ?
> And yes, I know that there is rss-parseres already built, but this is
> only for learning.
This should not prevent you from learning how to properly parse XML
(hint: with an XML parser). XML is *not* a line-oriented format, so you
just can't get nowhere trying to parse it this way.
HTH
More information about the Python-list
mailing list