Efficient: put Content of HTML file into mysql database

Fabian López fabian at syameses.com
Mon Nov 19 14:03:58 EST 2007


Thanks Jesse,
we have a webserver with different domains and I need to crawl different
urls so I decided first to download the file using :
f = urllib.urlretrieve(url,'doc.html')
And after that, I will read the content with all the HTML tags and save it
on our database in order to work with this text.
Is it enough? it is such an easy web crawler. Maybe I can save it without
downloading the file, can I?
Thanks
Fabian

2007/11/19, Jesse Jaggars <cynshard at gmail.com>:
>
> Fabian López wrote:
> > Hi colegues,
> > do you know the most efficient way to put the content of an html file
> > into a mySQL database?Could it be this one?:
> > 1.- I have the html document in my hard disk.
> > 2.- Then I Open the file (maybe with fopen??)
> > 3.- Read the content (fread or similar)
> > 4.- Write all the content it in a SQL sentence.
> >
> > What happens if the html file is very big?
> >
> >
> > Thanks!
> > FAbian
> >
> >
> I don't understand why you would want to store an entire static html
> page in the database. All that accomplishes is adding the overhead of
> calling MySQL too.
>
> If you want to serve HTML do just let your webserver serve HTML.
>
> Is there more to the situation that would help myself and others
> understand why you want to do this?
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-list/attachments/20071119/221fd8d1/attachment.html>


More information about the Python-list mailing list