which db should I use?

Jim Richardson warlock at eskimo.com
Tue May 14 01:50:27 EDT 2002


-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On Mon, 13 May 2002 21:23:51 -0400,
 Peter Hansen <peter at engcorp.com> wrote:
> Jim Richardson wrote:
>> 
>> > No need to use grep, unless you want to.  Normally you would want to
>> > index the files, so that a search for key words becomes an extremely
>> > fast operation.  Even in a database...
>> 
>> wouldn't this be simply doing a db like approach? how would I go about
>> learning about this in python? I wouldn't mind not having to have a few
>> hundred MB of data in the newsspool *and* in some database. Would it be
>> possible to simple have an "index" file that would give me the same
>> search functions as SQL, seperate from the actual spool?
> 
> I'm not sure.  What's a newsspool? :-)
>

Leafnode, the newsserver I use, maintains the posts each in a single
file, in a directory for each newsgroup. I have collected about 6 months
worth of posts for a newsgroup, and I wish to be able to retrieve posts
that fit certain criteria, such as all posts by so and so, where the
subject includes fnord, or all posts between certain dates where the
word fooble was mentioned. I know this is basically what google groups
does, but I want it locally for a single newsgroup. My first idea was a
db table with each message being a row (I *think* I have the terminology
right) But I suspect indexing the existing files might be just as
effective, and take up less space. 


> I would not normally think that "searching text" is the first thing
> that comes to mind when one thinks of SQL and relational databases.
> A full-text indexing application, on the other hand, sounds like what
> you want.  I'm sure there are some notes on using Python for that
> somewhere...

I will be checking parnasus & etc later. 

> 
> What parts of SQL do you expect to use to do this searching?  Maybe
> that will give us a hint what you have in mind, and ideas whether
> there might be a better approach.
> 
> (I'm sure SQL can do something like this, but it might be little
> better performance-wise than doing "if item in list" turns out
> to be in Python... that is, very slow since it just does a 
> brute-force search from start to finish.)
> 
> -Peter

I would also like to get some familiarity with SQL but that is a
secondary issue and I can allways do that in another project.



-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.0.7 (GNU/Linux)

iD8DBQE84KWjd90bcYOAWPYRAkRNAJ95U4hijb04UnE6HB4k+3q7vvWvHwCgul/A
LqbjnV5o6tVmhdz5dPWuDqA=
=8SpD
-----END PGP SIGNATURE-----

-- 
Jim Richardson
	Anarchist, pagan and proud of it
http://www.eskimo.com/~warlock
Linux, from watches to supercomputers, for grandmas and geeks. 



More information about the Python-list mailing list