help with making my code more efficient
Roy Smith
roy at panix.com
Thu Dec 20 22:30:31 EST 2012
In article <cc869959-c568-4490-b45f-7855c6841575 at googlegroups.com>,
"Larry.Martell at gmail.com" <Larry.Martell at gmail.com> wrote:
> On Thursday, December 20, 2012 5:38:03 PM UTC-7, Chris Angelico wrote:
> > On Fri, Dec 21, 2012 at 11:19 AM, Larry.Martell at gmail.com
> >
> > <Larry.Martell at gmail.com> wrote:
> >
> > > This code works, but it takes way too long to run - e.g. when cdata has
> > > 600,000 elements (which is typical for my app) it takes 2 hours for this
> > > to run.
> >
> > >
> >
> > > Can anyone give me some suggestions on speeding this up?
> >
> > >
> >
> >
> >
> > It sounds like you may have enough data to want to not keep it all in
> >
> > memory. Have you considered switching to a database? You could then
> >
> > execute SQL queries against it.
>
> It came from a database. Originally I was getting just the data I wanted
> using SQL, but that was taking too long also. I was selecting just the
> messages I wanted, then for each one of those doing another query to get the
> data within the time diff of each. That was resulting in tens of thousands of
> queries. So I changed it to pull all the potential matches at once and then
> process it in python.
If you're doing free-text matching, an SQL database may not be the right
tool. I suspect you want to be looking at some kind of text search
engine, such as http://lucene.apache.org/ or http://xapian.org/.
More information about the Python-list
mailing list