A faster way of finding historical highs/lows

David Fraser davidf at sjsoft.com
Fri Jun 11 17:42:02 EDT 2004


Eamonn Sullivan wrote:
> Peter Hansen <peter at engcorp.com> wrote in message news:<SN2dnT9ob92aPFTdRVn-gQ at powergate.ca>...
> 
>>Eamonn Sullivan wrote:
>>
>>
>>>1. Find the most recent date when there was an equal or higher (or
>>>lower) value than X. 
>>
>>The fastest algorithm might depend on how you use the data, as well.
>>For example, do you update the data often, and search it rarely,
>>or update it rarely and do the search very often?  If searching
>>many times between updates, some preprocessing will likely make things
>>go much faster.
>>
>>Both of your examples sound to me like they would benefit by
>>using sort(), then a binary search.  Sort is very fast relative
>>to things like the Python loops you are doing, so using it to
>>prepare the data before the search can be a good step.
>>
>>-Peter
> 
> 
> Thanks for this. At the moment, the software answers a few questions
> (highest/lowest, longest streak, etc.) once per retrieval of data. The
> database retrieval, though, is *by far* the biggest time sapper, so a
> little preprocessing would be almost unnoticeable in comparison,
> probably (depends on how much, of course).
> 
> So, I'm guessing I could sort on the particular price field I'm using
> (decorate-sort-undecorate), and then just find the most recent date
> among the subset of data that meets the criteria (higher or lower). Is
> that what you mean? By binary search, do you mean further reorganizing
> the data into a binary tree using date?

If you have this in a relational database, you might find that the 
database can answer the question quicker for you, using indexes if 
neccessary ...
   select max(xxx) from yyy where zzz > 40
with an index on xxx and zzz will usually be done quickly internally by 
the database, and then you just get the result returned rather than 
having to process it

David



More information about the Python-list mailing list