fastest data structure for retrieving objects identified by (x, y) tuple?

Joshua Landau joshua.landau.ws at gmail.com
Wed Oct 3 18:47:39 EDT 2012


On 3 October 2012 23:30, Benjamin Jessup <bsj at abzinc.com> wrote:

> I have a group of objects identified by unique (x,y) pairs and I want to
> find out an object's "neighbors" in a matrix of size 2400 x 2400.
>        #############
>        #obj#   #   #
>        #############
>        #   #   #obj#      3 x 3 Example
>        #############
>        #   #   #   #
>        #############
> There is either a neighbor, or a null value. I always know the (x,y) pair
> to check the neighbors of, so is doing,
> >> obj = grid[x][y] #lists, doesn't scale with num of objects
> or,
> >> obj = grid.get((x,y),None) #dictionary, scales with num of objects
> the fastest? I can't seem to find a conclusion by testing each alone, then
> in the full environment. Is it that, depending on the number of objects,
> each has an advantage?
>
> I know the fastest way to retrieve them would be to have them store
> pointers to their neighbors, then use those for retrieval. When large
> numbers of objects are changing their (x,y) pairs, rebuilding the pointers
> is too slow


You really are asking the wrong question.

If most of the data is None, then a sparse matrix is best. Otherwise, lists
make the most sense.
*However, *what you really want is... a matrix. Look for a sparse matrix
type (
http://stackoverflow.com/questions/1053928/python-numpy-very-large-matrices)
if you need sparse, and otherwise Numpy's matrix should do fine.

The thing is this:
"I can't seem to find a conclusion by testing each alone, then in the full
environment. Is it that, depending on the number of objects, each has an
advantage?"

If you can't tell, don't bother. Dictionaries will have a memory advantage
with sparse matrices, lists in the other cases. That's the important part,
as look-up is fast for both*. If you need faster than these builtins, use
Numpy or SciPy.

If you really need optimization help, profile and ask the *right* question
to this list.

* To actually answer the question:
They both have O(1) look-up time, although dictionaries have a theoretical *
worst* case of O(N). Being simpler, it's faster to index a list. However,
you're doing that twice, so it's probably around even.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-list/attachments/20121003/42ad5522/attachment.html>


More information about the Python-list mailing list