List performance and CSV

jepler at unpythonic.net jepler at unpythonic.net
Sat Oct 8 12:01:42 EDT 2005


You'll probably see a slight speed increase with something like
    for a in CustomersToMatch:
        for b in Customers:
            if a[2] == b[2]:
                a[1] = b[1]
                break
But a really fast approach is to use a dictionary or other structure
that turns the inner loop into a fast lookup, not a slow loop through
the 'Customers' list.  Preparing the dictionary would look like
    custmap = {}
    for c in Customers:
        k = c[2]
        if k in custmap: continue
        custmap[k] = c
and the loop to update would look like
    for a in customerstomatch:
        try:
            a[1] = custmap[a[2]][1]
        except KeyError:
            continue

(all code is untested)

In "big-O" terms, I believe this changes the complexity from O(m*n) to O(m+n).

Jeff
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 196 bytes
Desc: not available
URL: <http://mail.python.org/pipermail/python-list/attachments/20051008/cdf8fd70/attachment.sig>


More information about the Python-list mailing list