hard memory limits

Fri May 6 23:32:36 EDT 2005

On Sat, 07 May 2005 02:29:48 GMT, bokr at oz.net (Bengt Richter) wrote:

>On Sat, 07 May 2005 11:08:31 +1000, Maurice LING <mauriceling at acm.org> wrote:
>>
>>It doesn't seems to help. I'm thinking that it might be a SOAPpy 
>>problem. The allocation fails when I grab a list of more than 150k 
>>elements through SOAP but allocating a 1 million element list is fine in 
>>python.
>>
>>Now I have a performance problem...
>>
>>Say I have 3 lists (20K elements, 1G elements, and 0 elements), call 
>>them 'a', 'b', and 'c'. I want to filter all that is in 'b' but not in 
>>'a' into 'c'...
>>
>> >>> a = range(1, 100000, 5)
>> >>> b = range(0, 1000000)
>> >>> c = []
>> >>> for i in b:
>>...     if i not in a: c.append(i)
>>...
>>
>>This takes forever to complete. Is there anyway to optimize this?
>>
>Checking whether something is in a list may average checking equality with
>each element in half the list. Checking for membership in a set should
>be much faster for any significant size set/list. I.e., just changing to
>
>      a = set(range(1, 100000, 5))
>
>should help. I assume those aren't examples of your real data ;-)
>You must have a lot of memory if you are keeping 1G elements there and
>copying a significant portion of them. Do you need to do this file-to-file,
>keeping a in memory? Perhaps page-file thrashing is part of the time problem?

Since when was 1000000 == 1G??

Maurice, is this mucking about with 1M or 1G lists in the same
exercise as the "vm_malloc fails when allocating a 20K-element list"
problem? Again, it might be a good idea if you gave us a little bit
more detail. You haven't even posted the actual *PYTHON* error message
and stack trace that you got from the original problem. In fact,
there's a possible interpretation that the (system?) malloc merely
prints the vm_malloc message and staggers on somehow ...

Regards,
John