hard memory limits

Fri May 6 22:29:48 EDT 2005

On Sat, 07 May 2005 11:08:31 +1000, Maurice LING <mauriceling at acm.org> wrote:

>James Stroud wrote:
>
>> Sorry Maurice, apparently in bash its "ulimit" (no n). I don't use bash, so I 
>> don't know all of the differences offhand. Try that.
>> 
>> James
>
>Thanks guys,
>
>It doesn't seems to help. I'm thinking that it might be a SOAPpy 
>problem. The allocation fails when I grab a list of more than 150k 
>elements through SOAP but allocating a 1 million element list is fine in 
>python.
>
>Now I have a performance problem...
>
>Say I have 3 lists (20K elements, 1G elements, and 0 elements), call 
>them 'a', 'b', and 'c'. I want to filter all that is in 'b' but not in 
>'a' into 'c'...
>
> >>> a = range(1, 100000, 5)
> >>> b = range(0, 1000000)
> >>> c = []
> >>> for i in b:
>...     if i not in a: c.append(i)
>...
>
>This takes forever to complete. Is there anyway to optimize this?
>
Checking whether something is in a list may average checking equality with
each element in half the list. Checking for membership in a set should
be much faster for any significant size set/list. I.e., just changing to

      a = set(range(1, 100000, 5))

should help. I assume those aren't examples of your real data ;-)
You must have a lot of memory if you are keeping 1G elements there and
copying a significant portion of them. Do you need to do this file-to-file,
keeping a in memory? Perhaps page-file thrashing is part of the time problem?

Regards,
Bengt Richter