hard memory limits

Maurice LING mauriceling at acm.org
Sat May 7 20:45:43 EDT 2005


Bengt Richter wrote:

> On Sat, 07 May 2005 14:03:34 +1000, Maurice LING <mauriceling at acm.org> wrote:
> 
> 
>>John Machin wrote:
>>
>>>On Sat, 07 May 2005 02:29:48 GMT, bokr at oz.net (Bengt Richter) wrote:
>>>
>>>
>>>
>>>>On Sat, 07 May 2005 11:08:31 +1000, Maurice LING <mauriceling at acm.org> wrote:
>>>>
>>>>
>>>>>It doesn't seems to help. I'm thinking that it might be a SOAPpy 
>>>>>problem. The allocation fails when I grab a list of more than 150k 
>>>>>elements through SOAP but allocating a 1 million element list is fine in 
>>>>>python.
>>>>>
>>>>>Now I have a performance problem...
>>>>>
>>>>>Say I have 3 lists (20K elements, 1G elements, and 0 elements), call 
>>>>>them 'a', 'b', and 'c'. I want to filter all that is in 'b' but not in 
>>>>>'a' into 'c'...
>>>>>
>>>>>
>>>>>
>>>>>>>>a = range(1, 100000, 5)
>>>>>>>>b = range(0, 1000000)
>>>>>>>>c = []
>>>>>>>>for i in b:
>>>>>
>>>>>...     if i not in a: c.append(i)
>>>>>...
>>>>>
>>>>>This takes forever to complete. Is there anyway to optimize this?
>>>>>
>>>>
>>>>Checking whether something is in a list may average checking equality with
>>>>each element in half the list. Checking for membership in a set should
>>>>be much faster for any significant size set/list. I.e., just changing to
>>>>
>>>>    a = set(range(1, 100000, 5))
>>>>
>>>>should help. I assume those aren't examples of your real data ;-)
>>>>You must have a lot of memory if you are keeping 1G elements there and
>>>>copying a significant portion of them. Do you need to do this file-to-file,
>>>>keeping a in memory? Perhaps page-file thrashing is part of the time problem?
>>>
>>>
>>>Since when was 1000000 == 1G??
>>>
>>>Maurice, is this mucking about with 1M or 1G lists in the same
>>>exercise as the "vm_malloc fails when allocating a 20K-element list"
>>>problem? Again, it might be a good idea if you gave us a little bit
>>>more detail. You haven't even posted the actual *PYTHON* error message
>>>and stack trace that you got from the original problem. In fact,
>>>there's a possible interpretation that the (system?) malloc merely
>>>prints the vm_malloc message and staggers on somehow ...
>>>
>>>Regards,
>>>John
>>
>>This is the exact error message:
>>
>>*** malloc: vm_allocate(size=9203712) failed (error code=3)
>>*** malloc[489]: error: Can't allocate region
>>
>>Nothing else. No stack trace, NOTHING.
>>
> 
> 1. Can you post minimal exact code that produces the above exact error message?
> 2. Will you? ;-)
> 
> Regards,
> Bengt Richter

I've re-tried the minimal code mimicking the error in interactive mode 
and got this:

 >>> from SOAPpy import WSDL
 >>> serv = 
WSDL.Proxy('http://eutils.ncbi.nlm.nih.gov/entrez/eutils/soap/v1.1/eutils.wsdl' 
)
 >>> result = serv.run_eSearch(db='pubmed', term='mouse', retmax=500000)
*** malloc: vm_allocate(size=9121792) failed (error code=3)
*** malloc[901]: error: Can't allocate region
Traceback (most recent call last):
   File "<stdin>", line 1, in ?
   File "/sw/lib/python2.3/site-packages/SOAPpy/Client.py", line 453, in 
__call__
     return self.__r_call(*args, **kw)
   File "/sw/lib/python2.3/site-packages/SOAPpy/Client.py", line 475, in 
__r_call
     self.__hd, self.__ma)
   File "/sw/lib/python2.3/site-packages/SOAPpy/Client.py", line 347, in 
__call
     config = self.config)
   File "/sw/lib/python2.3/site-packages/SOAPpy/Client.py", line 212, in 
call
     data = r.getfile().read(message_len)
   File "/sw/lib/python2.3/socket.py", line 301, in read
     data = self._sock.recv(recv_size)
MemoryError
 >>>


When changed retmax to 150000, it works nicely.




More information about the Python-list mailing list