[Patches] [ python-Patches-1484793 ] urllib2: resolves extremly slow import (of "everything")

SourceForge.net noreply at sourceforge.net
Thu May 18 07:54:51 CEST 2006


Patches item #1484793, was opened at 2006-05-09 15:59
Message generated for change (Comment added) made by gbrandl
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=1484793&group_id=5470

Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: Library (Lib)
Group: Python 2.4
Status: Closed
Resolution: Fixed
Priority: 5
Submitted By: kxroberto (kxroberto)
Assigned to: Nobody/Anonymous (nobody)
Summary: urllib2: resolves extremly slow import (of "everything")

Initial Comment:
This superseeds the old patch #1053150 (for an older
Python; it was stopped: "Jeremy doesn't like the idea")
in order to import the expensive modules behind urllib2
late.

I'm recommending now again to do this, as things are
almost unacceptable meanwhile.

In Py24, simply importing original urllib2 costs upto
to a second on my slower machines. the startup time of
some of my bigger apps/scripts goes mainly to importing
urllib2. More than half of the time goes into importing
cookielib (regarding profiler runs). Its almost
unusable so now in CGI scripts.

New modules were added to urllib2 meanwhile, and worst
of all the cookielib was inserted into urllib2 the same
old style "import everything on top of the file in a
kind of C-#include manner". 

Python offers best dynamic modularization of code. That
should be exploited for such an expensive
virtualization module like urllib2. There are usually
only very locations, where the sub-modules are referenced. 
This patch also enables to strip off unnecessary
modules (down to _MozillaCookieJar!) for
cx_freeze/py2exe distribution. 

( Since long I have this patch on my list, which I
apply after each Python installation regularly. )

--

As a side effect of this import-all practice a lazy
cookielib dependency came into normal Request
constructor code:
"origin_req_host = cookielib.request_host(self)"

I'd recommend, to copy/move this simple tool function
request_host into urllib2 in order to resolve the
cookielib dependency completely. (not done so far in
the patch)



-robert








----------------------------------------------------------------------

>Comment By: Georg Brandl (gbrandl)
Date: 2006-05-18 05:54

Message:
Logged In: YES 
user_id=849994

Jim: Note that I didn't apply the patch from here, but only
added lazy-loading of ftplib, cookielib and mimetypes.

----------------------------------------------------------------------

Comment By: Jim Jewett (jimjjewett)
Date: 2006-05-17 22:44

Message:
Logged In: YES 
user_id=764593

Note that lazy importing can interact very badly with 
threads.

Why did you change the signature of OpenenDirector._open?  
The base class ignores the data, but subclasses may not.

Removing the SSL guard "if hasattr(httplib, 'HTTPS')" is 
questionable, since the ssl library is external and must be 
compiled separately, and therefore may not exist on some 
platforms even without other source customizations.



----------------------------------------------------------------------

Comment By: Georg Brandl (gbrandl)
Date: 2006-05-17 15:17

Message:
Logged In: YES 
user_id=849994

Fixed in rev. 46029.

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=1484793&group_id=5470


More information about the Patches mailing list