urllib and bypass proxy

Thu Apr 3 14:21:34 EDT 2008

Under MS Windows, I encountered a problem with the proxy bypass
specification. In windows, the bypass specification for the proxy
uses semi-colons to delimit entries. Mine happens to have two
semi-colons back-to-back. Internet explorer handles this just fine but
urllib equates this with ALWAYS bypass the proxy. (I'm using Python
2.5.2) 

This is caused because the double semi-colon is turned into an empty
string entry and at the bottom of urllib.py, and empty string can
always be found in a host name. Therefore it always chooses to bypass
the proxy. 

Of course the fix is to get rid of the double colon in the bypass
settings in internet explorer (which I did). But it took me an hour to
track this down (first time using urllib). Perhaps a better fix
would be to test for the empty string and continue the loop in that
case. From urllib.py: 

        # now check if we match one of the registry values.
        for test in proxyOverride:
            if test == "": continue
            test = test.replace(".", r"\.")     # mask dots

This is not really a bug but rather a way to be more consistent with
internet explorer. If this has value, do I submit a bug report or does
someone else?