urllib.request with proxy and HTTPS

Pavel Volkov sailor at lists.xtsubasa.org
Fri Jun 30 03:53:17 EDT 2017


Hello,
I'm trying to make an HTTPS request with urllib.

OS: Gentoo
Python: 3.6.1
openssl: 1.0.2l

This is my test code:

===== CODE BLOCK BEGIN =====
import ssl
import urllib.request
from lxml import etree

PROXY = 'proxy.vpn.local:9999'
URL = "https://google.com"

proxy = urllib.request.ProxyHandler({'http': PROXY})

#context = ssl.SSLContext(ssl.PROTOCOL_TLSv1_1)
context = ssl.SSLContext()
context.verify_mode = ssl.CERT_REQUIRED
context.check_hostname = True

secure_handler = urllib.request.HTTPSHandler(context = context)
opener = urllib.request.build_opener(proxy, secure_handler)
opener.addheaders = [('User-Agent', 'Mozilla/5.0 (Windows NT 6.1; Win64; 
x64; rv:54.0) Gecko/20100101 Firefox/54.0')]

response = opener.open(URL)
tree = etree.parse(response, parser=etree.HTMLParser())
print(tree.docinfo.doctype)
===== CODE BLOCK END =====


My first problem is that CERTIFICATE_VERIFY_FAILED error happens.
I've found that something similar happens in macOS since Python installs 
its own set of trusted CA.
But this isn't macOS and I can fetch HTTPS normally with curl and other 
tools.


===== TRACE BLOCK BEGIN =====
Traceback (most recent call last):
  File "/usr/lib64/python3.6/urllib/request.py", line 1318, in do_open
    encode_chunked=req.has_header('Transfer-encoding'))
  File "/usr/lib64/python3.6/http/client.py", line 1239, in request
    self._send_request(method, url, body, headers, encode_chunked)
  File "/usr/lib64/python3.6/http/client.py", line 1285, in _send_request
    self.endheaders(body, encode_chunked=encode_chunked)
  File "/usr/lib64/python3.6/http/client.py", line 1234, in endheaders
    self._send_output(message_body, encode_chunked=encode_chunked)
  File "/usr/lib64/python3.6/http/client.py", line 1026, in _send_output
    self.send(msg)
  File "/usr/lib64/python3.6/http/client.py", line 964, in send
    self.connect()
  File "/usr/lib64/python3.6/http/client.py", line 1400, in connect
    server_hostname=server_hostname)
  File "/usr/lib64/python3.6/ssl.py", line 401, in wrap_socket
    _context=self, _session=session)
  File "/usr/lib64/python3.6/ssl.py", line 808, in __init__
    self.do_handshake()
  File "/usr/lib64/python3.6/ssl.py", line 1061, in do_handshake
    self._sslobj.do_handshake()
  File "/usr/lib64/python3.6/ssl.py", line 683, in do_handshake
    self._sslobj.do_handshake()
ssl.SSLError: [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed 
(_ssl.c:749)

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "./https_test.py", line 21, in <module>
    response = opener.open(URL)
  File "/usr/lib64/python3.6/urllib/request.py", line 526, in open
    response = self._open(req, data)
  File "/usr/lib64/python3.6/urllib/request.py", line 544, in _open
    '_open', req)
  File "/usr/lib64/python3.6/urllib/request.py", line 504, in _call_chain
    result = func(*args)
  File "/usr/lib64/python3.6/urllib/request.py", line 1361, in https_open
    context=self._context, check_hostname=self._check_hostname)
  File "/usr/lib64/python3.6/urllib/request.py", line 1320, in do_open
    raise URLError(err)
urllib.error.URLError: <urlopen error [SSL: CERTIFICATE_VERIFY_FAILED] 
certificate verify failed (_ssl.c:749)>
===== TRACE BLOCK END =====

Second problem is that for HTTP requests proxy is used, but for HTTPS it 
makes a direct connection (verified with tcpdump).

I've read at docs.python.org that previous versions of Python couldn't 
handle HTTPS with proxy but that shortcoming seems to have gone now.

Please help :)




More information about the Python-list mailing list