[issue37355] SSLSocket.read does a GIL round-trip for every 16KB TLS record
Josh Snyder
report at bugs.python.org
Thu Jun 20 14:42:49 EDT 2019
New submission from Josh Snyder <hashbrowncipher at gmail.com>:
Background:
SSLSocket.read drops the GIL and performs exactly one successful call to OpenSSL's `SSL_read`, whose documentation states "At most the contents of one record will be returned". TLS records are at most 16KB, so high throughput (especially multithreaded) TLS reception can become bottlenecked on the GIL.
Proposal:
For non-blocking sockets, call SSL_read in a loop until the user-supplied limit is reached or no bytes are available on the socket. I don't know of a way to safely improve performance for blocking sockets.
Initial testing:
I performed initial testing using 32 threads pinned to 16 cores, downloading and re-assembling a single 140270MB file from a "real world" TLS sender. This resulted in a 4x increase in throughput, a 6.6x reduction in voluntary context switches, a 3.5x reduction in system time. User time did increase by 43%, so the overall reduction in CPU usage was only 2.67x.
before after
wall clock time (s) : 29.637 7.116
user time (s) : 8.793 12.584
system time (s) : 105.118 30.010
user + system time (s) : 113.911 42.594
cpu utilization (%) : 384 599
voluntary switches : 1,653,065 248,484
speed (MB/s) : 4733 19712
My git branch (currently a draft) is at https://github.com/hashbrowncipher/cpython/commits/faster_tls
----------
assignee: christian.heimes
components: SSL
messages: 346156
nosy: christian.heimes, josnyder
priority: normal
severity: normal
status: open
title: SSLSocket.read does a GIL round-trip for every 16KB TLS record
type: performance
_______________________________________
Python tracker <report at bugs.python.org>
<https://bugs.python.org/issue37355>
_______________________________________
More information about the Python-bugs-list
mailing list