From peter.xihong.wang at intel.com Tue Sep 12 12:35:27 2017 From: peter.xihong.wang at intel.com (Wang, Peter Xihong) Date: Tue, 12 Sep 2017 16:35:27 +0000 Subject: [Speed] Python Performance Benchmark Suite revision request Message-ID: <371EBC7881C7844EAAF5556BFF21BCCC86EFD2F4@ORSMSX105.amr.corp.intel.com> Hi All, I have a couple of input on the benchmark suite. I am currently using the Python Benchmark Suite https://github.com/python/performance for a customer and running into some issues. Due to specific circumstance, the net speed/bandwidth must be limited. As a result, the real time package checking and pulling step prior to the actual benchmark run was never able to finish (always time out somewhere during the bit pulling). My request: is there a way to get a zipped (or git cloned version) of this benchmark, and stop the package pulling in real time prior to every run? In addition to net speed limit, we have to run the benchmark on multiple machines. Trying to wait for the same source codes being downloaded repeatedly for every run is really painfully, and not very efficient. The current implementation does not guarantee the exact same benchmarks are used for all runs. I feel git tag should be used to enforce consistency. We used The Grand Unified Python Benchmark Suite https://hg.python.org/benchmarks in the past, and found that one was very easy to use, with far less dependency, and can be simply zipped and deployed easily. Thanks, Peter -------------- next part -------------- An HTML attachment was scrubbed... URL: From victor.stinner at gmail.com Wed Sep 13 18:48:22 2017 From: victor.stinner at gmail.com (Victor Stinner) Date: Thu, 14 Sep 2017 00:48:22 +0200 Subject: [Speed] Python Performance Benchmark Suite revision request In-Reply-To: <371EBC7881C7844EAAF5556BFF21BCCC86EFD2F4@ORSMSX105.amr.corp.intel.com> References: <371EBC7881C7844EAAF5556BFF21BCCC86EFD2F4@ORSMSX105.amr.corp.intel.com> Message-ID: Hi, 2017-09-12 18:35 GMT+02:00 Wang, Peter Xihong : > I am currently using the Python Benchmark Suite > https://github.com/python/performance for a customer and running into some > issues. Due to specific circumstance, the net speed/bandwidth must be > limited. As a result, the real time package checking and pulling step prior > to the actual benchmark run was never able to finish (always time out > somewhere during the bit pulling). I guess that you are talking about the creation of the virtual environment. In the Git repository, I see these dependencies (performance/requirements.txt): --- setuptools>=18.5 pip>=6.0 wheel backports-abc==0.5 singledispatch==3.4.0.3 certifi==2017.4.17 lockfile==0.12.2; python_version < '3.0' pydns==2.3.6; python_version < '3.0' MarkupSafe==1.0 webencodings==0.5.1 mpmath==0.19 statistics==1.0.3.5; python_version < '3.4' docutils==0.13.1 six==1.10.0 perf==1.4 Chameleon==3.1 Django==1.11.3 Genshi==0.7 Mako==1.0.6 SQLAlchemy==1.1.11 dulwich==0.17.3 mercurial==4.2.2; python_version < '3.0' html5lib==0.999999999 pathlib2==2.3.0; python_version < '3.4' pyaes==1.6.0 spambayes==1.1b2; python_version < '3.0' sympy==1.0 tornado==4.5.1 psutil==5.2.2 --- > My request: is there a way to get a zipped (or git cloned version) of this > benchmark, and stop the package pulling in real time prior to every run? In > addition to net speed limit, we have to run the benchmark on multiple > machines. Trying to wait for the same source codes being downloaded > repeatedly for every run is really painfully, and not very efficient. You can try to create the virtual environment manually: $ python3 -m venv myenv $ myenv/bin/python -m pip install -r performance/requirements.txt And then run benchmarks using: $ python3 -m performance run --venv myenv/ -b telco --fast -v Or more directly using: $ myenv/bin/python -m performance run --inside-venv ... Instead of getting dependencies from the Internet, you should be able to download them locally to install them. Example to install Django (but still get dependencies of Django from the Internet) in a new virtual environment: $ wget https://pypi.python.org/packages/fe/ca/a7b35a0f5088f26b1ef3c7add57161a7d387a4cbd30db01c1091aa87e207/Django-1.11.3-py2.py3-none-any.whl # link found at https://pypi.python.org/pypi/Django/1.11.3 $ python3 -m venv bla $ bla/bin/python -m pip install Django-1.11.3-py2.py3-none-any.whl There are likely tools to automate these steps. If you have exactly the same hardware, you *might* copy the full environment environment. But I never tried to do that. > The > current implementation does not guarantee the exact same benchmarks are used > for all runs. I feel git tag should be used to enforce consistency. Dependencies are pinned by their exact version: package==x.y.z. Indirect dependencies are also pinned, example: backports-abc==0.5. I'm not a pip expert, so maybe we can enhance that. But I wouldn't say that "there is no guarantee". IMHO it's more reliable than the old benchmark suite since the "sys.path" is now under control: benchmarks are run in a "known" environment. PYTHONPATH is ignored. The python_startup benchmark is not impacted by .pth files of your sys.path. > We used The Grand Unified Python Benchmark Suite > https://hg.python.org/benchmarks in the past, and found that one was very > easy to use, with far less dependency, and can be simply zipped and deployed > easily. Yeah, you are right. But it was more complex to add new dependencies and update dependencies. As a consequence, we tested softwares which were 5 years old... Not really revelant. Victor From ncoghlan at gmail.com Wed Sep 13 19:19:56 2017 From: ncoghlan at gmail.com (Nick Coghlan) Date: Thu, 14 Sep 2017 09:19:56 +1000 Subject: [Speed] Python Performance Benchmark Suite revision request In-Reply-To: References: <371EBC7881C7844EAAF5556BFF21BCCC86EFD2F4@ORSMSX105.amr.corp.intel.com> Message-ID: On 14 September 2017 at 08:48, Victor Stinner wrote: > There are likely tools to automate these steps. wagon, for example: https://github.com/cloudify-cosmo/wagon#create-packages (Although then you have to bootstrap wagon in the destination environment to handle the install process) >> We used The Grand Unified Python Benchmark Suite >> https://hg.python.org/benchmarks in the past, and found that one was very >> easy to use, with far less dependency, and can be simply zipped and deployed >> easily. > > Yeah, you are right. But it was more complex to add new dependencies > and update dependencies. As a consequence, we tested softwares which > were 5 years old... Not really revelant. It would be useful to provide instructions in the README on how to: 1. Use "pip download --no-binary :all: -r requirements.txt archive_dir" to get the dependencies on an internet connected machine 2. Use "pip install --no-index --find-links archive_dir -r requirements.txt" to install from the unpacked archive instead of the internet Potentially, performance could gain a subcommand to do the initial download, and a "performance venv create" option to specify a local directory to use instead of the index server when installing dependencies. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From victor.stinner at gmail.com Wed Sep 13 19:52:23 2017 From: victor.stinner at gmail.com (Victor Stinner) Date: Thu, 14 Sep 2017 01:52:23 +0200 Subject: [Speed] Python Performance Benchmark Suite revision request In-Reply-To: References: <371EBC7881C7844EAAF5556BFF21BCCC86EFD2F4@ORSMSX105.amr.corp.intel.com> Message-ID: 2017-09-14 1:19 GMT+02:00 Nick Coghlan : > It would be useful to provide (...) Contributions as pull requests are welcome ;-) Victor