[Speed] Python Performance Benchmark Suite revision request

Wed Sep 13 18:48:22 EDT 2017

Hi,

2017-09-12 18:35 GMT+02:00 Wang, Peter Xihong <peter.xihong.wang at intel.com>:
> I am currently using the Python Benchmark Suite
> https://github.com/python/performance for a customer and running into some
> issues.  Due to specific circumstance, the net speed/bandwidth must be
> limited.  As a result, the real time package checking and pulling step prior
> to the actual benchmark run was never able to finish (always time out
> somewhere during the bit pulling).

I guess that you are talking about the creation of the virtual environment.

In the Git repository, I see these dependencies (performance/requirements.txt):
---
setuptools>=18.5
pip>=6.0
wheel
backports-abc==0.5
singledispatch==3.4.0.3
certifi==2017.4.17
lockfile==0.12.2; python_version < '3.0'
pydns==2.3.6; python_version < '3.0'
MarkupSafe==1.0
webencodings==0.5.1
mpmath==0.19
statistics==1.0.3.5; python_version < '3.4'
docutils==0.13.1
six==1.10.0
perf==1.4
Chameleon==3.1
Django==1.11.3
Genshi==0.7
Mako==1.0.6
SQLAlchemy==1.1.11
dulwich==0.17.3
mercurial==4.2.2; python_version < '3.0'
html5lib==0.999999999
pathlib2==2.3.0; python_version < '3.4'
pyaes==1.6.0
spambayes==1.1b2; python_version < '3.0'
sympy==1.0
tornado==4.5.1
psutil==5.2.2
---

> My request: is there a way to get a zipped (or git cloned version) of this
> benchmark, and stop the package pulling in real time prior to every run?  In
> addition to net speed limit, we have to run the benchmark on multiple
> machines.  Trying to wait for the same source codes being downloaded
> repeatedly for every run is really painfully, and not very efficient.

You can try to create the virtual environment manually:

$ python3 -m venv myenv
$ myenv/bin/python -m pip install -r performance/requirements.txt

And then run benchmarks using:

$ python3 -m performance run --venv myenv/ -b telco --fast -v

Or more directly using:

$ myenv/bin/python -m performance run --inside-venv ...

Instead of getting dependencies from the Internet, you should be able
to download them locally to install them.

Example to install Django (but still get dependencies of Django from
the Internet) in a new virtual environment:

$ wget https://pypi.python.org/packages/fe/ca/a7b35a0f5088f26b1ef3c7add57161a7d387a4cbd30db01c1091aa87e207/Django-1.11.3-py2.py3-none-any.whl
# link found at https://pypi.python.org/pypi/Django/1.11.3
$ python3 -m venv bla
$ bla/bin/python -m pip install Django-1.11.3-py2.py3-none-any.whl

There are likely tools to automate these steps.

If you have exactly the same hardware, you *might* copy the full
environment environment. But I never tried to do that.

>  The
> current implementation does not guarantee the exact same benchmarks are used
> for all runs.  I feel git tag should be used to enforce consistency.

Dependencies are pinned by their exact version: package==x.y.z.
Indirect dependencies are also pinned, example: backports-abc==0.5.

I'm not a pip expert, so maybe we can enhance that. But I wouldn't say
that "there is no guarantee".

IMHO it's more reliable than the old benchmark suite since the
"sys.path" is now under control: benchmarks are run in a "known"
environment. PYTHONPATH is ignored. The python_startup benchmark is
not impacted by .pth files of your sys.path.

> We used The Grand Unified Python Benchmark Suite
> https://hg.python.org/benchmarks in the past, and found that one was very
> easy to use, with far less dependency, and can be simply zipped and deployed
> easily.

Yeah, you are right. But it was more complex to add new dependencies
and update dependencies. As a consequence, we tested softwares which
were 5 years old... Not really revelant.

Victor