From peter.xihong.wang at intel.com  Tue Sep 12 12:35:27 2017
From: peter.xihong.wang at intel.com (Wang, Peter Xihong)
Date: Tue, 12 Sep 2017 16:35:27 +0000
Subject: [Speed] Python Performance Benchmark Suite revision request
Message-ID: <371EBC7881C7844EAAF5556BFF21BCCC86EFD2F4@ORSMSX105.amr.corp.intel.com>

Hi All,

I have a couple of input on the benchmark suite.

I am currently using the Python Benchmark Suite https://github.com/python/performance for a customer and running into some issues.  Due to specific circumstance, the net speed/bandwidth must be limited.  As a result, the real time package checking and pulling step prior to the actual benchmark run was never able to finish (always time out somewhere during the bit pulling).

My request: is there a way to get a zipped (or git cloned version) of this benchmark, and stop the package pulling in real time prior to every run?  In addition to net speed limit, we have to run the benchmark on multiple machines.  Trying to wait for the same source codes being downloaded repeatedly for every run is really painfully, and not very efficient.  The current implementation does not guarantee the exact same benchmarks are used for all runs.  I feel git tag should be used to enforce consistency.

We used The Grand Unified Python Benchmark Suite https://hg.python.org/benchmarks in the past, and found that one was very easy to use, with far less dependency, and can be simply zipped and deployed easily.

Thanks,

Peter


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/speed/attachments/20170912/a31b68d7/attachment.html>

From victor.stinner at gmail.com  Wed Sep 13 18:48:22 2017
From: victor.stinner at gmail.com (Victor Stinner)
Date: Thu, 14 Sep 2017 00:48:22 +0200
Subject: [Speed] Python Performance Benchmark Suite revision request
In-Reply-To: <371EBC7881C7844EAAF5556BFF21BCCC86EFD2F4@ORSMSX105.amr.corp.intel.com>
References: <371EBC7881C7844EAAF5556BFF21BCCC86EFD2F4@ORSMSX105.amr.corp.intel.com>
Message-ID: <CAMpsgwZLcJHOHCATD6Li4-4yWi95fQVWPnQ_buJQKL+WnASHmw@mail.gmail.com>

Hi,

2017-09-12 18:35 GMT+02:00 Wang, Peter Xihong <peter.xihong.wang at intel.com>:
> I am currently using the Python Benchmark Suite
> https://github.com/python/performance for a customer and running into some
> issues.  Due to specific circumstance, the net speed/bandwidth must be
> limited.  As a result, the real time package checking and pulling step prior
> to the actual benchmark run was never able to finish (always time out
> somewhere during the bit pulling).

I guess that you are talking about the creation of the virtual environment.

In the Git repository, I see these dependencies (performance/requirements.txt):
---
setuptools>=18.5
pip>=6.0
wheel
backports-abc==0.5
singledispatch==3.4.0.3
certifi==2017.4.17
lockfile==0.12.2; python_version < '3.0'
pydns==2.3.6; python_version < '3.0'
MarkupSafe==1.0
webencodings==0.5.1
mpmath==0.19
statistics==1.0.3.5; python_version < '3.4'
docutils==0.13.1
six==1.10.0
perf==1.4
Chameleon==3.1
Django==1.11.3
Genshi==0.7
Mako==1.0.6
SQLAlchemy==1.1.11
dulwich==0.17.3
mercurial==4.2.2; python_version < '3.0'
html5lib==0.999999999
pathlib2==2.3.0; python_version < '3.4'
pyaes==1.6.0
spambayes==1.1b2; python_version < '3.0'
sympy==1.0
tornado==4.5.1
psutil==5.2.2
---


> My request: is there a way to get a zipped (or git cloned version) of this
> benchmark, and stop the package pulling in real time prior to every run?  In
> addition to net speed limit, we have to run the benchmark on multiple
> machines.  Trying to wait for the same source codes being downloaded
> repeatedly for every run is really painfully, and not very efficient.

You can try to create the virtual environment manually:

$ python3 -m venv myenv
$ myenv/bin/python -m pip install -r performance/requirements.txt

And then run benchmarks using:

$ python3 -m performance run --venv myenv/ -b telco --fast -v

Or more directly using:

$ myenv/bin/python -m performance run --inside-venv ...


Instead of getting dependencies from the Internet, you should be able
to download them locally to install them.

Example to install Django (but still get dependencies of Django from
the Internet) in a new virtual environment:

$ wget https://pypi.python.org/packages/fe/ca/a7b35a0f5088f26b1ef3c7add57161a7d387a4cbd30db01c1091aa87e207/Django-1.11.3-py2.py3-none-any.whl
# link found at https://pypi.python.org/pypi/Django/1.11.3
$ python3 -m venv bla
$ bla/bin/python -m pip install Django-1.11.3-py2.py3-none-any.whl

There are likely tools to automate these steps.

If you have exactly the same hardware, you *might* copy the full
environment environment. But I never tried to do that.


>  The
> current implementation does not guarantee the exact same benchmarks are used
> for all runs.  I feel git tag should be used to enforce consistency.

Dependencies are pinned by their exact version: package==x.y.z.
Indirect dependencies are also pinned, example: backports-abc==0.5.

I'm not a pip expert, so maybe we can enhance that. But I wouldn't say
that "there is no guarantee".

IMHO it's more reliable than the old benchmark suite since the
"sys.path" is now under control: benchmarks are run in a "known"
environment. PYTHONPATH is ignored. The python_startup benchmark is
not impacted by .pth files of your sys.path.


> We used The Grand Unified Python Benchmark Suite
> https://hg.python.org/benchmarks in the past, and found that one was very
> easy to use, with far less dependency, and can be simply zipped and deployed
> easily.

Yeah, you are right. But it was more complex to add new dependencies
and update dependencies. As a consequence, we tested softwares which
were 5 years old... Not really revelant.

Victor

From ncoghlan at gmail.com  Wed Sep 13 19:19:56 2017
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Thu, 14 Sep 2017 09:19:56 +1000
Subject: [Speed] Python Performance Benchmark Suite revision request
In-Reply-To: <CAMpsgwZLcJHOHCATD6Li4-4yWi95fQVWPnQ_buJQKL+WnASHmw@mail.gmail.com>
References: <371EBC7881C7844EAAF5556BFF21BCCC86EFD2F4@ORSMSX105.amr.corp.intel.com>
 <CAMpsgwZLcJHOHCATD6Li4-4yWi95fQVWPnQ_buJQKL+WnASHmw@mail.gmail.com>
Message-ID: <CADiSq7cH_nUOHRY-fzqVQ4Oo23eDSMXqRB1dc8_ZSq0FqP4uNw@mail.gmail.com>

On 14 September 2017 at 08:48, Victor Stinner <victor.stinner at gmail.com> wrote:
> There are likely tools to automate these steps.

wagon, for example: https://github.com/cloudify-cosmo/wagon#create-packages

(Although then you have to bootstrap wagon in the destination
environment to handle the install process)

>> We used The Grand Unified Python Benchmark Suite
>> https://hg.python.org/benchmarks in the past, and found that one was very
>> easy to use, with far less dependency, and can be simply zipped and deployed
>> easily.
>
> Yeah, you are right. But it was more complex to add new dependencies
> and update dependencies. As a consequence, we tested softwares which
> were 5 years old... Not really revelant.

It would be useful to provide instructions in the README on how to:

1. Use "pip download --no-binary :all: -r requirements.txt
archive_dir" to get the dependencies on an internet connected machine
2. Use "pip install --no-index --find-links archive_dir -r
requirements.txt" to install from the unpacked archive instead of the
internet

Potentially, performance could gain a subcommand to do the initial
download, and a "performance venv create" option to specify a local
directory to use instead of the index server when installing
dependencies.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia

From victor.stinner at gmail.com  Wed Sep 13 19:52:23 2017
From: victor.stinner at gmail.com (Victor Stinner)
Date: Thu, 14 Sep 2017 01:52:23 +0200
Subject: [Speed] Python Performance Benchmark Suite revision request
In-Reply-To: <CADiSq7cH_nUOHRY-fzqVQ4Oo23eDSMXqRB1dc8_ZSq0FqP4uNw@mail.gmail.com>
References: <371EBC7881C7844EAAF5556BFF21BCCC86EFD2F4@ORSMSX105.amr.corp.intel.com>
 <CAMpsgwZLcJHOHCATD6Li4-4yWi95fQVWPnQ_buJQKL+WnASHmw@mail.gmail.com>
 <CADiSq7cH_nUOHRY-fzqVQ4Oo23eDSMXqRB1dc8_ZSq0FqP4uNw@mail.gmail.com>
Message-ID: <CAMpsgwY8P+kV47mAsTdhFQLRYphKY4OyMkHim=Hqo3oE=KyU-Q@mail.gmail.com>

2017-09-14 1:19 GMT+02:00 Nick Coghlan <ncoghlan at gmail.com>:
> It would be useful to provide (...)

Contributions as pull requests are welcome ;-)

Victor