[Python-checkins] distutils2: Rename the "pypi" package to "index", refactor index.dist.
tarek.ziade
python-checkins at python.org
Sun Aug 8 11:50:46 CEST 2010
tarek.ziade pushed 795db3d79331 to distutils2:
http://hg.python.org/distutils2/rev/795db3d79331
changeset: 447:795db3d79331
user: Alexis Metaireau <ametaireau at gmail.com>
date: Thu Jul 15 14:01:17 2010 +0200
summary: Rename the "pypi" package to "index", refactor index.dist.
files: docs/source/index.rst, docs/source/projects-index.client.rst, docs/source/projects-index.dist.rst, docs/source/projects-index.rst, docs/source/projects-index.simple.rst, docs/source/projects-index.xmlrpc.rst, docs/source/pypi.rst, src/distutils2/index/__init__.py, src/distutils2/index/base.py, src/distutils2/index/dist.py, src/distutils2/index/errors.py, src/distutils2/index/simple.py, src/distutils2/pypi/__init__.py, src/distutils2/pypi/dist.py, src/distutils2/pypi/errors.py, src/distutils2/pypi/simple.py, src/distutils2/tests/test_index_dist.py, src/distutils2/tests/test_index_simple.py, src/distutils2/tests/test_pypi_dist.py, src/distutils2/tests/test_pypi_simple.py
diff --git a/docs/source/index.rst b/docs/source/index.rst
--- a/docs/source/index.rst
+++ b/docs/source/index.rst
@@ -16,7 +16,7 @@
depgraph
new_commands
test_framework
- pypi
+ projects-index
version
Indices and tables
diff --git a/docs/source/pypi.rst b/docs/source/projects-index.client.rst
rename from docs/source/pypi.rst
rename to docs/source/projects-index.client.rst
--- a/docs/source/pypi.rst
+++ b/docs/source/projects-index.client.rst
@@ -1,195 +1,24 @@
-=========================================
-Tools to query PyPI: the PyPI package
-=========================================
+===============================
+High level API to Query indexes
+===============================
-Distutils2 comes with a module (eg. `distutils2.pypi`) which contains
-facilities to access the Python Package Index (named "pypi", and avalaible on
-the url `http://pypi.python.org`.
+Distutils2 provides a high level API to query indexes, search for releases and
+distributions, no matters the underlying API you want to use.
-There is two ways to retrieve data from pypi: using the *simple* API, and using
-*XML-RPC*. The first one is in fact a set of HTML pages avalaible at
-`http://pypi.python.org/simple/`, and the second one contains a set of XML-RPC
-methods. In order to reduce the overload caused by running distant methods on
-the pypi server (by using the XML-RPC methods), the best way to retrieve
-informations is by using the simple API, when it contains the information you
-need.
-
-Distutils2 provides two python modules to ease the work with those two APIs:
-`distutils2.pypi.simple` and `distutils2.pypi.xmlrpc`. Both of them depends on
-another python module: `distutils2.pypi.dist`.
-
-
-Requesting information via the "simple" API `distutils2.pypi.simple`
-====================================================================
-
-`distutils2.pypi.simple` can process the Python Package Index and return and
-download urls of distributions, for specific versions or latests, but it also
-can process external html pages, with the goal to find *pypi unhosted* versions
-of python distributions.
-
-You should use `distutils2.pypi.simple` for:
-
- * Search distributions by name and versions.
- * Process pypi external pages.
- * Download distributions by name and versions.
-
-And should not be used to:
-
- * Things that will end up in too long index processing (like "finding all
- distributions with a specific version, no matters the name")
+The aim of this module is to choose the best way to query the API, using the
+less possible XML-RPC, and when possible the simple index.
API
-----
+===
-Here is a complete overview of the APIs of the SimpleIndex class.
+The client comes with the common methods "find", "get" and "download", which
+helps to query the servers, and returns.
-.. autoclass:: distutils2.pypi.simple.SimpleIndex
- :members:
+:class:`distutils2.index.dist.ReleaseInfo`, and
+:class:`distutils2.index.dist.ReleasesList` objects.
-Usage Exemples
----------------
+XXX TODO Autoclass here.
-To help you understand how using the `SimpleIndex` class, here are some basic
-usages.
+Exemples
+=========
-Request PyPI to get a specific distribution
-++++++++++++++++++++++++++++++++++++++++++++
-
-Supposing you want to scan the PyPI index to get a list of distributions for
-the "foobar" project. You can use the "find" method for that::
-
- >>> from distutils2.pypi import SimpleIndex
- >>> client = SimpleIndex()
- >>> client.find("foobar")
- [<PyPIDistribution "Foobar 1.1">, <PyPIDistribution "Foobar 1.2">]
-
-Note that you also can request the client about specific versions, using version
-specifiers (described in `PEP 345
-<http://www.python.org/dev/peps/pep-0345/#version-specifiers>`_)::
-
- >>> client.find("foobar < 1.2")
- [<PyPIDistribution "foobar 1.1">, ]
-
-`find` returns a list of distributions, but you also can get the last
-distribution (the more up to date) that fullfil your requirements, like this::
-
- >>> client.get("foobar < 1.2")
- <PyPIDistribution "foobar 1.1">
-
-Download distributions
-+++++++++++++++++++++++
-
-As it can get the urls of distributions provided by PyPI, the `SimpleIndex`
-client also can download the distributions and put it for you in a temporary
-destination::
-
- >>> client.download("foobar")
- /tmp/temp_dir/foobar-1.2.tar.gz
-
-You also can specify the directory you want to download to::
-
- >>> client.download("foobar", "/path/to/my/dir")
- /path/to/my/dir/foobar-1.2.tar.gz
-
-While downloading, the md5 of the archive will be checked, if not matches, it
-will try another time, then if fails again, raise `MD5HashDoesNotMatchError`.
-
-Internally, that's not the SimpleIndex which download the distributions, but the
-`PyPIDistribution` class. Please refer to this documentation for more details.
-
-Following PyPI external links
-++++++++++++++++++++++++++++++
-
-The default behavior for distutils2 is to *not* follow the links provided
-by HTML pages in the "simple index", to find distributions related
-downloads.
-
-It's possible to tell the PyPIClient to follow external links by setting the
-`follow_externals` attribute, on instanciation or after::
-
- >>> client = SimpleIndex(follow_externals=True)
-
-or ::
-
- >>> client = SimpleIndex()
- >>> client.follow_externals = True
-
-Working with external indexes, and mirrors
-+++++++++++++++++++++++++++++++++++++++++++
-
-The default `SimpleIndex` behavior is to rely on the Python Package index stored
-on PyPI (http://pypi.python.org/simple).
-
-As you can need to work with a local index, or private indexes, you can specify
-it using the index_url parameter::
-
- >>> client = SimpleIndex(index_url="file://filesystem/path/")
-
-or ::
-
- >>> client = SimpleIndex(index_url="http://some.specific.url/")
-
-You also can specify mirrors to fallback on in case the first index_url you
-provided doesnt respond, or not correctly. The default behavior for
-`SimpleIndex` is to use the list provided by Python.org DNS records, as
-described in the :pep:`381` about mirroring infrastructure.
-
-If you don't want to rely on these, you could specify the list of mirrors you
-want to try by specifying the `mirrors` attribute. It's a simple iterable::
-
- >>> mirrors = ["http://first.mirror","http://second.mirror"]
- >>> client = SimpleIndex(mirrors=mirrors)
-
-
-Requesting informations via XML-RPC (`distutils2.pypi.XmlRpcIndex`)
-==========================================================================
-
-The other method to request the Python package index, is using the XML-RPC
-methods. Distutils2 provides a simple wrapper around `xmlrpclib
-<http://docs.python.org/library/xmlrpclib.html>`_, that can return you
-`PyPIDistribution` objects.
-
-::
- >>> from distutils2.pypi import XmlRpcIndex()
- >>> client = XmlRpcIndex()
-
-
-PyPI Distributions
-==================
-
-Both `SimpleIndex` and `XmlRpcIndex` classes works with the classes provided
-in the `pypi.dist` package.
-
-`PyPIDistribution`
-------------------
-
-`PyPIDistribution` is a simple class that defines the following attributes:
-
-:name:
- The name of the package. `foobar` in our exemples here
-:version:
- The version of the package
-:location:
- If the files from the archive has been downloaded, here is the path where
- you can find them.
-:url:
- The url of the distribution
-
-.. autoclass:: distutils2.pypi.dist.PyPIDistribution
- :members:
-
-`PyPIDistributions`
--------------------
-
-The `dist` module also provides another class, to work with lists of
-`PyPIDistribution` classes. It allow to filter results and is used as a
-container of
-
-.. autoclass:: distutils2.pypi.dist.PyPIDistributions
- :members:
-
-At a higher level
-=================
-
-XXX : A description about a wraper around PyPI simple and XmlRpc Indexes
-(PyPIIndex ?)
diff --git a/docs/source/projects-index.dist.rst b/docs/source/projects-index.dist.rst
new file mode 100644
--- /dev/null
+++ b/docs/source/projects-index.dist.rst
@@ -0,0 +1,87 @@
+==================================================
+Representation of informations coming from indexes
+==================================================
+
+Informations coming from indexes are represented by the classes present in the
+`dist` module.
+
+.. note:: Keep in mind that each project (eg. FooBar) can have several
+ releases (eg. 1.1, 1.2, 1.3), and each of these releases can be
+ provided in multiple distributions (eg. a source distribution,
+ a binary one, etc).
+
+APIs
+====
+
+ReleaseInfo
+------------
+
+Each release have a project name, a project version and contain project
+metadata. In addition, releases contain the distributions too.
+
+These informations are stored in :class:`distutils2.index.dist.ReleaseInfo`
+objects.
+
+.. autoclass:: distutils2.index.dist.ReleaseInfo
+ :members:
+
+DistInfo
+---------
+
+:class:`distutils2.index.dist.DistInfo` is a simple class that contains
+informations related to distributions. It's mainly about the URLs where those
+distributions can be found.
+
+.. autoclass:: distutils2.index.dist.DistInfo
+ :members:
+
+ReleasesList
+------------
+
+The `dist` module also provides another class, to work with lists of
+:class:`distutils.index.dist.ReleaseInfo` classes. It allow to filter
+and order results.
+
+.. autoclass:: distutils2.index.dist.ReleasesList
+ :members:
+
+Exemple usages
+===============
+
+Build a list of releases, and order them
+----------------------------------------
+
+Assuming we have a list of releases::
+
+ >>> from distutils2.index.dist import ReleaseList, ReleaseInfo
+ >>> fb10 = ReleaseInfo("FooBar", "1.0")
+ >>> fb11 = ReleaseInfo("FooBar", "1.1")
+ >>> fb11a = ReleaseInfo("FooBar", "1.1a1")
+ >>> ReleasesList("FooBar", [fb11, fb11a, fb10])
+ >>> releases.sort_releases()
+ >>> releases.get_versions()
+ ['1.1', '1.1a1', '1.0']
+ >>> releases.add_release("1.2a1")
+ >>> releases.get_versions()
+ ['1.1', '1.1a1', '1.0', '1.2a1']
+ >>> releases.sort_releases()
+ ['1.2a1', '1.1', '1.1a1', '1.0']
+ >>> releases.sort_releases(prefer_final=True)
+ >>> releases.get_versions()
+ ['1.1', '1.0', '1.2a1', '1.1a1']
+
+
+Add distribution related informations to releases
+-------------------------------------------------
+
+It's easy to add distribution informatons to releases::
+
+ >>> from distutils2.index.dist import ReleaseList, ReleaseInfo
+ >>> r = ReleaseInfo("FooBar", "1.0")
+ >>> r.add_distribution("sdist", url="http://example.org/foobar-1.0.tar.gz")
+ >>> r.dists
+ {'sdist': FooBar 1.0 sdist}
+ >>> r['sdist'].url
+ {'url': 'http://example.org/foobar-1.0.tar.gz', 'hashname': None, 'hashval':
+ None, 'is_external': True}
+
diff --git a/docs/source/pypi.rst b/docs/source/projects-index.rst
copy from docs/source/pypi.rst
copy to docs/source/projects-index.rst
--- a/docs/source/pypi.rst
+++ b/docs/source/projects-index.rst
@@ -1,195 +1,28 @@
-=========================================
-Tools to query PyPI: the PyPI package
-=========================================
+===================================
+Query Python Package Indexes (PyPI)
+===================================
-Distutils2 comes with a module (eg. `distutils2.pypi`) which contains
-facilities to access the Python Package Index (named "pypi", and avalaible on
-the url `http://pypi.python.org`.
+Distutils2 provides facilities to access python package informations stored in
+indexes. The main Python Package Index is available at http://pypi.python.org.
-There is two ways to retrieve data from pypi: using the *simple* API, and using
-*XML-RPC*. The first one is in fact a set of HTML pages avalaible at
+.. note:: The tools provided in distutils2 are not limited to query pypi, and
+ can be used for others indexes, if they respect the same interfaces.
+
+There is two ways to retrieve data from these indexes: using the *simple* API,
+and using *XML-RPC*. The first one is a set of HTML pages avalaibles at
`http://pypi.python.org/simple/`, and the second one contains a set of XML-RPC
-methods. In order to reduce the overload caused by running distant methods on
-the pypi server (by using the XML-RPC methods), the best way to retrieve
-informations is by using the simple API, when it contains the information you
-need.
+methods.
-Distutils2 provides two python modules to ease the work with those two APIs:
-`distutils2.pypi.simple` and `distutils2.pypi.xmlrpc`. Both of them depends on
-another python module: `distutils2.pypi.dist`.
+If you dont care about which API to use, the best thing to do is to let
+distutils2 decide this for you, by using :class:`distutils2.index.Client`.
+Of course, you can rely too on :class:`distutils2.index.simple.Crawler` and
+:class:`distutils.index.xmlrpc.Client` if you need to use these specific APIs.
-Requesting information via the "simple" API `distutils2.pypi.simple`
-====================================================================
+.. toctree::
+ :maxdepth: 2
-`distutils2.pypi.simple` can process the Python Package Index and return and
-download urls of distributions, for specific versions or latests, but it also
-can process external html pages, with the goal to find *pypi unhosted* versions
-of python distributions.
-
-You should use `distutils2.pypi.simple` for:
-
- * Search distributions by name and versions.
- * Process pypi external pages.
- * Download distributions by name and versions.
-
-And should not be used to:
-
- * Things that will end up in too long index processing (like "finding all
- distributions with a specific version, no matters the name")
-
-API
-----
-
-Here is a complete overview of the APIs of the SimpleIndex class.
-
-.. autoclass:: distutils2.pypi.simple.SimpleIndex
- :members:
-
-Usage Exemples
----------------
-
-To help you understand how using the `SimpleIndex` class, here are some basic
-usages.
-
-Request PyPI to get a specific distribution
-++++++++++++++++++++++++++++++++++++++++++++
-
-Supposing you want to scan the PyPI index to get a list of distributions for
-the "foobar" project. You can use the "find" method for that::
-
- >>> from distutils2.pypi import SimpleIndex
- >>> client = SimpleIndex()
- >>> client.find("foobar")
- [<PyPIDistribution "Foobar 1.1">, <PyPIDistribution "Foobar 1.2">]
-
-Note that you also can request the client about specific versions, using version
-specifiers (described in `PEP 345
-<http://www.python.org/dev/peps/pep-0345/#version-specifiers>`_)::
-
- >>> client.find("foobar < 1.2")
- [<PyPIDistribution "foobar 1.1">, ]
-
-`find` returns a list of distributions, but you also can get the last
-distribution (the more up to date) that fullfil your requirements, like this::
-
- >>> client.get("foobar < 1.2")
- <PyPIDistribution "foobar 1.1">
-
-Download distributions
-+++++++++++++++++++++++
-
-As it can get the urls of distributions provided by PyPI, the `SimpleIndex`
-client also can download the distributions and put it for you in a temporary
-destination::
-
- >>> client.download("foobar")
- /tmp/temp_dir/foobar-1.2.tar.gz
-
-You also can specify the directory you want to download to::
-
- >>> client.download("foobar", "/path/to/my/dir")
- /path/to/my/dir/foobar-1.2.tar.gz
-
-While downloading, the md5 of the archive will be checked, if not matches, it
-will try another time, then if fails again, raise `MD5HashDoesNotMatchError`.
-
-Internally, that's not the SimpleIndex which download the distributions, but the
-`PyPIDistribution` class. Please refer to this documentation for more details.
-
-Following PyPI external links
-++++++++++++++++++++++++++++++
-
-The default behavior for distutils2 is to *not* follow the links provided
-by HTML pages in the "simple index", to find distributions related
-downloads.
-
-It's possible to tell the PyPIClient to follow external links by setting the
-`follow_externals` attribute, on instanciation or after::
-
- >>> client = SimpleIndex(follow_externals=True)
-
-or ::
-
- >>> client = SimpleIndex()
- >>> client.follow_externals = True
-
-Working with external indexes, and mirrors
-+++++++++++++++++++++++++++++++++++++++++++
-
-The default `SimpleIndex` behavior is to rely on the Python Package index stored
-on PyPI (http://pypi.python.org/simple).
-
-As you can need to work with a local index, or private indexes, you can specify
-it using the index_url parameter::
-
- >>> client = SimpleIndex(index_url="file://filesystem/path/")
-
-or ::
-
- >>> client = SimpleIndex(index_url="http://some.specific.url/")
-
-You also can specify mirrors to fallback on in case the first index_url you
-provided doesnt respond, or not correctly. The default behavior for
-`SimpleIndex` is to use the list provided by Python.org DNS records, as
-described in the :pep:`381` about mirroring infrastructure.
-
-If you don't want to rely on these, you could specify the list of mirrors you
-want to try by specifying the `mirrors` attribute. It's a simple iterable::
-
- >>> mirrors = ["http://first.mirror","http://second.mirror"]
- >>> client = SimpleIndex(mirrors=mirrors)
-
-
-Requesting informations via XML-RPC (`distutils2.pypi.XmlRpcIndex`)
-==========================================================================
-
-The other method to request the Python package index, is using the XML-RPC
-methods. Distutils2 provides a simple wrapper around `xmlrpclib
-<http://docs.python.org/library/xmlrpclib.html>`_, that can return you
-`PyPIDistribution` objects.
-
-::
- >>> from distutils2.pypi import XmlRpcIndex()
- >>> client = XmlRpcIndex()
-
-
-PyPI Distributions
-==================
-
-Both `SimpleIndex` and `XmlRpcIndex` classes works with the classes provided
-in the `pypi.dist` package.
-
-`PyPIDistribution`
-------------------
-
-`PyPIDistribution` is a simple class that defines the following attributes:
-
-:name:
- The name of the package. `foobar` in our exemples here
-:version:
- The version of the package
-:location:
- If the files from the archive has been downloaded, here is the path where
- you can find them.
-:url:
- The url of the distribution
-
-.. autoclass:: distutils2.pypi.dist.PyPIDistribution
- :members:
-
-`PyPIDistributions`
--------------------
-
-The `dist` module also provides another class, to work with lists of
-`PyPIDistribution` classes. It allow to filter results and is used as a
-container of
-
-.. autoclass:: distutils2.pypi.dist.PyPIDistributions
- :members:
-
-At a higher level
-=================
-
-XXX : A description about a wraper around PyPI simple and XmlRpc Indexes
-(PyPIIndex ?)
+ projects-index.client.rst
+ projects-index.dist.rst
+ projects-index.simple.rst
+ projects-index.xmlrpc.rst
diff --git a/docs/source/projects-index.simple.rst b/docs/source/projects-index.simple.rst
new file mode 100644
--- /dev/null
+++ b/docs/source/projects-index.simple.rst
@@ -0,0 +1,121 @@
+=========================================
+Querying indexes via the simple index API
+=========================================
+
+`distutils2.index.simple` can process Python Package Indexes, and provides
+useful informations about distributions. It also can crawl local indexes, for
+instance.
+
+You should use `distutils2.index.simple` for:
+
+ * Search distributions by name and versions.
+ * Process index external pages.
+ * Download distributions by name and versions.
+
+And should not be used to:
+
+ * Things that will end up in too long index processing (like "finding all
+ distributions with a specific version, no matters the name")
+
+API
+---
+
+.. autoclass:: distutils2.index.simple.Crawler
+ :members:
+
+
+Usage Exemples
+---------------
+
+To help you understand how using the `SimpleIndexCrawler` class, here are some basic
+usages.
+
+Request the simple index to get a specific distribution
+++++++++++++++++++++++++++++++++++++++++++++++++++++++++
+
+Supposing you want to scan an index to get a list of distributions for
+the "foobar" project. You can use the "find" method for that.
+The find method will browse the project page, and return :class:`ReleaseInfo`
+objects for each found link that rely on downloads. ::
+
+ >>> from distutils2.index.simple import Crawler
+ >>> crawler = Crawler()
+ >>> crawler.find("FooBar")
+ [<ReleaseInfo "Foobar 1.1">, <ReleaseInfo "Foobar 1.2">]
+
+Note that you also can request the client about specific versions, using version
+specifiers (described in `PEP 345
+<http://www.python.org/dev/peps/pep-0345/#version-specifiers>`_)::
+
+ >>> client.find("FooBar < 1.2")
+ [<ReleaseInfo "FooBar 1.1">, ]
+
+`find` returns a list of :class:`ReleaseInfo`, but you also can get the best
+distribution that fullfil your requirements, using "get"::
+
+ >>> client.get("FooBar < 1.2")
+ <ReleaseInfo "FooBar 1.1">
+
+Download distributions
++++++++++++++++++++++++
+
+As it can get the urls of distributions provided by PyPI, the `SimpleIndexCrawler`
+client also can download the distributions and put it for you in a temporary
+destination::
+
+ >>> client.download("foobar")
+ /tmp/temp_dir/foobar-1.2.tar.gz
+
+You also can specify the directory you want to download to::
+
+ >>> client.download("foobar", "/path/to/my/dir")
+ /path/to/my/dir/foobar-1.2.tar.gz
+
+While downloading, the md5 of the archive will be checked, if not matches, it
+will try another time, then if fails again, raise `MD5HashDoesNotMatchError`.
+
+Internally, that's not the SimpleIndexCrawler which download the distributions, but the
+`DistributionInfo` class. Please refer to this documentation for more details.
+
+Following PyPI external links
+++++++++++++++++++++++++++++++
+
+The default behavior for distutils2 is to *not* follow the links provided
+by HTML pages in the "simple index", to find distributions related
+downloads.
+
+It's possible to tell the PyPIClient to follow external links by setting the
+`follow_externals` attribute, on instanciation or after::
+
+ >>> client = SimpleIndexCrawler(follow_externals=True)
+
+or ::
+
+ >>> client = SimpleIndexCrawler()
+ >>> client.follow_externals = True
+
+Working with external indexes, and mirrors
++++++++++++++++++++++++++++++++++++++++++++
+
+The default `SimpleIndexCrawler` behavior is to rely on the Python Package index stored
+on PyPI (http://pypi.python.org/simple).
+
+As you can need to work with a local index, or private indexes, you can specify
+it using the index_url parameter::
+
+ >>> client = SimpleIndexCrawler(index_url="file://filesystem/path/")
+
+or ::
+
+ >>> client = SimpleIndexCrawler(index_url="http://some.specific.url/")
+
+You also can specify mirrors to fallback on in case the first index_url you
+provided doesnt respond, or not correctly. The default behavior for
+`SimpleIndexCrawler` is to use the list provided by Python.org DNS records, as
+described in the :pep:`381` about mirroring infrastructure.
+
+If you don't want to rely on these, you could specify the list of mirrors you
+want to try by specifying the `mirrors` attribute. It's a simple iterable::
+
+ >>> mirrors = ["http://first.mirror","http://second.mirror"]
+ >>> client = SimpleIndexCrawler(mirrors=mirrors)
diff --git a/docs/source/projects-index.xmlrpc.rst b/docs/source/projects-index.xmlrpc.rst
new file mode 100644
--- /dev/null
+++ b/docs/source/projects-index.xmlrpc.rst
@@ -0,0 +1,22 @@
+=========================
+Query indexes via XML-RPC
+=========================
+
+Indexes can be queried by using XML-RPC calls, and Distutils2 provides a simple
+way to use this methods.
+
+The :class:`distutils2.xmlrpc.Client` have some specificities, that would be
+described here.
+
+You should use XML-RPC for:
+
+ * XXX TODO
+
+API
+====
+::
+ >>> from distutils2.index import XmlRpcClient()
+ >>> client = XmlRpcClient()
+
+Usage examples
+===============
diff --git a/src/distutils2/pypi/__init__.py b/src/distutils2/index/__init__.py
rename from src/distutils2/pypi/__init__.py
rename to src/distutils2/index/__init__.py
--- a/src/distutils2/pypi/__init__.py
+++ b/src/distutils2/index/__init__.py
@@ -1,6 +1,6 @@
-"""distutils2.pypi
+"""distutils2.index
-Package containing ways to interact with the PyPI APIs.
+Package containing ways to interact with Index APIs.
"""
__all__ = ['simple',
diff --git a/src/distutils2/index/base.py b/src/distutils2/index/base.py
new file mode 100644
--- /dev/null
+++ b/src/distutils2/index/base.py
@@ -0,0 +1,88 @@
+from distutils2.version import VersionPredicate
+from distutils2.index.errors import DistributionNotFound
+
+
+class IndexClient(object):
+ """Base class containing common index client methods"""
+
+ def _search_for_releases(self, requirements):
+ """To be redefined in child classes"""
+ return NotImplemented
+
+ def find(self, requirements, prefer_final=None):
+ """Browse the PyPI to find distributions that fullfil the given
+ requirements.
+
+ :param requirements: A project name and it's distribution, using
+ version specifiers, as described in PEP345.
+ :type requirements: You can pass either a version.VersionPredicate
+ or a string.
+ :param prefer_final: if the version is not mentioned in requirements,
+ and the last version is not a "final" one
+ (alpha, beta, etc.), pick up the last final
+ version.
+ """
+ requirements = self._get_version_predicate(requirements)
+ prefer_final = self._get_prefer_final(prefer_final)
+
+ # internally, rely on the "_search_for_release" method
+ dists = self._search_for_releases(requirements)
+ if dists:
+ dists = dists.filter(requirements)
+ dists.sort_releases(prefer_final=prefer_final)
+ return dists
+
+ def get(self, requirements, prefer_final=None):
+ """Return only one release that fulfill the given requirements.
+
+ :param requirements: A project name and it's distribution, using
+ version specifiers, as described in PEP345.
+ :type requirements: You can pass either a version.VersionPredicate
+ or a string.
+ :param prefer_final: if the version is not mentioned in requirements,
+ and the last version is not a "final" one
+ (alpha, beta, etc.), pick up the last final
+ version.
+ """
+ predicate = self._get_version_predicate(requirements)
+
+ # internally, rely on the "_get_release" method
+ dist = self._get_release(predicate, prefer_final=prefer_final)
+ if not dist:
+ raise DistributionNotFound(requirements)
+ return dist
+
+ def download(self, requirements, temp_path=None, prefer_final=None,
+ prefer_source=True):
+ """Download the distribution, using the requirements.
+
+ If more than one distribution match the requirements, use the last
+ version.
+ Download the distribution, and put it in the temp_path. If no temp_path
+ is given, creates and return one.
+
+ Returns the complete absolute path to the downloaded archive.
+
+ :param requirements: The same as the find attribute of `find`.
+
+ You can specify prefer_final argument here. If not, the default
+ one will be used.
+ """
+ return self.get(requirements, prefer_final)\
+ .download(prefer_source=prefer_source, path=temp_path)
+
+ def _get_version_predicate(self, requirements):
+ """Return a VersionPredicate object, from a string or an already
+ existing object.
+ """
+ if isinstance(requirements, str):
+ requirements = VersionPredicate(requirements)
+ return requirements
+
+ def _get_prefer_final(self, prefer_final=None):
+ """Return the prefer_final bit parameter or the specified one if
+ exists."""
+ if prefer_final:
+ return prefer_final
+ else:
+ return self._prefer_final
diff --git a/src/distutils2/pypi/dist.py b/src/distutils2/index/dist.py
rename from src/distutils2/pypi/dist.py
rename to src/distutils2/index/dist.py
--- a/src/distutils2/pypi/dist.py
+++ b/src/distutils2/index/dist.py
@@ -1,98 +1,184 @@
-"""distutils2.pypi.dist
+"""distutils2.index.dist
-Provides the PyPIDistribution class thats represents a distribution retrieved
-on PyPI.
+Provides useful classes to represent the release and distributions retrieved
+from indexes.
+
+A project can have several releases (=versions) and each release can have
+several distributions (sdist, bdist).
+
+The release contains the metadata related informations (see PEP 384), and the
+distributions contains download related informations.
+
"""
import re
+import tempfile
+import urllib
import urlparse
-import urllib
-import tempfile
-from operator import attrgetter
try:
import hashlib
except ImportError:
from distutils2._backport import hashlib
+from distutils2.index.errors import (HashDoesNotMatch, UnsupportedHashName,
+ CantParseArchiveName)
from distutils2.version import suggest_normalized_version, NormalizedVersion
-from distutils2.pypi.errors import HashDoesNotMatch, UnsupportedHashName
+from distutils2.metadata import DistributionMetadata
EXTENSIONS = ".tar.gz .tar.bz2 .tar .zip .tgz .egg".split()
MD5_HASH = re.compile(r'^.*#md5=([a-f0-9]+)$')
+DIST_TYPES = ['bdist', 'sdist']
-class PyPIDistribution(object):
- """Represents a distribution retrieved from PyPI.
+class ReleaseInfo(object):
+ """Represent a release of a project (a project with a specific version).
+ The release contain the metadata informations related to this specific
+ version, and is also a container for distribution related informations.
- This is a simple container for various attributes as name, version,
- downloaded_location, url etc.
-
- The PyPIDistribution class is used by the pypi.*Index class to return
- information about distributions.
+ See the DistInfo class for more information about distributions.
"""
- @classmethod
- def from_url(cls, url, probable_dist_name=None, is_external=True):
- """Build a Distribution from a url archive (egg or zip or tgz).
-
- :param url: complete url of the distribution
- :param probable_dist_name: A probable name of the distribution.
- :param is_external: Tell if the url commes from an index or from
- an external URL.
+ def __init__(self, name, version, metadata=None, hidden=False, **kwargs):
"""
- # if the url contains a md5 hash, get it.
- md5_hash = None
- match = MD5_HASH.match(url)
- if match is not None:
- md5_hash = match.group(1)
- # remove the hash
- url = url.replace("#md5=%s" % md5_hash, "")
-
- # parse the archive name to find dist name and version
- archive_name = urlparse.urlparse(url)[2].split('/')[-1]
- extension_matched = False
- # remove the extension from the name
- for ext in EXTENSIONS:
- if archive_name.endswith(ext):
- archive_name = archive_name[:-len(ext)]
- extension_matched = True
-
- name, version = split_archive_name(archive_name)
- if extension_matched is True:
- return PyPIDistribution(name, version, url=url, url_hashname="md5",
- url_hashval=md5_hash,
- url_is_external=is_external)
-
- def __init__(self, name, version, type=None, url=None, url_hashname=None,
- url_hashval=None, url_is_external=True):
- """Create a new instance of PyPIDistribution.
-
:param name: the name of the distribution
:param version: the version of the distribution
- :param type: the type of the dist (eg. source, bin-*, etc.)
- :param url: URL where we found this distribution
- :param url_hashname: the name of the hash we want to use. Refer to the
- hashlib.new documentation for more information.
- :param url_hashval: the hash value.
- :param url_is_external: we need to know if the provided url comes from an
- index browsing, or from an external resource.
-
+ :param metadata: the metadata fields of the release.
+ :type metadata: dict
+ :param kwargs: optional arguments for a new distribution.
"""
self.name = name
self.version = NormalizedVersion(version)
- self.type = type
+ self.metadata = DistributionMetadata() # XXX from_dict=metadata)
+ self.dists = {}
+ self.hidden = hidden
+
+ if 'dist_type' in kwargs:
+ dist_type = kwargs.pop('dist_type')
+ self.add_distribution(dist_type, **kwargs)
+
+ @property
+ def is_final(self):
+ """proxy to version.is_final"""
+ return self.version.is_final
+
+ def add_distribution(self, dist_type='sdist', **params):
+ """Add distribution informations to this release.
+ If distribution information is already set for this distribution type,
+ add the given url paths to the distribution. This can be useful while
+ some of them fails to download.
+
+ :param dist_type: the distribution type (eg. "sdist", "bdist", etc.)
+ :param params: the fields to be passed to the distribution object
+ (see the :class:DistInfo constructor).
+ """
+ if dist_type not in DIST_TYPES:
+ raise ValueError(dist_type)
+ if dist_type in self.dists:
+ self.dists[dist_type].add_url(**params)
+ else:
+ self.dists[dist_type] = DistInfo(self, dist_type, **params)
+
+ def get_distribution(self, dist_type=None, prefer_source=True):
+ """Return a distribution.
+
+ If dist_type is set, find first for this distribution type, and just
+ act as an alias of __get_item__.
+
+ If prefer_source is True, search first for source distribution, and if
+ not return one existing distribution.
+ """
+ if len(self.dists) == 0:
+ raise LookupError()
+ if dist_type:
+ return self[dist_type]
+ if prefer_source:
+ if "sdist" in self.dists:
+ dist = self["sdist"]
+ else:
+ dist = self.dists.values()[0]
+ return dist
+
+ def download(self, temp_path=None, prefer_source=True):
+ """Download the distribution, using the requirements.
+
+ If more than one distribution match the requirements, use the last
+ version.
+ Download the distribution, and put it in the temp_path. If no temp_path
+ is given, creates and return one.
+
+ Returns the complete absolute path to the downloaded archive.
+ """
+ return self.get_distribution(prefer_source=prefer_source)\
+ .download(path=temp_path)
+
+ def __getitem__(self, item):
+ """distributions are available using release["sdist"]"""
+ return self.dists[item]
+
+ def _check_is_comparable(self, other):
+ if not isinstance(other, ReleaseInfo):
+ raise TypeError("cannot compare %s and %s"
+ % (type(self).__name__, type(other).__name__))
+ elif self.name != other.name:
+ raise TypeError("cannot compare %s and %s"
+ % (self.name, other.name))
+
+ def __eq__(self, other):
+ self._check_is_comparable(other)
+ return self.version == other.version
+
+ def __lt__(self, other):
+ self._check_is_comparable(other)
+ return self.version < other.version
+
+ def __ne__(self, other):
+ return not self.__eq__(other)
+
+ def __gt__(self, other):
+ return not (self.__lt__(other) or self.__eq__(other))
+
+ def __le__(self, other):
+ return self.__eq__(other) or self.__lt__(other)
+
+ def __ge__(self, other):
+ return self.__eq__(other) or self.__gt__(other)
+
+ # See http://docs.python.org/reference/datamodel#object.__hash__
+ __hash__ = object.__hash__
+
+
+class DistInfo(object):
+ """Represents a distribution retrieved from an index (sdist, bdist, ...)
+ """
+
+ def __init__(self, release, dist_type=None, url=None, hashname=None,
+ hashval=None, is_external=True):
+ """Create a new instance of DistInfo.
+
+ :param release: a DistInfo class is relative to a release.
+ :param dist_type: the type of the dist (eg. source, bin-*, etc.)
+ :param url: URL where we found this distribution
+ :param hashname: the name of the hash we want to use. Refer to the
+ hashlib.new documentation for more information.
+ :param hashval: the hash value.
+ :param is_external: we need to know if the provided url comes from
+ an index browsing, or from an external resource.
+
+ """
+ self.release = release
+ self.dist_type = dist_type
# set the downloaded path to None by default. The goal here
# is to not download distributions multiple times
self.downloaded_location = None
- # We store urls in dict, because we need to have a bit more informations
+ # We store urls in dict, because we need to have a bit more infos
# than the simple URL. It will be used later to find the good url to
# use.
- # We have two _url* attributes: _url and _urls. _urls contains a list of
- # dict for the different urls, and _url contains the choosen url, in
+ # We have two _url* attributes: _url and urls. urls contains a list
+ # of dict for the different urls, and _url contains the choosen url, in
# order to dont make the selection process multiple times.
- self._urls = []
+ self.urls = []
self._url = None
- self.add_url(url, url_hashname, url_hashval, url_is_external)
+ self.add_url(url, hashname, hashval, is_external)
def add_url(self, url, hashname=None, hashval=None, is_external=True):
"""Add a new url to the list of urls"""
@@ -101,15 +187,15 @@
hashlib.new(hashname)
except ValueError:
raise UnsupportedHashName(hashname)
-
- self._urls.append({
- 'url': url,
- 'hashname': hashname,
- 'hashval': hashval,
- 'is_external': is_external,
- })
- # reset the url selection process
- self._url = None
+ if not url in [u['url'] for u in self.urls]:
+ self.urls.append({
+ 'url': url,
+ 'hashname': hashname,
+ 'hashval': hashval,
+ 'is_external': is_external,
+ })
+ # reset the url selection process
+ self._url = None
@property
def url(self):
@@ -118,24 +204,19 @@
# If there is more than one internal or external, return the first
# one.
if self._url is None:
- if len(self._urls) > 1:
- internals_urls = [u for u in self._urls \
+ if len(self.urls) > 1:
+ internals_urls = [u for u in self.urls \
if u['is_external'] == False]
if len(internals_urls) >= 1:
self._url = internals_urls[0]
if self._url is None:
- self._url = self._urls[0]
+ self._url = self.urls[0]
return self._url
@property
def is_source(self):
"""return if the distribution is a source one or not"""
- return self.type == 'source'
-
- @property
- def is_final(self):
- """proxy to version.is_final"""
- return self.version.is_final
+ return self.dist_type == 'sdist'
def download(self, path=None):
"""Download the distribution to a path, and return it.
@@ -169,113 +250,140 @@
% (hashval.hexdigest(), expected_hashval))
def __repr__(self):
- return "%s %s %s %s" \
- % (self.__class__.__name__, self.name, self.version,
- self.type or "")
+ return "%s %s %s" % (
+ self.release.name, self.release.version, self.dist_type or "")
- def _check_is_comparable(self, other):
- if not isinstance(other, PyPIDistribution):
- raise TypeError("cannot compare %s and %s"
- % (type(self).__name__, type(other).__name__))
- elif self.name != other.name:
- raise TypeError("cannot compare %s and %s"
- % (self.name, other.name))
- def __eq__(self, other):
- self._check_is_comparable(other)
- return self.version == other.version
+class ReleasesList(list):
+ """A container of Release.
- def __lt__(self, other):
- self._check_is_comparable(other)
- return self.version < other.version
-
- def __ne__(self, other):
- return not self.__eq__(other)
-
- def __gt__(self, other):
- return not (self.__lt__(other) or self.__eq__(other))
-
- def __le__(self, other):
- return self.__eq__(other) or self.__lt__(other)
-
- def __ge__(self, other):
- return self.__eq__(other) or self.__gt__(other)
-
- # See http://docs.python.org/reference/datamodel#object.__hash__
- __hash__ = object.__hash__
-
-
-class PyPIDistributions(list):
- """A container of PyPIDistribution objects.
-
- Contains methods and facilities to sort and filter distributions.
+ Provides useful methods and facilities to sort and filter releases.
"""
- def __init__(self, list=[]):
- # To disable the ability to pass lists on instanciation
- super(PyPIDistributions, self).__init__()
+ def __init__(self, name, list=[], contains_hidden=False):
+ super(ReleasesList, self).__init__()
for item in list:
self.append(item)
+ self.name = name
+ self.contains_hidden = contains_hidden
+
+ def filter(self, predicate):
+ """Filter and return a subset of releases matching the given predicate.
+ """
+ return ReleasesList(self.name, [release for release in self
+ if release.name == predicate.name
+ and predicate.match(release.version)])
- def filter(self, predicate):
- """Filter the distributions and return a subset of distributions that
- match the given predicate
+ def get_last(self, predicate, prefer_final=None):
+ """Return the "last" release, that satisfy the given predicates.
+
+ "last" is defined by the version number of the releases, you also could
+ set prefer_final parameter to True or False to change the order results
"""
- return PyPIDistributions(
- [dist for dist in self if dist.name == predicate.name and
- predicate.match(dist.version)])
+ releases = self.filter(predicate)
+ releases.sort_releases(prefer_final, reverse=True)
+ return releases[0]
- def get_last(self, predicate, prefer_source=None, prefer_final=None):
- """Return the most up to date version, that satisfy the given
- predicate
+ def add_release(self, version=None, dist_type='sdist', release=None,
+ **dist_args):
+ """Add a release to the list.
+
+ The release can be passed in the `release` parameter, and in this case,
+ it will be crawled to extract the useful informations if necessary, or
+ the release informations can be directly passed in the `version` and
+ `dist_type` arguments.
+
+ Other keywords arguments can be provided, and will be forwarded to the
+ distribution creation (eg. the arguments of the DistInfo constructor).
"""
- distributions = self.filter(predicate)
- distributions.sort_distributions(prefer_source, prefer_final, reverse=True)
- return distributions[0]
+ if release:
+ if release.name != self.name:
+ raise ValueError(release.name)
+ version = '%s' % release.version
+ for dist in release.dists.values():
+ for url in dist.urls:
+ self.add_release(version, dist.dist_type, **url)
+ else:
+ matches = [r for r in self if '%s' % r.version == version
+ and r.name == self.name]
+ if not matches:
+ release = ReleaseInfo(self.name, version)
+ self.append(release)
+ else:
+ release = matches[0]
- def get_same_name_and_version(self):
- """Return lists of PyPIDistribution objects that refer to the same
- name and version number. This do not consider the type (source, binary,
- etc.)"""
- processed = []
- duplicates = []
- for dist in self:
- if (dist.name, dist.version) not in processed:
- processed.append((dist.name, dist.version))
- found_duplicates = [d for d in self if d.name == dist.name and
- d.version == dist.version]
- if len(found_duplicates) > 1:
- duplicates.append(found_duplicates)
- return duplicates
+ release.add_distribution(dist_type=dist_type, **dist_args)
- def append(self, o):
- """Append a new distribution to the list.
+ def sort_releases(self, prefer_final=False, reverse=True, *args, **kwargs):
+ """Sort the results with the given properties.
- If a distribution with the same name and version exists, just grab the
- URL informations and add a new new url for the existing one.
+ The `prefer_final` argument can be used to specify if final
+ distributions (eg. not dev, bet or alpha) would be prefered or not.
+
+ Results can be inverted by using `reverse`.
+
+ Any other parameter provided will be forwarded to the sorted call. You
+ cannot redefine the key argument of "sorted" here, as it is used
+ internally to sort the releases.
"""
- similar_dists = [d for d in self if d.name == o.name and
- d.version == o.version and d.type == o.type]
- if len(similar_dists) > 0:
- dist = similar_dists[0]
- dist.add_url(**o.url)
- else:
- super(PyPIDistributions, self).append(o)
-
- def sort_distributions(self, prefer_source=True, prefer_final=False,
- reverse=True, *args, **kwargs):
- """order the results with the given properties"""
sort_by = []
if prefer_final:
sort_by.append("is_final")
sort_by.append("version")
- if prefer_source:
- sort_by.append("is_source")
-
- super(PyPIDistributions, self).sort(
+ super(ReleasesList, self).sort(
key=lambda i: [getattr(i, arg) for arg in sort_by],
reverse=reverse, *args, **kwargs)
+
+ def get_release(self, version):
+ """Return a release from it's version.
+ """
+ matches = [r for r in self if "%s" % r.version == version]
+ if len(matches) != 1:
+ raise KeyError(version)
+ return matches[0]
+
+ def get_versions(self):
+ """Return a list of releases versions contained"""
+ return ["%s" % r.version for r in self]
+
+
+def get_infos_from_url(url, probable_dist_name=None, is_external=True):
+ """Get useful informations from an URL.
+
+ Return a dict of (name, version, url, hashtype, hash, is_external)
+
+ :param url: complete url of the distribution
+ :param probable_dist_name: A probable name of the project.
+ :param is_external: Tell if the url commes from an index or from
+ an external URL.
+ """
+ # if the url contains a md5 hash, get it.
+ md5_hash = None
+ match = MD5_HASH.match(url)
+ if match is not None:
+ md5_hash = match.group(1)
+ # remove the hash
+ url = url.replace("#md5=%s" % md5_hash, "")
+
+ # parse the archive name to find dist name and version
+ archive_name = urlparse.urlparse(url)[2].split('/')[-1]
+ extension_matched = False
+ # remove the extension from the name
+ for ext in EXTENSIONS:
+ if archive_name.endswith(ext):
+ archive_name = archive_name[:-len(ext)]
+ extension_matched = True
+
+ name, version = split_archive_name(archive_name)
+ if extension_matched is True:
+ return {'name': name,
+ 'version': version,
+ 'url': url,
+ 'hashname': "md5",
+ 'hashval': md5_hash,
+ 'is_external': is_external,
+ 'dist_type': 'sdist'}
def split_archive_name(archive_name, probable_name=None):
diff --git a/src/distutils2/pypi/errors.py b/src/distutils2/index/errors.py
rename from src/distutils2/pypi/errors.py
rename to src/distutils2/index/errors.py
--- a/src/distutils2/pypi/errors.py
+++ b/src/distutils2/index/errors.py
@@ -5,19 +5,19 @@
from distutils2.errors import DistutilsError
-class PyPIError(DistutilsError):
- """The base class for errors of the pypi python package."""
+class IndexError(DistutilsError):
+ """The base class for errors of the index python package."""
-class DistributionNotFound(PyPIError):
+class DistributionNotFound(IndexError):
"""No distribution match the given requirements."""
-class CantParseArchiveName(PyPIError):
+class CantParseArchiveName(IndexError):
"""An archive name can't be parsed to find distribution name and version"""
-class DownloadError(PyPIError):
+class DownloadError(IndexError):
"""An error has occurs while downloading"""
@@ -25,9 +25,9 @@
"""Compared hashes does not match"""
-class UnsupportedHashName(PyPIError):
+class UnsupportedHashName(IndexError):
"""A unsupported hashname has been used"""
-class UnableToDownload(PyPIError):
+class UnableToDownload(IndexError):
"""All mirrors have been tried, without success"""
diff --git a/src/distutils2/pypi/simple.py b/src/distutils2/index/simple.py
rename from src/distutils2/pypi/simple.py
rename to src/distutils2/index/simple.py
--- a/src/distutils2/pypi/simple.py
+++ b/src/distutils2/index/simple.py
@@ -1,6 +1,6 @@
-"""pypi.simple
+"""index.simple
-Contains the class "SimpleIndex", a simple spider to find and retrieve
+Contains the class "SimpleIndexCrawler", a simple spider to find and retrieve
distributions on the Python Package Index, using it's "simple" API,
avalaible at http://pypi.python.org/simple/
"""
@@ -12,16 +12,16 @@
import urllib2
import urlparse
-from distutils2.version import VersionPredicate
-from distutils2.pypi.dist import (PyPIDistribution, PyPIDistributions,
- EXTENSIONS)
-from distutils2.pypi.errors import (PyPIError, DistributionNotFound,
- DownloadError, UnableToDownload)
+from distutils2.index.base import IndexClient
+from distutils2.index.dist import (ReleasesList, EXTENSIONS,
+ get_infos_from_url)
+from distutils2.index.errors import (IndexError, DownloadError,
+ UnableToDownload)
from distutils2 import __version__ as __distutils2_version__
# -- Constants -----------------------------------------------
-PYPI_DEFAULT_INDEX_URL = "http://pypi.python.org/simple/"
-PYPI_DEFAULT_MIRROR_URL = "mirrors.pypi.python.org"
+DEFAULT_INDEX_URL = "http://pypi.python.org/simple/"
+DEFAULT_MIRROR_URL = "mirrors.pypi.python.org"
DEFAULT_HOSTS = ("*",)
SOCKET_TIMEOUT = 15
USER_AGENT = "Python-urllib/%s distutils2/%s" % (
@@ -58,8 +58,8 @@
return _socket_timeout
-class SimpleIndex(object):
- """Provides useful tools to request the Python Package Index simple API
+class Crawler(IndexClient):
+ """Provides useful tools to request the Python Package Index simple API.
:param index_url: the url of the simple index to search on.
:param follow_externals: tell if following external links is needed or
@@ -69,8 +69,6 @@
hosts.
:param follow_externals: tell if following external links is needed or
not. Default is False.
- :param prefer_source: if there is binary and source distributions, the
- source prevails.
:param prefer_final: if the version is not mentioned, and the last
version is not a "final" one (alpha, beta, etc.),
pick up the last final version.
@@ -81,10 +79,10 @@
:param timeout: time in seconds to consider a url has timeouted.
"""
- def __init__(self, index_url=PYPI_DEFAULT_INDEX_URL, hosts=DEFAULT_HOSTS,
- follow_externals=False, prefer_source=True,
- prefer_final=False, mirrors_url=PYPI_DEFAULT_MIRROR_URL,
- mirrors=None, timeout=SOCKET_TIMEOUT):
+ def __init__(self, index_url=DEFAULT_INDEX_URL, hosts=DEFAULT_HOSTS,
+ follow_externals=False, prefer_final=False,
+ mirrors_url=DEFAULT_MIRROR_URL, mirrors=None,
+ timeout=SOCKET_TIMEOUT):
self.follow_externals = follow_externals
if not index_url.endswith("/"):
@@ -99,7 +97,6 @@
self._index_urls.extend(mirrors)
self._current_index_url = 0
self._timeout = timeout
- self._prefer_source = prefer_source
self._prefer_final = prefer_final
# create a regexp to match all given hosts
@@ -109,86 +106,26 @@
# scanning them multple time (eg. if there is multiple pages pointing
# on one)
self._processed_urls = []
- self._distributions = {}
-
- def find(self, requirements, prefer_source=None, prefer_final=None):
- """Browse the PyPI to find distributions that fullfil the given
- requirements.
-
- :param requirements: A project name and it's distribution, using
- version specifiers, as described in PEP345.
- :type requirements: You can pass either a version.VersionPredicate
- or a string.
- :param prefer_source: if there is binary and source distributions, the
- source prevails.
- :param prefer_final: if the version is not mentioned, and the last
- version is not a "final" one (alpha, beta, etc.),
- pick up the last final version.
- """
- requirements = self._get_version_predicate(requirements)
- if prefer_source is None:
- prefer_source = self._prefer_source
- if prefer_final is None:
- prefer_final = self._prefer_final
-
- # process the index for this project
- self._process_pypi_page(requirements.name)
-
- # filter with requirements and return the results
- if requirements.name in self._distributions:
- dists = self._distributions[requirements.name].filter(requirements)
- dists.sort_distributions(prefer_source=prefer_source,
- prefer_final=prefer_final)
- else:
- dists = []
-
- return dists
-
- def get(self, requirements, *args, **kwargs):
- """Browse the PyPI index to find distributions that fullfil the
- given requirements, and return the most recent one.
-
- You can specify prefer_final and prefer_source arguments here.
- If not, the default one will be used.
- """
- predicate = self._get_version_predicate(requirements)
- dists = self.find(predicate, *args, **kwargs)
-
- if len(dists) == 0:
- raise DistributionNotFound(requirements)
-
- return dists.get_last(predicate)
-
- def download(self, requirements, temp_path=None, *args, **kwargs):
- """Download the distribution, using the requirements.
-
- If more than one distribution match the requirements, use the last
- version.
- Download the distribution, and put it in the temp_path. If no temp_path
- is given, creates and return one.
-
- Returns the complete absolute path to the downloaded archive.
-
- :param requirements: The same as the find attribute of `find`.
-
- You can specify prefer_final and prefer_source arguments here.
- If not, the default one will be used.
- """
- return self.get(requirements, *args, **kwargs)\
- .download(path=temp_path)
-
- def _get_version_predicate(self, requirements):
- """Return a VersionPredicate object, from a string or an already
- existing object.
- """
- if isinstance(requirements, str):
- requirements = VersionPredicate(requirements)
- return requirements
-
+ self._releases = {}
+
@property
def index_url(self):
return self._index_urls[self._current_index_url]
+ def _search_for_releases(self, requirements):
+ """Search for distributions and return a ReleaseList object containing
+ the results
+ """
+ # process the index page for the project name, searching for
+ # distributions.
+ self._process_index_page(requirements.name)
+ return self._releases.setdefault(requirements.name,
+ ReleasesList(requirements.name))
+
+ def _get_release(self, requirements, prefer_final):
+ """Return only one release that fulfill the given requirements"""
+ return self.find(requirements, prefer_final).get_last(requirements)
+
def _switch_to_next_mirror(self):
"""Switch to the next mirror (eg. point self.index_url to the next
url.
@@ -228,18 +165,33 @@
return True
return False
- def _register_dist(self, dist):
- """Register a distribution as a part of fetched distributions for
- SimpleIndex.
+ def _register_release(self, release=None, release_info={}):
+ """Register a new release.
- Return the PyPIDistributions object for the specified project name
+ Both a release or a dict of release_info can be provided, the prefered
+ way (eg. the quicker) is the dict one.
+
+ Return the list of existing releases for the given project.
"""
- # Internally, check if a entry exists with the project name, if not,
- # create a new one, and if exists, add the dist to the pool.
- if not dist.name in self._distributions:
- self._distributions[dist.name] = PyPIDistributions()
- self._distributions[dist.name].append(dist)
- return self._distributions[dist.name]
+ # Check if the project already has a list of releases (refering to
+ # the project name). If not, create a new release list.
+ # Then, add the release to the list.
+ if release:
+ name = release.name
+ else:
+ name = release_info['name']
+ if not name in self._releases:
+ self._releases[name] = ReleasesList(name)
+
+ if release:
+ self._releases[name].add_release(release=release)
+ else:
+ name = release_info.pop('name')
+ version = release_info.pop('version')
+ dist_type = release_info.pop('dist_type')
+ self._releases[name].add_release(version, dist_type,
+ **release_info)
+ return self._releases[name]
def _process_url(self, url, project_name=None, follow_links=True):
"""Process an url and search for distributions packages.
@@ -264,9 +216,9 @@
if self._is_distribution(link) or is_download:
self._processed_urls.append(link)
# it's a distribution, so create a dist object
- dist = PyPIDistribution.from_url(link, project_name,
- is_external=not self.index_url in url)
- self._register_dist(dist)
+ infos = get_infos_from_url(link, project_name,
+ is_external=not self.index_url in url)
+ self._register_release(release_info=infos)
else:
if self._is_browsable(link) and follow_links:
self._process_url(link, project_name,
@@ -307,7 +259,7 @@
if self._is_browsable(url):
yield (url, False)
- def _process_pypi_page(self, name):
+ def _process_index_page(self, name):
"""Find and process a PyPI page for the given project name.
:param name: the name of the project to find the page
@@ -320,8 +272,8 @@
# if an error occurs, try with the next index_url
# (provided by the mirrors)
self._switch_to_next_mirror()
- self._distributions.clear()
- self._process_pypi_page(name)
+ self._releases.clear()
+ self._process_index_page(name)
@socket_timeout()
def _open_url(self, url):
@@ -366,7 +318,7 @@
return fp
except (ValueError, httplib.InvalidURL), v:
msg = ' '.join([str(arg) for arg in v.args])
- raise PyPIError('%s %s' % (url, msg))
+ raise IndexError('%s %s' % (url, msg))
except urllib2.HTTPError, v:
return v
except urllib2.URLError, v:
diff --git a/src/distutils2/tests/test_pypi_dist.py b/src/distutils2/tests/test_index_dist.py
rename from src/distutils2/tests/test_pypi_dist.py
rename to src/distutils2/tests/test_index_dist.py
--- a/src/distutils2/tests/test_pypi_dist.py
+++ b/src/distutils2/tests/test_index_dist.py
@@ -1,76 +1,93 @@
-"""Tests for the distutils2.pypi.dist module."""
+"""Tests for the distutils2.index.dist module."""
import os
-import shutil
-import tempfile
from distutils2.tests.pypi_server import use_pypi_server
from distutils2.tests import run_unittest
from distutils2.tests.support import unittest, TempdirManager
from distutils2.version import VersionPredicate
-from distutils2.pypi.errors import HashDoesNotMatch, UnsupportedHashName
-from distutils2.pypi.dist import (PyPIDistribution as Dist,
- PyPIDistributions as Dists,
- split_archive_name)
+from distutils2.index.errors import HashDoesNotMatch, UnsupportedHashName
+from distutils2.index.dist import (ReleaseInfo, ReleasesList, DistInfo,
+ split_archive_name, get_infos_from_url)
-class TestPyPIDistribution(TempdirManager,
- unittest.TestCase):
- """Tests the pypi.dist.PyPIDistribution class"""
+def Dist(*args, **kwargs):
+ # DistInfo takes a release as a first parameter, avoid this in tests.
+ return DistInfo(None, *args, **kwargs)
+
+
+class TestReleaseInfo(unittest.TestCase):
def test_instanciation(self):
- # Test the Distribution class provides us the good attributes when
+ # Test the DistInfo class provides us the good attributes when
# given on construction
- dist = Dist("FooBar", "1.1")
- self.assertEqual("FooBar", dist.name)
- self.assertEqual("1.1", "%s" % dist.version)
+ release = ReleaseInfo("FooBar", "1.1")
+ self.assertEqual("FooBar", release.name)
+ self.assertEqual("1.1", "%s" % release.version)
- def test_create_from_url(self):
- # Test that the Distribution object can be built from a single URL
+ def test_add_dist(self):
+ # empty distribution type should assume "sdist"
+ release = ReleaseInfo("FooBar", "1.1")
+ release.add_distribution(url="http://example.org/")
+ # should not fail
+ release['sdist']
+
+ def test_get_unknown_distribution(self):
+ # should raise a KeyError
+ pass
+
+ def test_get_infos_from_url(self):
+ # Test that the the URLs are parsed the right way
url_list = {
'FooBar-1.1.0.tar.gz': {
'name': 'foobar', # lowercase the name
- 'version': '1.1',
+ 'version': '1.1.0',
},
'Foo-Bar-1.1.0.zip': {
'name': 'foo-bar', # keep the dash
- 'version': '1.1',
+ 'version': '1.1.0',
},
'foobar-1.1b2.tar.gz#md5=123123123123123': {
'name': 'foobar',
'version': '1.1b2',
- 'url': {
- 'url': 'http://test.tld/foobar-1.1b2.tar.gz', # no hash
- 'hashval': '123123123123123',
- 'hashname': 'md5',
- }
+ 'url': 'http://example.org/foobar-1.1b2.tar.gz', # no hash
+ 'hashval': '123123123123123',
+ 'hashname': 'md5',
},
'foobar-1.1-rc2.tar.gz': { # use suggested name
'name': 'foobar',
'version': '1.1c2',
- 'url': {
- 'url': 'http://test.tld/foobar-1.1-rc2.tar.gz',
- }
+ 'url': 'http://example.org/foobar-1.1-rc2.tar.gz',
}
}
for url, attributes in url_list.items():
- dist = Dist.from_url("http://test.tld/" + url)
- for attribute, value in attributes.items():
- if isinstance(value, dict):
- mylist = getattr(dist, attribute)
- for val in value.keys():
- self.assertEqual(value[val], mylist[val])
+ # for each url
+ infos = get_infos_from_url("http://example.org/" + url)
+ for attribute, expected in attributes.items():
+ got = infos.get(attribute)
+ if attribute == "version":
+ self.assertEqual("%s" % got, expected)
else:
- if attribute == "version":
- self.assertEqual("%s" % getattr(dist, "version"), value)
- else:
- self.assertEqual(getattr(dist, attribute), value)
+ self.assertEqual(got, expected)
+
+ def test_split_archive_name(self):
+ # Test we can split the archive names
+ names = {
+ 'foo-bar-baz-1.0-rc2': ('foo-bar-baz', '1.0c2'),
+ 'foo-bar-baz-1.0': ('foo-bar-baz', '1.0'),
+ 'foobarbaz-1.0': ('foobarbaz', '1.0'),
+ }
+ for name, results in names.items():
+ self.assertEqual(results, split_archive_name(name))
+
+
+class TestDistInfo(TempdirManager, unittest.TestCase):
def test_get_url(self):
# Test that the url property works well
- d = Dist("foobar", "1.1", url="test_url")
+ d = Dist(url="test_url")
self.assertDictEqual(d.url, {
"url": "test_url",
"is_external": True,
@@ -87,13 +104,13 @@
"hashname": None,
"hashval": None,
})
- self.assertEqual(2, len(d._urls))
+ self.assertEqual(2, len(d.urls))
def test_comparaison(self):
- # Test that we can compare PyPIDistributions
- foo1 = Dist("foo", "1.0")
- foo2 = Dist("foo", "2.0")
- bar = Dist("bar", "2.0")
+ # Test that we can compare DistInfoributionInfoList
+ foo1 = ReleaseInfo("foo", "1.0")
+ foo2 = ReleaseInfo("foo", "2.0")
+ bar = ReleaseInfo("bar", "2.0")
# assert we use the version to compare
self.assertTrue(foo1 < foo2)
self.assertFalse(foo1 > foo2)
@@ -102,16 +119,6 @@
# assert we can't compare dists with different names
self.assertRaises(TypeError, foo1.__eq__, bar)
- def test_split_archive_name(self):
- # Test we can split the archive names
- names = {
- 'foo-bar-baz-1.0-rc2': ('foo-bar-baz', '1.0c2'),
- 'foo-bar-baz-1.0': ('foo-bar-baz', '1.0'),
- 'foobarbaz-1.0': ('foobarbaz', '1.0'),
- }
- for name, results in names.items():
- self.assertEqual(results, split_archive_name(name))
-
@use_pypi_server("downloads_with_md5")
def test_download(self, server):
# Download is possible, and the md5 is checked if given
@@ -120,19 +127,18 @@
url = "%s/simple/foobar/foobar-0.1.tar.gz" % server.full_address
# check md5 if given
- dist = Dist("FooBar", "0.1", url=url,
- url_hashname="md5", url_hashval="d41d8cd98f00b204e9800998ecf8427e")
+ dist = Dist(url=url, hashname="md5",
+ hashval="d41d8cd98f00b204e9800998ecf8427e")
add_to_tmpdirs(dist.download())
# a wrong md5 fails
- dist2 = Dist("FooBar", "0.1", url=url,
- url_hashname="md5", url_hashval="wrongmd5")
+ dist2 = Dist(url=url, hashname="md5", hashval="wrongmd5")
self.assertRaises(HashDoesNotMatch, dist2.download)
add_to_tmpdirs(dist2.downloaded_location)
# we can omit the md5 hash
- dist3 = Dist("FooBar", "0.1", url=url)
+ dist3 = Dist(url=url)
add_to_tmpdirs(dist3.download())
# and specify a temporary location
@@ -141,106 +147,104 @@
dist3.download(path=path1)
# and for a new one
path2_base = self.mkdtemp()
- dist4 = Dist("FooBar", "0.1", url=url)
+ dist4 = Dist(url=url)
path2 = dist4.download(path=path2_base)
self.assertTrue(path2_base in path2)
def test_hashname(self):
# Invalid hashnames raises an exception on assignation
- Dist("FooBar", "0.1", url_hashname="md5", url_hashval="value")
+ Dist(hashname="md5", hashval="value")
- self.assertRaises(UnsupportedHashName, Dist, "FooBar", "0.1",
- url_hashname="invalid_hashname", url_hashval="value")
+ self.assertRaises(UnsupportedHashName, Dist,
+ hashname="invalid_hashname",
+ hashval="value")
-class TestPyPIDistributions(unittest.TestCase):
+class TestReleasesList(unittest.TestCase):
def test_filter(self):
# Test we filter the distributions the right way, using version
# predicate match method
- dists = Dists((
- Dist("FooBar", "1.1"),
- Dist("FooBar", "1.1.1"),
- Dist("FooBar", "1.2"),
- Dist("FooBar", "1.2.1"),
+ releases = ReleasesList('FooBar', (
+ ReleaseInfo("FooBar", "1.1"),
+ ReleaseInfo("FooBar", "1.1.1"),
+ ReleaseInfo("FooBar", "1.2"),
+ ReleaseInfo("FooBar", "1.2.1"),
))
- filtered = dists.filter(VersionPredicate("FooBar (<1.2)"))
- self.assertNotIn(dists[2], filtered)
- self.assertNotIn(dists[3], filtered)
- self.assertIn(dists[0], filtered)
- self.assertIn(dists[1], filtered)
+ filtered = releases.filter(VersionPredicate("FooBar (<1.2)"))
+ self.assertNotIn(releases[2], filtered)
+ self.assertNotIn(releases[3], filtered)
+ self.assertIn(releases[0], filtered)
+ self.assertIn(releases[1], filtered)
- def test_append(self):
+ def test_add_release(self):
# When adding a new item to the list, the behavior is to test if
- # a distribution with the same name and version number already exists,
- # and if so, to add url informations to the existing PyPIDistribution
+ # a release with the same name and version number already exists,
+ # and if so, to add a new distribution for it. If the distribution type
+ # is already defined too, add url informations to the existing DistInfo
# object.
- # If no object matches, just add "normally" the object to the list.
- dists = Dists([
- Dist("FooBar", "1.1", url="external_url", type="source"),
+ releases = ReleasesList("FooBar", [
+ ReleaseInfo("FooBar", "1.1", url="external_url",
+ dist_type="sdist"),
])
- self.assertEqual(1, len(dists))
- dists.append(Dist("FooBar", "1.1", url="internal_url",
- url_is_external=False, type="source"))
- self.assertEqual(1, len(dists))
- self.assertEqual(2, len(dists[0]._urls))
+ self.assertEqual(1, len(releases))
+ releases.add_release(release=ReleaseInfo("FooBar", "1.1",
+ url="internal_url",
+ is_external=False,
+ dist_type="sdist"))
+ self.assertEqual(1, len(releases))
+ self.assertEqual(2, len(releases[0]['sdist'].urls))
- dists.append(Dist("Foobar", "1.1.1", type="source"))
- self.assertEqual(2, len(dists))
+ releases.add_release(release=ReleaseInfo("FooBar", "1.1.1",
+ dist_type="sdist"))
+ self.assertEqual(2, len(releases))
# when adding a distribution whith a different type, a new distribution
# has to be added.
- dists.append(Dist("Foobar", "1.1.1", type="binary"))
- self.assertEqual(3, len(dists))
+ releases.add_release(release=ReleaseInfo("FooBar", "1.1.1",
+ dist_type="bdist"))
+ self.assertEqual(2, len(releases))
+ self.assertEqual(2, len(releases[1].dists))
def test_prefer_final(self):
# Can order the distributions using prefer_final
- fb10 = Dist("FooBar", "1.0") # final distribution
- fb11a = Dist("FooBar", "1.1a1") # alpha
- fb12a = Dist("FooBar", "1.2a1") # alpha
- fb12b = Dist("FooBar", "1.2b1") # beta
- dists = Dists([fb10, fb11a, fb12a, fb12b])
+ fb10 = ReleaseInfo("FooBar", "1.0") # final distribution
+ fb11a = ReleaseInfo("FooBar", "1.1a1") # alpha
+ fb12a = ReleaseInfo("FooBar", "1.2a1") # alpha
+ fb12b = ReleaseInfo("FooBar", "1.2b1") # beta
+ dists = ReleasesList("FooBar", [fb10, fb11a, fb12a, fb12b])
- dists.sort_distributions(prefer_final=True)
+ dists.sort_releases(prefer_final=True)
self.assertEqual(fb10, dists[0])
- dists.sort_distributions(prefer_final=False)
+ dists.sort_releases(prefer_final=False)
self.assertEqual(fb12b, dists[0])
- def test_prefer_source(self):
- # Ordering support prefer_source
- fb_source = Dist("FooBar", "1.0", type="source")
- fb_binary = Dist("FooBar", "1.0", type="binary")
- fb2_binary = Dist("FooBar", "2.0", type="binary")
- dists = Dists([fb_binary, fb_source])
-
- dists.sort_distributions(prefer_source=True)
- self.assertEqual(fb_source, dists[0])
-
- dists.sort_distributions(prefer_source=False)
- self.assertEqual(fb_binary, dists[0])
-
- dists.append(fb2_binary)
- dists.sort_distributions(prefer_source=True)
- self.assertEqual(fb2_binary, dists[0])
-
- def test_get_same_name_and_version(self):
- # PyPIDistributions can return a list of "duplicates"
- fb_source = Dist("FooBar", "1.0", type="source")
- fb_binary = Dist("FooBar", "1.0", type="binary")
- fb2_binary = Dist("FooBar", "2.0", type="binary")
- dists = Dists([fb_binary, fb_source, fb2_binary])
- duplicates = dists.get_same_name_and_version()
- self.assertTrue(1, len(duplicates))
- self.assertIn(fb_source, duplicates[0])
+# def test_prefer_source(self):
+# # Ordering support prefer_source
+# fb_source = Dist("FooBar", "1.0", type="source")
+# fb_binary = Dist("FooBar", "1.0", type="binary")
+# fb2_binary = Dist("FooBar", "2.0", type="binary")
+# dists = ReleasesList([fb_binary, fb_source])
+#
+# dists.sort_distributions(prefer_source=True)
+# self.assertEqual(fb_source, dists[0])
+#
+# dists.sort_distributions(prefer_source=False)
+# self.assertEqual(fb_binary, dists[0])
+#
+# dists.append(fb2_binary)
+# dists.sort_distributions(prefer_source=True)
+# self.assertEqual(fb2_binary, dists[0])
def test_suite():
suite = unittest.TestSuite()
- suite.addTest(unittest.makeSuite(TestPyPIDistribution))
- suite.addTest(unittest.makeSuite(TestPyPIDistributions))
+ suite.addTest(unittest.makeSuite(TestDistInfo))
+ suite.addTest(unittest.makeSuite(TestReleaseInfo))
+ suite.addTest(unittest.makeSuite(TestReleasesList))
return suite
if __name__ == '__main__':
diff --git a/src/distutils2/tests/test_pypi_simple.py b/src/distutils2/tests/test_index_simple.py
rename from src/distutils2/tests/test_pypi_simple.py
rename to src/distutils2/tests/test_index_simple.py
--- a/src/distutils2/tests/test_pypi_simple.py
+++ b/src/distutils2/tests/test_index_simple.py
@@ -3,38 +3,34 @@
"""
import sys
import os
-import shutil
-import tempfile
import urllib2
-from distutils2.pypi import simple
-from distutils2.tests import support, run_unittest
+from distutils2.index.simple import Crawler
+from distutils2.tests import support
from distutils2.tests.support import unittest
from distutils2.tests.pypi_server import (use_pypi_server, PyPIServer,
PYPI_DEFAULT_STATIC_PATH)
-class PyPISimpleTestCase(support.TempdirManager,
- unittest.TestCase):
+class SimpleCrawlerTestCase(support.TempdirManager, unittest.TestCase):
- def _get_simple_index(self, server, base_url="/simple/", hosts=None,
+ def _get_simple_crawler(self, server, base_url="/simple/", hosts=None,
*args, **kwargs):
- """Build and return a SimpleSimpleIndex instance, with the test server
+ """Build and return a SimpleIndex instance, with the test server
urls
"""
if hosts is None:
hosts = (server.full_address.strip("http://"),)
kwargs['hosts'] = hosts
- if not 'mirrors' in kwargs:
- kwargs['mirrors'] = [] # to speed up tests
- return simple.SimpleIndex(server.full_address + base_url, *args,
+ return Crawler(server.full_address + base_url, *args,
**kwargs)
- def test_bad_urls(self):
- index = simple.SimpleIndex(mirrors=[])
+ @use_pypi_server()
+ def test_bad_urls(self, server):
+ crawler = Crawler()
url = 'http://127.0.0.1:0/nonesuch/test_simple'
try:
- v = index._open_url(url)
+ v = crawler._open_url(url)
except Exception, v:
self.assertTrue(url in str(v))
else:
@@ -43,10 +39,10 @@
# issue 16
# easy_install inquant.contentmirror.plone breaks because of a typo
# in its home URL
- index = simple.SimpleIndex(hosts=('www.example.com',), mirrors=[])
+ crawler = Crawler(hosts=('example.org',))
url = 'url:%20https://svn.plone.org/svn/collective/inquant.contentmirror.plone/trunk'
try:
- v = index._open_url(url)
+ v = crawler._open_url(url)
except Exception, v:
self.assertTrue(url in str(v))
else:
@@ -58,10 +54,10 @@
old_urlopen = urllib2.urlopen
urllib2.urlopen = _urlopen
- url = 'http://example.com'
+ url = 'http://example.org'
try:
try:
- v = index._open_url(url)
+ v = crawler._open_url(url)
except Exception, v:
self.assertTrue('line' in str(v))
else:
@@ -72,92 +68,91 @@
# issue 20
url = 'http://http://svn.pythonpaste.org/Paste/wphp/trunk'
try:
- index._open_url(url)
+ crawler._open_url(url)
except Exception, v:
self.assertTrue('nonnumeric port' in str(v))
# issue #160
if sys.version_info[0] == 2 and sys.version_info[1] == 7:
# this should not fail
- url = 'http://example.com'
+ url = server.full_address
page = ('<a href="http://www.famfamfam.com]('
'http://www.famfamfam.com/">')
- index._process_url(url, page)
+ crawler._process_url(url, page)
@use_pypi_server("test_found_links")
def test_found_links(self, server):
- # Browse the index, asking for a specified distribution version
+ # Browse the index, asking for a specified release version
# The PyPI index contains links for version 1.0, 1.1, 2.0 and 2.0.1
- index = self._get_simple_index(server)
- last_distribution = index.get("foobar")
+ crawler = self._get_simple_crawler(server)
+ last_release = crawler.get("foobar")
# we have scanned the index page
self.assertIn(server.full_address + "/simple/foobar/",
- index._processed_urls)
+ crawler._processed_urls)
- # we have found 4 distributions in this page
- self.assertEqual(len(index._distributions["foobar"]), 4)
+ # we have found 4 releases in this page
+ self.assertEqual(len(crawler._releases["foobar"]), 4)
# and returned the most recent one
- self.assertEqual("%s" % last_distribution.version, '2.0.1')
+ self.assertEqual("%s" % last_release.version, '2.0.1')
def test_is_browsable(self):
- index = simple.SimpleIndex(follow_externals=False, mirrors=[])
- self.assertTrue(index._is_browsable(index.index_url + "test"))
+ crawler = Crawler(follow_externals=False)
+ self.assertTrue(crawler._is_browsable(crawler.index_url + "test"))
# Now, when following externals, we can have a list of hosts to trust.
# and don't follow other external links than the one described here.
- index = simple.SimpleIndex(hosts=["pypi.python.org", "test.org"],
- follow_externals=True, mirrors=[])
+ crawler = Crawler(hosts=["pypi.python.org", "example.org"],
+ follow_externals=True)
good_urls = (
"http://pypi.python.org/foo/bar",
"http://pypi.python.org/simple/foobar",
- "http://test.org",
- "http://test.org/",
- "http://test.org/simple/",
+ "http://example.org",
+ "http://example.org/",
+ "http://example.org/simple/",
)
bad_urls = (
"http://python.org",
- "http://test.tld",
+ "http://example.tld",
)
for url in good_urls:
- self.assertTrue(index._is_browsable(url))
+ self.assertTrue(crawler._is_browsable(url))
for url in bad_urls:
- self.assertFalse(index._is_browsable(url))
+ self.assertFalse(crawler._is_browsable(url))
# allow all hosts
- index = simple.SimpleIndex(follow_externals=True, hosts=("*",),
- mirrors=[])
- self.assertTrue(index._is_browsable("http://an-external.link/path"))
- self.assertTrue(index._is_browsable("pypi.test.tld/a/path"))
+ crawler = Crawler(follow_externals=True, hosts=("*",))
+ self.assertTrue(crawler._is_browsable("http://an-external.link/path"))
+ self.assertTrue(crawler._is_browsable("pypi.example.org/a/path"))
# specify a list of hosts we want to allow
- index = simple.SimpleIndex(follow_externals=True,
- hosts=("*.test.tld",), mirrors=[])
- self.assertFalse(index._is_browsable("http://an-external.link/path"))
- self.assertTrue(index._is_browsable("http://pypi.test.tld/a/path"))
+ crawler = Crawler(follow_externals=True,
+ hosts=("*.example.org",))
+ self.assertFalse(crawler._is_browsable("http://an-external.link/path"))
+ self.assertTrue(crawler._is_browsable("http://pypi.example.org/a/path"))
@use_pypi_server("with_externals")
- def test_restrict_hosts(self, server):
+ def test_follow_externals(self, server):
# Include external pages
# Try to request the package index, wich contains links to "externals"
# resources. They have to be scanned too.
- index = self._get_simple_index(server, follow_externals=True)
- index.get("foobar")
+ crawler = self._get_simple_crawler(server, follow_externals=True)
+ crawler.get("foobar")
self.assertIn(server.full_address + "/external/external.html",
- index._processed_urls)
+ crawler._processed_urls)
@use_pypi_server("with_real_externals")
def test_restrict_hosts(self, server):
# Only use a list of allowed hosts is possible
# Test that telling the simple pyPI client to not retrieve external
# works
- index = self._get_simple_index(server, follow_externals=False)
- index.get("foobar")
+ crawler = self._get_simple_crawler(server, follow_externals=False)
+ crawler.get("foobar")
self.assertNotIn(server.full_address + "/external/external.html",
- index._processed_urls)
+ crawler._processed_urls)
@use_pypi_server(static_filesystem_paths=["with_externals"],
static_uri_paths=["simple", "external"])
@@ -171,23 +166,26 @@
# - someone manually coindexes this link (with the md5 in the url) onto
# an external page accessible from the package page.
# - someone reuploads the package (with a different md5)
- # - while easy_installing, an MD5 error occurs because the external link
- # is used
+ # - while easy_installing, an MD5 error occurs because the external
+ # link is used
# -> The index should use the link from pypi, not the external one.
# start an index server
index_url = server.full_address + '/simple/'
# scan a test index
- index = simple.SimpleIndex(index_url, follow_externals=True, mirrors=[])
- dists = index.find("foobar")
+ crawler = Crawler(index_url, follow_externals=True)
+ releases = crawler.find("foobar")
server.stop()
# we have only one link, because links are compared without md5
- self.assertEqual(len(dists), 1)
+ self.assertEqual(1, len(releases))
+ self.assertEqual(1, len(releases[0].dists))
# the link should be from the index
- self.assertEqual('12345678901234567', dists[0].url['hashval'])
- self.assertEqual('md5', dists[0].url['hashname'])
+ self.assertEqual(2, len(releases[0].dists['sdist'].urls))
+ self.assertEqual('12345678901234567',
+ releases[0].dists['sdist'].url['hashval'])
+ self.assertEqual('md5', releases[0].dists['sdist'].url['hashname'])
@use_pypi_server(static_filesystem_paths=["with_norel_links"],
static_uri_paths=["simple", "external"])
@@ -197,22 +195,22 @@
# to not be processed by the package index, while processing "pages".
# process the pages
- index = self._get_simple_index(server, follow_externals=True)
- index.find("foobar")
+ crawler = self._get_simple_crawler(server, follow_externals=True)
+ crawler.find("foobar")
# now it should have processed only pages with links rel="download"
# and rel="homepage"
self.assertIn("%s/simple/foobar/" % server.full_address,
- index._processed_urls) # it's the simple index page
+ crawler._processed_urls) # it's the simple index page
self.assertIn("%s/external/homepage.html" % server.full_address,
- index._processed_urls) # the external homepage is rel="homepage"
+ crawler._processed_urls) # the external homepage is rel="homepage"
self.assertNotIn("%s/external/nonrel.html" % server.full_address,
- index._processed_urls) # this link contains no rel=*
+ crawler._processed_urls) # this link contains no rel=*
self.assertNotIn("%s/unrelated-0.2.tar.gz" % server.full_address,
- index._processed_urls) # linked from simple index (no rel)
+ crawler._processed_urls) # linked from simple index (no rel)
self.assertIn("%s/foobar-0.1.tar.gz" % server.full_address,
- index._processed_urls) # linked from simple index (rel)
+ crawler._processed_urls) # linked from simple index (rel)
self.assertIn("%s/foobar-2.0.tar.gz" % server.full_address,
- index._processed_urls) # linked from external homepage (rel)
+ crawler._processed_urls) # linked from external homepage (rel)
def test_uses_mirrors(self):
# When the main repository seems down, try using the given mirrors"""
@@ -222,18 +220,18 @@
try:
# create the index using both servers
- index = simple.SimpleIndex(server.full_address + "/simple/",
+ crawler = Crawler(server.full_address + "/simple/",
hosts=('*',), timeout=1, # set the timeout to 1s for the tests
- mirrors=[mirror.full_address + "/simple/",])
+ mirrors=[mirror.full_address + "/simple/", ])
# this should not raise a timeout
- self.assertEqual(4, len(index.find("foo")))
+ self.assertEqual(4, len(crawler.find("foo")))
finally:
mirror.stop()
def test_simple_link_matcher(self):
# Test that the simple link matcher yields the right links"""
- index = simple.SimpleIndex(follow_externals=False, mirrors=[])
+ crawler = Crawler(follow_externals=False)
# Here, we define:
# 1. one link that must be followed, cause it's a download one
@@ -241,27 +239,27 @@
# returns false for it.
# 3. one link that must be followed cause it's a homepage that is
# browsable
- self.assertTrue(index._is_browsable("%stest" % index.index_url))
- self.assertFalse(index._is_browsable("http://dl-link2"))
+ self.assertTrue(crawler._is_browsable("%stest" % crawler.index_url))
+ self.assertFalse(crawler._is_browsable("http://dl-link2"))
content = """
<a href="http://dl-link1" rel="download">download_link1</a>
<a href="http://dl-link2" rel="homepage">homepage_link1</a>
<a href="%stest" rel="homepage">homepage_link2</a>
- """ % index.index_url
+ """ % crawler.index_url
# Test that the simple link matcher yield the good links.
- generator = index._simple_link_matcher(content, index.index_url)
+ generator = crawler._simple_link_matcher(content, crawler.index_url)
self.assertEqual(('http://dl-link1', True), generator.next())
- self.assertEqual(('%stest' % index.index_url, False),
+ self.assertEqual(('%stest' % crawler.index_url, False),
generator.next())
self.assertRaises(StopIteration, generator.next)
# Follow the external links is possible
- index.follow_externals = True
- generator = index._simple_link_matcher(content, index.index_url)
+ crawler.follow_externals = True
+ generator = crawler._simple_link_matcher(content, crawler.index_url)
self.assertEqual(('http://dl-link1', True), generator.next())
self.assertEqual(('http://dl-link2', False), generator.next())
- self.assertEqual(('%stest' % index.index_url, False),
+ self.assertEqual(('%stest' % crawler.index_url, False),
generator.next())
self.assertRaises(StopIteration, generator.next)
@@ -269,19 +267,19 @@
# Test that we can browse local files"""
index_path = os.sep.join(["file://" + PYPI_DEFAULT_STATIC_PATH,
"test_found_links", "simple"])
- index = simple.SimpleIndex(index_path)
- dists = index.find("foobar")
+ crawler = Crawler(index_path)
+ dists = crawler.find("foobar")
self.assertEqual(4, len(dists))
def test_get_link_matcher(self):
- crawler = simple.SimpleIndex("http://example.org")
+ crawler = Crawler("http://example.org")
self.assertEqual('_simple_link_matcher', crawler._get_link_matcher(
"http://example.org/some/file").__name__)
self.assertEqual('_default_link_matcher', crawler._get_link_matcher(
"http://other-url").__name__)
def test_default_link_matcher(self):
- crawler = simple.SimpleIndex("http://example.org", mirrors=[])
+ crawler = Crawler("http://example.org", mirrors=[])
crawler.follow_externals = True
crawler._is_browsable = lambda *args:True
base_url = "http://example.org/some/file/"
@@ -297,7 +295,7 @@
self.assertIn('http://example.org/some/download', found_links)
def test_suite():
- return unittest.makeSuite(PyPISimpleTestCase)
+ return unittest.makeSuite(SimpleCrawlerTestCase)
if __name__ == '__main__':
unittest.main(defaultTest="test_suite")
--
Repository URL: http://hg.python.org/distutils2
More information about the Python-checkins
mailing list