[Python-checkins] distutils2: Merge from Alexis
tarek.ziade
python-checkins at python.org
Sun Aug 8 11:50:47 CEST 2010
tarek.ziade pushed 251a40b381cb to distutils2:
http://hg.python.org/distutils2/rev/251a40b381cb
changeset: 475:251a40b381cb
parent: 443:0f543ed42707
parent: 474:437bda319048
user: ?ric Araujo <merwok at netwok.org>
date: Thu Aug 05 16:54:59 2010 +0200
summary: Merge from Alexis
files: docs/source/index.rst, docs/source/pypi.rst, src/distutils2/index/dist.py, src/distutils2/index/simple.py, src/distutils2/index/wrapper.py, src/distutils2/index/xmlrpc.py, src/distutils2/metadata.py, src/distutils2/pypi/__init__.py, src/distutils2/pypi/dist.py, src/distutils2/pypi/errors.py, src/distutils2/pypi/simple.py, src/distutils2/tests/pypi_server.py, src/distutils2/tests/test_index_dist.py, src/distutils2/tests/test_index_simple.py, src/distutils2/tests/test_pypi_dist.py, src/distutils2/tests/test_pypi_simple.py, src/distutils2/util.py
diff --git a/docs/source/index.rst b/docs/source/index.rst
--- a/docs/source/index.rst
+++ b/docs/source/index.rst
@@ -17,7 +17,7 @@
commands
command_hooks
test_framework
- pypi
+ projects-index
version
Indices and tables
diff --git a/docs/source/pypi.rst b/docs/source/projects-index.client.rst
rename from docs/source/pypi.rst
rename to docs/source/projects-index.client.rst
--- a/docs/source/pypi.rst
+++ b/docs/source/projects-index.client.rst
@@ -1,195 +1,20 @@
-=========================================
-Tools to query PyPI: the PyPI package
-=========================================
+===============================
+High level API to Query indexes
+===============================
-Distutils2 comes with a module (eg. `distutils2.pypi`) which contains
-facilities to access the Python Package Index (named "pypi", and avalaible on
-the url `http://pypi.python.org`.
+Distutils2 provides a high level API to query indexes, search for releases and
+distributions, no matters the underlying API you want to use.
-There is two ways to retrieve data from pypi: using the *simple* API, and using
-*XML-RPC*. The first one is in fact a set of HTML pages avalaible at
-`http://pypi.python.org/simple/`, and the second one contains a set of XML-RPC
-methods. In order to reduce the overload caused by running distant methods on
-the pypi server (by using the XML-RPC methods), the best way to retrieve
-informations is by using the simple API, when it contains the information you
-need.
-
-Distutils2 provides two python modules to ease the work with those two APIs:
-`distutils2.pypi.simple` and `distutils2.pypi.xmlrpc`. Both of them depends on
-another python module: `distutils2.pypi.dist`.
-
-
-Requesting information via the "simple" API `distutils2.pypi.simple`
-====================================================================
-
-`distutils2.pypi.simple` can process the Python Package Index and return and
-download urls of distributions, for specific versions or latests, but it also
-can process external html pages, with the goal to find *pypi unhosted* versions
-of python distributions.
-
-You should use `distutils2.pypi.simple` for:
-
- * Search distributions by name and versions.
- * Process pypi external pages.
- * Download distributions by name and versions.
-
-And should not be used to:
-
- * Things that will end up in too long index processing (like "finding all
- distributions with a specific version, no matters the name")
+The aim of this module is to choose the best way to query the API, using the
+less possible XML-RPC, and when possible the simple index.
API
-----
+===
-Here is a complete overview of the APIs of the SimpleIndex class.
+The client comes with the common methods "find_projects", "get_release" and
+"get_releases", which helps to query the servers, and returns
+:class:`distutils2.index.dist.ReleaseInfo`, and
+:class:`distutils2.index.dist.ReleasesList` objects.
-.. autoclass:: distutils2.pypi.simple.SimpleIndex
+.. autoclass:: distutils2.index.wrapper.ClientWrapper
:members:
-
-Usage Exemples
----------------
-
-To help you understand how using the `SimpleIndex` class, here are some basic
-usages.
-
-Request PyPI to get a specific distribution
-++++++++++++++++++++++++++++++++++++++++++++
-
-Supposing you want to scan the PyPI index to get a list of distributions for
-the "foobar" project. You can use the "find" method for that::
-
- >>> from distutils2.pypi import SimpleIndex
- >>> client = SimpleIndex()
- >>> client.find("foobar")
- [<PyPIDistribution "Foobar 1.1">, <PyPIDistribution "Foobar 1.2">]
-
-Note that you also can request the client about specific versions, using version
-specifiers (described in `PEP 345
-<http://www.python.org/dev/peps/pep-0345/#version-specifiers>`_)::
-
- >>> client.find("foobar < 1.2")
- [<PyPIDistribution "foobar 1.1">, ]
-
-`find` returns a list of distributions, but you also can get the last
-distribution (the more up to date) that fullfil your requirements, like this::
-
- >>> client.get("foobar < 1.2")
- <PyPIDistribution "foobar 1.1">
-
-Download distributions
-+++++++++++++++++++++++
-
-As it can get the urls of distributions provided by PyPI, the `SimpleIndex`
-client also can download the distributions and put it for you in a temporary
-destination::
-
- >>> client.download("foobar")
- /tmp/temp_dir/foobar-1.2.tar.gz
-
-You also can specify the directory you want to download to::
-
- >>> client.download("foobar", "/path/to/my/dir")
- /path/to/my/dir/foobar-1.2.tar.gz
-
-While downloading, the md5 of the archive will be checked, if not matches, it
-will try another time, then if fails again, raise `MD5HashDoesNotMatchError`.
-
-Internally, that's not the SimpleIndex which download the distributions, but the
-`PyPIDistribution` class. Please refer to this documentation for more details.
-
-Following PyPI external links
-++++++++++++++++++++++++++++++
-
-The default behavior for distutils2 is to *not* follow the links provided
-by HTML pages in the "simple index", to find distributions related
-downloads.
-
-It's possible to tell the PyPIClient to follow external links by setting the
-`follow_externals` attribute, on instanciation or after::
-
- >>> client = SimpleIndex(follow_externals=True)
-
-or ::
-
- >>> client = SimpleIndex()
- >>> client.follow_externals = True
-
-Working with external indexes, and mirrors
-+++++++++++++++++++++++++++++++++++++++++++
-
-The default `SimpleIndex` behavior is to rely on the Python Package index stored
-on PyPI (http://pypi.python.org/simple).
-
-As you can need to work with a local index, or private indexes, you can specify
-it using the index_url parameter::
-
- >>> client = SimpleIndex(index_url="file://filesystem/path/")
-
-or ::
-
- >>> client = SimpleIndex(index_url="http://some.specific.url/")
-
-You also can specify mirrors to fallback on in case the first index_url you
-provided doesnt respond, or not correctly. The default behavior for
-`SimpleIndex` is to use the list provided by Python.org DNS records, as
-described in the :pep:`381` about mirroring infrastructure.
-
-If you don't want to rely on these, you could specify the list of mirrors you
-want to try by specifying the `mirrors` attribute. It's a simple iterable::
-
- >>> mirrors = ["http://first.mirror","http://second.mirror"]
- >>> client = SimpleIndex(mirrors=mirrors)
-
-
-Requesting informations via XML-RPC (`distutils2.pypi.XmlRpcIndex`)
-==========================================================================
-
-The other method to request the Python package index, is using the XML-RPC
-methods. Distutils2 provides a simple wrapper around `xmlrpclib
-<http://docs.python.org/library/xmlrpclib.html>`_, that can return you
-`PyPIDistribution` objects.
-
-::
- >>> from distutils2.pypi import XmlRpcIndex()
- >>> client = XmlRpcIndex()
-
-
-PyPI Distributions
-==================
-
-Both `SimpleIndex` and `XmlRpcIndex` classes works with the classes provided
-in the `pypi.dist` package.
-
-`PyPIDistribution`
-------------------
-
-`PyPIDistribution` is a simple class that defines the following attributes:
-
-:name:
- The name of the package. `foobar` in our exemples here
-:version:
- The version of the package
-:location:
- If the files from the archive has been downloaded, here is the path where
- you can find them.
-:url:
- The url of the distribution
-
-.. autoclass:: distutils2.pypi.dist.PyPIDistribution
- :members:
-
-`PyPIDistributions`
--------------------
-
-The `dist` module also provides another class, to work with lists of
-`PyPIDistribution` classes. It allow to filter results and is used as a
-container of
-
-.. autoclass:: distutils2.pypi.dist.PyPIDistributions
- :members:
-
-At a higher level
-=================
-
-XXX : A description about a wraper around PyPI simple and XmlRpc Indexes
-(PyPIIndex ?)
diff --git a/docs/source/projects-index.dist.rst b/docs/source/projects-index.dist.rst
new file mode 100644
--- /dev/null
+++ b/docs/source/projects-index.dist.rst
@@ -0,0 +1,109 @@
+==================================================
+Representation of informations coming from indexes
+==================================================
+
+Informations coming from indexes are represented by the classes present in the
+`dist` module.
+
+APIs
+====
+
+Keep in mind that each project (eg. FooBar) can have several releases
+(eg. 1.1, 1.2, 1.3), and each of these releases can be provided in multiple
+distributions (eg. a source distribution, a binary one, etc).
+
+ReleaseInfo
+------------
+
+Each release have a project name, a project version and contain project
+metadata. In addition, releases contain the distributions too.
+
+These informations are stored in :class:`distutils2.index.dist.ReleaseInfo`
+objects.
+
+.. autoclass:: distutils2.index.dist.ReleaseInfo
+ :members:
+
+DistInfo
+---------
+
+:class:`distutils2.index.dist.DistInfo` is a simple class that contains
+informations related to distributions. It's mainly about the URLs where those
+distributions can be found.
+
+.. autoclass:: distutils2.index.dist.DistInfo
+ :members:
+
+ReleasesList
+------------
+
+The `dist` module also provides another class, to work with lists of
+:class:`distutils.index.dist.ReleaseInfo` classes. It allow to filter
+and order results.
+
+.. autoclass:: distutils2.index.dist.ReleasesList
+ :members:
+
+Exemple usages
+===============
+
+Build a list of releases, and order them
+----------------------------------------
+
+Assuming we have a list of releases::
+
+ >>> from distutils2.index.dist import ReleaseList, ReleaseInfo
+ >>> fb10 = ReleaseInfo("FooBar", "1.0")
+ >>> fb11 = ReleaseInfo("FooBar", "1.1")
+ >>> fb11a = ReleaseInfo("FooBar", "1.1a1")
+ >>> ReleasesList("FooBar", [fb11, fb11a, fb10])
+ >>> releases.sort_releases()
+ >>> releases.get_versions()
+ ['1.1', '1.1a1', '1.0']
+ >>> releases.add_release("1.2a1")
+ >>> releases.get_versions()
+ ['1.1', '1.1a1', '1.0', '1.2a1']
+ >>> releases.sort_releases()
+ ['1.2a1', '1.1', '1.1a1', '1.0']
+ >>> releases.sort_releases(prefer_final=True)
+ >>> releases.get_versions()
+ ['1.1', '1.0', '1.2a1', '1.1a1']
+
+
+Add distribution related informations to releases
+-------------------------------------------------
+
+It's easy to add distribution informatons to releases::
+
+ >>> from distutils2.index.dist import ReleaseList, ReleaseInfo
+ >>> r = ReleaseInfo("FooBar", "1.0")
+ >>> r.add_distribution("sdist", url="http://example.org/foobar-1.0.tar.gz")
+ >>> r.dists
+ {'sdist': FooBar 1.0 sdist}
+ >>> r['sdist'].url
+ {'url': 'http://example.org/foobar-1.0.tar.gz', 'hashname': None, 'hashval':
+ None, 'is_external': True}
+
+Attributes Lazy loading
+-----------------------
+
+To abstract a maximum the way of querying informations to the indexes,
+attributes and releases informations can be retrieved "on demand", in a "lazy"
+way.
+
+For instance, if you have a release instance that does not contain the metadata
+attribute, it can be build directly when accedded::
+
+ >>> r = Release("FooBar", "1.1")
+ >>> print r._metadata
+ None # metadata field is actually set to "None"
+ >>> r.metadata
+ <Metadata for FooBar 1.1>
+
+Like this, it's possible to retrieve project's releases, releases metadata and
+releases distributions informations.
+
+Internally, this is possible because while retrieving for the first time
+informations about projects, releases or distributions, a reference to the
+client used is stored in the objects. Then, while trying to access undefined
+fields, it will be used if necessary.
diff --git a/docs/source/pypi.rst b/docs/source/projects-index.rst
copy from docs/source/pypi.rst
copy to docs/source/projects-index.rst
--- a/docs/source/pypi.rst
+++ b/docs/source/projects-index.rst
@@ -1,195 +1,28 @@
-=========================================
-Tools to query PyPI: the PyPI package
-=========================================
+===================================
+Query Python Package Indexes (PyPI)
+===================================
-Distutils2 comes with a module (eg. `distutils2.pypi`) which contains
-facilities to access the Python Package Index (named "pypi", and avalaible on
-the url `http://pypi.python.org`.
+Distutils2 provides facilities to access python package informations stored in
+indexes. The main Python Package Index is available at http://pypi.python.org.
-There is two ways to retrieve data from pypi: using the *simple* API, and using
-*XML-RPC*. The first one is in fact a set of HTML pages avalaible at
+.. note:: The tools provided in distutils2 are not limited to query pypi, and
+ can be used for others indexes, if they respect the same interfaces.
+
+There is two ways to retrieve data from these indexes: using the *simple* API,
+and using *XML-RPC*. The first one is a set of HTML pages avalaibles at
`http://pypi.python.org/simple/`, and the second one contains a set of XML-RPC
-methods. In order to reduce the overload caused by running distant methods on
-the pypi server (by using the XML-RPC methods), the best way to retrieve
-informations is by using the simple API, when it contains the information you
-need.
+methods.
-Distutils2 provides two python modules to ease the work with those two APIs:
-`distutils2.pypi.simple` and `distutils2.pypi.xmlrpc`. Both of them depends on
-another python module: `distutils2.pypi.dist`.
+If you dont care about which API to use, the best thing to do is to let
+distutils2 decide this for you, by using :class:`distutils2.index.ClientWrapper`.
+Of course, you can rely too on :class:`distutils2.index.simple.Crawler` and
+:class:`distutils.index.xmlrpc.Client` if you need to use these specific APIs.
-Requesting information via the "simple" API `distutils2.pypi.simple`
-====================================================================
+.. toctree::
+ :maxdepth: 2
-`distutils2.pypi.simple` can process the Python Package Index and return and
-download urls of distributions, for specific versions or latests, but it also
-can process external html pages, with the goal to find *pypi unhosted* versions
-of python distributions.
-
-You should use `distutils2.pypi.simple` for:
-
- * Search distributions by name and versions.
- * Process pypi external pages.
- * Download distributions by name and versions.
-
-And should not be used to:
-
- * Things that will end up in too long index processing (like "finding all
- distributions with a specific version, no matters the name")
-
-API
-----
-
-Here is a complete overview of the APIs of the SimpleIndex class.
-
-.. autoclass:: distutils2.pypi.simple.SimpleIndex
- :members:
-
-Usage Exemples
----------------
-
-To help you understand how using the `SimpleIndex` class, here are some basic
-usages.
-
-Request PyPI to get a specific distribution
-++++++++++++++++++++++++++++++++++++++++++++
-
-Supposing you want to scan the PyPI index to get a list of distributions for
-the "foobar" project. You can use the "find" method for that::
-
- >>> from distutils2.pypi import SimpleIndex
- >>> client = SimpleIndex()
- >>> client.find("foobar")
- [<PyPIDistribution "Foobar 1.1">, <PyPIDistribution "Foobar 1.2">]
-
-Note that you also can request the client about specific versions, using version
-specifiers (described in `PEP 345
-<http://www.python.org/dev/peps/pep-0345/#version-specifiers>`_)::
-
- >>> client.find("foobar < 1.2")
- [<PyPIDistribution "foobar 1.1">, ]
-
-`find` returns a list of distributions, but you also can get the last
-distribution (the more up to date) that fullfil your requirements, like this::
-
- >>> client.get("foobar < 1.2")
- <PyPIDistribution "foobar 1.1">
-
-Download distributions
-+++++++++++++++++++++++
-
-As it can get the urls of distributions provided by PyPI, the `SimpleIndex`
-client also can download the distributions and put it for you in a temporary
-destination::
-
- >>> client.download("foobar")
- /tmp/temp_dir/foobar-1.2.tar.gz
-
-You also can specify the directory you want to download to::
-
- >>> client.download("foobar", "/path/to/my/dir")
- /path/to/my/dir/foobar-1.2.tar.gz
-
-While downloading, the md5 of the archive will be checked, if not matches, it
-will try another time, then if fails again, raise `MD5HashDoesNotMatchError`.
-
-Internally, that's not the SimpleIndex which download the distributions, but the
-`PyPIDistribution` class. Please refer to this documentation for more details.
-
-Following PyPI external links
-++++++++++++++++++++++++++++++
-
-The default behavior for distutils2 is to *not* follow the links provided
-by HTML pages in the "simple index", to find distributions related
-downloads.
-
-It's possible to tell the PyPIClient to follow external links by setting the
-`follow_externals` attribute, on instanciation or after::
-
- >>> client = SimpleIndex(follow_externals=True)
-
-or ::
-
- >>> client = SimpleIndex()
- >>> client.follow_externals = True
-
-Working with external indexes, and mirrors
-+++++++++++++++++++++++++++++++++++++++++++
-
-The default `SimpleIndex` behavior is to rely on the Python Package index stored
-on PyPI (http://pypi.python.org/simple).
-
-As you can need to work with a local index, or private indexes, you can specify
-it using the index_url parameter::
-
- >>> client = SimpleIndex(index_url="file://filesystem/path/")
-
-or ::
-
- >>> client = SimpleIndex(index_url="http://some.specific.url/")
-
-You also can specify mirrors to fallback on in case the first index_url you
-provided doesnt respond, or not correctly. The default behavior for
-`SimpleIndex` is to use the list provided by Python.org DNS records, as
-described in the :pep:`381` about mirroring infrastructure.
-
-If you don't want to rely on these, you could specify the list of mirrors you
-want to try by specifying the `mirrors` attribute. It's a simple iterable::
-
- >>> mirrors = ["http://first.mirror","http://second.mirror"]
- >>> client = SimpleIndex(mirrors=mirrors)
-
-
-Requesting informations via XML-RPC (`distutils2.pypi.XmlRpcIndex`)
-==========================================================================
-
-The other method to request the Python package index, is using the XML-RPC
-methods. Distutils2 provides a simple wrapper around `xmlrpclib
-<http://docs.python.org/library/xmlrpclib.html>`_, that can return you
-`PyPIDistribution` objects.
-
-::
- >>> from distutils2.pypi import XmlRpcIndex()
- >>> client = XmlRpcIndex()
-
-
-PyPI Distributions
-==================
-
-Both `SimpleIndex` and `XmlRpcIndex` classes works with the classes provided
-in the `pypi.dist` package.
-
-`PyPIDistribution`
-------------------
-
-`PyPIDistribution` is a simple class that defines the following attributes:
-
-:name:
- The name of the package. `foobar` in our exemples here
-:version:
- The version of the package
-:location:
- If the files from the archive has been downloaded, here is the path where
- you can find them.
-:url:
- The url of the distribution
-
-.. autoclass:: distutils2.pypi.dist.PyPIDistribution
- :members:
-
-`PyPIDistributions`
--------------------
-
-The `dist` module also provides another class, to work with lists of
-`PyPIDistribution` classes. It allow to filter results and is used as a
-container of
-
-.. autoclass:: distutils2.pypi.dist.PyPIDistributions
- :members:
-
-At a higher level
-=================
-
-XXX : A description about a wraper around PyPI simple and XmlRpc Indexes
-(PyPIIndex ?)
+ projects-index.client.rst
+ projects-index.dist.rst
+ projects-index.simple.rst
+ projects-index.xmlrpc.rst
diff --git a/docs/source/projects-index.simple.rst b/docs/source/projects-index.simple.rst
new file mode 100644
--- /dev/null
+++ b/docs/source/projects-index.simple.rst
@@ -0,0 +1,131 @@
+=========================================
+Querying indexes via the simple index API
+=========================================
+
+`distutils2.index.simple` can process Python Package Indexes, and provides
+useful informations about distributions. It also can crawl local indexes, for
+instance.
+
+You should use `distutils2.index.simple` for:
+
+ * Search distributions by name and versions.
+ * Process index external pages.
+ * Download distributions by name and versions.
+
+And should not be used to:
+
+ * Things that will end up in too long index processing (like "finding all
+ distributions with a specific version, no matters the name")
+
+API
+---
+
+.. autoclass:: distutils2.index.simple.Crawler
+ :members:
+
+
+Usage Exemples
+---------------
+
+To help you understand how using the `Crawler` class, here are some basic
+usages.
+
+Request the simple index to get a specific distribution
+++++++++++++++++++++++++++++++++++++++++++++++++++++++++
+
+Supposing you want to scan an index to get a list of distributions for
+the "foobar" project. You can use the "get_releases" method for that.
+The get_releases method will browse the project page, and return :class:`ReleaseInfo`
+objects for each found link that rely on downloads. ::
+
+ >>> from distutils2.index.simple import Crawler
+ >>> crawler = Crawler()
+ >>> crawler.get_releases("FooBar")
+ [<ReleaseInfo "Foobar 1.1">, <ReleaseInfo "Foobar 1.2">]
+
+Note that you also can request the client about specific versions, using version
+specifiers (described in `PEP 345
+<http://www.python.org/dev/peps/pep-0345/#version-specifiers>`_)::
+
+ >>> client.get_releases("FooBar < 1.2")
+ [<ReleaseInfo "FooBar 1.1">, ]
+
+`get_releases` returns a list of :class:`ReleaseInfo`, but you also can get the best
+distribution that fullfil your requirements, using "get_release"::
+
+ >>> client.get_release("FooBar < 1.2")
+ <ReleaseInfo "FooBar 1.1">
+
+Download distributions
++++++++++++++++++++++++
+
+As it can get the urls of distributions provided by PyPI, the `Crawler`
+client also can download the distributions and put it for you in a temporary
+destination::
+
+ >>> client.download("foobar")
+ /tmp/temp_dir/foobar-1.2.tar.gz
+
+You also can specify the directory you want to download to::
+
+ >>> client.download("foobar", "/path/to/my/dir")
+ /path/to/my/dir/foobar-1.2.tar.gz
+
+While downloading, the md5 of the archive will be checked, if not matches, it
+will try another time, then if fails again, raise `MD5HashDoesNotMatchError`.
+
+Internally, that's not the Crawler which download the distributions, but the
+`DistributionInfo` class. Please refer to this documentation for more details.
+
+Following PyPI external links
+++++++++++++++++++++++++++++++
+
+The default behavior for distutils2 is to *not* follow the links provided
+by HTML pages in the "simple index", to find distributions related
+downloads.
+
+It's possible to tell the PyPIClient to follow external links by setting the
+`follow_externals` attribute, on instanciation or after::
+
+ >>> client = Crawler(follow_externals=True)
+
+or ::
+
+ >>> client = Crawler()
+ >>> client.follow_externals = True
+
+Working with external indexes, and mirrors
++++++++++++++++++++++++++++++++++++++++++++
+
+The default `Crawler` behavior is to rely on the Python Package index stored
+on PyPI (http://pypi.python.org/simple).
+
+As you can need to work with a local index, or private indexes, you can specify
+it using the index_url parameter::
+
+ >>> client = Crawler(index_url="file://filesystem/path/")
+
+or ::
+
+ >>> client = Crawler(index_url="http://some.specific.url/")
+
+You also can specify mirrors to fallback on in case the first index_url you
+provided doesnt respond, or not correctly. The default behavior for
+`Crawler` is to use the list provided by Python.org DNS records, as
+described in the :pep:`381` about mirroring infrastructure.
+
+If you don't want to rely on these, you could specify the list of mirrors you
+want to try by specifying the `mirrors` attribute. It's a simple iterable::
+
+ >>> mirrors = ["http://first.mirror","http://second.mirror"]
+ >>> client = Crawler(mirrors=mirrors)
+
+Searching in the simple index
++++++++++++++++++++++++++++++
+
+It's possible to search for projects with specific names in the package index.
+Assuming you want to find all projects containing the "Grail" keyword::
+
+ >>> client.search(name="grail")
+ ["holy grail", "unholy grail", "grail"]
+
diff --git a/docs/source/projects-index.xmlrpc.rst b/docs/source/projects-index.xmlrpc.rst
new file mode 100644
--- /dev/null
+++ b/docs/source/projects-index.xmlrpc.rst
@@ -0,0 +1,149 @@
+=========================
+Query indexes via XML-RPC
+=========================
+
+Indexes can be queried using XML-RPC calls, and Distutils2 provides a simple
+way to interface with XML-RPC.
+
+You should **use** XML-RPC when:
+
+ * Searching the index for projects **on other fields than project
+ names**. For instance, you can search for projects based on the
+ author_email field.
+ * Searching all the versions that have existed for a project.
+ * you want to retrive METADATAs informations from releases or
+ distributions.
+
+You should **avoid using** XML-RPC method calls when:
+
+ * Retrieving the last version of a project
+ * Getting the projects with a specific name and version.
+ * The simple index can match your needs
+
+When dealing with indexes, keep in mind that the index queriers will always
+return you :class:`distutils2.index.ReleaseInfo` and
+:class:`distutils2.index.ReleasesList` objects.
+
+Some methods here share common APIs with the one you can find on
+:class:`distutils2.index.simple`, internally, :class:`distutils2.index.client`
+is inherited by :class:`distutils2.index.xmlrpc.Client`
+
+API
+====
+
+.. autoclass:: distutils2.index.xmlrpc.Client
+ :members:
+
+Usage examples
+===============
+
+Use case described here are use case that are not common to the other clients.
+If you want to see all the methods, please refer to API or to usage examples
+described in :class:`distutils2.index.client.Client`
+
+Finding releases
+----------------
+
+It's a common use case to search for "things" within the index.
+We can basically search for projects by their name, which is the
+most used way for users (eg. "give me the last version of the FooBar project").
+This can be accomplished using the following syntax::
+
+ >>> client = xmlrpc.Client()
+ >>> client.get_release("Foobar (<= 1.3))
+ <FooBar 1.2.1>
+ >>> client.get_releases("FooBar (<= 1.3)")
+ [FooBar 1.1, FooBar 1.1.1, FooBar 1.2, FooBar 1.2.1]
+
+And we also can find for specific fields::
+
+ >>> client.search_projects(field=value)
+
+You could specify the operator to use, default is "or"::
+
+ >>> client.search_projects(field=value, operator="and")
+
+The specific fields you can search are:
+
+ * name
+ * version
+ * author
+ * author_email
+ * maintainer
+ * maintainer_email
+ * home_page
+ * license
+ * summary
+ * description
+ * keywords
+ * platform
+ * download_url
+
+Getting metadata informations
+-----------------------------
+
+XML-RPC is a prefered way to retrieve metadata informations from indexes.
+It's really simple to do so::
+
+ >>> client = xmlrpc.Client()
+ >>> client.get_metadata("FooBar", "1.1")
+ <ReleaseInfo FooBar 1.1>
+
+Assuming we already have a :class:`distutils2.index.ReleaseInfo` object defined,
+it's possible to pass it ot the xmlrpc client to retrieve and complete it's
+metadata::
+
+ >>> foobar11 = ReleaseInfo("FooBar", "1.1")
+ >>> client = xmlrpc.Client()
+ >>> returned_release = client.get_metadata(release=foobar11)
+ >>> returned_release
+ <ReleaseInfo FooBar 1.1>
+
+Get all the releases of a project
+---------------------------------
+
+To retrieve all the releases for a project, you can build them using
+`get_releases`::
+
+ >>> client = xmlrpc.Client()
+ >>> client.get_releases("FooBar")
+ [<ReleaseInfo FooBar 0.9>, <ReleaseInfo FooBar 1.0>, <ReleaseInfo 1.1>]
+
+Get informations about distributions
+------------------------------------
+
+Indexes have informations about projects, releases **and** distributions.
+If you're not familiar with those, please refer to the documentation of
+:mod:`distutils2.index.dist`.
+
+It's possible to retrive informations about distributions, e.g "what are the
+existing distributions for this release ? How to retrieve them ?"::
+
+ >>> client = xmlrpc.Client()
+ >>> release = client.get_distributions("FooBar", "1.1")
+ >>> release.dists
+ {'sdist': <FooBar 1.1 sdist>, 'bdist': <FooBar 1.1 bdist>}
+
+As you see, this does not return a list of distributions, but a release,
+because a release can be used like a list of distributions.
+
+Lazy load information from project, releases and distributions.
+----------------------------------------------------------------
+
+.. note:: The lazy loading feature is not currently available !
+
+As :mod:`distutils2.index.dist` classes support "lazy" loading of
+informations, you can use it while retrieving informations from XML-RPC.
+
+For instance, it's possible to get all the releases for a project, and to access
+directly the metadata of each release, without making
+:class:`distutils2.index.xmlrpc.Client` directly (they will be made, but they're
+invisible to the you)::
+
+ >>> client = xmlrpc.Client()
+ >>> releases = client.get_releases("FooBar")
+ >>> releases.get_release("1.1").metadata
+ <Metadata for FooBar 1.1>
+
+Refer to the :mod:`distutils2.index.dist` documentation for more information
+about attributes lazy loading.
diff --git a/src/distutils2/pypi/__init__.py b/src/distutils2/index/__init__.py
rename from src/distutils2/pypi/__init__.py
rename to src/distutils2/index/__init__.py
--- a/src/distutils2/pypi/__init__.py
+++ b/src/distutils2/index/__init__.py
@@ -1,8 +1,11 @@
-"""distutils2.pypi
+"""Package containing ways to interact with Index APIs.
-Package containing ways to interact with the PyPI APIs.
-"""
+"""
__all__ = ['simple',
+ 'xmlrpc',
'dist',
-]
+ 'errors',
+ 'mirrors',]
+
+from dist import ReleaseInfo, ReleasesList, DistInfo
diff --git a/src/distutils2/index/base.py b/src/distutils2/index/base.py
new file mode 100644
--- /dev/null
+++ b/src/distutils2/index/base.py
@@ -0,0 +1,55 @@
+from distutils2.version import VersionPredicate
+from distutils2.index.dist import ReleasesList
+
+
+class BaseClient(object):
+ """Base class containing common methods for the index crawlers/clients"""
+
+ def __init__(self, prefer_final, prefer_source):
+ self._prefer_final = prefer_final
+ self._prefer_source = prefer_source
+ self._index = self
+
+ def _get_version_predicate(self, requirements):
+ """Return a VersionPredicate object, from a string or an already
+ existing object.
+ """
+ if isinstance(requirements, str):
+ requirements = VersionPredicate(requirements)
+ return requirements
+
+ def _get_prefer_final(self, prefer_final=None):
+ """Return the prefer_final internal parameter or the specified one if
+ provided"""
+ if prefer_final:
+ return prefer_final
+ else:
+ return self._prefer_final
+
+ def _get_prefer_source(self, prefer_source=None):
+ """Return the prefer_source internal parameter or the specified one if
+ provided"""
+ if prefer_source:
+ return prefer_source
+ else:
+ return self._prefer_source
+
+ def _get_project(self, project_name):
+ """Return an project instance, create it if necessary"""
+ return self._projects.setdefault(project_name.lower(),
+ ReleasesList(project_name, index=self._index))
+
+ def download_distribution(self, requirements, temp_path=None,
+ prefer_source=None, prefer_final=None):
+ """Download a distribution from the last release according to the
+ requirements.
+
+ If temp_path is provided, download to this path, otherwise, create a
+ temporary location for the download and return it.
+ """
+ prefer_final = self._get_prefer_final(prefer_final)
+ prefer_source = self._get_prefer_source(prefer_source)
+ release = self.get_release(requirements, prefer_final)
+ if release:
+ dist = release.get_distribution(prefer_source=prefer_source)
+ return dist.download(temp_path)
diff --git a/src/distutils2/pypi/dist.py b/src/distutils2/index/dist.py
rename from src/distutils2/pypi/dist.py
rename to src/distutils2/index/dist.py
--- a/src/distutils2/pypi/dist.py
+++ b/src/distutils2/index/dist.py
@@ -1,186 +1,187 @@
-"""distutils2.pypi.dist
+"""distutils2.index.dist
-Provides the PyPIDistribution class thats represents a distribution retrieved
-on PyPI.
+Provides useful classes to represent the release and distributions retrieved
+from indexes.
+
+A project can have several releases (=versions) and each release can have
+several distributions (sdist, bdist).
+
+The release contains the metadata related informations (see PEP 384), and the
+distributions contains download related informations.
+
"""
+import mimetypes
import re
+import tarfile
+import tempfile
+import urllib
import urlparse
-import urllib
-import tempfile
-from operator import attrgetter
+import zipfile
try:
import hashlib
except ImportError:
from distutils2._backport import hashlib
+from distutils2.errors import IrrationalVersionError
+from distutils2.index.errors import (HashDoesNotMatch, UnsupportedHashName,
+ CantParseArchiveName)
from distutils2.version import suggest_normalized_version, NormalizedVersion
-from distutils2.pypi.errors import HashDoesNotMatch, UnsupportedHashName
+from distutils2.metadata import DistributionMetadata
+from distutils2.util import untar_file, unzip_file, splitext
+
+__all__ = ['ReleaseInfo', 'DistInfo', 'ReleasesList', 'get_infos_from_url']
EXTENSIONS = ".tar.gz .tar.bz2 .tar .zip .tgz .egg".split()
MD5_HASH = re.compile(r'^.*#md5=([a-f0-9]+)$')
+DIST_TYPES = ['bdist', 'sdist']
-class PyPIDistribution(object):
- """Represents a distribution retrieved from PyPI.
+class IndexReference(object):
+ def set_index(self, index=None):
+ self._index = index
- This is a simple container for various attributes as name, version,
- downloaded_location, url etc.
- The PyPIDistribution class is used by the pypi.*Index class to return
- information about distributions.
+class ReleaseInfo(IndexReference):
+ """Represent a release of a project (a project with a specific version).
+ The release contain the _metadata informations related to this specific
+ version, and is also a container for distribution related informations.
+
+ See the DistInfo class for more information about distributions.
"""
- @classmethod
- def from_url(cls, url, probable_dist_name=None, is_external=True):
- """Build a Distribution from a url archive (egg or zip or tgz).
-
- :param url: complete url of the distribution
- :param probable_dist_name: A probable name of the distribution.
- :param is_external: Tell if the url commes from an index or from
- an external URL.
+ def __init__(self, name, version, metadata=None, hidden=False,
+ index=None, **kwargs):
"""
- # if the url contains a md5 hash, get it.
- md5_hash = None
- match = MD5_HASH.match(url)
- if match is not None:
- md5_hash = match.group(1)
- # remove the hash
- url = url.replace("#md5=%s" % md5_hash, "")
-
- # parse the archive name to find dist name and version
- archive_name = urlparse.urlparse(url)[2].split('/')[-1]
- extension_matched = False
- # remove the extension from the name
- for ext in EXTENSIONS:
- if archive_name.endswith(ext):
- archive_name = archive_name[:-len(ext)]
- extension_matched = True
-
- name, version = split_archive_name(archive_name)
- if extension_matched is True:
- return PyPIDistribution(name, version, url=url, url_hashname="md5",
- url_hashval=md5_hash,
- url_is_external=is_external)
-
- def __init__(self, name, version, type=None, url=None, url_hashname=None,
- url_hashval=None, url_is_external=True):
- """Create a new instance of PyPIDistribution.
-
:param name: the name of the distribution
:param version: the version of the distribution
- :param type: the type of the dist (eg. source, bin-*, etc.)
- :param url: URL where we found this distribution
- :param url_hashname: the name of the hash we want to use. Refer to the
- hashlib.new documentation for more information.
- :param url_hashval: the hash value.
- :param url_is_external: we need to know if the provided url comes from an
- index browsing, or from an external resource.
+ :param metadata: the metadata fields of the release.
+ :type metadata: dict
+ :param kwargs: optional arguments for a new distribution.
+ """
+ self.set_index(index)
+ self.name = name
+ self._version = None
+ self.version = version
+ if metadata:
+ self._metadata = DistributionMetadata(mapping=metadata)
+ else:
+ self._metadata = None
+ self._dists = {}
+ self.hidden = hidden
- """
- self.name = name
- self.version = NormalizedVersion(version)
- self.type = type
- # set the downloaded path to None by default. The goal here
- # is to not download distributions multiple times
- self.downloaded_location = None
- # We store urls in dict, because we need to have a bit more informations
- # than the simple URL. It will be used later to find the good url to
- # use.
- # We have two _url* attributes: _url and _urls. _urls contains a list of
- # dict for the different urls, and _url contains the choosen url, in
- # order to dont make the selection process multiple times.
- self._urls = []
- self._url = None
- self.add_url(url, url_hashname, url_hashval, url_is_external)
+ if 'dist_type' in kwargs:
+ dist_type = kwargs.pop('dist_type')
+ self.add_distribution(dist_type, **kwargs)
- def add_url(self, url, hashname=None, hashval=None, is_external=True):
- """Add a new url to the list of urls"""
- if hashname is not None:
- try:
- hashlib.new(hashname)
- except ValueError:
- raise UnsupportedHashName(hashname)
+ def set_version(self, version):
+ try:
+ self._version = NormalizedVersion(version)
+ except IrrationalVersionError:
+ suggestion = suggest_normalized_version(version)
+ if suggestion:
+ self.version = suggestion
+ else:
+ raise IrrationalVersionError(version)
- self._urls.append({
- 'url': url,
- 'hashname': hashname,
- 'hashval': hashval,
- 'is_external': is_external,
- })
- # reset the url selection process
- self._url = None
+ def get_version(self):
+ return self._version
+
+ version = property(get_version, set_version)
@property
- def url(self):
- """Pick up the right url for the list of urls in self.urls"""
- # We return internal urls over externals.
- # If there is more than one internal or external, return the first
- # one.
- if self._url is None:
- if len(self._urls) > 1:
- internals_urls = [u for u in self._urls \
- if u['is_external'] == False]
- if len(internals_urls) >= 1:
- self._url = internals_urls[0]
- if self._url is None:
- self._url = self._urls[0]
- return self._url
-
- @property
- def is_source(self):
- """return if the distribution is a source one or not"""
- return self.type == 'source'
+ def metadata(self):
+ """If the metadata is not set, use the indexes to get it"""
+ if not self._metadata:
+ self._index.get_metadata(self.name, '%s' % self.version)
+ return self._metadata
@property
def is_final(self):
"""proxy to version.is_final"""
return self.version.is_final
+
+ @property
+ def dists(self):
+ if self._dists is None:
+ self._index.get_distributions(self.name, '%s' % self.version)
+ if self._dists is None:
+ self._dists = {}
+ return self._dists
- def download(self, path=None):
- """Download the distribution to a path, and return it.
+ def add_distribution(self, dist_type='sdist', python_version=None, **params):
+ """Add distribution informations to this release.
+ If distribution information is already set for this distribution type,
+ add the given url paths to the distribution. This can be useful while
+ some of them fails to download.
- If the path is given in path, use this, otherwise, generates a new one
+ :param dist_type: the distribution type (eg. "sdist", "bdist", etc.)
+ :param params: the fields to be passed to the distribution object
+ (see the :class:DistInfo constructor).
"""
- if path is None:
- path = tempfile.mkdtemp()
+ if dist_type not in DIST_TYPES:
+ raise ValueError(dist_type)
+ if dist_type in self.dists:
+ self._dists[dist_type].add_url(**params)
+ else:
+ self._dists[dist_type] = DistInfo(self, dist_type,
+ index=self._index, **params)
+ if python_version:
+ self._dists[dist_type].python_version = python_version
- # if we do not have downloaded it yet, do it.
- if self.downloaded_location is None:
- url = self.url['url']
- archive_name = urlparse.urlparse(url)[2].split('/')[-1]
- filename, headers = urllib.urlretrieve(url,
- path + "/" + archive_name)
- self.downloaded_location = filename
- self._check_md5(filename)
- return self.downloaded_location
+ def get_distribution(self, dist_type=None, prefer_source=True):
+ """Return a distribution.
- def _check_md5(self, filename):
- """Check that the md5 checksum of the given file matches the one in
- url param"""
- hashname = self.url['hashname']
- expected_hashval = self.url['hashval']
- if not None in (expected_hashval, hashname):
- f = open(filename)
- hashval = hashlib.new(hashname)
- hashval.update(f.read())
- if hashval.hexdigest() != expected_hashval:
- raise HashDoesNotMatch("got %s instead of %s"
- % (hashval.hexdigest(), expected_hashval))
+ If dist_type is set, find first for this distribution type, and just
+ act as an alias of __get_item__.
- def __repr__(self):
- return "%s %s %s %s" \
- % (self.__class__.__name__, self.name, self.version,
- self.type or "")
+ If prefer_source is True, search first for source distribution, and if
+ not return one existing distribution.
+ """
+ if len(self.dists) == 0:
+ raise LookupError()
+ if dist_type:
+ return self[dist_type]
+ if prefer_source:
+ if "sdist" in self.dists:
+ dist = self["sdist"]
+ else:
+ dist = self.dists.values()[0]
+ return dist
+
+ def download(self, temp_path=None, prefer_source=True):
+ """Download the distribution, using the requirements.
+
+ If more than one distribution match the requirements, use the last
+ version.
+ Download the distribution, and put it in the temp_path. If no temp_path
+ is given, creates and return one.
+
+ Returns the complete absolute path to the downloaded archive.
+ """
+ return self.get_distribution(prefer_source=prefer_source)\
+ .download(path=temp_path)
+
+ def set_metadata(self, metadata):
+ if not self._metadata:
+ self._metadata = DistributionMetadata()
+ self._metadata.update(metadata)
+
+ def __getitem__(self, item):
+ """distributions are available using release["sdist"]"""
+ return self.dists[item]
def _check_is_comparable(self, other):
- if not isinstance(other, PyPIDistribution):
+ if not isinstance(other, ReleaseInfo):
raise TypeError("cannot compare %s and %s"
% (type(self).__name__, type(other).__name__))
elif self.name != other.name:
raise TypeError("cannot compare %s and %s"
% (self.name, other.name))
+ def __repr__(self):
+ return "<%s %s>" % (self.name, self.version)
+
def __eq__(self, other):
self._check_is_comparable(other)
return self.version == other.version
@@ -205,78 +206,311 @@
__hash__ = object.__hash__
-class PyPIDistributions(list):
- """A container of PyPIDistribution objects.
+class DistInfo(IndexReference):
+ """Represents a distribution retrieved from an index (sdist, bdist, ...)
+ """
- Contains methods and facilities to sort and filter distributions.
+ def __init__(self, release, dist_type=None, url=None, hashname=None,
+ hashval=None, is_external=True, python_version=None,
+ index=None):
+ """Create a new instance of DistInfo.
+
+ :param release: a DistInfo class is relative to a release.
+ :param dist_type: the type of the dist (eg. source, bin-*, etc.)
+ :param url: URL where we found this distribution
+ :param hashname: the name of the hash we want to use. Refer to the
+ hashlib.new documentation for more information.
+ :param hashval: the hash value.
+ :param is_external: we need to know if the provided url comes from
+ an index browsing, or from an external resource.
+
+ """
+ self.set_index(index)
+ self.release = release
+ self.dist_type = dist_type
+ self.python_version = python_version
+ self._unpacked_dir = None
+ # set the downloaded path to None by default. The goal here
+ # is to not download distributions multiple times
+ self.downloaded_location = None
+ # We store urls in dict, because we need to have a bit more infos
+ # than the simple URL. It will be used later to find the good url to
+ # use.
+ # We have two _url* attributes: _url and urls. urls contains a list
+ # of dict for the different urls, and _url contains the choosen url, in
+ # order to dont make the selection process multiple times.
+ self.urls = []
+ self._url = None
+ self.add_url(url, hashname, hashval, is_external)
+
+ def add_url(self, url, hashname=None, hashval=None, is_external=True):
+ """Add a new url to the list of urls"""
+ if hashname is not None:
+ try:
+ hashlib.new(hashname)
+ except ValueError:
+ raise UnsupportedHashName(hashname)
+ if not url in [u['url'] for u in self.urls]:
+ self.urls.append({
+ 'url': url,
+ 'hashname': hashname,
+ 'hashval': hashval,
+ 'is_external': is_external,
+ })
+ # reset the url selection process
+ self._url = None
+
+ @property
+ def url(self):
+ """Pick up the right url for the list of urls in self.urls"""
+ # We return internal urls over externals.
+ # If there is more than one internal or external, return the first
+ # one.
+ if self._url is None:
+ if len(self.urls) > 1:
+ internals_urls = [u for u in self.urls \
+ if u['is_external'] == False]
+ if len(internals_urls) >= 1:
+ self._url = internals_urls[0]
+ if self._url is None:
+ self._url = self.urls[0]
+ return self._url
+
+ @property
+ def is_source(self):
+ """return if the distribution is a source one or not"""
+ return self.dist_type == 'sdist'
+
+ def download(self, path=None):
+ """Download the distribution to a path, and return it.
+
+ If the path is given in path, use this, otherwise, generates a new one
+ Return the download location.
+ """
+ if path is None:
+ path = tempfile.mkdtemp()
+
+ # if we do not have downloaded it yet, do it.
+ if self.downloaded_location is None:
+ url = self.url['url']
+ archive_name = urlparse.urlparse(url)[2].split('/')[-1]
+ filename, headers = urllib.urlretrieve(url,
+ path + "/" + archive_name)
+ self.downloaded_location = filename
+ self._check_md5(filename)
+ return self.downloaded_location
+
+ def unpack(self, path=None):
+ """Unpack the distribution to the given path.
+
+ If not destination is given, creates a temporary location.
+
+ Returns the location of the extracted files (root).
+ """
+ if not self._unpacked_dir:
+ if path is None:
+ path = tempfile.mkdtemp()
+
+ filename = self.download()
+ content_type = mimetypes.guess_type(filename)[0]
+
+ if (content_type == 'application/zip'
+ or filename.endswith('.zip')
+ or filename.endswith('.pybundle')
+ or zipfile.is_zipfile(filename)):
+ unzip_file(filename, path, flatten=not filename.endswith('.pybundle'))
+ elif (content_type == 'application/x-gzip'
+ or tarfile.is_tarfile(filename)
+ or splitext(filename)[1].lower() in ('.tar', '.tar.gz', '.tar.bz2', '.tgz', '.tbz')):
+ untar_file(filename, path)
+ self._unpacked_dir = path
+ return self._unpacked_dir
+
+ def _check_md5(self, filename):
+ """Check that the md5 checksum of the given file matches the one in
+ url param"""
+ hashname = self.url['hashname']
+ expected_hashval = self.url['hashval']
+ if not None in (expected_hashval, hashname):
+ f = open(filename)
+ hashval = hashlib.new(hashname)
+ hashval.update(f.read())
+ if hashval.hexdigest() != expected_hashval:
+ raise HashDoesNotMatch("got %s instead of %s"
+ % (hashval.hexdigest(), expected_hashval))
+
+ def __repr__(self):
+ return "<%s %s %s>" % (
+ self.release.name, self.release.version, self.dist_type or "")
+
+
+class ReleasesList(IndexReference):
+ """A container of Release.
+
+ Provides useful methods and facilities to sort and filter releases.
"""
- def __init__(self, list=[]):
- # To disable the ability to pass lists on instanciation
- super(PyPIDistributions, self).__init__()
- for item in list:
- self.append(item)
+ def __init__(self, name, releases=None, contains_hidden=False, index=None):
+ self.set_index(index)
+ self._releases = []
+ self.name = name
+ self.contains_hidden = contains_hidden
+ if releases:
+ self.add_releases(releases)
+
+ @property
+ def releases(self):
+ if not self._releases:
+ self.fetch_releases()
+ return self._releases
+
+ def fetch_releases(self):
+ self._index.get_releases(self.name)
+ return self.releases
def filter(self, predicate):
- """Filter the distributions and return a subset of distributions that
- match the given predicate
+ """Filter and return a subset of releases matching the given predicate.
"""
- return PyPIDistributions(
- [dist for dist in self if dist.name == predicate.name and
- predicate.match(dist.version)])
+ return ReleasesList(self.name, [release for release in self.releases
+ if predicate.match(release.version)],
+ index=self._index)
- def get_last(self, predicate, prefer_source=None, prefer_final=None):
- """Return the most up to date version, that satisfy the given
- predicate
+ def get_last(self, predicate, prefer_final=None):
+ """Return the "last" release, that satisfy the given predicates.
+
+ "last" is defined by the version number of the releases, you also could
+ set prefer_final parameter to True or False to change the order results
"""
- distributions = self.filter(predicate)
- distributions.sort_distributions(prefer_source, prefer_final, reverse=True)
- return distributions[0]
+ releases = self.filter(predicate)
+ releases.sort_releases(prefer_final, reverse=True)
+ return releases[0]
- def get_same_name_and_version(self):
- """Return lists of PyPIDistribution objects that refer to the same
- name and version number. This do not consider the type (source, binary,
- etc.)"""
- processed = []
- duplicates = []
- for dist in self:
- if (dist.name, dist.version) not in processed:
- processed.append((dist.name, dist.version))
- found_duplicates = [d for d in self if d.name == dist.name and
- d.version == dist.version]
- if len(found_duplicates) > 1:
- duplicates.append(found_duplicates)
- return duplicates
+ def add_releases(self, releases):
+ """Add releases in the release list.
- def append(self, o):
- """Append a new distribution to the list.
+ :param: releases is a list of ReleaseInfo objects.
+ """
+ for r in releases:
+ self.add_release(release=r)
- If a distribution with the same name and version exists, just grab the
- URL informations and add a new new url for the existing one.
+ def add_release(self, version=None, dist_type='sdist', release=None,
+ **dist_args):
+ """Add a release to the list.
+
+ The release can be passed in the `release` parameter, and in this case,
+ it will be crawled to extract the useful informations if necessary, or
+ the release informations can be directly passed in the `version` and
+ `dist_type` arguments.
+
+ Other keywords arguments can be provided, and will be forwarded to the
+ distribution creation (eg. the arguments of the DistInfo constructor).
"""
- similar_dists = [d for d in self if d.name == o.name and
- d.version == o.version and d.type == o.type]
- if len(similar_dists) > 0:
- dist = similar_dists[0]
- dist.add_url(**o.url)
+ if release:
+ if release.name.lower() != self.name.lower():
+ raise ValueError("%s is not the same project than %s" %
+ (release.name, self.name))
+ version = '%s' % release.version
+
+ if not version in self.get_versions():
+ # append only if not already exists
+ self._releases.append(release)
+ for dist in release.dists.values():
+ for url in dist.urls:
+ self.add_release(version, dist.dist_type, **url)
else:
- super(PyPIDistributions, self).append(o)
+ matches = [r for r in self._releases if '%s' % r.version == version
+ and r.name == self.name]
+ if not matches:
+ release = ReleaseInfo(self.name, version, index=self._index)
+ self._releases.append(release)
+ else:
+ release = matches[0]
- def sort_distributions(self, prefer_source=True, prefer_final=False,
- reverse=True, *args, **kwargs):
- """order the results with the given properties"""
+ release.add_distribution(dist_type=dist_type, **dist_args)
+
+ def sort_releases(self, prefer_final=False, reverse=True, *args, **kwargs):
+ """Sort the results with the given properties.
+
+ The `prefer_final` argument can be used to specify if final
+ distributions (eg. not dev, bet or alpha) would be prefered or not.
+
+ Results can be inverted by using `reverse`.
+
+ Any other parameter provided will be forwarded to the sorted call. You
+ cannot redefine the key argument of "sorted" here, as it is used
+ internally to sort the releases.
+ """
sort_by = []
if prefer_final:
sort_by.append("is_final")
sort_by.append("version")
- if prefer_source:
- sort_by.append("is_source")
-
- super(PyPIDistributions, self).sort(
+ self.releases.sort(
key=lambda i: [getattr(i, arg) for arg in sort_by],
reverse=reverse, *args, **kwargs)
+ def get_release(self, version):
+ """Return a release from it's version.
+ """
+ matches = [r for r in self.releases if "%s" % r.version == version]
+ if len(matches) != 1:
+ raise KeyError(version)
+ return matches[0]
+
+ def get_versions(self):
+ """Return a list of releases versions contained"""
+ return ["%s" % r.version for r in self._releases]
+
+ def __getitem__(self, key):
+ return self.releases[key]
+
+ def __len__(self):
+ return len(self.releases)
+
+ def __repr__(self):
+ string = 'Project "%s"' % self.name
+ if self.get_versions():
+ string += ' versions: %s' % ', '.join(self.get_versions())
+ return '<%s>' % string
+
+
+def get_infos_from_url(url, probable_dist_name=None, is_external=True):
+ """Get useful informations from an URL.
+
+ Return a dict of (name, version, url, hashtype, hash, is_external)
+
+ :param url: complete url of the distribution
+ :param probable_dist_name: A probable name of the project.
+ :param is_external: Tell if the url commes from an index or from
+ an external URL.
+ """
+ # if the url contains a md5 hash, get it.
+ md5_hash = None
+ match = MD5_HASH.match(url)
+ if match is not None:
+ md5_hash = match.group(1)
+ # remove the hash
+ url = url.replace("#md5=%s" % md5_hash, "")
+
+ # parse the archive name to find dist name and version
+ archive_name = urlparse.urlparse(url)[2].split('/')[-1]
+ extension_matched = False
+ # remove the extension from the name
+ for ext in EXTENSIONS:
+ if archive_name.endswith(ext):
+ archive_name = archive_name[:-len(ext)]
+ extension_matched = True
+
+ name, version = split_archive_name(archive_name)
+ if extension_matched is True:
+ return {'name': name,
+ 'version': version,
+ 'url': url,
+ 'hashname': "md5",
+ 'hashval': md5_hash,
+ 'is_external': is_external,
+ 'dist_type': 'sdist'}
+
def split_archive_name(archive_name, probable_name=None):
"""Split an archive name into two parts: name and version.
@@ -309,7 +543,7 @@
name, version = eager_split(archive_name)
version = suggest_normalized_version(version)
- if version != "" and name != "":
+ if version is not None and name != "":
return (name.lower(), version)
else:
raise CantParseArchiveName(archive_name)
diff --git a/src/distutils2/pypi/errors.py b/src/distutils2/index/errors.py
rename from src/distutils2/pypi/errors.py
rename to src/distutils2/index/errors.py
--- a/src/distutils2/pypi/errors.py
+++ b/src/distutils2/index/errors.py
@@ -5,19 +5,27 @@
from distutils2.errors import DistutilsError
-class PyPIError(DistutilsError):
- """The base class for errors of the pypi python package."""
+class IndexesError(DistutilsError):
+ """The base class for errors of the index python package."""
-class DistributionNotFound(PyPIError):
- """No distribution match the given requirements."""
+class ProjectNotFound(IndexesError):
+ """Project has not been found"""
-class CantParseArchiveName(PyPIError):
+class DistributionNotFound(IndexesError):
+ """The release has not been found"""
+
+
+class ReleaseNotFound(IndexesError):
+ """The release has not been found"""
+
+
+class CantParseArchiveName(IndexesError):
"""An archive name can't be parsed to find distribution name and version"""
-class DownloadError(PyPIError):
+class DownloadError(IndexesError):
"""An error has occurs while downloading"""
@@ -25,9 +33,13 @@
"""Compared hashes does not match"""
-class UnsupportedHashName(PyPIError):
+class UnsupportedHashName(IndexesError):
"""A unsupported hashname has been used"""
-class UnableToDownload(PyPIError):
+class UnableToDownload(IndexesError):
"""All mirrors have been tried, without success"""
+
+
+class InvalidSearchField(IndexesError):
+ """An invalid search field has been used"""
diff --git a/src/distutils2/index/mirrors.py b/src/distutils2/index/mirrors.py
new file mode 100644
--- /dev/null
+++ b/src/distutils2/index/mirrors.py
@@ -0,0 +1,52 @@
+"""Utilities related to the mirror infrastructure defined in PEP 381.
+See http://www.python.org/dev/peps/pep-0381/
+"""
+
+from string import ascii_lowercase
+import socket
+
+DEFAULT_MIRROR_URL = "last.pypi.python.org"
+
+def get_mirrors(hostname=None):
+ """Return the list of mirrors from the last record found on the DNS
+ entry::
+
+ >>> from distutils2.index.mirrors import get_mirrors
+ >>> get_mirrors()
+ ['a.pypi.python.org', 'b.pypi.python.org', 'c.pypi.python.org',
+ 'd.pypi.python.org']
+
+ """
+ if hostname is None:
+ hostname = DEFAULT_MIRROR_URL
+
+ # return the last mirror registered on PyPI.
+ try:
+ hostname = socket.gethostbyname_ex(hostname)[0]
+ except socket.gaierror:
+ return []
+ end_letter = hostname.split(".", 1)
+
+ # determine the list from the last one.
+ return ["%s.%s" % (s, end_letter[1]) for s in string_range(end_letter[0])]
+
+def string_range(last):
+ """Compute the range of string between "a" and last.
+
+ This works for simple "a to z" lists, but also for "a to zz" lists.
+ """
+ for k in range(len(last)):
+ for x in product(ascii_lowercase, repeat=k+1):
+ result = ''.join(x)
+ yield result
+ if result == last:
+ return
+
+def product(*args, **kwds):
+ pools = map(tuple, args) * kwds.get('repeat', 1)
+ result = [[]]
+ for pool in pools:
+ result = [x+[y] for x in result for y in pool]
+ for prod in result:
+ yield tuple(prod)
+
diff --git a/src/distutils2/pypi/simple.py b/src/distutils2/index/simple.py
rename from src/distutils2/pypi/simple.py
rename to src/distutils2/index/simple.py
--- a/src/distutils2/pypi/simple.py
+++ b/src/distutils2/index/simple.py
@@ -1,6 +1,6 @@
-"""pypi.simple
+"""index.simple
-Contains the class "SimpleIndex", a simple spider to find and retrieve
+Contains the class "SimpleIndexCrawler", a simple spider to find and retrieve
distributions on the Python Package Index, using it's "simple" API,
avalaible at http://pypi.python.org/simple/
"""
@@ -11,17 +11,23 @@
import sys
import urllib2
import urlparse
+import logging
+import os
-from distutils2.version import VersionPredicate
-from distutils2.pypi.dist import (PyPIDistribution, PyPIDistributions,
- EXTENSIONS)
-from distutils2.pypi.errors import (PyPIError, DistributionNotFound,
- DownloadError, UnableToDownload)
+from distutils2.index.base import BaseClient
+from distutils2.index.dist import (ReleasesList, EXTENSIONS,
+ get_infos_from_url, MD5_HASH)
+from distutils2.index.errors import (IndexesError, DownloadError,
+ UnableToDownload, CantParseArchiveName,
+ ReleaseNotFound, ProjectNotFound)
+from distutils2.index.mirrors import get_mirrors
+from distutils2.metadata import DistributionMetadata
from distutils2 import __version__ as __distutils2_version__
+__all__ = ['Crawler', 'DEFAULT_SIMPLE_INDEX_URL']
+
# -- Constants -----------------------------------------------
-PYPI_DEFAULT_INDEX_URL = "http://pypi.python.org/simple/"
-PYPI_DEFAULT_MIRROR_URL = "mirrors.pypi.python.org"
+DEFAULT_SIMPLE_INDEX_URL = "http://a.pypi.python.org/simple/"
DEFAULT_HOSTS = ("*",)
SOCKET_TIMEOUT = 15
USER_AGENT = "Python-urllib/%s distutils2/%s" % (
@@ -30,9 +36,6 @@
# -- Regexps -------------------------------------------------
EGG_FRAGMENT = re.compile(r'^egg=([-A-Za-z0-9_.]+)$')
HREF = re.compile("""href\\s*=\\s*['"]?([^'"> ]+)""", re.I)
-PYPI_MD5 = re.compile(
- '<a href="([^"#]+)">([^<]+)</a>\n\s+\\(<a (?:title="MD5 hash"\n\s+)'
- 'href="[^?]+\?:action=show_md5&digest=([0-9a-f]{32})">md5</a>\\)')
URL_SCHEME = re.compile('([-+.a-z0-9]{2,}):', re.I).match
# This pattern matches a character entity reference (a decimal numeric
@@ -58,10 +61,39 @@
return _socket_timeout
-class SimpleIndex(object):
- """Provides useful tools to request the Python Package Index simple API
+def with_mirror_support():
+ """Decorator that makes the mirroring support easier"""
+ def wrapper(func):
+ def wrapped(self, *args, **kwargs):
+ try:
+ return func(self, *args, **kwargs)
+ except DownloadError:
+ # if an error occurs, try with the next index_url
+ if self._mirrors_tries >= self._mirrors_max_tries:
+ try:
+ self._switch_to_next_mirror()
+ except KeyError:
+ raise UnableToDownload("Tried all mirrors")
+ else:
+ self._mirrors_tries += 1
+ self._projects.clear()
+ return wrapped(self, *args, **kwargs)
+ return wrapped
+ return wrapper
+
+
+class Crawler(BaseClient):
+ """Provides useful tools to request the Python Package Index simple API.
+
+ You can specify both mirrors and mirrors_url, but mirrors_url will only be
+ used if mirrors is set to None.
:param index_url: the url of the simple index to search on.
+ :param prefer_final: if the version is not mentioned, and the last
+ version is not a "final" one (alpha, beta, etc.),
+ pick up the last final version.
+ :param prefer_source: if the distribution type is not mentioned, pick up
+ the source one if available.
:param follow_externals: tell if following external links is needed or
not. Default is False.
:param hosts: a list of hosts allowed to be processed while using
@@ -69,38 +101,33 @@
hosts.
:param follow_externals: tell if following external links is needed or
not. Default is False.
- :param prefer_source: if there is binary and source distributions, the
- source prevails.
- :param prefer_final: if the version is not mentioned, and the last
- version is not a "final" one (alpha, beta, etc.),
- pick up the last final version.
:param mirrors_url: the url to look on for DNS records giving mirror
adresses.
- :param mirrors: a list of mirrors to check out if problems
- occurs while working with the one given in "url"
+ :param mirrors: a list of mirrors (see PEP 381).
:param timeout: time in seconds to consider a url has timeouted.
+ :param mirrors_max_tries": number of times to try requesting informations
+ on mirrors before switching.
"""
- def __init__(self, index_url=PYPI_DEFAULT_INDEX_URL, hosts=DEFAULT_HOSTS,
- follow_externals=False, prefer_source=True,
- prefer_final=False, mirrors_url=PYPI_DEFAULT_MIRROR_URL,
- mirrors=None, timeout=SOCKET_TIMEOUT):
+ def __init__(self, index_url=DEFAULT_SIMPLE_INDEX_URL, prefer_final=False,
+ prefer_source=True, hosts=DEFAULT_HOSTS,
+ follow_externals=False, mirrors_url=None, mirrors=None,
+ timeout=SOCKET_TIMEOUT, mirrors_max_tries=0):
+ super(Crawler, self).__init__(prefer_final, prefer_source)
self.follow_externals = follow_externals
+ # mirroring attributes.
if not index_url.endswith("/"):
index_url += "/"
- self._index_urls = [index_url]
# if no mirrors are defined, use the method described in PEP 381.
if mirrors is None:
- try:
- mirrors = socket.gethostbyname_ex(mirrors_url)[-1]
- except socket.gaierror:
- mirrors = []
- self._index_urls.extend(mirrors)
- self._current_index_url = 0
+ mirrors = get_mirrors(mirrors_url)
+ self._mirrors = set(mirrors)
+ self._mirrors_used = set()
+ self.index_url = index_url
+ self._mirrors_max_tries = mirrors_max_tries
+ self._mirrors_tries = 0
self._timeout = timeout
- self._prefer_source = prefer_source
- self._prefer_final = prefer_final
# create a regexp to match all given hosts
self._allowed_hosts = re.compile('|'.join(map(translate, hosts))).match
@@ -109,96 +136,84 @@
# scanning them multple time (eg. if there is multiple pages pointing
# on one)
self._processed_urls = []
- self._distributions = {}
+ self._projects = {}
- def find(self, requirements, prefer_source=None, prefer_final=None):
- """Browse the PyPI to find distributions that fullfil the given
- requirements.
+ @with_mirror_support()
+ def search_projects(self, name=None, **kwargs):
+ """Search the index for projects containing the given name.
- :param requirements: A project name and it's distribution, using
- version specifiers, as described in PEP345.
- :type requirements: You can pass either a version.VersionPredicate
- or a string.
- :param prefer_source: if there is binary and source distributions, the
- source prevails.
- :param prefer_final: if the version is not mentioned, and the last
- version is not a "final" one (alpha, beta, etc.),
- pick up the last final version.
+ Return a list of names.
"""
- requirements = self._get_version_predicate(requirements)
- if prefer_source is None:
- prefer_source = self._prefer_source
- if prefer_final is None:
- prefer_final = self._prefer_final
+ index = self._open_url(self.index_url)
+ projectname = re.compile("""<a[^>]*>(.?[^<]*%s.?[^<]*)</a>""" % name,
+ flags=re.I)
+ matching_projects = []
+ for match in projectname.finditer(index.read()):
+ project_name = match.group(1)
+ matching_projects.append(self._get_project(project_name))
+ return matching_projects
- # process the index for this project
- self._process_pypi_page(requirements.name)
-
- # filter with requirements and return the results
- if requirements.name in self._distributions:
- dists = self._distributions[requirements.name].filter(requirements)
- dists.sort_distributions(prefer_source=prefer_source,
- prefer_final=prefer_final)
- else:
- dists = []
-
- return dists
-
- def get(self, requirements, *args, **kwargs):
- """Browse the PyPI index to find distributions that fullfil the
- given requirements, and return the most recent one.
-
- You can specify prefer_final and prefer_source arguments here.
- If not, the default one will be used.
+ def get_releases(self, requirements, prefer_final=None,
+ force_update=False):
+ """Search for releases and return a ReleaseList object containing
+ the results.
"""
predicate = self._get_version_predicate(requirements)
- dists = self.find(predicate, *args, **kwargs)
+ if self._projects.has_key(predicate.name.lower()) and not force_update:
+ return self._projects.get(predicate.name.lower())
+ prefer_final = self._get_prefer_final(prefer_final)
+ self._process_index_page(predicate.name)
- if len(dists) == 0:
- raise DistributionNotFound(requirements)
+ if not self._projects.has_key(predicate.name.lower()):
+ raise ProjectNotFound()
- return dists.get_last(predicate)
+ releases = self._projects.get(predicate.name.lower())
+ releases.sort_releases(prefer_final=prefer_final)
+ return releases
- def download(self, requirements, temp_path=None, *args, **kwargs):
- """Download the distribution, using the requirements.
+ def get_release(self, requirements, prefer_final=None):
+ """Return only one release that fulfill the given requirements"""
+ predicate = self._get_version_predicate(requirements)
+ release = self.get_releases(predicate, prefer_final)\
+ .get_last(predicate)
+ if not release:
+ raise ReleaseNotFound("No release matches the given criterias")
+ return release
- If more than one distribution match the requirements, use the last
- version.
- Download the distribution, and put it in the temp_path. If no temp_path
- is given, creates and return one.
+ def get_distributions(self, project_name, version):
+ """Return the distributions found on the index for the specific given
+ release"""
+ # as the default behavior of get_release is to return a release
+ # containing the distributions, just alias it.
+ return self.get_release("%s (%s)" % (project_name, version))
- Returns the complete absolute path to the downloaded archive.
+ def get_metadata(self, project_name, version):
+ """Return the metadatas from the simple index.
- :param requirements: The same as the find attribute of `find`.
-
- You can specify prefer_final and prefer_source arguments here.
- If not, the default one will be used.
+ Currently, download one archive, extract it and use the PKG-INFO file.
"""
- return self.get(requirements, *args, **kwargs)\
- .download(path=temp_path)
-
- def _get_version_predicate(self, requirements):
- """Return a VersionPredicate object, from a string or an already
- existing object.
- """
- if isinstance(requirements, str):
- requirements = VersionPredicate(requirements)
- return requirements
-
- @property
- def index_url(self):
- return self._index_urls[self._current_index_url]
+ release = self.get_distributions(project_name, version)
+ if not release._metadata:
+ location = release.get_distribution().unpack()
+ pkg_info = os.path.join(location, 'PKG-INFO')
+ release._metadata = DistributionMetadata(pkg_info)
+ return release
def _switch_to_next_mirror(self):
"""Switch to the next mirror (eg. point self.index_url to the next
- url.
+ mirror url.
+
+ Raise a KeyError if all mirrors have been tried.
"""
- # Internally, iter over the _index_url iterable, if we have read all
- # of the available indexes, raise an exception.
- if self._current_index_url < len(self._index_urls):
- self._current_index_url = self._current_index_url + 1
- else:
- raise UnableToDownload("All mirrors fails")
+ self._mirrors_used.add(self.index_url)
+ index_url = self._mirrors.pop()
+ if not ("http://" or "https://" or "file://") in index_url:
+ index_url = "http://%s" % index_url
+
+ if not index_url.endswith("/simple"):
+ index_url = "%s/simple/" % index_url
+
+ self.index_url = index_url
def _is_browsable(self, url):
"""Tell if the given URL can be browsed or not.
@@ -228,18 +243,34 @@
return True
return False
- def _register_dist(self, dist):
- """Register a distribution as a part of fetched distributions for
- SimpleIndex.
+ def _register_release(self, release=None, release_info={}):
+ """Register a new release.
- Return the PyPIDistributions object for the specified project name
+ Both a release or a dict of release_info can be provided, the prefered
+ way (eg. the quicker) is the dict one.
+
+ Return the list of existing releases for the given project.
"""
- # Internally, check if a entry exists with the project name, if not,
- # create a new one, and if exists, add the dist to the pool.
- if not dist.name in self._distributions:
- self._distributions[dist.name] = PyPIDistributions()
- self._distributions[dist.name].append(dist)
- return self._distributions[dist.name]
+ # Check if the project already has a list of releases (refering to
+ # the project name). If not, create a new release list.
+ # Then, add the release to the list.
+ if release:
+ name = release.name
+ else:
+ name = release_info['name']
+ if not name.lower() in self._projects:
+ self._projects[name.lower()] = ReleasesList(name,
+ index=self._index)
+
+ if release:
+ self._projects[name.lower()].add_release(release=release)
+ else:
+ name = release_info.pop('name')
+ version = release_info.pop('version')
+ dist_type = release_info.pop('dist_type')
+ self._projects[name.lower()].add_release(version, dist_type,
+ **release_info)
+ return self._projects[name.lower()]
def _process_url(self, url, project_name=None, follow_links=True):
"""Process an url and search for distributions packages.
@@ -264,9 +295,14 @@
if self._is_distribution(link) or is_download:
self._processed_urls.append(link)
# it's a distribution, so create a dist object
- dist = PyPIDistribution.from_url(link, project_name,
- is_external=not self.index_url in url)
- self._register_dist(dist)
+ try:
+ infos = get_infos_from_url(link, project_name,
+ is_external=not self.index_url in url)
+ except CantParseArchiveName, e:
+ logging.warning("version has not been parsed: %s"
+ % e)
+ else:
+ self._register_release(release_info=infos)
else:
if self._is_browsable(link) and follow_links:
self._process_url(link, project_name,
@@ -280,6 +316,9 @@
else:
return self._default_link_matcher
+ def _get_full_url(self, url, base_url):
+ return urlparse.urljoin(base_url, self._htmldecode(url))
+
def _simple_link_matcher(self, content, base_url):
"""Yield all links with a rel="download" or rel="homepage".
@@ -287,41 +326,39 @@
If follow_externals is set to False, dont yeld the external
urls.
"""
+ for match in HREF.finditer(content):
+ url = self._get_full_url(match.group(1), base_url)
+ if MD5_HASH.match(url):
+ yield (url, True)
+
for match in REL.finditer(content):
+ # search for rel links.
tag, rel = match.groups()
rels = map(str.strip, rel.lower().split(','))
if 'homepage' in rels or 'download' in rels:
for match in HREF.finditer(tag):
- url = urlparse.urljoin(base_url,
- self._htmldecode(match.group(1)))
+ url = self._get_full_url(match.group(1), base_url)
if 'download' in rels or self._is_browsable(url):
# yield a list of (url, is_download)
- yield (urlparse.urljoin(base_url, url),
- 'download' in rels)
+ yield (url, 'download' in rels)
def _default_link_matcher(self, content, base_url):
"""Yield all links found on the page.
"""
for match in HREF.finditer(content):
- url = urlparse.urljoin(base_url, self._htmldecode(match.group(1)))
+ url = self._get_full_url(match.group(1), base_url)
if self._is_browsable(url):
yield (url, False)
- def _process_pypi_page(self, name):
+ @with_mirror_support()
+ def _process_index_page(self, name):
"""Find and process a PyPI page for the given project name.
:param name: the name of the project to find the page
"""
- try:
- # Browse and index the content of the given PyPI page.
- url = self.index_url + name + "/"
- self._process_url(url, name)
- except DownloadError:
- # if an error occurs, try with the next index_url
- # (provided by the mirrors)
- self._switch_to_next_mirror()
- self._distributions.clear()
- self._process_pypi_page(name)
+ # Browse and index the content of the given PyPI page.
+ url = self.index_url + name + "/"
+ self._process_url(url, name)
@socket_timeout()
def _open_url(self, url):
@@ -329,44 +366,35 @@
files support.
"""
+ scheme, netloc, path, params, query, frag = urlparse.urlparse(url)
+
+ # authentication stuff
+ if scheme in ('http', 'https'):
+ auth, host = urllib2.splituser(netloc)
+ else:
+ auth = None
+
+ # add index.html automatically for filesystem paths
+ if scheme == 'file':
+ if url.endswith('/'):
+ url += "index.html"
+
+ # add authorization headers if auth is provided
+ if auth:
+ auth = "Basic " + \
+ urllib2.unquote(auth).encode('base64').strip()
+ new_url = urlparse.urlunparse((
+ scheme, host, path, params, query, frag))
+ request = urllib2.Request(new_url)
+ request.add_header("Authorization", auth)
+ else:
+ request = urllib2.Request(url)
+ request.add_header('User-Agent', USER_AGENT)
try:
- scheme, netloc, path, params, query, frag = urlparse.urlparse(url)
-
- if scheme in ('http', 'https'):
- auth, host = urllib2.splituser(netloc)
- else:
- auth = None
-
- # add index.html automatically for filesystem paths
- if scheme == 'file':
- if url.endswith('/'):
- url += "index.html"
-
- if auth:
- auth = "Basic " + \
- urllib2.unquote(auth).encode('base64').strip()
- new_url = urlparse.urlunparse((
- scheme, host, path, params, query, frag))
- request = urllib2.Request(new_url)
- request.add_header("Authorization", auth)
- else:
- request = urllib2.Request(url)
- request.add_header('User-Agent', USER_AGENT)
fp = urllib2.urlopen(request)
-
- if auth:
- # Put authentication info back into request URL if same host,
- # so that links found on the page will work
- s2, h2, path2, param2, query2, frag2 = \
- urlparse.urlparse(fp.url)
- if s2 == scheme and h2 == host:
- fp.url = urlparse.urlunparse(
- (s2, netloc, path2, param2, query2, frag2))
-
- return fp
except (ValueError, httplib.InvalidURL), v:
msg = ' '.join([str(arg) for arg in v.args])
- raise PyPIError('%s %s' % (url, msg))
+ raise IndexesError('%s %s' % (url, msg))
except urllib2.HTTPError, v:
return v
except urllib2.URLError, v:
@@ -376,6 +404,18 @@
'The server might be down, %s' % (url, v.line))
except httplib.HTTPException, v:
raise DownloadError("Download error for %s: %s" % (url, v))
+ except socket.timeout:
+ raise DownloadError("The server timeouted")
+
+ if auth:
+ # Put authentication info back into request URL if same host,
+ # so that links found on the page will work
+ s2, h2, path2, param2, query2, frag2 = \
+ urlparse.urlparse(fp.url)
+ if s2 == scheme and h2 == host:
+ fp.url = urlparse.urlunparse(
+ (s2, netloc, path2, param2, query2, frag2))
+ return fp
def _decode_entity(self, match):
what = match.group(1)
diff --git a/src/distutils2/index/wrapper.py b/src/distutils2/index/wrapper.py
new file mode 100644
--- /dev/null
+++ b/src/distutils2/index/wrapper.py
@@ -0,0 +1,93 @@
+import xmlrpc
+import simple
+
+_WRAPPER_MAPPINGS = {'get_release': 'simple',
+ 'get_releases': 'simple',
+ 'search_projects': 'simple',
+ 'get_metadata': 'xmlrpc',
+ 'get_distributions': 'simple'}
+
+_WRAPPER_INDEXES = {'xmlrpc': xmlrpc.Client,
+ 'simple': simple.Crawler}
+
+def switch_index_if_fails(func, wrapper):
+ """Decorator that switch of index (for instance from xmlrpc to simple)
+ if the first mirror return an empty list or raises an exception.
+ """
+ def decorator(*args, **kwargs):
+ retry = True
+ exception = None
+ methods = [func]
+ for f in wrapper._indexes.values():
+ if f != func.im_self and hasattr(f, func.__name__):
+ methods.append(getattr(f, func.__name__))
+ for method in methods:
+ try:
+ response = method(*args, **kwargs)
+ retry = False
+ except Exception, e:
+ exception = e
+ if not retry:
+ break
+ if retry and exception:
+ raise exception
+ else:
+ return response
+ return decorator
+
+
+class ClientWrapper(object):
+ """Wrapper around simple and xmlrpc clients,
+
+ Choose the best implementation to use depending the needs, using the given
+ mappings.
+ If one of the indexes returns an error, tries to use others indexes.
+
+ :param index: tell wich index to rely on by default.
+ :param index_classes: a dict of name:class to use as indexes.
+ :param indexes: a dict of name:index already instantiated
+ :param mappings: the mappings to use for this wrapper
+ """
+
+ def __init__(self, default_index='simple', index_classes=_WRAPPER_INDEXES,
+ indexes={}, mappings=_WRAPPER_MAPPINGS):
+ self._projects = {}
+ self._mappings = mappings
+ self._indexes = indexes
+ self._default_index = default_index
+
+ # instantiate the classes and set their _project attribute to the one
+ # of the wrapper.
+ for name, cls in index_classes.items():
+ obj = self._indexes.setdefault(name, cls())
+ obj._projects = self._projects
+ obj._index = self
+
+ def __getattr__(self, method_name):
+ """When asking for methods of the wrapper, return the implementation of
+ the wrapped classes, depending the mapping.
+
+ Decorate the methods to switch of implementation if an error occurs
+ """
+ real_method = None
+ if method_name in _WRAPPER_MAPPINGS:
+ obj = self._indexes[_WRAPPER_MAPPINGS[method_name]]
+ real_method = getattr(obj, method_name)
+ else:
+ # the method is not defined in the mappings, so we try first to get
+ # it via the default index, and rely on others if needed.
+ try:
+ real_method = getattr(self._indexes[self._default_index],
+ method_name)
+ except AttributeError:
+ other_indexes = [i for i in self._indexes
+ if i != self._default_index]
+ for index in other_indexes:
+ real_method = getattr(self._indexes[index], method_name, None)
+ if real_method:
+ break
+ if real_method:
+ return switch_index_if_fails(real_method, self)
+ else:
+ raise AttributeError("No index have attribute '%s'" % method_name)
+
diff --git a/src/distutils2/index/xmlrpc.py b/src/distutils2/index/xmlrpc.py
new file mode 100644
--- /dev/null
+++ b/src/distutils2/index/xmlrpc.py
@@ -0,0 +1,175 @@
+import logging
+import xmlrpclib
+
+from distutils2.errors import IrrationalVersionError
+from distutils2.index.base import BaseClient
+from distutils2.index.errors import ProjectNotFound, InvalidSearchField
+from distutils2.index.dist import ReleaseInfo
+
+__all__ = ['Client', 'DEFAULT_XMLRPC_INDEX_URL']
+
+DEFAULT_XMLRPC_INDEX_URL = 'http://python.org/pypi'
+
+_SEARCH_FIELDS = ['name', 'version', 'author', 'author_email', 'maintainer',
+ 'maintainer_email', 'home_page', 'license', 'summary',
+ 'description', 'keywords', 'platform', 'download_url']
+
+
+class Client(BaseClient):
+ """Client to query indexes using XML-RPC method calls.
+
+ If no server_url is specified, use the default PyPI XML-RPC URL,
+ defined in the DEFAULT_XMLRPC_INDEX_URL constant::
+
+ >>> client = XMLRPCClient()
+ >>> client.server_url == DEFAULT_XMLRPC_INDEX_URL
+ True
+
+ >>> client = XMLRPCClient("http://someurl/")
+ >>> client.server_url
+ 'http://someurl/'
+ """
+
+ def __init__(self, server_url=DEFAULT_XMLRPC_INDEX_URL, prefer_final=False,
+ prefer_source=True):
+ super(Client, self).__init__(prefer_final, prefer_source)
+ self.server_url = server_url
+ self._projects = {}
+
+ def get_release(self, requirements, prefer_final=False):
+ """Return a release with all complete metadata and distribution
+ related informations.
+ """
+ prefer_final = self._get_prefer_final(prefer_final)
+ predicate = self._get_version_predicate(requirements)
+ releases = self.get_releases(predicate.name)
+ release = releases.get_last(predicate, prefer_final)
+ self.get_metadata(release.name, "%s" % release.version)
+ self.get_distributions(release.name, "%s" % release.version)
+ return release
+
+ def get_releases(self, requirements, prefer_final=None, show_hidden=True,
+ force_update=False):
+ """Return the list of existing releases for a specific project.
+
+ Cache the results from one call to another.
+
+ If show_hidden is True, return the hidden releases too.
+ If force_update is True, reprocess the index to update the
+ informations (eg. make a new XML-RPC call).
+ ::
+
+ >>> client = XMLRPCClient()
+ >>> client.get_releases('Foo')
+ ['1.1', '1.2', '1.3']
+
+ If no such project exists, raise a ProjectNotFound exception::
+
+ >>> client.get_project_versions('UnexistingProject')
+ ProjectNotFound: UnexistingProject
+
+ """
+ def get_versions(project_name, show_hidden):
+ return self.proxy.package_releases(project_name, show_hidden)
+
+ predicate = self._get_version_predicate(requirements)
+ prefer_final = self._get_prefer_final(prefer_final)
+ project_name = predicate.name
+ if not force_update and (project_name.lower() in self._projects):
+ project = self._projects[project_name.lower()]
+ if not project.contains_hidden and show_hidden:
+ # if hidden releases are requested, and have an existing
+ # list of releases that does not contains hidden ones
+ all_versions = get_versions(project_name, show_hidden)
+ existing_versions = project.get_versions()
+ hidden_versions = list(set(all_versions) -
+ set(existing_versions))
+ for version in hidden_versions:
+ project.add_release(release=ReleaseInfo(project_name,
+ version, index=self._index))
+ else:
+ versions = get_versions(project_name, show_hidden)
+ if not versions:
+ raise ProjectNotFound(project_name)
+ project = self._get_project(project_name)
+ project.add_releases([ReleaseInfo(project_name, version,
+ index=self._index)
+ for version in versions])
+ project = project.filter(predicate)
+ project.sort_releases(prefer_final)
+ return project
+
+
+ def get_distributions(self, project_name, version):
+ """Grab informations about distributions from XML-RPC.
+
+ Return a ReleaseInfo object, with distribution-related informations
+ filled in.
+ """
+ url_infos = self.proxy.release_urls(project_name, version)
+ project = self._get_project(project_name)
+ if version not in project.get_versions():
+ project.add_release(release=ReleaseInfo(project_name, version,
+ index=self._index))
+ release = project.get_release(version)
+ for info in url_infos:
+ packagetype = info['packagetype']
+ dist_infos = {'url': info['url'],
+ 'hashval': info['md5_digest'],
+ 'hashname': 'md5',
+ 'is_external': False,
+ 'python_version': info['python_version']}
+ release.add_distribution(packagetype, **dist_infos)
+ return release
+
+ def get_metadata(self, project_name, version):
+ """Retreive project metadatas.
+
+ Return a ReleaseInfo object, with metadata informations filled in.
+ """
+ metadata = self.proxy.release_data(project_name, version)
+ project = self._get_project(project_name)
+ if version not in project.get_versions():
+ project.add_release(release=ReleaseInfo(project_name, version,
+ index=self._index))
+ release = project.get_release(version)
+ release.set_metadata(metadata)
+ return release
+
+ def search_projects(self, name=None, operator="or", **kwargs):
+ """Find using the keys provided in kwargs.
+
+ You can set operator to "and" or "or".
+ """
+ for key in kwargs:
+ if key not in _SEARCH_FIELDS:
+ raise InvalidSearchField(key)
+ if name:
+ kwargs["name"] = name
+ projects = self.proxy.search(kwargs, operator)
+ for p in projects:
+ project = self._get_project(p['name'])
+ try:
+ project.add_release(release=ReleaseInfo(p['name'],
+ p['version'], metadata={'summary': p['summary']},
+ index=self._index))
+ except IrrationalVersionError, e:
+ logging.warn("Irrational version error found: %s" % e)
+
+ return [self._projects[p['name'].lower()] for p in projects]
+
+ @property
+ def proxy(self):
+ """Property used to return the XMLRPC server proxy.
+
+ If no server proxy is defined yet, creates a new one::
+
+ >>> client = XmlRpcClient()
+ >>> client.proxy()
+ <ServerProxy for python.org/pypi>
+
+ """
+ if not hasattr(self, '_server_proxy'):
+ self._server_proxy = xmlrpclib.ServerProxy(self.server_url)
+
+ return self._server_proxy
diff --git a/src/distutils2/metadata.py b/src/distutils2/metadata.py
--- a/src/distutils2/metadata.py
+++ b/src/distutils2/metadata.py
@@ -183,9 +183,12 @@
"""The metadata of a release.
Supports versions 1.0, 1.1 and 1.2 (auto-detected).
+
+ if from_dict attribute is set, all key/values pairs will be sent to the
+ "set" method, building the metadata from the dict.
"""
def __init__(self, path=None, platform_dependent=False,
- execution_context=None, fileobj=None):
+ execution_context=None, fileobj=None, mapping=None):
self._fields = {}
self.version = None
self.docutils_support = _HAS_DOCUTILS
@@ -195,6 +198,8 @@
elif fileobj is not None:
self.read_file(fileobj)
self.execution_context = execution_context
+ if mapping:
+ self.update(mapping)
def _set_best_version(self):
self.version = _best_version(self._fields)
@@ -322,6 +327,38 @@
for value in values:
self._write_field(fileobject, field, value)
+ def update(self, other=None, **kwargs):
+ """Set metadata values from the given mapping
+
+ Convert the keys to Metadata fields. Given keys that don't match a
+ metadata argument will not be used.
+
+ If overwrite is set to False, just add metadata values that are
+ actually not defined.
+
+ If there is existing values in conflict with the dictionary ones, the
+ new values prevails.
+
+ Empty values (e.g. None and []) are not setted this way.
+ """
+ def _set(key, value):
+ if value not in ([], None) and key in _ATTR2FIELD:
+ self.set(self._convert_name(key), value)
+
+ if other is None:
+ pass
+ elif hasattr(other, 'iteritems'): # iteritems saves memory and lookups
+ for k, v in other.iteritems():
+ _set(k, v)
+ elif hasattr(other, 'keys'):
+ for k in other.keys():
+ _set(k, v)
+ else:
+ for k, v in other:
+ _set(k, v)
+ if kwargs:
+ self.update(kwargs)
+
def set(self, name, value):
"""Control then set a metadata field."""
name = self._convert_name(name)
diff --git a/src/distutils2/tests/pypi_server.py b/src/distutils2/tests/pypi_server.py
--- a/src/distutils2/tests/pypi_server.py
+++ b/src/distutils2/tests/pypi_server.py
@@ -5,17 +5,28 @@
before any use.
"""
+import os
import Queue
+import SocketServer
+import select
import threading
+
from BaseHTTPServer import HTTPServer
from SimpleHTTPServer import SimpleHTTPRequestHandler
-import os.path
-import select
+from SimpleXMLRPCServer import SimpleXMLRPCServer
from distutils2.tests.support import unittest
PYPI_DEFAULT_STATIC_PATH = os.path.dirname(os.path.abspath(__file__)) + "/pypiserver"
+def use_xmlrpc_server(*server_args, **server_kwargs):
+ server_kwargs['serve_xmlrpc'] = True
+ return use_pypi_server(*server_args, **server_kwargs)
+
+def use_http_server(*server_args, **server_kwargs):
+ server_kwargs['serve_xmlrpc'] = False
+ return use_pypi_server(*server_args, **server_kwargs)
+
def use_pypi_server(*server_args, **server_kwargs):
"""Decorator to make use of the PyPIServer for test methods,
just when needed, and not for the entire duration of the testcase.
@@ -50,38 +61,58 @@
"""
def __init__(self, test_static_path=None,
- static_filesystem_paths=["default"], static_uri_paths=["simple"]):
+ static_filesystem_paths=["default"],
+ static_uri_paths=["simple"], serve_xmlrpc=False) :
"""Initialize the server.
+
+ Default behavior is to start the HTTP server. You can either start the
+ xmlrpc server by setting xmlrpc to True. Caution: Only one server will
+ be started.
static_uri_paths and static_base_path are parameters used to provides
respectively the http_paths to serve statically, and where to find the
matching files on the filesystem.
"""
+ # we want to launch the server in a new dedicated thread, to not freeze
+ # tests.
threading.Thread.__init__(self)
self._run = True
- self.httpd = HTTPServer(('', 0), PyPIRequestHandler)
- self.httpd.RequestHandlerClass.log_request = lambda *_: None
- self.httpd.RequestHandlerClass.pypi_server = self
- self.address = (self.httpd.server_name, self.httpd.server_port)
- self.request_queue = Queue.Queue()
- self._requests = []
- self.default_response_status = 200
- self.default_response_headers = [('Content-type', 'text/plain')]
- self.default_response_data = "hello"
-
- # initialize static paths / filesystems
- self.static_uri_paths = static_uri_paths
- if test_static_path is not None:
- static_filesystem_paths.append(test_static_path)
- self.static_filesystem_paths = [PYPI_DEFAULT_STATIC_PATH + "/" + path
- for path in static_filesystem_paths]
+ self._serve_xmlrpc = serve_xmlrpc
+
+ if not self._serve_xmlrpc:
+ self.server = HTTPServer(('', 0), PyPIRequestHandler)
+ self.server.RequestHandlerClass.pypi_server = self
+
+ self.request_queue = Queue.Queue()
+ self._requests = []
+ self.default_response_status = 200
+ self.default_response_headers = [('Content-type', 'text/plain')]
+ self.default_response_data = "hello"
+
+ # initialize static paths / filesystems
+ self.static_uri_paths = static_uri_paths
+ if test_static_path is not None:
+ static_filesystem_paths.append(test_static_path)
+ self.static_filesystem_paths = [PYPI_DEFAULT_STATIC_PATH + "/" + path
+ for path in static_filesystem_paths]
+ else:
+ # xmlrpc server
+ self.server = PyPIXMLRPCServer(('', 0))
+ self.xmlrpc = XMLRPCMockIndex()
+ # register the xmlrpc methods
+ self.server.register_introspection_functions()
+ self.server.register_instance(self.xmlrpc)
+
+ self.address = (self.server.server_name, self.server.server_port)
+ # to not have unwanted outputs.
+ self.server.RequestHandlerClass.log_request = lambda *_: None
def run(self):
# loop because we can't stop it otherwise, for python < 2.6
while self._run:
- r, w, e = select.select([self.httpd], [], [], 0.5)
+ r, w, e = select.select([self.server], [], [], 0.5)
if r:
- self.httpd.handle_request()
+ self.server.handle_request()
def stop(self):
"""self shutdown is not supported for python < 2.6"""
@@ -191,3 +222,180 @@
self.send_header(header, value)
self.end_headers()
self.wfile.write(data)
+
+class PyPIXMLRPCServer(SimpleXMLRPCServer):
+ def server_bind(self):
+ """Override server_bind to store the server name."""
+ SocketServer.TCPServer.server_bind(self)
+ host, port = self.socket.getsockname()[:2]
+ self.server_name = socket.getfqdn(host)
+ self.server_port = port
+
+class MockDist(object):
+ """Fake distribution, used in the Mock PyPI Server"""
+ def __init__(self, name, version="1.0", hidden=False, url="http://url/",
+ type="sdist", filename="", size=10000,
+ digest="123456", downloads=7, has_sig=False,
+ python_version="source", comment="comment",
+ author="John Doe", author_email="john at doe.name",
+ maintainer="Main Tayner", maintainer_email="maintainer_mail",
+ project_url="http://project_url/", homepage="http://homepage/",
+ keywords="", platform="UNKNOWN", classifiers=[], licence="",
+ description="Description", summary="Summary", stable_version="",
+ ordering="", documentation_id="", code_kwalitee_id="",
+ installability_id="", obsoletes=[], obsoletes_dist=[],
+ provides=[], provides_dist=[], requires=[], requires_dist=[],
+ requires_external=[], requires_python=""):
+
+ # basic fields
+ self.name = name
+ self.version = version
+ self.hidden = hidden
+
+ # URL infos
+ self.url = url
+ self.digest = digest
+ self.downloads = downloads
+ self.has_sig = has_sig
+ self.python_version = python_version
+ self.comment = comment
+ self.type = type
+
+ # metadata
+ self.author = author
+ self.author_email = author_email
+ self.maintainer = maintainer
+ self.maintainer_email = maintainer_email
+ self.project_url = project_url
+ self.homepage = homepage
+ self.keywords = keywords
+ self.platform = platform
+ self.classifiers = classifiers
+ self.licence = licence
+ self.description = description
+ self.summary = summary
+ self.stable_version = stable_version
+ self.ordering = ordering
+ self.cheesecake_documentation_id = documentation_id
+ self.cheesecake_code_kwalitee_id = code_kwalitee_id
+ self.cheesecake_installability_id = installability_id
+
+ self.obsoletes = obsoletes
+ self.obsoletes_dist = obsoletes_dist
+ self.provides = provides
+ self.provides_dist = provides_dist
+ self.requires = requires
+ self.requires_dist = requires_dist
+ self.requires_external = requires_external
+ self.requires_python = requires_python
+
+ def url_infos(self):
+ return {
+ 'url': self.url,
+ 'packagetype': self.type,
+ 'filename': 'filename.tar.gz',
+ 'size': '6000',
+ 'md5_digest': self.digest,
+ 'downloads': self.downloads,
+ 'has_sig': self.has_sig,
+ 'python_version': self.python_version,
+ 'comment_text': self.comment,
+ }
+
+ def metadata(self):
+ return {
+ 'maintainer': self.maintainer,
+ 'project_url': [self.project_url],
+ 'maintainer_email': self.maintainer_email,
+ 'cheesecake_code_kwalitee_id': self.cheesecake_code_kwalitee_id,
+ 'keywords': self.keywords,
+ 'obsoletes_dist': self.obsoletes_dist,
+ 'requires_external': self.requires_external,
+ 'author': self.author,
+ 'author_email': self.author_email,
+ 'download_url': self.url,
+ 'platform': self.platform,
+ 'version': self.version,
+ 'obsoletes': self.obsoletes,
+ 'provides': self.provides,
+ 'cheesecake_documentation_id': self.cheesecake_documentation_id,
+ '_pypi_hidden': self.hidden,
+ 'description': self.description,
+ '_pypi_ordering': 19,
+ 'requires_dist': self.requires_dist,
+ 'requires_python': self.requires_python,
+ 'classifiers': [],
+ 'name': self.name,
+ 'licence': self.licence,
+ 'summary': self.summary,
+ 'home_page': self.homepage,
+ 'stable_version': self.stale_version,
+ 'provides_dist': self.provides_dist,
+ 'requires': self.requires,
+ 'cheesecake_installability_id': self.cheesecake_installability_id,
+ }
+
+ def search_result(self):
+ return {
+ '_pypi_ordering': 0,
+ 'version': self.version,
+ 'name': self.name,
+ 'summary': self.summary,
+ }
+
+class XMLRPCMockIndex(object):
+ """Mock XMLRPC server"""
+
+ def __init__(self, dists=[]):
+ self._dists = dists
+
+ def add_distributions(self, dists):
+ for dist in dists:
+ self._dists.append(MockDist(**dist))
+
+ def set_distributions(self, dists):
+ self._dists = []
+ self.add_distributions(dists)
+
+ def set_search_result(self, result):
+ """set a predefined search result"""
+ self._search_result = result
+
+ def _get_search_results(self):
+ results = []
+ for name in self._search_result:
+ found_dist = [d for d in self._dists if d.name == name]
+ if found_dist:
+ results.append(found_dist[0])
+ else:
+ dist = MockDist(name)
+ results.append(dist)
+ self._dists.append(dist)
+ return [r.search_result() for r in results]
+
+ def list_package(self):
+ return [d.name for d in self._dists]
+
+ def package_releases(self, package_name, show_hidden=False):
+ if show_hidden:
+ # return all
+ return [d.version for d in self._dists if d.name == package_name]
+ else:
+ # return only un-hidden
+ return [d.version for d in self._dists if d.name == package_name
+ and not d.hidden]
+
+ def release_urls(self, package_name, version):
+ return [d.url_infos() for d in self._dists
+ if d.name == package_name and d.version == version]
+
+ def release_data(self, package_name, version):
+ release = [d for d in self._dists
+ if d.name == package_name and d.version == version]
+ if release:
+ return release[0].metadata()
+ else:
+ return {}
+
+ def search(self, spec, operator="and"):
+ return self._get_search_results()
diff --git a/src/distutils2/tests/pypiserver/project_list/simple/index.html b/src/distutils2/tests/pypiserver/project_list/simple/index.html
new file mode 100644
--- /dev/null
+++ b/src/distutils2/tests/pypiserver/project_list/simple/index.html
@@ -0,0 +1,5 @@
+<a class="test" href="yeah">FooBar-bar</a>
+<a class="test" href="yeah">Foobar-baz</a>
+<a class="test" href="yeah">Baz-FooBar</a>
+<a class="test" href="yeah">Baz</a>
+<a class="test" href="yeah">Foo</a>
diff --git a/src/distutils2/tests/test_pypi_dist.py b/src/distutils2/tests/test_index_dist.py
rename from src/distutils2/tests/test_pypi_dist.py
rename to src/distutils2/tests/test_index_dist.py
--- a/src/distutils2/tests/test_pypi_dist.py
+++ b/src/distutils2/tests/test_index_dist.py
@@ -1,72 +1,93 @@
-"""Tests for the distutils2.pypi.dist module."""
+"""Tests for the distutils2.index.dist module."""
+
+import os
from distutils2.tests.pypi_server import use_pypi_server
from distutils2.tests import run_unittest
from distutils2.tests.support import unittest, TempdirManager
from distutils2.version import VersionPredicate
-from distutils2.pypi.errors import HashDoesNotMatch, UnsupportedHashName
-from distutils2.pypi.dist import (PyPIDistribution as Dist,
- PyPIDistributions as Dists,
- split_archive_name)
+from distutils2.index.errors import HashDoesNotMatch, UnsupportedHashName
+from distutils2.index.dist import (ReleaseInfo, ReleasesList, DistInfo,
+ split_archive_name, get_infos_from_url)
-class TestPyPIDistribution(TempdirManager,
- unittest.TestCase):
- """Tests the pypi.dist.PyPIDistribution class"""
+def Dist(*args, **kwargs):
+ # DistInfo takes a release as a first parameter, avoid this in tests.
+ return DistInfo(None, *args, **kwargs)
+
+
+class TestReleaseInfo(unittest.TestCase):
def test_instantiation(self):
- # Test the Distribution class provides us the good attributes when
+ # Test the DistInfo class provides us the good attributes when
# given on construction
- dist = Dist("FooBar", "1.1")
- self.assertEqual("FooBar", dist.name)
- self.assertEqual("1.1", "%s" % dist.version)
+ release = ReleaseInfo("FooBar", "1.1")
+ self.assertEqual("FooBar", release.name)
+ self.assertEqual("1.1", "%s" % release.version)
- def test_create_from_url(self):
- # Test that the Distribution object can be built from a single URL
+ def test_add_dist(self):
+ # empty distribution type should assume "sdist"
+ release = ReleaseInfo("FooBar", "1.1")
+ release.add_distribution(url="http://example.org/")
+ # should not fail
+ release['sdist']
+
+ def test_get_unknown_distribution(self):
+ # should raise a KeyError
+ pass
+
+ def test_get_infos_from_url(self):
+ # Test that the the URLs are parsed the right way
url_list = {
'FooBar-1.1.0.tar.gz': {
'name': 'foobar', # lowercase the name
- 'version': '1.1',
+ 'version': '1.1.0',
},
'Foo-Bar-1.1.0.zip': {
'name': 'foo-bar', # keep the dash
- 'version': '1.1',
+ 'version': '1.1.0',
},
'foobar-1.1b2.tar.gz#md5=123123123123123': {
'name': 'foobar',
'version': '1.1b2',
- 'url': {
- 'url': 'http://test.tld/foobar-1.1b2.tar.gz', # no hash
- 'hashval': '123123123123123',
- 'hashname': 'md5',
- }
+ 'url': 'http://example.org/foobar-1.1b2.tar.gz', # no hash
+ 'hashval': '123123123123123',
+ 'hashname': 'md5',
},
'foobar-1.1-rc2.tar.gz': { # use suggested name
'name': 'foobar',
'version': '1.1c2',
- 'url': {
- 'url': 'http://test.tld/foobar-1.1-rc2.tar.gz',
- }
+ 'url': 'http://example.org/foobar-1.1-rc2.tar.gz',
}
}
for url, attributes in url_list.items():
- dist = Dist.from_url("http://test.tld/" + url)
- for attribute, value in attributes.items():
- if isinstance(value, dict):
- mylist = getattr(dist, attribute)
- for val in value.keys():
- self.assertEqual(value[val], mylist[val])
+ # for each url
+ infos = get_infos_from_url("http://example.org/" + url)
+ for attribute, expected in attributes.items():
+ got = infos.get(attribute)
+ if attribute == "version":
+ self.assertEqual("%s" % got, expected)
else:
- if attribute == "version":
- self.assertEqual(str(getattr(dist, "version")), value)
- else:
- self.assertEqual(getattr(dist, attribute), value)
+ self.assertEqual(got, expected)
+
+ def test_split_archive_name(self):
+ # Test we can split the archive names
+ names = {
+ 'foo-bar-baz-1.0-rc2': ('foo-bar-baz', '1.0c2'),
+ 'foo-bar-baz-1.0': ('foo-bar-baz', '1.0'),
+ 'foobarbaz-1.0': ('foobarbaz', '1.0'),
+ }
+ for name, results in names.items():
+ self.assertEqual(results, split_archive_name(name))
+
+
+class TestDistInfo(TempdirManager, unittest.TestCase):
def test_get_url(self):
# Test that the url property works well
- d = Dist("foobar", "1.1", url="test_url")
+ d = Dist(url="test_url")
self.assertDictEqual(d.url, {
"url": "test_url",
"is_external": True,
@@ -83,13 +104,13 @@
"hashname": None,
"hashval": None,
})
- self.assertEqual(2, len(d._urls))
+ self.assertEqual(2, len(d.urls))
- def test_comparaison(self):
- # Test that we can compare PyPIDistributions
- foo1 = Dist("foo", "1.0")
- foo2 = Dist("foo", "2.0")
- bar = Dist("bar", "2.0")
+ def test_comparison(self):
+ # Test that we can compare DistInfoributionInfoList
+ foo1 = ReleaseInfo("foo", "1.0")
+ foo2 = ReleaseInfo("foo", "2.0")
+ bar = ReleaseInfo("bar", "2.0")
# assert we use the version to compare
self.assertTrue(foo1 < foo2)
self.assertFalse(foo1 > foo2)
@@ -98,34 +119,23 @@
# assert we can't compare dists with different names
self.assertRaises(TypeError, foo1.__eq__, bar)
- def test_split_archive_name(self):
- # Test we can split the archive names
- names = {
- 'foo-bar-baz-1.0-rc2': ('foo-bar-baz', '1.0c2'),
- 'foo-bar-baz-1.0': ('foo-bar-baz', '1.0'),
- 'foobarbaz-1.0': ('foobarbaz', '1.0'),
- }
- for name, results in names.items():
- self.assertEqual(results, split_archive_name(name))
-
@use_pypi_server("downloads_with_md5")
def test_download(self, server):
# Download is possible, and the md5 is checked if given
url = "%s/simple/foobar/foobar-0.1.tar.gz" % server.full_address
# check md5 if given
- dist = Dist("FooBar", "0.1", url=url, url_hashname="md5",
- url_hashval="d41d8cd98f00b204e9800998ecf8427e")
+ dist = Dist(url=url, hashname="md5",
+ hashval="d41d8cd98f00b204e9800998ecf8427e")
dist.download(self.mkdtemp())
# a wrong md5 fails
- dist2 = Dist("FooBar", "0.1", url=url,
- url_hashname="md5", url_hashval="wrongmd5")
+ dist2 = Dist(url=url, hashname="md5", hashval="wrongmd5")
self.assertRaises(HashDoesNotMatch, dist2.download, self.mkdtemp())
# we can omit the md5 hash
- dist3 = Dist("FooBar", "0.1", url=url)
+ dist3 = Dist(url=url)
dist3.download(self.mkdtemp())
# and specify a temporary location
@@ -134,106 +144,104 @@
dist3.download(path=path1)
# and for a new one
path2_base = self.mkdtemp()
- dist4 = Dist("FooBar", "0.1", url=url)
+ dist4 = Dist(url=url)
path2 = dist4.download(path=path2_base)
self.assertTrue(path2_base in path2)
def test_hashname(self):
# Invalid hashnames raises an exception on assignation
- Dist("FooBar", "0.1", url_hashname="md5", url_hashval="value")
+ Dist(hashname="md5", hashval="value")
- self.assertRaises(UnsupportedHashName, Dist, "FooBar", "0.1",
- url_hashname="invalid_hashname", url_hashval="value")
+ self.assertRaises(UnsupportedHashName, Dist,
+ hashname="invalid_hashname",
+ hashval="value")
-class TestPyPIDistributions(unittest.TestCase):
+class TestReleasesList(unittest.TestCase):
def test_filter(self):
# Test we filter the distributions the right way, using version
# predicate match method
- dists = Dists((
- Dist("FooBar", "1.1"),
- Dist("FooBar", "1.1.1"),
- Dist("FooBar", "1.2"),
- Dist("FooBar", "1.2.1"),
+ releases = ReleasesList('FooBar', (
+ ReleaseInfo("FooBar", "1.1"),
+ ReleaseInfo("FooBar", "1.1.1"),
+ ReleaseInfo("FooBar", "1.2"),
+ ReleaseInfo("FooBar", "1.2.1"),
))
- filtered = dists.filter(VersionPredicate("FooBar (<1.2)"))
- self.assertNotIn(dists[2], filtered)
- self.assertNotIn(dists[3], filtered)
- self.assertIn(dists[0], filtered)
- self.assertIn(dists[1], filtered)
+ filtered = releases.filter(VersionPredicate("FooBar (<1.2)"))
+ self.assertNotIn(releases[2], filtered)
+ self.assertNotIn(releases[3], filtered)
+ self.assertIn(releases[0], filtered)
+ self.assertIn(releases[1], filtered)
def test_append(self):
# When adding a new item to the list, the behavior is to test if
- # a distribution with the same name and version number already exists,
- # and if so, to add url informations to the existing PyPIDistribution
+ # a release with the same name and version number already exists,
+ # and if so, to add a new distribution for it. If the distribution type
+ # is already defined too, add url informations to the existing DistInfo
# object.
- # If no object matches, just add "normally" the object to the list.
- dists = Dists([
- Dist("FooBar", "1.1", url="external_url", type="source"),
+ releases = ReleasesList("FooBar", [
+ ReleaseInfo("FooBar", "1.1", url="external_url",
+ dist_type="sdist"),
])
- self.assertEqual(1, len(dists))
- dists.append(Dist("FooBar", "1.1", url="internal_url",
- url_is_external=False, type="source"))
- self.assertEqual(1, len(dists))
- self.assertEqual(2, len(dists[0]._urls))
+ self.assertEqual(1, len(releases))
+ releases.add_release(release=ReleaseInfo("FooBar", "1.1",
+ url="internal_url",
+ is_external=False,
+ dist_type="sdist"))
+ self.assertEqual(1, len(releases))
+ self.assertEqual(2, len(releases[0]['sdist'].urls))
- dists.append(Dist("Foobar", "1.1.1", type="source"))
- self.assertEqual(2, len(dists))
+ releases.add_release(release=ReleaseInfo("FooBar", "1.1.1",
+ dist_type="sdist"))
+ self.assertEqual(2, len(releases))
# when adding a distribution whith a different type, a new distribution
# has to be added.
- dists.append(Dist("Foobar", "1.1.1", type="binary"))
- self.assertEqual(3, len(dists))
+ releases.add_release(release=ReleaseInfo("FooBar", "1.1.1",
+ dist_type="bdist"))
+ self.assertEqual(2, len(releases))
+ self.assertEqual(2, len(releases[1].dists))
def test_prefer_final(self):
# Can order the distributions using prefer_final
- fb10 = Dist("FooBar", "1.0") # final distribution
- fb11a = Dist("FooBar", "1.1a1") # alpha
- fb12a = Dist("FooBar", "1.2a1") # alpha
- fb12b = Dist("FooBar", "1.2b1") # beta
- dists = Dists([fb10, fb11a, fb12a, fb12b])
+ fb10 = ReleaseInfo("FooBar", "1.0") # final distribution
+ fb11a = ReleaseInfo("FooBar", "1.1a1") # alpha
+ fb12a = ReleaseInfo("FooBar", "1.2a1") # alpha
+ fb12b = ReleaseInfo("FooBar", "1.2b1") # beta
+ dists = ReleasesList("FooBar", [fb10, fb11a, fb12a, fb12b])
- dists.sort_distributions(prefer_final=True)
+ dists.sort_releases(prefer_final=True)
self.assertEqual(fb10, dists[0])
- dists.sort_distributions(prefer_final=False)
+ dists.sort_releases(prefer_final=False)
self.assertEqual(fb12b, dists[0])
- def test_prefer_source(self):
- # Ordering support prefer_source
- fb_source = Dist("FooBar", "1.0", type="source")
- fb_binary = Dist("FooBar", "1.0", type="binary")
- fb2_binary = Dist("FooBar", "2.0", type="binary")
- dists = Dists([fb_binary, fb_source])
-
- dists.sort_distributions(prefer_source=True)
- self.assertEqual(fb_source, dists[0])
-
- dists.sort_distributions(prefer_source=False)
- self.assertEqual(fb_binary, dists[0])
-
- dists.append(fb2_binary)
- dists.sort_distributions(prefer_source=True)
- self.assertEqual(fb2_binary, dists[0])
-
- def test_get_same_name_and_version(self):
- # PyPIDistributions can return a list of "duplicates"
- fb_source = Dist("FooBar", "1.0", type="source")
- fb_binary = Dist("FooBar", "1.0", type="binary")
- fb2_binary = Dist("FooBar", "2.0", type="binary")
- dists = Dists([fb_binary, fb_source, fb2_binary])
- duplicates = dists.get_same_name_and_version()
- self.assertTrue(1, len(duplicates))
- self.assertIn(fb_source, duplicates[0])
+# def test_prefer_source(self):
+# # Ordering support prefer_source
+# fb_source = Dist("FooBar", "1.0", type="source")
+# fb_binary = Dist("FooBar", "1.0", type="binary")
+# fb2_binary = Dist("FooBar", "2.0", type="binary")
+# dists = ReleasesList([fb_binary, fb_source])
+#
+# dists.sort_distributions(prefer_source=True)
+# self.assertEqual(fb_source, dists[0])
+#
+# dists.sort_distributions(prefer_source=False)
+# self.assertEqual(fb_binary, dists[0])
+#
+# dists.append(fb2_binary)
+# dists.sort_distributions(prefer_source=True)
+# self.assertEqual(fb2_binary, dists[0])
def test_suite():
suite = unittest.TestSuite()
- suite.addTest(unittest.makeSuite(TestPyPIDistribution))
- suite.addTest(unittest.makeSuite(TestPyPIDistributions))
+ suite.addTest(unittest.makeSuite(TestDistInfo))
+ suite.addTest(unittest.makeSuite(TestReleaseInfo))
+ suite.addTest(unittest.makeSuite(TestReleasesList))
return suite
if __name__ == '__main__':
diff --git a/src/distutils2/tests/test_pypi_simple.py b/src/distutils2/tests/test_index_simple.py
rename from src/distutils2/tests/test_pypi_simple.py
rename to src/distutils2/tests/test_index_simple.py
--- a/src/distutils2/tests/test_pypi_simple.py
+++ b/src/distutils2/tests/test_index_simple.py
@@ -3,35 +3,34 @@
"""
import sys
import os
-import shutil
import urllib2
-from distutils2.pypi import simple
-from distutils2.tests import support, run_unittest
+from distutils2.index.simple import Crawler
+from distutils2.tests import support
from distutils2.tests.support import unittest
from distutils2.tests.pypi_server import (use_pypi_server, PyPIServer,
PYPI_DEFAULT_STATIC_PATH)
-class PyPISimpleTestCase(support.TempdirManager,
- unittest.TestCase):
+class SimpleCrawlerTestCase(support.TempdirManager, unittest.TestCase):
- def _get_simple_index(self, server, base_url="/simple/", hosts=None,
+ def _get_simple_crawler(self, server, base_url="/simple/", hosts=None,
*args, **kwargs):
- """Build and return a SimpleSimpleIndex instance, with the test server
+ """Build and return a SimpleIndex instance, with the test server
urls
"""
if hosts is None:
hosts = (server.full_address.strip("http://"),)
kwargs['hosts'] = hosts
- return simple.SimpleIndex(server.full_address + base_url, *args,
+ return Crawler(server.full_address + base_url, *args,
**kwargs)
- def test_bad_urls(self):
- index = simple.SimpleIndex()
+ @use_pypi_server()
+ def test_bad_urls(self, server):
+ crawler = Crawler()
url = 'http://127.0.0.1:0/nonesuch/test_simple'
try:
- v = index._open_url(url)
+ v = crawler._open_url(url)
except Exception, v:
self.assertTrue(url in str(v))
else:
@@ -40,10 +39,10 @@
# issue 16
# easy_install inquant.contentmirror.plone breaks because of a typo
# in its home URL
- index = simple.SimpleIndex(hosts=('www.example.com',))
+ crawler = Crawler(hosts=('example.org',))
url = 'url:%20https://svn.plone.org/svn/collective/inquant.contentmirror.plone/trunk'
try:
- v = index._open_url(url)
+ v = crawler._open_url(url)
except Exception, v:
self.assertTrue(url in str(v))
else:
@@ -55,10 +54,10 @@
old_urlopen = urllib2.urlopen
urllib2.urlopen = _urlopen
- url = 'http://example.com'
+ url = 'http://example.org'
try:
try:
- v = index._open_url(url)
+ v = crawler._open_url(url)
except Exception, v:
self.assertTrue('line' in str(v))
else:
@@ -69,91 +68,91 @@
# issue 20
url = 'http://http://svn.pythonpaste.org/Paste/wphp/trunk'
try:
- index._open_url(url)
+ crawler._open_url(url)
except Exception, v:
self.assertTrue('nonnumeric port' in str(v))
# issue #160
if sys.version_info[0] == 2 and sys.version_info[1] == 7:
# this should not fail
- url = 'http://example.com'
+ url = server.full_address
page = ('<a href="http://www.famfamfam.com]('
'http://www.famfamfam.com/">')
- index._process_url(url, page)
+ crawler._process_url(url, page)
@use_pypi_server("test_found_links")
def test_found_links(self, server):
- # Browse the index, asking for a specified distribution version
+ # Browse the index, asking for a specified release version
# The PyPI index contains links for version 1.0, 1.1, 2.0 and 2.0.1
- index = self._get_simple_index(server)
- last_distribution = index.get("foobar")
+ crawler = self._get_simple_crawler(server)
+ last_release = crawler.get_release("foobar")
# we have scanned the index page
self.assertIn(server.full_address + "/simple/foobar/",
- index._processed_urls)
+ crawler._processed_urls)
- # we have found 4 distributions in this page
- self.assertEqual(len(index._distributions["foobar"]), 4)
+ # we have found 4 releases in this page
+ self.assertEqual(len(crawler._projects["foobar"]), 4)
# and returned the most recent one
- self.assertEqual("%s" % last_distribution.version, '2.0.1')
+ self.assertEqual("%s" % last_release.version, '2.0.1')
def test_is_browsable(self):
- index = simple.SimpleIndex(follow_externals=False)
- self.assertTrue(index._is_browsable(index.index_url + "test"))
+ crawler = Crawler(follow_externals=False)
+ self.assertTrue(crawler._is_browsable(crawler.index_url + "test"))
# Now, when following externals, we can have a list of hosts to trust.
# and don't follow other external links than the one described here.
- index = simple.SimpleIndex(hosts=["pypi.python.org", "test.org"],
+ crawler = Crawler(hosts=["pypi.python.org", "example.org"],
follow_externals=True)
good_urls = (
"http://pypi.python.org/foo/bar",
"http://pypi.python.org/simple/foobar",
- "http://test.org",
- "http://test.org/",
- "http://test.org/simple/",
+ "http://example.org",
+ "http://example.org/",
+ "http://example.org/simple/",
)
bad_urls = (
"http://python.org",
- "http://test.tld",
+ "http://example.tld",
)
for url in good_urls:
- self.assertTrue(index._is_browsable(url))
+ self.assertTrue(crawler._is_browsable(url))
for url in bad_urls:
- self.assertFalse(index._is_browsable(url))
+ self.assertFalse(crawler._is_browsable(url))
# allow all hosts
- index = simple.SimpleIndex(follow_externals=True, hosts=("*",))
- self.assertTrue(index._is_browsable("http://an-external.link/path"))
- self.assertTrue(index._is_browsable("pypi.test.tld/a/path"))
+ crawler = Crawler(follow_externals=True, hosts=("*",))
+ self.assertTrue(crawler._is_browsable("http://an-external.link/path"))
+ self.assertTrue(crawler._is_browsable("pypi.example.org/a/path"))
# specify a list of hosts we want to allow
- index = simple.SimpleIndex(follow_externals=True,
- hosts=("*.test.tld",))
- self.assertFalse(index._is_browsable("http://an-external.link/path"))
- self.assertTrue(index._is_browsable("http://pypi.test.tld/a/path"))
+ crawler = Crawler(follow_externals=True,
+ hosts=("*.example.org",))
+ self.assertFalse(crawler._is_browsable("http://an-external.link/path"))
+ self.assertTrue(crawler._is_browsable("http://pypi.example.org/a/path"))
@use_pypi_server("with_externals")
- def test_restrict_hosts(self, server):
+ def test_follow_externals(self, server):
# Include external pages
# Try to request the package index, wich contains links to "externals"
# resources. They have to be scanned too.
- index = self._get_simple_index(server, follow_externals=True)
- index.get("foobar")
+ crawler = self._get_simple_crawler(server, follow_externals=True)
+ crawler.get_release("foobar")
self.assertIn(server.full_address + "/external/external.html",
- index._processed_urls)
+ crawler._processed_urls)
@use_pypi_server("with_real_externals")
def test_restrict_hosts(self, server):
# Only use a list of allowed hosts is possible
# Test that telling the simple pyPI client to not retrieve external
# works
- index = self._get_simple_index(server, follow_externals=False)
- index.get("foobar")
+ crawler = self._get_simple_crawler(server, follow_externals=False)
+ crawler.get_release("foobar")
self.assertNotIn(server.full_address + "/external/external.html",
- index._processed_urls)
+ crawler._processed_urls)
@use_pypi_server(static_filesystem_paths=["with_externals"],
static_uri_paths=["simple", "external"])
@@ -167,23 +166,26 @@
# - someone manually coindexes this link (with the md5 in the url) onto
# an external page accessible from the package page.
# - someone reuploads the package (with a different md5)
- # - while easy_installing, an MD5 error occurs because the external link
- # is used
+ # - while easy_installing, an MD5 error occurs because the external
+ # link is used
# -> The index should use the link from pypi, not the external one.
# start an index server
index_url = server.full_address + '/simple/'
# scan a test index
- index = simple.SimpleIndex(index_url, follow_externals=True)
- dists = index.find("foobar")
+ crawler = Crawler(index_url, follow_externals=True)
+ releases = crawler.get_releases("foobar")
server.stop()
# we have only one link, because links are compared without md5
- self.assertEqual(len(dists), 1)
+ self.assertEqual(1, len(releases))
+ self.assertEqual(1, len(releases[0].dists))
# the link should be from the index
- self.assertEqual('12345678901234567', dists[0].url['hashval'])
- self.assertEqual('md5', dists[0].url['hashname'])
+ self.assertEqual(2, len(releases[0].dists['sdist'].urls))
+ self.assertEqual('12345678901234567',
+ releases[0].dists['sdist'].url['hashval'])
+ self.assertEqual('md5', releases[0].dists['sdist'].url['hashname'])
@use_pypi_server(static_filesystem_paths=["with_norel_links"],
static_uri_paths=["simple", "external"])
@@ -193,22 +195,22 @@
# to not be processed by the package index, while processing "pages".
# process the pages
- index = self._get_simple_index(server, follow_externals=True)
- index.find("foobar")
+ crawler = self._get_simple_crawler(server, follow_externals=True)
+ crawler.get_releases("foobar")
# now it should have processed only pages with links rel="download"
# and rel="homepage"
self.assertIn("%s/simple/foobar/" % server.full_address,
- index._processed_urls) # it's the simple index page
+ crawler._processed_urls) # it's the simple index page
self.assertIn("%s/external/homepage.html" % server.full_address,
- index._processed_urls) # the external homepage is rel="homepage"
+ crawler._processed_urls) # the external homepage is rel="homepage"
self.assertNotIn("%s/external/nonrel.html" % server.full_address,
- index._processed_urls) # this link contains no rel=*
+ crawler._processed_urls) # this link contains no rel=*
self.assertNotIn("%s/unrelated-0.2.tar.gz" % server.full_address,
- index._processed_urls) # linked from simple index (no rel)
+ crawler._processed_urls) # linked from simple index (no rel)
self.assertIn("%s/foobar-0.1.tar.gz" % server.full_address,
- index._processed_urls) # linked from simple index (rel)
+ crawler._processed_urls) # linked from simple index (rel)
self.assertIn("%s/foobar-2.0.tar.gz" % server.full_address,
- index._processed_urls) # linked from external homepage (rel)
+ crawler._processed_urls) # linked from external homepage (rel)
def test_uses_mirrors(self):
# When the main repository seems down, try using the given mirrors"""
@@ -218,18 +220,18 @@
try:
# create the index using both servers
- index = simple.SimpleIndex(server.full_address + "/simple/",
+ crawler = Crawler(server.full_address + "/simple/",
hosts=('*',), timeout=1, # set the timeout to 1s for the tests
- mirrors=[mirror.full_address + "/simple/",])
+ mirrors=[mirror.full_address])
# this should not raise a timeout
- self.assertEqual(4, len(index.find("foo")))
+ self.assertEqual(4, len(crawler.get_releases("foo")))
finally:
mirror.stop()
def test_simple_link_matcher(self):
# Test that the simple link matcher yields the right links"""
- index = simple.SimpleIndex(follow_externals=False)
+ crawler = Crawler(follow_externals=False)
# Here, we define:
# 1. one link that must be followed, cause it's a download one
@@ -237,27 +239,33 @@
# returns false for it.
# 3. one link that must be followed cause it's a homepage that is
# browsable
- self.assertTrue(index._is_browsable("%stest" % index.index_url))
- self.assertFalse(index._is_browsable("http://dl-link2"))
+ # 4. one link that must be followed, because it contain a md5 hash
+ self.assertTrue(crawler._is_browsable("%stest" % crawler.index_url))
+ self.assertFalse(crawler._is_browsable("http://dl-link2"))
content = """
<a href="http://dl-link1" rel="download">download_link1</a>
<a href="http://dl-link2" rel="homepage">homepage_link1</a>
- <a href="%stest" rel="homepage">homepage_link2</a>
- """ % index.index_url
+ <a href="%(index_url)stest" rel="homepage">homepage_link2</a>
+ <a href="%(index_url)stest/foobar-1.tar.gz#md5=abcdef>download_link2</a>
+ """ % {'index_url': crawler.index_url }
# Test that the simple link matcher yield the good links.
- generator = index._simple_link_matcher(content, index.index_url)
+ generator = crawler._simple_link_matcher(content, crawler.index_url)
+ self.assertEqual(('%stest/foobar-1.tar.gz#md5=abcdef' % crawler.index_url,
+ True), generator.next())
self.assertEqual(('http://dl-link1', True), generator.next())
- self.assertEqual(('%stest' % index.index_url, False),
+ self.assertEqual(('%stest' % crawler.index_url, False),
generator.next())
self.assertRaises(StopIteration, generator.next)
- # Follow the external links is possible
- index.follow_externals = True
- generator = index._simple_link_matcher(content, index.index_url)
+ # Follow the external links is possible (eg. homepages)
+ crawler.follow_externals = True
+ generator = crawler._simple_link_matcher(content, crawler.index_url)
+ self.assertEqual(('%stest/foobar-1.tar.gz#md5=abcdef' % crawler.index_url,
+ True), generator.next())
self.assertEqual(('http://dl-link1', True), generator.next())
self.assertEqual(('http://dl-link2', False), generator.next())
- self.assertEqual(('%stest' % index.index_url, False),
+ self.assertEqual(('%stest' % crawler.index_url, False),
generator.next())
self.assertRaises(StopIteration, generator.next)
@@ -265,12 +273,44 @@
# Test that we can browse local files"""
index_path = os.sep.join(["file://" + PYPI_DEFAULT_STATIC_PATH,
"test_found_links", "simple"])
- index = simple.SimpleIndex(index_path)
- dists = index.find("foobar")
+ crawler = Crawler(index_path)
+ dists = crawler.get_releases("foobar")
self.assertEqual(4, len(dists))
+ def test_get_link_matcher(self):
+ crawler = Crawler("http://example.org")
+ self.assertEqual('_simple_link_matcher', crawler._get_link_matcher(
+ "http://example.org/some/file").__name__)
+ self.assertEqual('_default_link_matcher', crawler._get_link_matcher(
+ "http://other-url").__name__)
+
+ def test_default_link_matcher(self):
+ crawler = Crawler("http://example.org", mirrors=[])
+ crawler.follow_externals = True
+ crawler._is_browsable = lambda *args:True
+ base_url = "http://example.org/some/file/"
+ content = """
+<a href="../homepage" rel="homepage">link</a>
+<a href="../download" rel="download">link2</a>
+<a href="../simpleurl">link2</a>
+ """
+ found_links = dict(crawler._default_link_matcher(content,
+ base_url)).keys()
+ self.assertIn('http://example.org/some/homepage', found_links)
+ self.assertIn('http://example.org/some/simpleurl', found_links)
+ self.assertIn('http://example.org/some/download', found_links)
+
+ @use_pypi_server("project_list")
+ def test_search_projects(self, server):
+ # we can search the index for some projects, on their names
+ # the case used no matters here
+ crawler = self._get_simple_crawler(server)
+ projects = [p.name for p in crawler.search_projects("Foobar")]
+ self.assertListEqual(['FooBar-bar', 'Foobar-baz', 'Baz-FooBar'],
+ projects)
+
def test_suite():
- return unittest.makeSuite(PyPISimpleTestCase)
+ return unittest.makeSuite(SimpleCrawlerTestCase)
if __name__ == '__main__':
unittest.main(defaultTest="test_suite")
diff --git a/src/distutils2/tests/test_index_xmlrpc.py b/src/distutils2/tests/test_index_xmlrpc.py
new file mode 100644
--- /dev/null
+++ b/src/distutils2/tests/test_index_xmlrpc.py
@@ -0,0 +1,92 @@
+"""Tests for the distutils2.index.xmlrpc module."""
+
+from distutils2.tests.pypi_server import use_xmlrpc_server
+from distutils2.tests import run_unittest
+from distutils2.tests.support import unittest
+from distutils2.index.xmlrpc import Client, InvalidSearchField, ProjectNotFound
+
+
+class TestXMLRPCClient(unittest.TestCase):
+ def _get_client(self, server, *args, **kwargs):
+ return Client(server.full_address, *args, **kwargs)
+
+ @use_xmlrpc_server()
+ def test_search_projects(self, server):
+ client = self._get_client(server)
+ server.xmlrpc.set_search_result(['FooBar', 'Foo', 'FooFoo'])
+ results = [r.name for r in client.search_projects(name='Foo')]
+ self.assertEqual(3, len(results))
+ self.assertIn('FooBar', results)
+ self.assertIn('Foo', results)
+ self.assertIn('FooFoo', results)
+
+ def test_search_projects_bad_fields(self):
+ client = Client()
+ self.assertRaises(InvalidSearchField, client.search_projects,
+ invalid="test")
+
+ @use_xmlrpc_server()
+ def test_get_releases(self, server):
+ client = self._get_client(server)
+ server.xmlrpc.set_distributions([
+ {'name': 'FooBar', 'version': '1.1'},
+ {'name': 'FooBar', 'version': '1.2', 'url': 'http://some/url/'},
+ {'name': 'FooBar', 'version': '1.3', 'url': 'http://other/url/'},
+ ])
+
+ # use a lambda here to avoid an useless mock call
+ server.xmlrpc.list_releases = lambda *a, **k: ['1.1', '1.2', '1.3']
+
+ releases = client.get_releases('FooBar (<=1.2)')
+ # dont call release_data and release_url; just return name and version.
+ self.assertEqual(2, len(releases))
+ versions = releases.get_versions()
+ self.assertIn('1.1', versions)
+ self.assertIn('1.2', versions)
+ self.assertNotIn('1.3', versions)
+
+ self.assertRaises(ProjectNotFound, client.get_releases,'Foo')
+
+ @use_xmlrpc_server()
+ def test_get_distributions(self, server):
+ client = self._get_client(server)
+ server.xmlrpc.set_distributions([
+ {'name':'FooBar', 'version': '1.1', 'url':
+ 'http://example.org/foobar-1.1-sdist.tar.gz',
+ 'digest': '1234567', 'type': 'sdist', 'python_version':'source'},
+ {'name':'FooBar', 'version': '1.1', 'url':
+ 'http://example.org/foobar-1.1-bdist.tar.gz',
+ 'digest': '8912345', 'type': 'bdist'},
+ ])
+
+ releases = client.get_releases('FooBar', '1.1')
+ client.get_distributions('FooBar', '1.1')
+ release = releases.get_release('1.1')
+ self.assertTrue('http://example.org/foobar-1.1-sdist.tar.gz',
+ release['sdist'].url['url'])
+ self.assertTrue('http://example.org/foobar-1.1-bdist.tar.gz',
+ release['bdist'].url['url'])
+ self.assertEqual(release['sdist'].python_version, 'source')
+
+ @use_xmlrpc_server()
+ def test_get_metadata(self, server):
+ client = self._get_client(server)
+ server.xmlrpc.set_distributions([
+ {'name':'FooBar',
+ 'version': '1.1',
+ 'keywords': '',
+ 'obsoletes_dist': ['FooFoo'],
+ 'requires_external': ['Foo'],
+ }])
+ release = client.get_metadata('FooBar', '1.1')
+ self.assertEqual(['Foo'], release.metadata['requires_external'])
+ self.assertEqual(['FooFoo'], release.metadata['obsoletes_dist'])
+
+
+def test_suite():
+ suite = unittest.TestSuite()
+ suite.addTest(unittest.makeSuite(TestXMLRPCClient))
+ return suite
+
+if __name__ == '__main__':
+ run_unittest(test_suite())
diff --git a/src/distutils2/tests/test_version.py b/src/distutils2/tests/test_version.py
--- a/src/distutils2/tests/test_version.py
+++ b/src/distutils2/tests/test_version.py
@@ -188,6 +188,13 @@
# XXX need to silent the micro version in this case
#assert not VersionPredicate('Ho (<3.0,!=2.6)').match('2.6.3')
+ def test_predicate_name(self):
+ # Test that names are parsed the right way
+
+ self.assertEqual('Hey', VersionPredicate('Hey (<1.1)').name)
+ self.assertEqual('Foo-Bar', VersionPredicate('Foo-Bar (1.1)').name)
+ self.assertEqual('Foo Bar', VersionPredicate('Foo Bar (1.1)').name)
+
def test_is_final(self):
# VersionPredicate knows is a distribution is a final one or not.
final_versions = ('1.0', '1.0.post456')
diff --git a/src/distutils2/util.py b/src/distutils2/util.py
--- a/src/distutils2/util.py
+++ b/src/distutils2/util.py
@@ -5,10 +5,14 @@
__revision__ = "$Id: util.py 77761 2010-01-26 22:46:15Z tarek.ziade $"
+import os
+import posixpath
+import re
+import string
import sys
-import os
-import string
-import re
+import shutil
+import tarfile
+import zipfile
from copy import copy
from fnmatch import fnmatchcase
from ConfigParser import RawConfigParser
@@ -699,6 +703,119 @@
self.options, self.explicit)
+def splitext(path):
+ """Like os.path.splitext, but take off .tar too"""
+ base, ext = posixpath.splitext(path)
+ if base.lower().endswith('.tar'):
+ ext = base[-4:] + ext
+ base = base[:-4]
+ return base, ext
+
+
+def unzip_file(filename, location, flatten=True):
+ """Unzip the file (zip file located at filename) to the destination
+ location"""
+ if not os.path.exists(location):
+ os.makedirs(location)
+ zipfp = open(filename, 'rb')
+ try:
+ zip = zipfile.ZipFile(zipfp)
+ leading = has_leading_dir(zip.namelist()) and flatten
+ for name in zip.namelist():
+ data = zip.read(name)
+ fn = name
+ if leading:
+ fn = split_leading_dir(name)[1]
+ fn = os.path.join(location, fn)
+ dir = os.path.dirname(fn)
+ if not os.path.exists(dir):
+ os.makedirs(dir)
+ if fn.endswith('/') or fn.endswith('\\'):
+ # A directory
+ if not os.path.exists(fn):
+ os.makedirs(fn)
+ else:
+ fp = open(fn, 'wb')
+ try:
+ fp.write(data)
+ finally:
+ fp.close()
+ finally:
+ zipfp.close()
+
+
+def untar_file(filename, location):
+ """Untar the file (tar file located at filename) to the destination
+ location
+ """
+ if not os.path.exists(location):
+ os.makedirs(location)
+ if filename.lower().endswith('.gz') or filename.lower().endswith('.tgz'):
+ mode = 'r:gz'
+ elif (filename.lower().endswith('.bz2')
+ or filename.lower().endswith('.tbz')):
+ mode = 'r:bz2'
+ elif filename.lower().endswith('.tar'):
+ mode = 'r'
+ else:
+ mode = 'r:*'
+ tar = tarfile.open(filename, mode)
+ try:
+ leading = has_leading_dir([member.name for member in tar.getmembers()])
+ for member in tar.getmembers():
+ fn = member.name
+ if leading:
+ fn = split_leading_dir(fn)[1]
+ path = os.path.join(location, fn)
+ if member.isdir():
+ if not os.path.exists(path):
+ os.makedirs(path)
+ else:
+ try:
+ fp = tar.extractfile(member)
+ except (KeyError, AttributeError), e:
+ # Some corrupt tar files seem to produce this
+ # (specifically bad symlinks)
+ continue
+ if not os.path.exists(os.path.dirname(path)):
+ os.makedirs(os.path.dirname(path))
+ destfp = open(path, 'wb')
+ try:
+ shutil.copyfileobj(fp, destfp)
+ finally:
+ destfp.close()
+ fp.close()
+ finally:
+ tar.close()
+
+
+def has_leading_dir(paths):
+ """Returns true if all the paths have the same leading path name
+ (i.e., everything is in one subdirectory in an archive)"""
+ common_prefix = None
+ for path in paths:
+ prefix, rest = split_leading_dir(path)
+ if not prefix:
+ return False
+ elif common_prefix is None:
+ common_prefix = prefix
+ elif prefix != common_prefix:
+ return False
+ return True
+
+
+def split_leading_dir(path):
+ path = str(path)
+ path = path.lstrip('/').lstrip('\\')
+ if '/' in path and (('\\' in path and path.find('/') < path.find('\\'))
+ or '\\' not in path):
+ return path.split('/', 1)
+ elif '\\' in path:
+ return path.split('\\', 1)
+ else:
+ return path, ''
+
+
def spawn(cmd, search_path=1, verbose=0, dry_run=0, env=None):
"""Run another program specified as a command list 'cmd' in a new process.
diff --git a/src/distutils2/version.py b/src/distutils2/version.py
--- a/src/distutils2/version.py
+++ b/src/distutils2/version.py
@@ -321,7 +321,7 @@
return None
-_PREDICATE = re.compile(r"(?i)^\s*([a-z_]\w*(?:\.[a-z_]\w*)*)(.*)")
+_PREDICATE = re.compile(r"(?i)^\s*([a-z_][\sa-zA-Z_-]*(?:\.[a-z_]\w*)*)(.*)")
_VERSIONS = re.compile(r"^\s*\((.*)\)\s*$")
_PLAIN_VERSIONS = re.compile(r"^\s*(.*)\s*$")
_SPLIT_CMP = re.compile(r"^\s*(<=|>=|<|>|!=|==)\s*([^\s,]+)\s*$")
@@ -354,7 +354,8 @@
if match is None:
raise ValueError('Bad predicate "%s"' % predicate)
- self.name, predicates = match.groups()
+ name, predicates = match.groups()
+ self.name = name.strip()
predicates = predicates.strip()
predicates = _VERSIONS.match(predicates)
if predicates is not None:
--
Repository URL: http://hg.python.org/distutils2
More information about the Python-checkins
mailing list