[Distutils] use of '_' in package name causing version parsing issue?
Baiju M
mbaiju at zeomega.com
Wed Mar 10 05:20:24 CET 2010
On Wed, Mar 10, 2010 at 3:54 AM, P.J. Eby <pje at telecommunity.com> wrote:
> At 03:03 PM 3/9/2010 -0600, Brad Allen wrote:
>>
>> Today I was informed of an issue in which buildout (with the latest
>> setuptools) is not resolving version numbers properly, causing the
>> wrong package to be selected in some cases. The cause identified was
>> having '_' in the package name.
>
> I suspect there is a miscommunication or misunderstanding somewhere. It is
> perfectly acceptable to have a '_' in a package name or project name. This:
>
>> | >>> a="jiva_interface-2.3.6-py2.6.egg"
>> | >>> b="jiva_interface-2.3.8-py2.6.egg"
>> | >>> pkg_resources.parse_version(a)
>
> Is the wrong API to use to parse an egg filename, as parse_version() is for
> parsing a version that's already extracted from a filename. This is the
> right API for extracting a version from a filename:
>
>>>> pkg_resources.Distribution.from_filename(a).version
> '2.3.6'
>>>> pkg_resources.Distribution.from_filename(b).version
> '2.3.8'
>>>> pkg_resources.Distribution.from_filename(c).version
> '0.1.1'
>>>> pkg_resources.Distribution.from_filename(d).version
> '0.1.2'
>
> And here's the correct one for extracting the parsed version from a
> filename:
>
>>>> pkg_resources.Distribution.from_filename(a).parsed_version
> ('00000002', '00000003', '00000006', '*final')
>>>> pkg_resources.Distribution.from_filename(b).parsed_version
> ('00000002', '00000003', '00000008', '*final')
>>>> pkg_resources.Distribution.from_filename(c).parsed_version
> ('00000000', '00000001', '00000001', '*final')
>>>> pkg_resources.Distribution.from_filename(d).parsed_version
> ('00000000', '00000001', '00000002', '*final')
>
> As you can see, these APIs work just fine, so the example given is a red
> herring, unless Buildout is using the APIs incorrectly (which I really doubt
> it is).
>
> Usually, the situation where people run into trouble with unusual package
> names or filenames is when they produce a source distribution manually, or
> by using something other than distutils/setuptools (that has different
> filename escaping rules), or when they manually rename a file before
> uploading, and expect it to still work the same.
>
> It would be a good idea for you to check which of these things (if any) is
> taking place, and provide details of the specific problem, with steps to
> reproduce it, since the example given probably has nothing to do with it.
I spend some time with Buildout and setuptools code to identify the issue.
I will try to explain my findings.
1. Buildout is relying on pkg_resources.Requirement.parse function to
get the "project_name" like this:
pkg_resources.Requirement.parse('jiva_interface').project_name
I can see from the code of `Requirement` class that, the `__init__`
method is deprecated and recommend to use `parse`
function. Does this mean that we should not use the attributes
of an instance of `Requirement` class? This is very important as
the `parse` function return a list of instances of `Requirement` class.
So, if it is acceptable to use the "project_name" attribute, then
Buildout can rely on it, right ?
Here is beginning of `Requirement` class:
class Requirement:
def __init__(self, project_name, specs, extras):
"""DO NOT CALL THIS UNDOCUMENTED METHOD; use Requirement.parse()!"""
2. This is the code which get the "project_name" in the same `__init__` method:
self.unsafe_name, project_name = project_name, safe_name(project_name)
self.project_name, self.key = project_name, project_name.lower()
I looked at the "safe_name" method:
def safe_name(name):
"""Convert an arbitrary string to a standard distribution name
Any runs of non-alphanumeric/. characters are replaced with a
single '-'.
"""
return re.sub('[^A-Za-z0-9.]+', '-', name)
According to this code, this will be the result:
pkg_resources.safe_name('jiva_interface')
'jiva-interface'
And:
pkg_resources.Requirement.parse('jiva_interface').project_name
'jiva-interface'
Is this behavior correct ?
If you think what setuptools doing is fine, we will make changes
in Buildout code to use the "safe_name" method where ever it directly
get "project_name".
Regards,
Baiju M
More information about the Distutils-SIG
mailing list