[Distutils] use of '_' in package name causing version parsing issue?

Baiju M mbaiju at zeomega.com
Wed Mar 10 05:20:24 CET 2010


On Wed, Mar 10, 2010 at 3:54 AM, P.J. Eby <pje at telecommunity.com> wrote:
> At 03:03 PM 3/9/2010 -0600, Brad Allen wrote:
>>
>> Today I was informed of an issue in which buildout (with the latest
>> setuptools) is not resolving version numbers properly, causing the
>> wrong package to be selected in some cases. The cause identified was
>> having '_' in the package name.
>
> I suspect there is a miscommunication or misunderstanding somewhere.  It is
> perfectly acceptable to have a '_' in a package name or project name.  This:
>
>> | >>> a="jiva_interface-2.3.6-py2.6.egg"
>> | >>> b="jiva_interface-2.3.8-py2.6.egg"
>> | >>> pkg_resources.parse_version(a)
>
> Is the wrong API to use to parse an egg filename, as parse_version() is for
> parsing a version that's already extracted from a filename.  This is the
> right API for extracting a version from a filename:
>
>>>> pkg_resources.Distribution.from_filename(a).version
> '2.3.6'
>>>> pkg_resources.Distribution.from_filename(b).version
> '2.3.8'
>>>> pkg_resources.Distribution.from_filename(c).version
> '0.1.1'
>>>> pkg_resources.Distribution.from_filename(d).version
> '0.1.2'
>
> And here's the correct one for extracting the parsed version from a
> filename:
>
>>>> pkg_resources.Distribution.from_filename(a).parsed_version
> ('00000002', '00000003', '00000006', '*final')
>>>> pkg_resources.Distribution.from_filename(b).parsed_version
> ('00000002', '00000003', '00000008', '*final')
>>>> pkg_resources.Distribution.from_filename(c).parsed_version
> ('00000000', '00000001', '00000001', '*final')
>>>> pkg_resources.Distribution.from_filename(d).parsed_version
> ('00000000', '00000001', '00000002', '*final')
>
> As you can see, these APIs work just fine, so the example given is a red
> herring, unless Buildout is using the APIs incorrectly (which I really doubt
> it is).
>
> Usually, the situation where people run into trouble with unusual package
> names or filenames is when they produce a source distribution manually, or
> by using something other than distutils/setuptools (that has different
> filename escaping rules), or when they manually rename a file before
> uploading, and expect it to still work the same.
>
> It would be a good idea for you to check which of these things (if any) is
> taking place, and provide details of the specific problem, with steps to
> reproduce it, since the example given probably has nothing to do with it.

I spend some time with Buildout and setuptools code to identify the issue.
I will try to explain my findings.

1. Buildout is relying on pkg_resources.Requirement.parse function to
    get the "project_name" like this:

     pkg_resources.Requirement.parse('jiva_interface').project_name

    I can see from the code of `Requirement` class that, the `__init__`
    method is deprecated and recommend to use `parse`
    function. Does this mean that we should not use the attributes
    of an instance of `Requirement` class?  This is very important as
    the `parse` function return a list of instances of `Requirement` class.

   So, if it is acceptable to use the "project_name" attribute, then
   Buildout can rely on it, right ?

   Here is beginning of `Requirement` class:

    class Requirement:
        def __init__(self, project_name, specs, extras):
            """DO NOT CALL THIS UNDOCUMENTED METHOD; use Requirement.parse()!"""

2. This is the code which get the "project_name" in the same `__init__` method:

        self.unsafe_name, project_name = project_name, safe_name(project_name)
        self.project_name, self.key = project_name, project_name.lower()

    I looked at the "safe_name" method:

    def safe_name(name):
        """Convert an arbitrary string to a standard distribution name

        Any runs of non-alphanumeric/. characters are replaced with a
single '-'.
        """
        return re.sub('[^A-Za-z0-9.]+', '-', name)

   According to this code, this will be the result:

     pkg_resources.safe_name('jiva_interface')
     'jiva-interface'

   And:

     pkg_resources.Requirement.parse('jiva_interface').project_name
     'jiva-interface'

    Is this behavior correct ?

    If you think what setuptools doing is fine, we will make changes
    in Buildout code to use the "safe_name" method where ever it directly
    get "project_name".

Regards,
Baiju M


More information about the Distutils-SIG mailing list