[Distutils] People want CPAN :-)

David Cournapeau david at ar.media.kyoto-u.ac.jp
Sat Nov 7 08:49:12 CET 2009


Hi Guido,

Guido van Rossum wrote:
> On Fri, Nov 6, 2009 at 2:52 PM, David Lyon <david.lyon at preisshare.net> wrote:
>   
>> So the packages on CPAN are typically of a higher quality, simply
>> because they've been machine checked. I like that.
>>     
>
> Speaking purely on hearsay, I don't believe that. In fact, I've heard
> plenty of laments about the complete lack of quality control on CPAN.
>   

I cannot speak for CPAN, as I have never used it. But CRAN (which is
CPAN for R) works much better that PyPI today in practice. I am not sure
what exactly makes it work better, but it has the following properties,
both technical and more 'social':
    - R is a niche language, and targets mostly scientists. It is a
smaller community, more focused. They can push solutions more easily.
    - There is an extensive doc on how to develop R extensions (you can
download a 130 pages pdf).
    - R packages are much more constraints: there is a standard source
organization, which makes for a more consistant experience
    - There are regular checks of the packages (all the packages are
daily checked on a build farm on fedora and debian). It also has a
machine to check windows.

http://cran.r-project.org/web/checks/check_summary.html
http://cran.r-project.org/bin/windows/contrib/checkSummaryWin.html

I am obviously quite excited by Snakebite potential here.

Concerning distutils, I think it is important to improve it, but I think
it is inherently  flawed for serious and repeatable packaging. I have
written a quite extensive article on it from my point of view as a
numpy/scipy core developer and release manager
(http://cournape.wordpress.com/2009/04/01/python-packaging-a-few-observations-cabal-for-a-solution/),
I won't rehearse it here, but basically:
    - distutils is too complex for simple packages, and too inflexible
for complex ones. Adding new features to distutils is a painful
experience. Even autotools with its mix of 100 000 lines autogenerated
shell code, perl, m4 is more pleasant.
    - Most simple packages could be "buildable" from purely declarative
description. This is important IMHO because it means they are simple to
package by OS vendors, and you can more easily automate building and
testing.
    - it is hard to interact with other build/distribution tools, which
is sometimes needed.
    - distutils is too flexible for some things where it should not
be(like specifying the content of sdist tarballs), and that makes it
very difficult to repeat things in different environments.

Contrary to other people, I don't think a successor to distutils is that
hard to develop, especially since good designs in other languages
already exist. It would take time, especially to get a transition story
right, but I think it would be worthwhile.

cheers,

David


More information about the Distutils-SIG mailing list