[Distutils] Dynamic linking between Python modules (was: Beyond wheels 1.0: helping downstream, FHS and more)

Chris Barker chris.barker at noaa.gov
Tue May 19 23:11:05 CEST 2015


On Tue, May 19, 2015 at 10:58 AM, Paul Moore <p.f.moore at gmail.com> wrote:


> Sure. Doesn't have to be the same way, but the user experience has to
> be the same.


absolutely.

> But maybe that's not going to cut it -- in a way, we are headed there now,
> > with a contingent of people porting pypi packages to conda. So far it's
> > various subsets of the scientific community, but if we could get a few
> web
> > developers to join in...
>
> Unless project owners switch to providing conda packages, isn't there
> always going to be a lag? If a new version of lxml comes out, how long
> must I wait for "the conda folks" to release a package for it?
>

who knows? -- but it is currently a light lift to update a conda package to
a new version, once the original is built -- and we've got handy scripts
and CI systems that will push an updated version of the binaries as soon as
an updated version of the build script is pushed.

It's a short step to automate looking for new versions on PyPi and
automatically updadating the conda pacakges -- though there would need to
be hand-intervention for whenever a update broke the build script...

Of course, the ideal is for package maintainers to push conda pacakges
themselves -- which is why the more-than-one-system to support is
unfortunate.

On the other hand, there is one plus side -- if the package maintainer
doesn't push to PyPi, it's easier for a third party to take on that role --
see pychecker, or, for that matter numpy and scipy -- on pipy, but not
binaries for Windows. But you can get them on binstar (or Anaconda, or...)

> hmm -- that's the interesting technical question -- conda works at a
> higher
> > level than pip -- it CAN manage python itself -- I'm not sure it is HAS
> to,
> > but that's how it is usually used, and the idea is to provide a complete
> > environment, which does include python itself.
>
> Yes. But I don't want to use Anaconda Python, Same reason - how long
> do I wait for the new release of Python to be available in Anaconda?
> There's currently no Python 3.5 alpha for example...


you can grab and build the latest Python3.5 inside a conda environment just
as well. Or are you using python.org builds for alpha versions, too?

Oh, and as a conda environment sits at a higher level than python, it's
actually easier to set up an environment specifically for a particular
version of python.

And anyone could put up a conda package of Python3.5 Alpha as well --- once
the build script is written, it's pretty easy. But again -- teh more than
one way to do it problem.

If conda/binstar is good enough to replace pip/PyPI, there's no reason
> for pip/PyPI to still exist. So in effect binstar *becomes* PyPI.
>

yup.


> There's an element of evangelisation going on here - you're
> (effectively) asking what it'd take to persuade me to use conda in
> place of pip. I'm playing hard to get, a little, because I see no
> specific benefits to me in using conda, so I don't see why I should
> accept any loss at all, in the absence of a benefit to justify it.
>

I take no position here -- I'm playing around with ideas as to how we can
move the community toward a better future -- I'm not trying to advocate any
particular solution, but trying to figure out what solution we may want to
pursue -- quite specifically which solution I'm going to put my personal
energy toward.

We may want to look back at a thread on this list where Travis Oliphant
talks about why he built conda, etc. (I can't find it now -- maybe someone
with better google-fu than me can. It think it was a thread on this list,
probably about a year ago)

or read his Blog Post:

http://technicaldiscovery.blogspot.com/2013/12/why-i-promote-conda.html

One of the key points is that when they started building conda -- pip+wheel
where not mature, and the core folks behind them didn't want to support
what was needed (dynamic libs, etc) -- and still don't.

My biggest worry is that at some point, "if you want numpy/scipy, you
> should use conda" becomes an explicit benefit of conda,


That is EXACTLY what the explicit benefit of conda is. I think we'll get
binary wheels for numpy and scipy up on PyPi before too long, but the rest
of the more complex stuff is not going to be there.


> and pip/PyPI
> users get abandoned by the scientific community.


They kind of already have -- it's been a long time, and a lot of work by
only a couple folks to try to get binary wheels up on PyPi for Windows and
OS-X


> If that happens, I'd
> rather see the community rally behind conda than see a split. But I
> hope that's not the way things end up going.


we'll see. But look at Travis' post -- pip+wheel simply does not support
the full needs of the full scientific user. If we want a "one ring to rule
them all", then it'll have to be conda -- or something a lot like it.

On the other hand, I think pip+wheel+PyPi (or maybe just the community
around it) can be extended a bit to at least support all the truly python
focused stuff -- which I think would be pretty worthwhile it itself.

This is the key point. The decision was made to "bless" pip as the
> official Python package manager. Should we revisit that decision?


I'm not sure I want to be the one to bring that up ;-)



> If
> not, then how do we ensure that pip (and the surrounding
> infrastructure) handles the needs of the *whole* Python community? If
> the authors of scientific extensions for Python abandon pip for conda,
> then pip isn't supporting that part of the community properly. But
> conversely, if the scientific community doesn't look to address their
> issues within the pip/wheel infrastructure, how can we do anything to
> avoid a rift?
>

well -- I think the problem is that while SOME of the needs of scientific
community can be addressed within the pip-wheel infrastructure, they all
can't all be addressed there. And (I wish I could find that thread), I'm
pretty sure Travis said that before he started conda, he talked to the PyPa
folks (before it was called that), and told that he'd be best off going off
and building something new -- pip+wheel were just getting started, and were
not going to support what he needed -- certainly not anytime soon.


> I'd like to think so. The goal of pip is to be the baseline Python
> package manager. We're expecting Linux distributions to build their
> system packages via wheel, why can't Anaconda? Part of the problem
> here, to my mind, is that it's *very* hard for the outsider to
> separate out (Ana)conda-as-a-platform versus conda-as-a-tool, versus
> conda-as-a-distribution-format.
>

absolutely.

When you say "build their system packages via wheel" -- what does that
mean? and why wheel, rather than, say, pip + setuptools?

you can put whatever you want in a conda build script -- the current
standard practice baseline for python packages is:

$PYTHON setup.py install

it's that simple. And I don't know if this is what it actually does, but
essentially the conda package is all the stuff that that script added to
the environment. So if changing that invocation to use pip would get us
some better meta data or what have you, then by all means, we should change
that standard of practice.

(but it seems like an odd thing to have to use the package manager to build
the package correctly -- shouldn't' that be distutils or setuptools' job?

>   * Toss out your current setup
> >   * Install Anaconda (or miniconda)
> >   * Switch from virtualenv to conda environments
> >   * re-install all your dependencies
>
> Yeah, that's hopeless. And worse still is the possibility that a pure
> Python user might have to do that just to gain access to a particular
> package.
>

exactly.

I keep saying this, and I ought to ask - is there *any* likelihood
> that a package would formally abandon any attempt to provide binary
> distributions for Windows (or OSX, or whatever) except in conda
> format?


absolutely -- probably not the core major packages -- but I think that's
already the case with a number of more domain specific packages (including
my stuff -- OK, I have maybe three users outside my organization now, but
still..)

And numpy and scipy don't yet have binary wheels for Windows up -- though
that is being worked on.

So PyPI users will be told "install from source yourself or
> switch to conda". If there's no intention for that to ever happen,


Sadly, we are already there for minor packages, at least. Oh wait, not so
minor -- the geospatial stack is not well supported on PyPi. I don't think
there are pynetcdf or pyhdf binaries, etc...

On the other hand, some domain specifc stuff is being support, like
scikit-leaern, for instance:

http://scikit-learn.org/stable/install.html#install-official-release

lot of the "conda vs pip" discussion is less relevant. At the moment,
> there are projects like numpy that don't distribute Windows wheels on
> PyPI, but Christoph Gohlke has most of them available,


yes, but in a form that is not redistributable on PyPi...


> and in general
> such projects seem to be aiming to move to wheels. So there aren't any
> practical cases of "conda-only" packages.


I'm not sure about "conda-only", but not pip-installable is all too common.


> > And for even that to work, we need a way for everything installable by
> pip
> > to be installable within that conda environment -- which we could
> probably
> > achieve.
>
> See above - without some serious resource on the conda side, that
> "everything" is unlikely.


conda can provide a full, pretty standard, python environment -- why do you
think "everything" is unlikely?

> I do think that we could push
> > pip+pypi+wheel a little further to better support at least the
> > python-centric stuff -- i.e. third party libs, which would get us a lot
> > farther.
>
> Agreed. Understanding the *actual* problem here is important though
> (see my other post about clarifying why dynamic linking is so
> important).
>

yes -- I don't know that that's answered yet -- but the third party
dependency problem is real -- whether it is addressed by supporting dynamic
linking, or by making it easier to find, build, distribute compatible
static libs for package maintainers to use is still up in the air.

> And again, it's not just the scipy stack -- there is stuff like image
> > manipulation packages, etc, that could be better handled.
>
> Well, (for example) Pillow provides wheels with no issue, so I'm not
> sure what you're thinking of here?


the PIllow folks have figured it out -- and are doing the work -- but it
took years, and we had a lot of pain building binaries for OS-X of PIL
during that time.

I think dynamic libs would be a good thing for packages like PIL, but maybe
static is fine (I presume they are doing static now...)

> And the geospatial packages are a mess, too - is that "scientific"? -- I
> > don't know, but it's the "new hotness" in web development.
>
> I can't really comment on that one, as I've never used them. Does that
> make me untrendy? :-)
>

Absolutely!

-Chris


-- 

Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/OR&R            (206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115       (206) 526-6317   main reception

Chris.Barker at noaa.gov
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/distutils-sig/attachments/20150519/7dba551a/attachment-0001.html>


More information about the Distutils-SIG mailing list