[TriZPUG] {conda, spyder} vs {canopy, canopy IDE} for scientific python

Chris Calloway cbc at unc.edu
Sun Sep 14 21:10:32 CEST 2014


On 9/12/2014 12:06 PM, Tom Roche wrote:
> Could users (preferably of both) sound off on the relative merits of each set as development environments for scientific python? I don't know much about canopy or its editor/GUI/IDE, and I don't regularly use spyder, but was recently asked by someone who knew even less. My impression was, the main differences are that
>
> 1. Conda supports both python 2 and python 3, while canopy (the {environment, package manager}) currently supports only python 2.
>
> 2. Canopy (the {environment, package manager}) gives one all the Enthought goodness, conda does not.
>
> 3. Conda is completely free (though it has payware "add-ons"), canopy is payware. Both have free academic licenses.
>
> 4. The canopy IDE's console is ipython, while spyder's console allows other shells.
>
> Aside from that, my impression (on linux) is, both the environments and IDEs are roughly comparable, as are the organizations backing them (Continuum Analytics and Enthought, though the latter is larger and more established). Am I missing anything?

Thanks for asking. This is a questions I struggle a lot with. I have 
used Canopy to teach two week-long classes and Anaconda to teach two 
week-long classes. I abandoned both and returned to python.org Cpython 
due to several factors that probably don't matter to you as they only 
were of concern to my particular use case of teaching Python.

1). Correct, Anaconda supports Python 2 and 3. Canopy is Python 2. 
Anaconda is really a package manager. Installing Python 3 with Anaconda 
is like installing any other package. Note that Python 3 does not 
support the full range of packages available from Anaconda. And most 
people using Anaconda are using Python 2. In fact, just having come from 
the SciPy conferece, I can report that Python 3 uptake in the scientific 
Python world is practically nil. Only the developers of the major 
scientific tools (IPython, numpy, scipy, matplotlib, Pandas, 
scikit-learn, SymPy, Numba, and Bokeh) have interest in Python 3 and 
that's mostly just to be able to say, OK, we have that, just in case one 
of the ankle-biters from the Python 3 children's crusage comes knocking. 
However, almost all the domain tools made with those things are Python 
2. The GIS-oriented tools, frequently used with scientific Python tools, 
also have little Python 3 interest. The scientific Python community is 
focused on getting science done. Moving targets are not their thing. 
Reproducibility, collaboration, reliability, and usability are their things.

https://support.enthought.com/entries/22119894-Preparing-for-Python-3-Now

2) Canopy gives all the Enthought goodness. Anaconda gives all the 
Continuum Analytics goodness. The free Continuum Analytics pile of 
goodness is about three times as big as the Enthought free pile. 
Continuum Analytics does not withhold any of the open source packages. 
Enthought does. See next answer for more detail.

https://www.enthought.com/products/canopy/package-index/

http://docs.continuum.io/anaconda/pkg-docs.html

3) Anaconda is free in total. It does have payware tools that could be 
added. But they are Continuum Analytics products. Canopy has a some free 
packages and some pay to play packages. The academic version of Canopy 
only contains the free packages.

4) The Canopy IDE is IPython with an editor. Both are QT based. It is 
pretty bare bones vanilla. One of the Anaconda packages is Spyder. 
Spyder supports Python, IPython, and a special "internal" Spyder 
interpreter. All are QT based.

IDE is kind of a misnomer in contemporary times and in Python. IDE used 
to refer to a program with a GUI designer, editor, class viewer, and 
debugger. These days things that pass for IDEs omit the GUI designer, 
which are now separate programs for each GUI toolkit like QT designer. 
Introspection and code completion in IPython kind of make class viewers 
obsolete. And a debugger is built right into Python. You can find 
editors like Sublime outside of IDEs that often have IDE-like features 
such as version control integration and are superior to most IDEs. IDEs 
just kind of get in the way, however nice PyCharm and Wingware are. I 
like them, but they do get in the way. An IDE serves to separate you 
from collaborating as easily with other people who don't use your 
specific IDE, and scientific Python is all about collaboration, for 
which the community has agreed on IPython Notebook for reproducibility.

So a lot of this conversation has to do with Enthought or Continuum. And 
I feel bad and sorry for Enthought. They pioneered the whole scientific 
Python distribution scenario. But a pile of their brain trust split off 
to form Continuum and forge the next generation of tools (Number and 
Bokeh, which out-perform Julia) under the umbrella of the Numfocus 
Foundation. NumFocus is vendor neutral. So Canopy plays in the NumFocus 
arena where Enthought is a member. But development of Numba and Bokeh 
(GPU-parallelization-enabled replacements for numpy and matplotlib) is 
where the NumFocus focus is and what Continuum develops.

I feel bad for Enthought because Anaconda blows Canopy away. Anaconda is 
focused on being a package manager, an alternative to pip, setuptools, 
and virtualenv all rolled into one tool, conda, that actually works well 
with external binary dependencies. This works through a mechanism called 
binstar.org, which is kind of like GitHub for Python packages:

https://binstar.org/

I can write "conda recipes" which will make a "conda package" for a 
variety of platforms and upload the binaries to my "channel" on binstar. 
You can "subscribe" to my channel and "conda install" my packages. Right 
now I work on a distributed team that uses binstar to manage the install 
of our main tools that contain over 60 external binary dependencies. 
It's much more powerful than Python Wheels. A service called binstarbin 
is also coming soon where I will upload my conda recipes and binstar.org 
will run background tasks in the cloud to generate binaries for an array 
of platforms from my conda recipes and deposit the resulting binaries in 
my binstar channel. If you are working with government agencies that 
mandate Windows, this is now the way to go.

I feel bad for Enthought because they've been out maneuvered by former 
employees in the package management game. And best of breed Python 
package management is the most intense game in scientific Python these 
days. As we were told at SciPy by Nick Coghlan, "Don't fight the 
redistributors." If the redistributors are doing a better job than the 
Python community at packaging Python packages, so says the BDFL-delegate 
of the Python Package Authority, then use the redistributor that fits 
you (of course, he also works for a redistributor, RedHat, so he can say 
that). And the SciPy community as a while has settled on Anaconda as a 
redistributor, for better or worse. I say for better or worse, because 
this does cut the SciPy community off from the PYthon web development 
community, which is still very much based on pip/setuptools/virtualenv 
and domain tools like buildout and hashdist (note to self, get new 
TriPython member Matt McCormick from Kitware to present on hashdist).

So yeah, Continuum is a much bigger company because they've been running 
for a longer time. I went to a party at Enthought HQ on the 20th floor 
of the Bank of America tower in Austin during SciPy. They take up the 
whole floor. It's a serious business. They have a tons of admin and 
marketing staff. They still have a bunch of smart developers and 
scientists working for them. Their CEO, Eric Jones, is awesome and got 
both his MS and PhD in EE from Duke. They are still very much community 
driven and are the organizers for SciPy conference where their 
competitor gets all the attention. They are good guys.

Continuum is much smaller and leaner with mostly scientist/developers 
and billionaire investors. I can't even tell what their revenue model is.

However, maybe one good thing has come out of this split. Enthought has 
refocused. Their bread and butter is still consulting for the oil and 
gas industry because of their awesome geological and hydrology  modeling 
tools. But they have a renewed emphasis on training. Their previous 
training focus was classroom based and high dollar. Their new training 
focus is on the flipped classroom where you watch short videos and then 
work through exercises. This is a superior training model. If you have a 
.edu email address, you can partake of this new training model for free. 
If you are outside of academia, it is a per-seat-per-annum charge to 
take advantage of the training. But the new fee is much more reasonable 
than the old classroom training rate (hundreds of dollars instead of 
thousands of dollars per person). See:

https://training.enthought.com/courses

-- 
Sincerely,

Chris Calloway, Applications Analyst
UNC Renaissance Computing Institute
100 Europa Drive, Suite 540, Chapel Hill, NC 27517
(919) 599-3530


More information about the TriZPUG mailing list