[SciPy-Dev] Deprecate stats.glm?

Nathaniel Smith njs at pobox.com
Thu Jun 3 12:16:22 EDT 2010


On Thu, Jun 3, 2010 at 8:53 AM,  <josef.pktd at gmail.com> wrote:
> GLM as in general linear model not generalized. (It's the worst
> conflicting acronym in stats).

Sure, and lets not even talk about generalized least squares
(unrelated to both!).

But the general linear model is basically identical to a simple linear
model, both in interface and implementation. There's no reason to have
a separate function for it, one should just accept a matrix for the
"y" variable in the OLS code. But *generalized* linear models are
different in interface, implementation, and are almost as much of a
stats workhorse as standard linear models. So every book I've ever
seen uses the abbreviation "glm" to refer to the generalized version.
(Also, this is what R calls the function ;-).)

The implementation of dummy coding is kind of useful, but this is the
wrong place and the wrong name...

(Also, its least squares implementation calls inv -- the textbook
example of bad numerics!)

...Okay, you know all that anyway, the question is what to do with it.
If the problem were just that it needed a better implementation and
some new features added, then maybe we would keep it and let it be
improved incrementally. But the interface is just wrong, so we'll be
removing it sooner or later, and it might as well be sooner, rather
than prolong the agony.

-- Nathaniel



More information about the SciPy-Dev mailing list