[SciPy-Dev] distributions.py

Sat Sep 15 05:47:07 EDT 2012

On Fri, Sep 14, 2012 at 11:01 PM, <josef.pktd at gmail.com> wrote:

> On Fri, Sep 14, 2012 at 4:49 PM, Ralf Gommers <ralf.gommers at gmail.com>
> wrote:
> >
> >
> > On Fri, Sep 14, 2012 at 12:48 AM, <josef.pktd at gmail.com> wrote:
> >>
> >> On Thu, Sep 13, 2012 at 5:21 PM, nicky van foreest <
> vanforeest at gmail.com>
> >> wrote:
> >> > Hi,
> >> >
> >> > Now that I understand github (Thanks to Ralf for his explanations in
> >> > Dutch) and got some simple stuff out of the way in distributions.py I
> >> > would like to tackle a somewhat harder issue. The function argsreduce
> >> > is, as far as I can see, too generic. I did some tests to see whether
> >> > its most generic output, as described by its docstring, is actually
> >> > swallowed by the callers of argsreduce, but this appears not to be the
> >> > case.
> >>
> >> being generic is not a disadvantage (per se) if it's fast
> >>
> >>
> https://github.com/scipy/scipy/commit/4abdc10487d453b56f761598e8e013816b01a665
> >> (and a being a one liner is not a disadvantage either)
> >>
> >> Josef
> >>
> >> >
> >> > My motivation to simplify the code in distributions.py (and clean it
> >> > up) is partly based on making it simpler to understand for myself, but
> >> > also to  others. The fact that github makes code browsing a much nicer
> >> > experience, perhaps more people will take a look at what's under the
> >> > hood. But then the code should also be accessible and clean. Are there
> >> > any reasons not to pursue this path, and focus on more important
> >> > problems of the stats library?
> >
> >
> > Not sure that argsreduce is the best place to start (see Josef's reply),
> but
> > there should be things that can be done to make the code easier to read.
> For
> > example, this code is used in ~10 methods of rv_continuous:
> >
> >         loc,scale=map(kwds.get,['loc','scale'])
> >         args, loc, scale = self._fix_loc_scale(args, loc, scale)
> >         x,loc,scale = map(asarray,(x,loc,scale))
> >         args = tuple(map(asarray,args))
> >
> > Some refactoring may be in order. The same is true of the rest of the
> > implementation of many of those methods. Some are exactly the same except
> > for calls to the corresponding underscored method (example: logsf() and
> > logcdf() are identical except for calls to _logsf() and _logcdf(), and
> one
> > nonsensical multiplication).
>
> however when comparing across methods pdf, cdf, sf, ppf, (not with the
> log version) then there are small differences how bounds are handled,
> and the details can be tricky.
>

Right, and the way it's written it's very hard to figure out those
differences. It would help if the common parts were refactored out, making
the differences visible, and that comments were added to explain the
differences.

Ralf
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/scipy-dev/attachments/20120915/1f214890/attachment.html>