[SciPy-Dev] distributions.py

josef.pktd at gmail.com josef.pktd at gmail.com
Fri Sep 14 17:01:35 EDT 2012


On Fri, Sep 14, 2012 at 4:49 PM, Ralf Gommers <ralf.gommers at gmail.com> wrote:
>
>
> On Fri, Sep 14, 2012 at 12:48 AM, <josef.pktd at gmail.com> wrote:
>>
>> On Thu, Sep 13, 2012 at 5:21 PM, nicky van foreest <vanforeest at gmail.com>
>> wrote:
>> > Hi,
>> >
>> > Now that I understand github (Thanks to Ralf for his explanations in
>> > Dutch) and got some simple stuff out of the way in distributions.py I
>> > would like to tackle a somewhat harder issue. The function argsreduce
>> > is, as far as I can see, too generic. I did some tests to see whether
>> > its most generic output, as described by its docstring, is actually
>> > swallowed by the callers of argsreduce, but this appears not to be the
>> > case.
>>
>> being generic is not a disadvantage (per se) if it's fast
>>
>> https://github.com/scipy/scipy/commit/4abdc10487d453b56f761598e8e013816b01a665
>> (and a being a one liner is not a disadvantage either)
>>
>> Josef
>>
>> >
>> > My motivation to simplify the code in distributions.py (and clean it
>> > up) is partly based on making it simpler to understand for myself, but
>> > also to  others. The fact that github makes code browsing a much nicer
>> > experience, perhaps more people will take a look at what's under the
>> > hood. But then the code should also be accessible and clean. Are there
>> > any reasons not to pursue this path, and focus on more important
>> > problems of the stats library?
>
>
> Not sure that argsreduce is the best place to start (see Josef's reply), but
> there should be things that can be done to make the code easier to read. For
> example, this code is used in ~10 methods of rv_continuous:
>
>         loc,scale=map(kwds.get,['loc','scale'])
>         args, loc, scale = self._fix_loc_scale(args, loc, scale)
>         x,loc,scale = map(asarray,(x,loc,scale))
>         args = tuple(map(asarray,args))
>
> Some refactoring may be in order. The same is true of the rest of the
> implementation of many of those methods. Some are exactly the same except
> for calls to the corresponding underscored method (example: logsf() and
> logcdf() are identical except for calls to _logsf() and _logcdf(), and one
> nonsensical multiplication).

however when comparing across methods pdf, cdf, sf, ppf, (not with the
log version) then there are small differences how bounds are handled,
and the details can be tricky.

 Josef

>
> Ralf
>
>
> _______________________________________________
> SciPy-Dev mailing list
> SciPy-Dev at scipy.org
> http://mail.scipy.org/mailman/listinfo/scipy-dev
>



More information about the SciPy-Dev mailing list