[Numpy-discussion] making "low" optional in numpy.randint

Wed Feb 17 15:04:37 EST 2016

On Wed, Feb 17, 2016 at 2:20 PM, <josef.pktd at gmail.com> wrote:

>
>
> On Wed, Feb 17, 2016 at 2:09 PM, G Young <gfyoung17 at gmail.com> wrote:
>
>> Yes, you are correct in explaining my intentions.  However, as I also
>> mentioned in the PR discussion, I did not quite understand how your wrapper
>> idea would make things any more comprehensive at the cost of additional
>> overhead and complexity.  What do you mean by making the functions
>> "consistent" (i.e. outline the behavior *exactly* depending on the
>> inputs)?  As I've explained before, and I will state it again, the
>> different behavior for the high=None and low != None case is due to
>> backwards compatibility.
>>
>
>
> One problem is that if there is only one positional argument, then I can
> still figure out that it might have different meanings.
> If there are two keywords, then I would assume standard python argument
> interpretation applies.
>
> If I want to save on typing, then I think it should be for a more
> "standard" case. (I also never sample all real numbers, at least not
> uniformly.)
>

One more thing I don't like:

So far all distributions are "theoretical" distributions where the
distribution depends on the provided shape, location and scale parameters.
There is a limitation in how they are represented as numbers/dtype and what
range is possible. However, that is not relevant for most use cases.

In this case you are promoting `dtype` from a memory or storage parameter
to an actual shape (or loc and scale) parameter.
That's "weird", and even more so if this would be the default behavior.

There is no proper uniform distribution on all integers. So, this forces
users to think about the implementation detail like dtype, when I just want
a random sample of a probability distribution.

Josef

>
> Josef
>
>
>
>>
>> On Wed, Feb 17, 2016 at 6:52 PM, Joseph Fox-Rabinovitz <
>> jfoxrabinovitz at gmail.com> wrote:
>>
>>> On Wed, Feb 17, 2016 at 1:37 PM,  <josef.pktd at gmail.com> wrote:
>>> >
>>> >
>>> > On Wed, Feb 17, 2016 at 10:01 AM, G Young <gfyoung17 at gmail.com> wrote:
>>> >>
>>> >> Hello all,
>>> >>
>>> >> I have a PR open here that makes "low" an optional parameter in
>>> >> numpy.randint and introduces new behavior into the API as follows:
>>> >>
>>> >> 1) `low == None` and `high == None`
>>> >>
>>> >> Numbers are generated over the range `[lowbnd, highbnd)`, where
>>> `lowbnd =
>>> >> np.iinfo(dtype).min`, and `highbnd = np.iinfo(dtype).max`, where
>>> `dtype` is
>>> >> the provided integral type.
>>> >>
>>> >> 2) `low != None` and `high == None`
>>> >>
>>> >> If `low >= 0`, numbers are <b>still</b> generated over the range `[0,
>>> >> low)`, but if `low` < 0, numbers are generated over the range `[low,
>>> >> highbnd)`, where `highbnd` is defined as above.
>>> >>
>>> >> 3) `low == None` and `high != None`
>>> >>
>>> >> Numbers are generated over the range `[lowbnd, high)`, where `lowbnd`
>>> is
>>> >> defined as above.
>>> >
>>> >
>>> > My impression (*) is that this will be confusing, and uses a default
>>> that I
>>> > never ever needed.
>>> >
>>> > Maybe a better way would be to use low=-np.inf and high=np.inf  where
>>> inf
>>> > would be interpreted as the smallest and largest representable number.
>>> And
>>> > leave the defaults unchanged.
>>> >
>>> > (*) I didn't try to understand how it works for various cases.
>>> >
>>> > Josef
>>> >
>>>
>>> As I mentioned on the PR discussion, the thing that bothers me is the
>>> inconsistency between the new and the old functionality, specifically
>>> in #2. If high is, the behavior is completely different depending on
>>> the value of `low`. Using `np.inf` instead of `None` may fix that,
>>> although I think that the author's idea was to avoid having to type
>>> the bounds in the `None`/`+/-np.inf` cases. I think that a better
>>> option is to have a separate wrapper to `randint` that implements this
>>> behavior in a consistent manner and leaves the current function
>>> consistent as well.
>>>
>>>     -Joe
>>>
>>>
>>> >
>>> >
>>> >>
>>> >>
>>> >> The primary motivation was the second case, as it is more convenient
>>> to
>>> >> specify a 'dtype' by itself when generating such numbers in a similar
>>> vein
>>> >> to numpy.empty, except with initialized values.
>>> >>
>>> >> Looking forward to your feedback!
>>> >>
>>> >> Greg
>>> >>
>>> >> _______________________________________________
>>> >> NumPy-Discussion mailing list
>>> >> NumPy-Discussion at scipy.org
>>> >> https://mail.scipy.org/mailman/listinfo/numpy-discussion
>>> >>
>>> >
>>> >
>>> > _______________________________________________
>>> > NumPy-Discussion mailing list
>>> > NumPy-Discussion at scipy.org
>>> > https://mail.scipy.org/mailman/listinfo/numpy-discussion
>>> >
>>> _______________________________________________
>>> NumPy-Discussion mailing list
>>> NumPy-Discussion at scipy.org
>>> https://mail.scipy.org/mailman/listinfo/numpy-discussion
>>>
>>
>>
>> _______________________________________________
>> NumPy-Discussion mailing list
>> NumPy-Discussion at scipy.org
>> https://mail.scipy.org/mailman/listinfo/numpy-discussion
>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20160217/f5ba3ab6/attachment.html>