From rhodri at kynesim.co.uk  Thu Nov  1 08:50:51 2018
From: rhodri at kynesim.co.uk (Rhodri James)
Date: Thu, 1 Nov 2018 12:50:51 +0000
Subject: [Python-ideas] Allow Context Managers to Support Suspended
 Execution
In-Reply-To: <CAD9w1hC6cY46-WQg10cBfa9JeYoxXkdmadgpwJ9AhtRV+qYFgw@mail.gmail.com>
References: <CAD9w1hC6cY46-WQg10cBfa9JeYoxXkdmadgpwJ9AhtRV+qYFgw@mail.gmail.com>
Message-ID: <dbed35a8-5938-305c-aae9-47660f9853c0@kynesim.co.uk>

On 01/11/2018 02:52, David Allemang wrote:
> I do not think there is currently a good way for Context Managers to
> support suspended execution, as in await or yield. Both of these
> instructions cause the interpreter to leave the with block, yet no
> indication of this (temporary) exit or subsequent re-entrance is given
> to the context manager. If the intent of a Context Manager is to say
> "no matter how this block is entered or exited, the context will be
> correctly maintained", then this needs to be possible.

I think you're going to have to justify this a bit more.  From my point 
of view, yielding does not leave the with block in any meaningful sense. 
  Indeed I'd be quite hacked off with a file context manager that was so 
inefficient as to close the file on yielding a line, only to have to 
re-open and seek when it got control back.


-- 
Rhodri James *-* Kynesim Ltd

From cspealma at redhat.com  Thu Nov  1 09:39:54 2018
From: cspealma at redhat.com (Calvin Spealman)
Date: Thu, 1 Nov 2018 09:39:54 -0400
Subject: [Python-ideas] Allow Context Managers to Support Suspended
 Execution
In-Reply-To: <CAD9w1hC6cY46-WQg10cBfa9JeYoxXkdmadgpwJ9AhtRV+qYFgw@mail.gmail.com>
References: <CAD9w1hC6cY46-WQg10cBfa9JeYoxXkdmadgpwJ9AhtRV+qYFgw@mail.gmail.com>
Message-ID: <CACo5Rz4BWyxZa0WWg5NBLKUMeu+iZyNgRDr9kKUNuLFnaP5CAw@mail.gmail.com>

I'm very curious about the idea, but can't come up with any use cases based
just one your explanation. Maybe you could give some examples where this
would be useful? In particular, what are some cases that are really hard to
handle now and how would those cases be improved like this?

On Wed, Oct 31, 2018 at 10:53 PM David Allemang <allemang.d at gmail.com>
wrote:

> I do not think there is currently a good way for Context Managers to
> support suspended execution, as in await or yield. Both of these
> instructions cause the interpreter to leave the with block, yet no
> indication of this (temporary) exit or subsequent re-entrance is given
> to the context manager. If the intent of a Context Manager is to say
> "no matter how this block is entered or exited, the context will be
> correctly maintained", then this needs to be possible.
>
> I would propose magic methods __suspend__ and __resume__ as companions
> to the existing __enter__ and __exit__ methods (and their async
> variants). __suspend__, if present, would be called upon suspending
> execution on an await or yield statement, and __resume__, if present,
> would be called when execution is resumed. If __suspend__ or
> __resume__ are not present then nothing should be done, so that the
> behavior of existing context managers is preserved.
>
> Here is an example demonstrating the issue with await:
> https://gist.github.com/allemangD/bba8dc2d059310623f752ebf65bb6cdc
> and one with yield:
> https://gist.github.com/allemangD/f2534f16d3a0c642c2cdc02c544e854f
>
> The context manager used is clearly not thread-safe, and I'm not
> actually sure how to approach a thread-safe implementation with the
> proposed __suspend__ and __resume__ - but I don't believe that
> introducing these new methods would create any issues that aren't
> already present with __enter__ and __exit__.
>
> It's worth noting that the context manager used in those examples is,
> essentially, identical contextlib's redirect_stdout and decimal's
> localcontext managers. Any context manager such as these which modify
> global state or the behavior of global functions would benefit from
> this. It may also make sense to, for example, have the __suspend__
> method on file objects flush buffers without closing the file, similar
> to their current __exit__ behavior, but I'm unsure what impact this
> would have on performance.
>
> It is important, though, that yield and await not use __enter__ or
> __exit__, as not all context-managers are reusable. I'm unsure  what
> the best term would be to describe this type of context, as the
> documentation for contextlib already gives a different definition for
> "reentrant" - I would then call them "suspendable" contexts. It would
> make sense to have an @suspendable decorator, probably in contextlib,
> to indicate that a context manager can use __enter__ and __exit__
> methods rather than __suspend__ and __resume__. All it would need to
> do is define __suspend__ to call __enter__() and __resume__ to call
> __exit__(None, None, None).
>
> It is also important, since __suspend__ and __resume__ would be called
> after a context is entered but before it is exited, that __suspend__
> not accept any parameters and that __resume__ not use its return
> value. __suspend__ could not be triggered by an exception, only by a
> yield or await, and __resume__ could not have its return value named
> with as.
>
> Thanks,
>
> David
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> https://mail.python.org/mailman/listinfo/python-ideas
> Code of Conduct: http://python.org/psf/codeofconduct/
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20181101/688e448c/attachment-0001.html>

From guido at python.org  Thu Nov  1 11:05:37 2018
From: guido at python.org (Guido van Rossum)
Date: Thu, 1 Nov 2018 08:05:37 -0700
Subject: [Python-ideas] Allow Context Managers to Support Suspended
 Execution
In-Reply-To: <CAD9w1hC6cY46-WQg10cBfa9JeYoxXkdmadgpwJ9AhtRV+qYFgw@mail.gmail.com>
References: <CAD9w1hC6cY46-WQg10cBfa9JeYoxXkdmadgpwJ9AhtRV+qYFgw@mail.gmail.com>
Message-ID: <CAP7+vJK7hYnWZzrm-GmA56ZP5ZEvCrrHHzCg-rLPacn91VHqng@mail.gmail.com>

Check out the decimal example here:
https://www.python.org/dev/peps/pep-0568/ (PEP 568 is deferred, but PEP 567
is implemented in Python 3.7).

Those Contexts aren't context managers, but still there's some thought put
into swapping contexts out at the boundaries of generators.

On Wed, Oct 31, 2018 at 7:54 PM David Allemang <allemang.d at gmail.com> wrote:

> I do not think there is currently a good way for Context Managers to
> support suspended execution, as in await or yield. Both of these
> instructions cause the interpreter to leave the with block, yet no
> indication of this (temporary) exit or subsequent re-entrance is given
> to the context manager. If the intent of a Context Manager is to say
> "no matter how this block is entered or exited, the context will be
> correctly maintained", then this needs to be possible.
>
> I would propose magic methods __suspend__ and __resume__ as companions
> to the existing __enter__ and __exit__ methods (and their async
> variants). __suspend__, if present, would be called upon suspending
> execution on an await or yield statement, and __resume__, if present,
> would be called when execution is resumed. If __suspend__ or
> __resume__ are not present then nothing should be done, so that the
> behavior of existing context managers is preserved.
>
> Here is an example demonstrating the issue with await:
> https://gist.github.com/allemangD/bba8dc2d059310623f752ebf65bb6cdc
> and one with yield:
> https://gist.github.com/allemangD/f2534f16d3a0c642c2cdc02c544e854f
>
> The context manager used is clearly not thread-safe, and I'm not
> actually sure how to approach a thread-safe implementation with the
> proposed __suspend__ and __resume__ - but I don't believe that
> introducing these new methods would create any issues that aren't
> already present with __enter__ and __exit__.
>
> It's worth noting that the context manager used in those examples is,
> essentially, identical contextlib's redirect_stdout and decimal's
> localcontext managers. Any context manager such as these which modify
> global state or the behavior of global functions would benefit from
> this. It may also make sense to, for example, have the __suspend__
> method on file objects flush buffers without closing the file, similar
> to their current __exit__ behavior, but I'm unsure what impact this
> would have on performance.
>
> It is important, though, that yield and await not use __enter__ or
> __exit__, as not all context-managers are reusable. I'm unsure  what
> the best term would be to describe this type of context, as the
> documentation for contextlib already gives a different definition for
> "reentrant" - I would then call them "suspendable" contexts. It would
> make sense to have an @suspendable decorator, probably in contextlib,
> to indicate that a context manager can use __enter__ and __exit__
> methods rather than __suspend__ and __resume__. All it would need to
> do is define __suspend__ to call __enter__() and __resume__ to call
> __exit__(None, None, None).
>
> It is also important, since __suspend__ and __resume__ would be called
> after a context is entered but before it is exited, that __suspend__
> not accept any parameters and that __resume__ not use its return
> value. __suspend__ could not be triggered by an exception, only by a
> yield or await, and __resume__ could not have its return value named
> with as.
>
> Thanks,
>
> David
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> https://mail.python.org/mailman/listinfo/python-ideas
> Code of Conduct: http://python.org/psf/codeofconduct/
>


-- 
--Guido van Rossum (python.org/~guido)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20181101/70bd55a9/attachment.html>

From yselivanov.ml at gmail.com  Thu Nov  1 11:40:44 2018
From: yselivanov.ml at gmail.com (Yury Selivanov)
Date: Thu, 1 Nov 2018 11:40:44 -0400
Subject: [Python-ideas] Allow Context Managers to Support Suspended
 Execution
In-Reply-To: <CAP7+vJK7hYnWZzrm-GmA56ZP5ZEvCrrHHzCg-rLPacn91VHqng@mail.gmail.com>
References: <CAD9w1hC6cY46-WQg10cBfa9JeYoxXkdmadgpwJ9AhtRV+qYFgw@mail.gmail.com>
 <CAP7+vJK7hYnWZzrm-GmA56ZP5ZEvCrrHHzCg-rLPacn91VHqng@mail.gmail.com>
Message-ID: <CA+St6D1QSOg0ajo+4g74hU0tFK4S2LijsoO7bbivdHXuqL7Ftw@mail.gmail.com>

Yep, PEP 567 addresses this for coroutines, so David's first example
is covered; here's a link to the fixed version: [1]

The proposal to add __suspend__ and __resume__ is very similar to PEP
521 which was withdrawn.  PEP 568 (which needs to be properly updated)
is the way to go if we want to address this issue for generators.

[1] https://gist.github.com/allemangD/bba8dc2d059310623f752ebf65bb6cdc#gistcomment-2748803

Yury
On Thu, Nov 1, 2018 at 11:06 AM Guido van Rossum <guido at python.org> wrote:
>
> Check out the decimal example here: https://www.python.org/dev/peps/pep-0568/ (PEP 568 is deferred, but PEP 567 is implemented in Python 3.7).
>
> Those Contexts aren't context managers, but still there's some thought put into swapping contexts out at the boundaries of generators.
>
> On Wed, Oct 31, 2018 at 7:54 PM David Allemang <allemang.d at gmail.com> wrote:
>>
>> I do not think there is currently a good way for Context Managers to
>> support suspended execution, as in await or yield. Both of these
>> instructions cause the interpreter to leave the with block, yet no
>> indication of this (temporary) exit or subsequent re-entrance is given
>> to the context manager. If the intent of a Context Manager is to say
>> "no matter how this block is entered or exited, the context will be
>> correctly maintained", then this needs to be possible.
>>
>> I would propose magic methods __suspend__ and __resume__ as companions
>> to the existing __enter__ and __exit__ methods (and their async
>> variants). __suspend__, if present, would be called upon suspending
>> execution on an await or yield statement, and __resume__, if present,
>> would be called when execution is resumed. If __suspend__ or
>> __resume__ are not present then nothing should be done, so that the
>> behavior of existing context managers is preserved.
>>
>> Here is an example demonstrating the issue with await:
>> https://gist.github.com/allemangD/bba8dc2d059310623f752ebf65bb6cdc
>> and one with yield:
>> https://gist.github.com/allemangD/f2534f16d3a0c642c2cdc02c544e854f
>>
>> The context manager used is clearly not thread-safe, and I'm not
>> actually sure how to approach a thread-safe implementation with the
>> proposed __suspend__ and __resume__ - but I don't believe that
>> introducing these new methods would create any issues that aren't
>> already present with __enter__ and __exit__.
>>
>> It's worth noting that the context manager used in those examples is,
>> essentially, identical contextlib's redirect_stdout and decimal's
>> localcontext managers. Any context manager such as these which modify
>> global state or the behavior of global functions would benefit from
>> this. It may also make sense to, for example, have the __suspend__
>> method on file objects flush buffers without closing the file, similar
>> to their current __exit__ behavior, but I'm unsure what impact this
>> would have on performance.
>>
>> It is important, though, that yield and await not use __enter__ or
>> __exit__, as not all context-managers are reusable. I'm unsure  what
>> the best term would be to describe this type of context, as the
>> documentation for contextlib already gives a different definition for
>> "reentrant" - I would then call them "suspendable" contexts. It would
>> make sense to have an @suspendable decorator, probably in contextlib,
>> to indicate that a context manager can use __enter__ and __exit__
>> methods rather than __suspend__ and __resume__. All it would need to
>> do is define __suspend__ to call __enter__() and __resume__ to call
>> __exit__(None, None, None).
>>
>> It is also important, since __suspend__ and __resume__ would be called
>> after a context is entered but before it is exited, that __suspend__
>> not accept any parameters and that __resume__ not use its return
>> value. __suspend__ could not be triggered by an exception, only by a
>> yield or await, and __resume__ could not have its return value named
>> with as.
>>
>> Thanks,
>>
>> David
>> _______________________________________________
>> Python-ideas mailing list
>> Python-ideas at python.org
>> https://mail.python.org/mailman/listinfo/python-ideas
>> Code of Conduct: http://python.org/psf/codeofconduct/
>
>
>
> --
> --Guido van Rossum (python.org/~guido)
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> https://mail.python.org/mailman/listinfo/python-ideas
> Code of Conduct: http://python.org/psf/codeofconduct/


-- 
         Yury

From allemang.d at gmail.com  Thu Nov  1 13:11:38 2018
From: allemang.d at gmail.com (David Allemang)
Date: Thu, 1 Nov 2018 13:11:38 -0400
Subject: [Python-ideas] Allow Context Managers to Support Suspended
 Execution
In-Reply-To: <CA+St6D1QSOg0ajo+4g74hU0tFK4S2LijsoO7bbivdHXuqL7Ftw@mail.gmail.com>
References: <CAD9w1hC6cY46-WQg10cBfa9JeYoxXkdmadgpwJ9AhtRV+qYFgw@mail.gmail.com>
 <CAP7+vJK7hYnWZzrm-GmA56ZP5ZEvCrrHHzCg-rLPacn91VHqng@mail.gmail.com>
 <CA+St6D1QSOg0ajo+4g74hU0tFK4S2LijsoO7bbivdHXuqL7Ftw@mail.gmail.com>
Message-ID: <CAD9w1hATx6Jz2FCxaQR6ELu2pKY3JyBZmgp3LvDNzQ00NK3FUw@mail.gmail.com>

Yes, so PEP 512 is exactly what I was suggesting. My apologies for not
finding it before sending this.

So, then, PEP 567 solves the issue for coroutines and PEP 568 would solve
it for generators as well?

On Thu, Nov 1, 2018, 11:40 AM Yury Selivanov <yselivanov.ml at gmail.com wrote:

> Yep, PEP 567 addresses this for coroutines, so David's first example
> is covered; here's a link to the fixed version: [1]
>
> The proposal to add __suspend__ and __resume__ is very similar to PEP
> 521 which was withdrawn.  PEP 568 (which needs to be properly updated)
> is the way to go if we want to address this issue for generators.
>
> [1]
> https://gist.github.com/allemangD/bba8dc2d059310623f752ebf65bb6cdc#gistcomment-2748803
>
> Yury
> On Thu, Nov 1, 2018 at 11:06 AM Guido van Rossum <guido at python.org> wrote:
> >
> > Check out the decimal example here:
> https://www.python.org/dev/peps/pep-0568/ (PEP 568 is deferred, but PEP
> 567 is implemented in Python 3.7).
> >
> > Those Contexts aren't context managers, but still there's some thought
> put into swapping contexts out at the boundaries of generators.
> >
> > On Wed, Oct 31, 2018 at 7:54 PM David Allemang <allemang.d at gmail.com>
> wrote:
> >>
> >> I do not think there is currently a good way for Context Managers to
> >> support suspended execution, as in await or yield. Both of these
> >> instructions cause the interpreter to leave the with block, yet no
> >> indication of this (temporary) exit or subsequent re-entrance is given
> >> to the context manager. If the intent of a Context Manager is to say
> >> "no matter how this block is entered or exited, the context will be
> >> correctly maintained", then this needs to be possible.
> >>
> >> I would propose magic methods __suspend__ and __resume__ as companions
> >> to the existing __enter__ and __exit__ methods (and their async
> >> variants). __suspend__, if present, would be called upon suspending
> >> execution on an await or yield statement, and __resume__, if present,
> >> would be called when execution is resumed. If __suspend__ or
> >> __resume__ are not present then nothing should be done, so that the
> >> behavior of existing context managers is preserved.
> >>
> >> Here is an example demonstrating the issue with await:
> >> https://gist.github.com/allemangD/bba8dc2d059310623f752ebf65bb6cdc
> >> and one with yield:
> >> https://gist.github.com/allemangD/f2534f16d3a0c642c2cdc02c544e854f
> >>
> >> The context manager used is clearly not thread-safe, and I'm not
> >> actually sure how to approach a thread-safe implementation with the
> >> proposed __suspend__ and __resume__ - but I don't believe that
> >> introducing these new methods would create any issues that aren't
> >> already present with __enter__ and __exit__.
> >>
> >> It's worth noting that the context manager used in those examples is,
> >> essentially, identical contextlib's redirect_stdout and decimal's
> >> localcontext managers. Any context manager such as these which modify
> >> global state or the behavior of global functions would benefit from
> >> this. It may also make sense to, for example, have the __suspend__
> >> method on file objects flush buffers without closing the file, similar
> >> to their current __exit__ behavior, but I'm unsure what impact this
> >> would have on performance.
> >>
> >> It is important, though, that yield and await not use __enter__ or
> >> __exit__, as not all context-managers are reusable. I'm unsure  what
> >> the best term would be to describe this type of context, as the
> >> documentation for contextlib already gives a different definition for
> >> "reentrant" - I would then call them "suspendable" contexts. It would
> >> make sense to have an @suspendable decorator, probably in contextlib,
> >> to indicate that a context manager can use __enter__ and __exit__
> >> methods rather than __suspend__ and __resume__. All it would need to
> >> do is define __suspend__ to call __enter__() and __resume__ to call
> >> __exit__(None, None, None).
> >>
> >> It is also important, since __suspend__ and __resume__ would be called
> >> after a context is entered but before it is exited, that __suspend__
> >> not accept any parameters and that __resume__ not use its return
> >> value. __suspend__ could not be triggered by an exception, only by a
> >> yield or await, and __resume__ could not have its return value named
> >> with as.
> >>
> >> Thanks,
> >>
> >> David
> >> _______________________________________________
> >> Python-ideas mailing list
> >> Python-ideas at python.org
> >> https://mail.python.org/mailman/listinfo/python-ideas
> >> Code of Conduct: http://python.org/psf/codeofconduct/
> >
> >
> >
> > --
> > --Guido van Rossum (python.org/~guido)
> > _______________________________________________
> > Python-ideas mailing list
> > Python-ideas at python.org
> > https://mail.python.org/mailman/listinfo/python-ideas
> > Code of Conduct: http://python.org/psf/codeofconduct/
>
>
>
> --
>          Yury
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20181101/527ec517/attachment.html>

From storchaka at gmail.com  Thu Nov  1 15:36:19 2018
From: storchaka at gmail.com (Serhiy Storchaka)
Date: Thu, 1 Nov 2018 21:36:19 +0200
Subject: [Python-ideas] Add "default" kwarg to list.pop()
In-Reply-To: <20181031120710.0824d82d@fsol>
References: <CAFYqXL_CZ2QJUJO3o3f6uXAesaDQ3SqSW1K3PgF8dHOxsueNmg@mail.gmail.com>
 <prasli$8iv$1@blaine.gmane.org> <20181031120710.0824d82d@fsol>
Message-ID: <prfkfg$j6s$1@blaine.gmane.org>

31.10.18 13:07, Antoine Pitrou ????:
> l.pop(default=...) has the potential to be multi-thread-safe, while
> your alternatives haven't.

The multi-thread-safe alternative is:

     try:
         value = l.pop()
     except IndexError:
         value = default


From storchaka at gmail.com  Thu Nov  1 15:45:26 2018
From: storchaka at gmail.com (Serhiy Storchaka)
Date: Thu, 1 Nov 2018 21:45:26 +0200
Subject: [Python-ideas] Add "default" kwarg to list.pop()
In-Reply-To: <20181031120813.69b9a23b@fsol>
References: <CAFYqXL_CZ2QJUJO3o3f6uXAesaDQ3SqSW1K3PgF8dHOxsueNmg@mail.gmail.com>
 <20181031120813.69b9a23b@fsol>
Message-ID: <prfl0k$p96$1@blaine.gmane.org>

31.10.18 13:08, Antoine Pitrou ????:
> +1 from me.  dict.pop() already has an optional default.  This is a
> straight-forward improvement to the API and no Python programmer will
> be surprised.

list.pop() corresponds two dict methods. With argument it corresponds 
dict.pop(). But there are differences: dict.pop() called repeatedly with 
the same key will raise an error (or return the default), while 
list.pop() will likely return other item. Without argument it 
corresponds dict.popitem() which doesn't have an optional default.


From ethan at stoneleaf.us  Thu Nov  1 20:12:19 2018
From: ethan at stoneleaf.us (Ethan Furman)
Date: Thu, 1 Nov 2018 17:12:19 -0700
Subject: [Python-ideas] Add "default" kwarg to list.pop()
In-Reply-To: <CAPTjJmoj4qcmbyczMVZvZMqn8FHet=pqzVQ_Wx9KUy193GeEow@mail.gmail.com>
References: <CAFYqXL_CZ2QJUJO3o3f6uXAesaDQ3SqSW1K3PgF8dHOxsueNmg@mail.gmail.com>
 <prasli$8iv$1@blaine.gmane.org> <20181031010851.GC3817@ando.pearwood.info>
 <CAOmnVUAOdT2SobfVA1_ysFMZxxUs09rgi2tjaTHiDvezZZopCA@mail.gmail.com>
 <CAPTjJmqEp-E7qVrU=PvGE=0XDOZdZr2G5ea2RUb9M=3DmZQW-g@mail.gmail.com>
 <CAP2Qz+VEVc6W7b1yppAgOY2KerNE=FH=jqY5qWSrGBjN=Ac0wg@mail.gmail.com>
 <CAGgTfkPBRPNADFzahb-f1EqFNdBHc6q4KWbRD7m7eScLYb_E3w@mail.gmail.com>
 <CA+msPNnZbYSnH8KKSHKk_W=5K=PmLWWvaq0x1j0EohC7Z3tHbg@mail.gmail.com>
 <CAPTjJmr1ZF-VK9cTGZPi6X1od+H+cbvQKjOvvbE1R_waPTq20A@mail.gmail.com>
 <CA+msPNk5nb2XN4GSVhGZ4LzAC0Ngk=C4+DF_UkF-U03cSMXqTg@mail.gmail.com>
 <CAPTjJmoj4qcmbyczMVZvZMqn8FHet=pqzVQ_Wx9KUy193GeEow@mail.gmail.com>
Message-ID: <0f52b26d-9e9f-601c-3301-5ba7daadb19c@stoneleaf.us>

On 10/31/2018 02:29 PM, Chris Angelico wrote:

> Exactly how a team of core devs can make unified
> decisions is a little up in the air at the moment

I wouldn't worry too much about it.  I don't think we have ever made 
entirely unified decisions.

--
~Ethan~

From rosuav at gmail.com  Thu Nov  1 20:15:32 2018
From: rosuav at gmail.com (Chris Angelico)
Date: Fri, 2 Nov 2018 11:15:32 +1100
Subject: [Python-ideas] Add "default" kwarg to list.pop()
In-Reply-To: <0f52b26d-9e9f-601c-3301-5ba7daadb19c@stoneleaf.us>
References: <CAFYqXL_CZ2QJUJO3o3f6uXAesaDQ3SqSW1K3PgF8dHOxsueNmg@mail.gmail.com>
 <prasli$8iv$1@blaine.gmane.org> <20181031010851.GC3817@ando.pearwood.info>
 <CAOmnVUAOdT2SobfVA1_ysFMZxxUs09rgi2tjaTHiDvezZZopCA@mail.gmail.com>
 <CAPTjJmqEp-E7qVrU=PvGE=0XDOZdZr2G5ea2RUb9M=3DmZQW-g@mail.gmail.com>
 <CAP2Qz+VEVc6W7b1yppAgOY2KerNE=FH=jqY5qWSrGBjN=Ac0wg@mail.gmail.com>
 <CAGgTfkPBRPNADFzahb-f1EqFNdBHc6q4KWbRD7m7eScLYb_E3w@mail.gmail.com>
 <CA+msPNnZbYSnH8KKSHKk_W=5K=PmLWWvaq0x1j0EohC7Z3tHbg@mail.gmail.com>
 <CAPTjJmr1ZF-VK9cTGZPi6X1od+H+cbvQKjOvvbE1R_waPTq20A@mail.gmail.com>
 <CA+msPNk5nb2XN4GSVhGZ4LzAC0Ngk=C4+DF_UkF-U03cSMXqTg@mail.gmail.com>
 <CAPTjJmoj4qcmbyczMVZvZMqn8FHet=pqzVQ_Wx9KUy193GeEow@mail.gmail.com>
 <0f52b26d-9e9f-601c-3301-5ba7daadb19c@stoneleaf.us>
Message-ID: <CAPTjJmq96-wJ3R8Up+cRb9hDc+=B3JFqfuO-K0QdjxL2+dirEg@mail.gmail.com>

On Fri, Nov 2, 2018 at 11:12 AM Ethan Furman <ethan at stoneleaf.us> wrote:
>
> On 10/31/2018 02:29 PM, Chris Angelico wrote:
>
> > Exactly how a team of core devs can make unified
> > decisions is a little up in the air at the moment
>
> I wouldn't worry too much about it.  I don't think we have ever made
> entirely unified decisions.
>

LOL, there is that. But somehow, a single decision has to be made:
merge or don't merge? And getting a group of people to the point of
making a single decision is the bit that's up in the air.

ChrisA

From robertve92 at gmail.com  Thu Nov  1 20:18:26 2018
From: robertve92 at gmail.com (Robert Vanden Eynde)
Date: Fri, 2 Nov 2018 01:18:26 +0100
Subject: [Python-ideas] Add "default" kwarg to list.pop()
In-Reply-To: <CAPTjJmq96-wJ3R8Up+cRb9hDc+=B3JFqfuO-K0QdjxL2+dirEg@mail.gmail.com>
References: <CAFYqXL_CZ2QJUJO3o3f6uXAesaDQ3SqSW1K3PgF8dHOxsueNmg@mail.gmail.com>
 <prasli$8iv$1@blaine.gmane.org> <20181031010851.GC3817@ando.pearwood.info>
 <CAOmnVUAOdT2SobfVA1_ysFMZxxUs09rgi2tjaTHiDvezZZopCA@mail.gmail.com>
 <CAPTjJmqEp-E7qVrU=PvGE=0XDOZdZr2G5ea2RUb9M=3DmZQW-g@mail.gmail.com>
 <CAP2Qz+VEVc6W7b1yppAgOY2KerNE=FH=jqY5qWSrGBjN=Ac0wg@mail.gmail.com>
 <CAGgTfkPBRPNADFzahb-f1EqFNdBHc6q4KWbRD7m7eScLYb_E3w@mail.gmail.com>
 <CA+msPNnZbYSnH8KKSHKk_W=5K=PmLWWvaq0x1j0EohC7Z3tHbg@mail.gmail.com>
 <CAPTjJmr1ZF-VK9cTGZPi6X1od+H+cbvQKjOvvbE1R_waPTq20A@mail.gmail.com>
 <CA+msPNk5nb2XN4GSVhGZ4LzAC0Ngk=C4+DF_UkF-U03cSMXqTg@mail.gmail.com>
 <CAPTjJmoj4qcmbyczMVZvZMqn8FHet=pqzVQ_Wx9KUy193GeEow@mail.gmail.com>
 <0f52b26d-9e9f-601c-3301-5ba7daadb19c@stoneleaf.us>
 <CAPTjJmq96-wJ3R8Up+cRb9hDc+=B3JFqfuO-K0QdjxL2+dirEg@mail.gmail.com>
Message-ID: <CA+msPNnzM+NNE6oZgZ8cDryFid9YXrn-VhpdGQQSCz+5krBh8g@mail.gmail.com>

Just English Vocabulary, what do you mean by "being in the air at the
moment" ?
Like, that's a subject that a lot of people in here like to talk ?

Yes, to merge or not to merge, but people can UpVote/DownVote can't they ?
:D

Le ven. 2 nov. 2018 ? 01:15, Chris Angelico <rosuav at gmail.com> a ?crit :

> On Fri, Nov 2, 2018 at 11:12 AM Ethan Furman <ethan at stoneleaf.us> wrote:
> >
> > On 10/31/2018 02:29 PM, Chris Angelico wrote:
> >
> > > Exactly how a team of core devs can make unified
> > > decisions is a little up in the air at the moment
> >
> > I wouldn't worry too much about it.  I don't think we have ever made
> > entirely unified decisions.
> >
>
> LOL, there is that. But somehow, a single decision has to be made:
> merge or don't merge? And getting a group of people to the point of
> making a single decision is the bit that's up in the air.
>
> ChrisA
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> https://mail.python.org/mailman/listinfo/python-ideas
> Code of Conduct: http://python.org/psf/codeofconduct/
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20181102/ce3c29a9/attachment.html>

From rosuav at gmail.com  Thu Nov  1 20:22:12 2018
From: rosuav at gmail.com (Chris Angelico)
Date: Fri, 2 Nov 2018 11:22:12 +1100
Subject: [Python-ideas] Add "default" kwarg to list.pop()
In-Reply-To: <CA+msPNnzM+NNE6oZgZ8cDryFid9YXrn-VhpdGQQSCz+5krBh8g@mail.gmail.com>
References: <CAFYqXL_CZ2QJUJO3o3f6uXAesaDQ3SqSW1K3PgF8dHOxsueNmg@mail.gmail.com>
 <prasli$8iv$1@blaine.gmane.org> <20181031010851.GC3817@ando.pearwood.info>
 <CAOmnVUAOdT2SobfVA1_ysFMZxxUs09rgi2tjaTHiDvezZZopCA@mail.gmail.com>
 <CAPTjJmqEp-E7qVrU=PvGE=0XDOZdZr2G5ea2RUb9M=3DmZQW-g@mail.gmail.com>
 <CAP2Qz+VEVc6W7b1yppAgOY2KerNE=FH=jqY5qWSrGBjN=Ac0wg@mail.gmail.com>
 <CAGgTfkPBRPNADFzahb-f1EqFNdBHc6q4KWbRD7m7eScLYb_E3w@mail.gmail.com>
 <CA+msPNnZbYSnH8KKSHKk_W=5K=PmLWWvaq0x1j0EohC7Z3tHbg@mail.gmail.com>
 <CAPTjJmr1ZF-VK9cTGZPi6X1od+H+cbvQKjOvvbE1R_waPTq20A@mail.gmail.com>
 <CA+msPNk5nb2XN4GSVhGZ4LzAC0Ngk=C4+DF_UkF-U03cSMXqTg@mail.gmail.com>
 <CAPTjJmoj4qcmbyczMVZvZMqn8FHet=pqzVQ_Wx9KUy193GeEow@mail.gmail.com>
 <0f52b26d-9e9f-601c-3301-5ba7daadb19c@stoneleaf.us>
 <CAPTjJmq96-wJ3R8Up+cRb9hDc+=B3JFqfuO-K0QdjxL2+dirEg@mail.gmail.com>
 <CA+msPNnzM+NNE6oZgZ8cDryFid9YXrn-VhpdGQQSCz+5krBh8g@mail.gmail.com>
Message-ID: <CAPTjJmocOuGYkZh=+_rf_1GOX+jpkAqM5H3BtqZE6WFMaKiTxw@mail.gmail.com>

On Fri, Nov 2, 2018 at 11:19 AM Robert Vanden Eynde
<robertve92 at gmail.com> wrote:
>
> Just English Vocabulary, what do you mean by "being in the air at the moment" ?
> Like, that's a subject that a lot of people in here like to talk ?

"Up in the air" means uncertain, subject to change.

https://idioms.thefreedictionary.com/up+in+the+air

In this case, the governance model for the Python language is being discussed.

> Yes, to merge or not to merge, but people can UpVote/DownVote can't they ? :D

Upvotes and downvotes don't mean anything. So, sure! It's like
upvoting or downvoting one of your country's laws... nobody's stopping
you (at least, I hope you live in a country where you're allowed to
express your likes and dislikes), but it doesn't change anything
unless you're one of the handful of people who actually make the
decision.

ChrisA

From robertve92 at gmail.com  Thu Nov  1 20:26:06 2018
From: robertve92 at gmail.com (Robert Vanden Eynde)
Date: Fri, 2 Nov 2018 01:26:06 +0100
Subject: [Python-ideas] Add "default" kwarg to list.pop()
In-Reply-To: <CAPTjJmocOuGYkZh=+_rf_1GOX+jpkAqM5H3BtqZE6WFMaKiTxw@mail.gmail.com>
References: <CAFYqXL_CZ2QJUJO3o3f6uXAesaDQ3SqSW1K3PgF8dHOxsueNmg@mail.gmail.com>
 <prasli$8iv$1@blaine.gmane.org> <20181031010851.GC3817@ando.pearwood.info>
 <CAOmnVUAOdT2SobfVA1_ysFMZxxUs09rgi2tjaTHiDvezZZopCA@mail.gmail.com>
 <CAPTjJmqEp-E7qVrU=PvGE=0XDOZdZr2G5ea2RUb9M=3DmZQW-g@mail.gmail.com>
 <CAP2Qz+VEVc6W7b1yppAgOY2KerNE=FH=jqY5qWSrGBjN=Ac0wg@mail.gmail.com>
 <CAGgTfkPBRPNADFzahb-f1EqFNdBHc6q4KWbRD7m7eScLYb_E3w@mail.gmail.com>
 <CA+msPNnZbYSnH8KKSHKk_W=5K=PmLWWvaq0x1j0EohC7Z3tHbg@mail.gmail.com>
 <CAPTjJmr1ZF-VK9cTGZPi6X1od+H+cbvQKjOvvbE1R_waPTq20A@mail.gmail.com>
 <CA+msPNk5nb2XN4GSVhGZ4LzAC0Ngk=C4+DF_UkF-U03cSMXqTg@mail.gmail.com>
 <CAPTjJmoj4qcmbyczMVZvZMqn8FHet=pqzVQ_Wx9KUy193GeEow@mail.gmail.com>
 <0f52b26d-9e9f-601c-3301-5ba7daadb19c@stoneleaf.us>
 <CAPTjJmq96-wJ3R8Up+cRb9hDc+=B3JFqfuO-K0QdjxL2+dirEg@mail.gmail.com>
 <CA+msPNnzM+NNE6oZgZ8cDryFid9YXrn-VhpdGQQSCz+5krBh8g@mail.gmail.com>
 <CAPTjJmocOuGYkZh=+_rf_1GOX+jpkAqM5H3BtqZE6WFMaKiTxw@mail.gmail.com>
Message-ID: <CA+msPN=cOy599C4EsWFBmdpphL+RpH=B+T+Jq0p6sC5Wba9jsQ@mail.gmail.com>

>
> In this case, the governance model for the Python language is being
> discussed.
>

This was the info I was missing, where is it discussed ? Not only on this
list I assume ^^


> Upvotes and downvotes don't mean anything. [...]
>

Yes, that's why random people wouldn't vote.
But like, voting between like the 10 core devs where they all have the same
importance,
that does help for choosing "to merge or not to merge", isn't it ?
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20181102/54221085/attachment.html>

From rosuav at gmail.com  Thu Nov  1 20:28:19 2018
From: rosuav at gmail.com (Chris Angelico)
Date: Fri, 2 Nov 2018 11:28:19 +1100
Subject: [Python-ideas] Add "default" kwarg to list.pop()
In-Reply-To: <CA+msPN=cOy599C4EsWFBmdpphL+RpH=B+T+Jq0p6sC5Wba9jsQ@mail.gmail.com>
References: <CAFYqXL_CZ2QJUJO3o3f6uXAesaDQ3SqSW1K3PgF8dHOxsueNmg@mail.gmail.com>
 <prasli$8iv$1@blaine.gmane.org> <20181031010851.GC3817@ando.pearwood.info>
 <CAOmnVUAOdT2SobfVA1_ysFMZxxUs09rgi2tjaTHiDvezZZopCA@mail.gmail.com>
 <CAPTjJmqEp-E7qVrU=PvGE=0XDOZdZr2G5ea2RUb9M=3DmZQW-g@mail.gmail.com>
 <CAP2Qz+VEVc6W7b1yppAgOY2KerNE=FH=jqY5qWSrGBjN=Ac0wg@mail.gmail.com>
 <CAGgTfkPBRPNADFzahb-f1EqFNdBHc6q4KWbRD7m7eScLYb_E3w@mail.gmail.com>
 <CA+msPNnZbYSnH8KKSHKk_W=5K=PmLWWvaq0x1j0EohC7Z3tHbg@mail.gmail.com>
 <CAPTjJmr1ZF-VK9cTGZPi6X1od+H+cbvQKjOvvbE1R_waPTq20A@mail.gmail.com>
 <CA+msPNk5nb2XN4GSVhGZ4LzAC0Ngk=C4+DF_UkF-U03cSMXqTg@mail.gmail.com>
 <CAPTjJmoj4qcmbyczMVZvZMqn8FHet=pqzVQ_Wx9KUy193GeEow@mail.gmail.com>
 <0f52b26d-9e9f-601c-3301-5ba7daadb19c@stoneleaf.us>
 <CAPTjJmq96-wJ3R8Up+cRb9hDc+=B3JFqfuO-K0QdjxL2+dirEg@mail.gmail.com>
 <CA+msPNnzM+NNE6oZgZ8cDryFid9YXrn-VhpdGQQSCz+5krBh8g@mail.gmail.com>
 <CAPTjJmocOuGYkZh=+_rf_1GOX+jpkAqM5H3BtqZE6WFMaKiTxw@mail.gmail.com>
 <CA+msPN=cOy599C4EsWFBmdpphL+RpH=B+T+Jq0p6sC5Wba9jsQ@mail.gmail.com>
Message-ID: <CAPTjJmoniVnhLQko-z2gVrkVgqM0ij_1MvBNN0qzQYnK8NbZ1g@mail.gmail.com>

On Fri, Nov 2, 2018 at 11:26 AM Robert Vanden Eynde
<robertve92 at gmail.com> wrote:
>>
>> In this case, the governance model for the Python language is being discussed.
>
>
> This was the info I was missing, where is it discussed ? Not only on this list I assume ^^

There are a number of PEPs in the 8000s that would be worth reading.

>> Upvotes and downvotes don't mean anything. [...]
>
>
> Yes, that's why random people wouldn't vote.
> But like, voting between like the 10 core devs where they all have the same importance,
> that does help for choosing "to merge or not to merge", isn't it ?

That is just one of the possible options - that decisions are made by vote.

ChrisA

From robertve92 at gmail.com  Thu Nov  1 20:30:38 2018
From: robertve92 at gmail.com (Robert Vanden Eynde)
Date: Fri, 2 Nov 2018 01:30:38 +0100
Subject: [Python-ideas] Add "default" kwarg to list.pop()
In-Reply-To: <CAPTjJmoniVnhLQko-z2gVrkVgqM0ij_1MvBNN0qzQYnK8NbZ1g@mail.gmail.com>
References: <CAFYqXL_CZ2QJUJO3o3f6uXAesaDQ3SqSW1K3PgF8dHOxsueNmg@mail.gmail.com>
 <prasli$8iv$1@blaine.gmane.org> <20181031010851.GC3817@ando.pearwood.info>
 <CAOmnVUAOdT2SobfVA1_ysFMZxxUs09rgi2tjaTHiDvezZZopCA@mail.gmail.com>
 <CAPTjJmqEp-E7qVrU=PvGE=0XDOZdZr2G5ea2RUb9M=3DmZQW-g@mail.gmail.com>
 <CAP2Qz+VEVc6W7b1yppAgOY2KerNE=FH=jqY5qWSrGBjN=Ac0wg@mail.gmail.com>
 <CAGgTfkPBRPNADFzahb-f1EqFNdBHc6q4KWbRD7m7eScLYb_E3w@mail.gmail.com>
 <CA+msPNnZbYSnH8KKSHKk_W=5K=PmLWWvaq0x1j0EohC7Z3tHbg@mail.gmail.com>
 <CAPTjJmr1ZF-VK9cTGZPi6X1od+H+cbvQKjOvvbE1R_waPTq20A@mail.gmail.com>
 <CA+msPNk5nb2XN4GSVhGZ4LzAC0Ngk=C4+DF_UkF-U03cSMXqTg@mail.gmail.com>
 <CAPTjJmoj4qcmbyczMVZvZMqn8FHet=pqzVQ_Wx9KUy193GeEow@mail.gmail.com>
 <0f52b26d-9e9f-601c-3301-5ba7daadb19c@stoneleaf.us>
 <CAPTjJmq96-wJ3R8Up+cRb9hDc+=B3JFqfuO-K0QdjxL2+dirEg@mail.gmail.com>
 <CA+msPNnzM+NNE6oZgZ8cDryFid9YXrn-VhpdGQQSCz+5krBh8g@mail.gmail.com>
 <CAPTjJmocOuGYkZh=+_rf_1GOX+jpkAqM5H3BtqZE6WFMaKiTxw@mail.gmail.com>
 <CA+msPN=cOy599C4EsWFBmdpphL+RpH=B+T+Jq0p6sC5Wba9jsQ@mail.gmail.com>
 <CAPTjJmoniVnhLQko-z2gVrkVgqM0ij_1MvBNN0qzQYnK8NbZ1g@mail.gmail.com>
Message-ID: <CA+msPNnp5gVet458-K0RSmJiqAycLm48E=64B2Jm1Aztu02Ncg@mail.gmail.com>

>
> There are a number of PEPs in the 8000s that would be worth reading.
>

Will read that *? l'occaz*, closing the disgression now ^^
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20181102/069d233b/attachment-0001.html>

From ashafer01 at gmail.com  Thu Nov  1 21:06:54 2018
From: ashafer01 at gmail.com (Alex Shafer)
Date: Thu, 1 Nov 2018 19:06:54 -0600
Subject: [Python-ideas] dict.setdefault_call(), or API variations thereupon
Message-ID: <CAJG_PJUbaKc46x693+hmiaBhStSibxtWyLKS8PwZh6Tb=7iFZQ@mail.gmail.com>

I'd like to propose an addition to `dict` but I'm not necessarily proposing
what's written here as the API. When I initially saw the need for this
myself, I hastily wrote it as:

def setdefault_call(a_dict, key, default_func):
    try:
        return a_dict[key]
    except KeyError:
        default = default_func()
        a_dict[key] = default
        return default

If its not clear, the purpose is to eliminate the overhead of creating an
empty list or similar in situations like this:

d = {}
for i in range(1000000):  # some large loop
     l = d.setdefault(somekey, [])
     l.append(somevalue)

# instead...

for i in range(1000000):
    l = d.setdefault_call(somekey, list)
    l.append(somevalue)

One potential drawback I see to the concept is that I think there will be a
need to explicitly say "no arguments can get passed into this call".
Otherwise users may defeat the purpose with constructions like this:

d.setdefault_call("foo", list, ["default value"])

I'd mainly like feedback on this concept overall, and if its liked, perhaps
an API discussion to follow. Thanks!

PS

Other APIs I've considered for this are a new keyword argument to the
existing `setdefault()`, or perhaps more radically for Python, a new
keyword argument to the `dict()` constructor that would get called as an
implicit default for `setdefault()` and perhaps used in other scenarios
(essentially defining a type for dict values).
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20181101/46575043/attachment.html>

From rosuav at gmail.com  Thu Nov  1 21:12:45 2018
From: rosuav at gmail.com (Chris Angelico)
Date: Fri, 2 Nov 2018 12:12:45 +1100
Subject: [Python-ideas] dict.setdefault_call(),
 or API variations thereupon
In-Reply-To: <CAJG_PJUbaKc46x693+hmiaBhStSibxtWyLKS8PwZh6Tb=7iFZQ@mail.gmail.com>
References: <CAJG_PJUbaKc46x693+hmiaBhStSibxtWyLKS8PwZh6Tb=7iFZQ@mail.gmail.com>
Message-ID: <CAPTjJmqg_qtK3OfR+4VAaaNa7JXjHjHLpnx6EfEZX5n4tttqCQ@mail.gmail.com>

On Fri, Nov 2, 2018 at 12:07 PM Alex Shafer <ashafer01 at gmail.com> wrote:
> Other APIs I've considered for this are a new keyword argument to the existing `setdefault()`, or perhaps more radically for Python, a new keyword argument to the `dict()` constructor that would get called as an implicit default for `setdefault()` and perhaps used in other scenarios (essentially defining a type for dict values).
>

The time machine has been put to good use here. Are you aware of
__missing__ and collections.defaultdict? You just create a defaultdict
with a callable (very common to use a class like "list"), and any time
you try to use something that's missing, it'll call that to generate a
value.

from collections import defaultdict
d = defaultdict(list)
for category, item in some_stuff:
    d[category].append(item)

Easy way to group things into their categories.

ChrisA

From amit.mixie at gmail.com  Thu Nov  1 21:13:39 2018
From: amit.mixie at gmail.com (Amit Green)
Date: Thu, 1 Nov 2018 21:13:39 -0400
Subject: [Python-ideas] dict.setdefault_call(),
 or API variations thereupon
In-Reply-To: <CAJG_PJUbaKc46x693+hmiaBhStSibxtWyLKS8PwZh6Tb=7iFZQ@mail.gmail.com>
References: <CAJG_PJUbaKc46x693+hmiaBhStSibxtWyLKS8PwZh6Tb=7iFZQ@mail.gmail.com>
Message-ID: <CAKHFnb+RHPyO-DLe56HjoG92D+=YoYmASUOq6m4KJKmuBPWdBg@mail.gmail.com>

I use this a lot in my code.

Since `setdefault_call` does not exist, here is how I do it:

d = {}
lookup_d = d.get
provide_d = d.setdefault

for i in range(1000000):  # some large loop
l = (lookup_d(somekey)) or (provide_d(somekey, []))
l.append(somevalue)

I am not arguing for or against `.setdefault_call` -- I'm just providing
information, that I use the referenced behavior hundreds of time in my code.

My solution of using `lookup_d(...) or provide_d(...)` is obviously
inefficient in that it has to do two dictionary lookups (in the case that
the `lookup_d` fails).

A `setdefault_call` would be more efficient; though having to create a
lambda function, might offset this efficency.

So the key issue is readability, not efficiency.

On Thu, Nov 1, 2018 at 9:07 PM Alex Shafer <ashafer01 at gmail.com> wrote:

> I'd like to propose an addition to `dict` but I'm not necessarily
> proposing what's written here as the API. When I initially saw the need for
> this myself, I hastily wrote it as:
>
> def setdefault_call(a_dict, key, default_func):
>     try:
>         return a_dict[key]
>     except KeyError:
>         default = default_func()
>         a_dict[key] = default
>         return default
>
> If its not clear, the purpose is to eliminate the overhead of creating an
> empty list or similar in situations like this:
>
> d = {}
> for i in range(1000000):  # some large loop
>      l = d.setdefault(somekey, [])
>      l.append(somevalue)
>
> # instead...
>
> for i in range(1000000):
>     l = d.setdefault_call(somekey, list)
>     l.append(somevalue)
>
> One potential drawback I see to the concept is that I think there will be
> a need to explicitly say "no arguments can get passed into this call".
> Otherwise users may defeat the purpose with constructions like this:
>
> d.setdefault_call("foo", list, ["default value"])
>
> I'd mainly like feedback on this concept overall, and if its liked,
> perhaps an API discussion to follow. Thanks!
>
> PS
>
> Other APIs I've considered for this are a new keyword argument to the
> existing `setdefault()`, or perhaps more radically for Python, a new
> keyword argument to the `dict()` constructor that would get called as an
> implicit default for `setdefault()` and perhaps used in other scenarios
> (essentially defining a type for dict values).
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> https://mail.python.org/mailman/listinfo/python-ideas
> Code of Conduct: http://python.org/psf/codeofconduct/
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20181101/526ea50e/attachment.html>

From prometheus235 at gmail.com  Thu Nov  1 21:57:15 2018
From: prometheus235 at gmail.com (Nick Timkovich)
Date: Thu, 1 Nov 2018 20:57:15 -0500
Subject: [Python-ideas] Add "default" kwarg to list.pop()
In-Reply-To: <prfl0k$p96$1@blaine.gmane.org>
References: <CAFYqXL_CZ2QJUJO3o3f6uXAesaDQ3SqSW1K3PgF8dHOxsueNmg@mail.gmail.com>
 <20181031120813.69b9a23b@fsol> <prfl0k$p96$1@blaine.gmane.org>
Message-ID: <CAHkxivcsO5RgE6NUOvis60_eXKE9uWL0fNROaeTdNb3TUCAj+Q@mail.gmail.com>

Does it make sense to draw some sort of parallel between next(myiterator,
default="whatever") and mylist.pop(default="whatever")? They exhaust the
iterator/list then start emitting the default argument (if provided).

On Thu, Nov 1, 2018 at 2:46 PM Serhiy Storchaka <storchaka at gmail.com> wrote:

> 31.10.18 13:08, Antoine Pitrou ????:
> > +1 from me.  dict.pop() already has an optional default.  This is a
> > straight-forward improvement to the API and no Python programmer will
> > be surprised.
>
> list.pop() corresponds two dict methods. With argument it corresponds
> dict.pop(). But there are differences: dict.pop() called repeatedly with
> the same key will raise an error (or return the default), while
> list.pop() will likely return other item. Without argument it
> corresponds dict.popitem() which doesn't have an optional default.
>
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> https://mail.python.org/mailman/listinfo/python-ideas
> Code of Conduct: http://python.org/psf/codeofconduct/
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20181101/a88790fd/attachment-0001.html>

From ashafer01 at gmail.com  Thu Nov  1 22:07:31 2018
From: ashafer01 at gmail.com (Alex Shafer)
Date: Thu, 1 Nov 2018 20:07:31 -0600
Subject: [Python-ideas] dict.setdefault_call(),
 or API variations thereupon
In-Reply-To: <mailman.7661.1541123849.2798.python-ideas@python.org>
References: <mailman.7661.1541123849.2798.python-ideas@python.org>
Message-ID: <CAJG_PJU10A=_STur3=DMB31RtL+rC9NA=R8zCJ-stFyb0kGu6g@mail.gmail.com>

I had actually managed to miss collections.defaultdict!

I'd like to instead propose that a reference to that be added to the
dict.setdefault docs. I can't imagine I'm the only one that has missed this.


Date: Fri, 2 Nov 2018 12:12:45 +1100
> From: Chris Angelico <rosuav at gmail.com>
> To: python-ideas <python-ideas at python.org>
> Subject: Re: [Python-ideas] dict.setdefault_call(), or API variations
>         thereupon
> Message-ID:
>         <
> CAPTjJmqg_qtK3OfR+4VAaaNa7JXjHjHLpnx6EfEZX5n4tttqCQ at mail.gmail.com>
> Content-Type: text/plain; charset="UTF-8"
>
> On Fri, Nov 2, 2018 at 12:07 PM Alex Shafer <ashafer01 at gmail.com> wrote:
> > Other APIs I've considered for this are a new keyword argument to the
> existing `setdefault()`, or perhaps more radically for Python, a new
> keyword argument to the `dict()` constructor that would get called as an
> implicit default for `setdefault()` and perhaps used in other scenarios
> (essentially defining a type for dict values).
> >
>
> The time machine has been put to good use here. Are you aware of
> __missing__ and collections.defaultdict? You just create a defaultdict
> with a callable (very common to use a class like "list"), and any time
> you try to use something that's missing, it'll call that to generate a
> value.
>
> from collections import defaultdict
> d = defaultdict(list)
> for category, item in some_stuff:
>     d[category].append(item)
>
> Easy way to group things into their categories.
>
> ChrisA
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20181101/13ab190d/attachment.html>

From robertve92 at gmail.com  Thu Nov  1 22:08:44 2018
From: robertve92 at gmail.com (Robert Vanden Eynde)
Date: Fri, 2 Nov 2018 03:08:44 +0100
Subject: [Python-ideas] Add "default" kwarg to list.pop()
In-Reply-To: <CAHkxivcsO5RgE6NUOvis60_eXKE9uWL0fNROaeTdNb3TUCAj+Q@mail.gmail.com>
References: <CAFYqXL_CZ2QJUJO3o3f6uXAesaDQ3SqSW1K3PgF8dHOxsueNmg@mail.gmail.com>
 <20181031120813.69b9a23b@fsol> <prfl0k$p96$1@blaine.gmane.org>
 <CAHkxivcsO5RgE6NUOvis60_eXKE9uWL0fNROaeTdNb3TUCAj+Q@mail.gmail.com>
Message-ID: <CA+msPNmRGvaPk_Y2gc+885JZgRn_NMkmc_GESWXbk8RJcxnjpg@mail.gmail.com>

>
> Does it make sense to draw some sort of parallel between next(myiterator,
> default="whatever") and mylist.pop(default="whatever")? They exhaust the
> iterator/list then start emitting the default argument (if provided).
>

Yep that's what I just did in my previous mail.

"""
I think the same way about set.pop, list.pop.
About .index I agree adding default= would make sense but that's not
exactly the same thing as the others.
"""

Being picky: "TypeError: next() takes no keyword arguments", that's
next(myierator, "whatever") ;)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20181102/15716056/attachment.html>

From guido at python.org  Thu Nov  1 22:20:04 2018
From: guido at python.org (Guido van Rossum)
Date: Thu, 1 Nov 2018 19:20:04 -0700
Subject: [Python-ideas] dict.setdefault_call(),
 or API variations thereupon
In-Reply-To: <CAJG_PJU10A=_STur3=DMB31RtL+rC9NA=R8zCJ-stFyb0kGu6g@mail.gmail.com>
References: <mailman.7661.1541123849.2798.python-ideas@python.org>
 <CAJG_PJU10A=_STur3=DMB31RtL+rC9NA=R8zCJ-stFyb0kGu6g@mail.gmail.com>
Message-ID: <CAP7+vJKMCMLtztMpeCmvT-jcS9ALzAPZquJx-4_LVHcM94YyaA@mail.gmail.com>

The two are less connected than you seem to think.

On Thu, Nov 1, 2018 at 7:08 PM Alex Shafer <ashafer01 at gmail.com> wrote:

> I had actually managed to miss collections.defaultdict!
>
> I'd like to instead propose that a reference to that be added to the
> dict.setdefault docs. I can't imagine I'm the only one that has missed this.
>
>
> Date: Fri, 2 Nov 2018 12:12:45 +1100
>> From: Chris Angelico <rosuav at gmail.com>
>> To: python-ideas <python-ideas at python.org>
>> Subject: Re: [Python-ideas] dict.setdefault_call(), or API variations
>>         thereupon
>> Message-ID:
>>         <
>> CAPTjJmqg_qtK3OfR+4VAaaNa7JXjHjHLpnx6EfEZX5n4tttqCQ at mail.gmail.com>
>> Content-Type: text/plain; charset="UTF-8"
>
>
>>
>> On Fri, Nov 2, 2018 at 12:07 PM Alex Shafer <ashafer01 at gmail.com> wrote:
>> > Other APIs I've considered for this are a new keyword argument to the
>> existing `setdefault()`, or perhaps more radically for Python, a new
>> keyword argument to the `dict()` constructor that would get called as an
>> implicit default for `setdefault()` and perhaps used in other scenarios
>> (essentially defining a type for dict values).
>> >
>>
>> The time machine has been put to good use here. Are you aware of
>> __missing__ and collections.defaultdict? You just create a defaultdict
>> with a callable (very common to use a class like "list"), and any time
>> you try to use something that's missing, it'll call that to generate a
>> value.
>>
>> from collections import defaultdict
>> d = defaultdict(list)
>> for category, item in some_stuff:
>>     d[category].append(item)
>>
>> Easy way to group things into their categories.
>>
>> ChrisA
>>
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> https://mail.python.org/mailman/listinfo/python-ideas
> Code of Conduct: http://python.org/psf/codeofconduct/
>
-- 
--Guido (mobile)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20181101/cda44473/attachment.html>

From robertve92 at gmail.com  Thu Nov  1 22:23:05 2018
From: robertve92 at gmail.com (Robert Vanden Eynde)
Date: Fri, 2 Nov 2018 03:23:05 +0100
Subject: [Python-ideas] dict.setdefault_call(),
 or API variations thereupon
In-Reply-To: <CAP7+vJKMCMLtztMpeCmvT-jcS9ALzAPZquJx-4_LVHcM94YyaA@mail.gmail.com>
References: <mailman.7661.1541123849.2798.python-ideas@python.org>
 <CAJG_PJU10A=_STur3=DMB31RtL+rC9NA=R8zCJ-stFyb0kGu6g@mail.gmail.com>
 <CAP7+vJKMCMLtztMpeCmvT-jcS9ALzAPZquJx-4_LVHcM94YyaA@mail.gmail.com>
Message-ID: <CA+msPNmnNuCnCbdwEbptvY+45nBm97RUxkwZYg9hFpo+6uADgA@mail.gmail.com>

>
> The two are less connected than you seem to think.
>

Really ? What's the use mainstream use cases for setdefault ?
I was often in the case of Alex.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20181102/07a9314c/attachment-0001.html>

From guido at python.org  Thu Nov  1 22:49:57 2018
From: guido at python.org (Guido van Rossum)
Date: Thu, 1 Nov 2018 19:49:57 -0700
Subject: [Python-ideas] dict.setdefault_call(),
 or API variations thereupon
In-Reply-To: <CA+msPNmnNuCnCbdwEbptvY+45nBm97RUxkwZYg9hFpo+6uADgA@mail.gmail.com>
References: <mailman.7661.1541123849.2798.python-ideas@python.org>
 <CAJG_PJU10A=_STur3=DMB31RtL+rC9NA=R8zCJ-stFyb0kGu6g@mail.gmail.com>
 <CAP7+vJKMCMLtztMpeCmvT-jcS9ALzAPZquJx-4_LVHcM94YyaA@mail.gmail.com>
 <CA+msPNmnNuCnCbdwEbptvY+45nBm97RUxkwZYg9hFpo+6uADgA@mail.gmail.com>
Message-ID: <CAP7+vJL9Nrp5858cyoMDiNRS3MWgfm_u6R0S8m+phPvRG4sNXw@mail.gmail.com>

On Thu, Nov 1, 2018 at 7:23 PM Robert Vanden Eynde <robertve92 at gmail.com>
wrote:

> The two are less connected than you seem to think.
>>
>
> Really ? What's the use mainstream use cases for setdefault ?
> I was often in the case of Alex.
>

Well, defaultdict configures a default when an instance is created, while
setdefault() is used when inserting a value.

A major issue IMO with defaultdict is that if you try to *read* a
non-existing key it will be inserted.

-- 
--Guido van Rossum (python.org/~guido)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20181101/50215e55/attachment.html>

From ashafer01 at gmail.com  Thu Nov  1 22:58:28 2018
From: ashafer01 at gmail.com (Alex Shafer)
Date: Thu, 1 Nov 2018 20:58:28 -0600
Subject: [Python-ideas] dict.setdefault_call(),
 or API variations thereupon
In-Reply-To: <CAP7+vJL9Nrp5858cyoMDiNRS3MWgfm_u6R0S8m+phPvRG4sNXw@mail.gmail.com>
References: <mailman.7661.1541123849.2798.python-ideas@python.org>
 <CAJG_PJU10A=_STur3=DMB31RtL+rC9NA=R8zCJ-stFyb0kGu6g@mail.gmail.com>
 <CAP7+vJKMCMLtztMpeCmvT-jcS9ALzAPZquJx-4_LVHcM94YyaA@mail.gmail.com>
 <CA+msPNmnNuCnCbdwEbptvY+45nBm97RUxkwZYg9hFpo+6uADgA@mail.gmail.com>
 <CAP7+vJL9Nrp5858cyoMDiNRS3MWgfm_u6R0S8m+phPvRG4sNXw@mail.gmail.com>
Message-ID: <CAJG_PJVJUXywV2MrfLu2hTj6s=uWJYrDy_Q_ZDXm5C=ORq9kgg@mail.gmail.com>

So it actually sounds like having a dict method for performing write
operations with a factory function would be a semantic improvement.

On Thu, Nov 1, 2018 at 8:50 PM Guido van Rossum <guido at python.org> wrote:

> On Thu, Nov 1, 2018 at 7:23 PM Robert Vanden Eynde <robertve92 at gmail.com>
> wrote:
>
>> The two are less connected than you seem to think.
>>>
>>
>> Really ? What's the use mainstream use cases for setdefault ?
>> I was often in the case of Alex.
>>
>
> Well, defaultdict configures a default when an instance is created, while
> setdefault() is used when inserting a value.
>
> A major issue IMO with defaultdict is that if you try to *read* a
> non-existing key it will be inserted.
>
> --
> --Guido van Rossum (python.org/~guido)
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20181101/6f30cb45/attachment.html>

From steve at pearwood.info  Thu Nov  1 23:34:11 2018
From: steve at pearwood.info (Steven D'Aprano)
Date: Fri, 2 Nov 2018 14:34:11 +1100
Subject: [Python-ideas] dict.setdefault_call(),
 or API variations thereupon
In-Reply-To: <CAJG_PJVJUXywV2MrfLu2hTj6s=uWJYrDy_Q_ZDXm5C=ORq9kgg@mail.gmail.com>
References: <mailman.7661.1541123849.2798.python-ideas@python.org>
 <CAJG_PJU10A=_STur3=DMB31RtL+rC9NA=R8zCJ-stFyb0kGu6g@mail.gmail.com>
 <CAP7+vJKMCMLtztMpeCmvT-jcS9ALzAPZquJx-4_LVHcM94YyaA@mail.gmail.com>
 <CA+msPNmnNuCnCbdwEbptvY+45nBm97RUxkwZYg9hFpo+6uADgA@mail.gmail.com>
 <CAP7+vJL9Nrp5858cyoMDiNRS3MWgfm_u6R0S8m+phPvRG4sNXw@mail.gmail.com>
 <CAJG_PJVJUXywV2MrfLu2hTj6s=uWJYrDy_Q_ZDXm5C=ORq9kgg@mail.gmail.com>
Message-ID: <20181102033409.GI3817@ando.pearwood.info>

On Thu, Nov 01, 2018 at 08:58:28PM -0600, Alex Shafer wrote:

> So it actually sounds like having a dict method for performing write
> operations with a factory function would be a semantic improvement.

As Chris pointed out, that's what __missing__ does.

py> class MyDict(dict):
...     def __missing__(self, key):
...             return "something"
...
py> d = MyDict(a=1, b=2)
py> d['z']
'something'
py> d
{'a': 1, 'b': 2}

If you want the key to be inserted, do so in the __missing__ method.

Is there something missing (pun not intended) from this existing 
functionality?

The only improvement I'd like to see is to remove the need to subclass, 
so we could do this:

py> d = {'a': 1}  # Plain ol' regular dict, not a subclass.
py> d.__missing__ = lambda self, key: "something"
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
AttributeError: 'dict' object has no attribute '__missing__'

but as you can see, that doesn't work. We'd need to either give every 
dict a full __dict__ instance namespace, or a __missing__ slot. Given 
how rare it is to use __missing__ I suspect the cost is not worth it.

The bottom line is, if I understand your proposal, the functionality 
already exists. All you need do is subclass dict and give it a 
__missing__ method which does what you want.


-- 
Steve

From songofacandy at gmail.com  Thu Nov  1 23:58:37 2018
From: songofacandy at gmail.com (INADA Naoki)
Date: Fri, 2 Nov 2018 12:58:37 +0900
Subject: [Python-ideas] Add "default" kwarg to list.pop()
In-Reply-To: <prfl0k$p96$1@blaine.gmane.org>
References: <CAFYqXL_CZ2QJUJO3o3f6uXAesaDQ3SqSW1K3PgF8dHOxsueNmg@mail.gmail.com>
 <20181031120813.69b9a23b@fsol> <prfl0k$p96$1@blaine.gmane.org>
Message-ID: <CAEfz+TykQK0XBciEyycEzNpXSyodLr99z5Bo1x5M=+aaScTceA@mail.gmail.com>

On Fri, Nov 2, 2018 at 4:45 AM Serhiy Storchaka <storchaka at gmail.com> wrote:
>
> 31.10.18 13:08, Antoine Pitrou ????:
> > +1 from me.  dict.pop() already has an optional default.  This is a
> > straight-forward improvement to the API and no Python programmer will
> > be surprised.
>
> list.pop() corresponds two dict methods. With argument it corresponds
> dict.pop(). But there are differences: dict.pop() called repeatedly with
> the same key will raise an error (or return the default), while
> list.pop() will likely return other item. Without argument it
> corresponds dict.popitem() which doesn't have an optional default.
>

I think there is one more important difference between dict and list.
dict has .get(key[, default]), but list doesn't have it.

If we add only `list.pop([default])`, it is tempting that using it even when
they don't have to remove the item.  Unnecessary destructive change is
bad.  It reduces code readability, and it may create hard bug.

If this proposal is adding `list.get([index[, default]])` too, I still -0.
I don't know how often it is useful.

Regards,

-- 
INADA Naoki  <songofacandy at gmail.com>

From storchaka at gmail.com  Fri Nov  2 04:48:22 2018
From: storchaka at gmail.com (Serhiy Storchaka)
Date: Fri, 2 Nov 2018 10:48:22 +0200
Subject: [Python-ideas] Add "default" kwarg to list.pop()
In-Reply-To: <CA+msPNnZbYSnH8KKSHKk_W=5K=PmLWWvaq0x1j0EohC7Z3tHbg@mail.gmail.com>
References: <CAFYqXL_CZ2QJUJO3o3f6uXAesaDQ3SqSW1K3PgF8dHOxsueNmg@mail.gmail.com>
 <prasli$8iv$1@blaine.gmane.org> <20181031010851.GC3817@ando.pearwood.info>
 <CAOmnVUAOdT2SobfVA1_ysFMZxxUs09rgi2tjaTHiDvezZZopCA@mail.gmail.com>
 <CAPTjJmqEp-E7qVrU=PvGE=0XDOZdZr2G5ea2RUb9M=3DmZQW-g@mail.gmail.com>
 <CAP2Qz+VEVc6W7b1yppAgOY2KerNE=FH=jqY5qWSrGBjN=Ac0wg@mail.gmail.com>
 <CAGgTfkPBRPNADFzahb-f1EqFNdBHc6q4KWbRD7m7eScLYb_E3w@mail.gmail.com>
 <CA+msPNnZbYSnH8KKSHKk_W=5K=PmLWWvaq0x1j0EohC7Z3tHbg@mail.gmail.com>
Message-ID: <prh2sj$irh$1@blaine.gmane.org>

31.10.18 21:23, Robert Vanden Eynde ????:
> Should I write a PEP even though I know it's going to be rejected 
> because the mailing list was not really into it ?

It is better to not do this. PEP 572 was initially written with the 
intention to be rejected.


From steve at pearwood.info  Fri Nov  2 06:26:35 2018
From: steve at pearwood.info (Steven D'Aprano)
Date: Fri, 2 Nov 2018 21:26:35 +1100
Subject: [Python-ideas] Add "default" kwarg to list.pop()
In-Reply-To: <prfkfg$j6s$1@blaine.gmane.org>
References: <CAFYqXL_CZ2QJUJO3o3f6uXAesaDQ3SqSW1K3PgF8dHOxsueNmg@mail.gmail.com>
 <prasli$8iv$1@blaine.gmane.org> <20181031120710.0824d82d@fsol>
 <prfkfg$j6s$1@blaine.gmane.org>
Message-ID: <20181102102635.GK3817@ando.pearwood.info>

On Thu, Nov 01, 2018 at 09:36:19PM +0200, Serhiy Storchaka wrote:
> 31.10.18 13:07, Antoine Pitrou ????:
> >l.pop(default=...) has the potential to be multi-thread-safe, while
> >your alternatives haven't.
> 
> The multi-thread-safe alternative is:
> 
>     try:
>         value = l.pop()
>     except IndexError:
>         value = default

That's not an expression, so there are limits to where and when you can 
use it. What we need is a helper function that wraps that, called "pop". 
And since this seems to be reoccuring request going back nearly 20 years 
now:

https://mail.python.org/pipermail/python-dev/1999-July/000550.html
https://stackoverflow.com/questions/31216428/python-pop-from-empty-list

as is the more general get(list, index, default=None) helper:

https://stackoverflow.com/questions/2574636/getting-a-default-value-on-index-out-of-range-in-python
https://stackoverflow.com/questions/2492087/how-to-get-the-nth-element-of-a-python-list-or-a-default-if-not-available
https://stackoverflow.com/questions/5125619/why-doesnt-list-have-safe-get-method-like-dictionary
https://stackoverflow.com/questions/17721748/default-value-for-out-of-bounds-list-index

we could save people from having to re-invent the wheel over and over 
again by add them to a new module called 
"things_that_should_be_list_methods_but_arent.py" 

*wink*


-- 
Steve

From steve at pearwood.info  Fri Nov  2 06:59:15 2018
From: steve at pearwood.info (Steven D'Aprano)
Date: Fri, 2 Nov 2018 21:59:15 +1100
Subject: [Python-ideas] Add "default" kwarg to list.pop()
In-Reply-To: <prh2sj$irh$1@blaine.gmane.org>
References: <CAFYqXL_CZ2QJUJO3o3f6uXAesaDQ3SqSW1K3PgF8dHOxsueNmg@mail.gmail.com>
 <prasli$8iv$1@blaine.gmane.org> <20181031010851.GC3817@ando.pearwood.info>
 <CAOmnVUAOdT2SobfVA1_ysFMZxxUs09rgi2tjaTHiDvezZZopCA@mail.gmail.com>
 <CAPTjJmqEp-E7qVrU=PvGE=0XDOZdZr2G5ea2RUb9M=3DmZQW-g@mail.gmail.com>
 <CAP2Qz+VEVc6W7b1yppAgOY2KerNE=FH=jqY5qWSrGBjN=Ac0wg@mail.gmail.com>
 <CAGgTfkPBRPNADFzahb-f1EqFNdBHc6q4KWbRD7m7eScLYb_E3w@mail.gmail.com>
 <CA+msPNnZbYSnH8KKSHKk_W=5K=PmLWWvaq0x1j0EohC7Z3tHbg@mail.gmail.com>
 <prh2sj$irh$1@blaine.gmane.org>
Message-ID: <20181102105912.GL3817@ando.pearwood.info>

On Fri, Nov 02, 2018 at 10:48:22AM +0200, Serhiy Storchaka wrote:
> 31.10.18 21:23, Robert Vanden Eynde ????:
> >Should I write a PEP even though I know it's going to be rejected 
> >because the mailing list was not really into it ?

I disagree that "the mailing list was not really into it".

So far, I count 12 people who responded to the original post by 
Giampaolo. By my count, I see:

* five people in favour;
* three people against, or see no need for it;
* four people I can't tell if they are for or against, 
  (possibly neutral?) [1]

I know that adding features isn't decided by majority vote, but it seems 
clear to me that there is a substantial set of Python users, perhaps a 
majority, who would find this feature useful and more obvious than the 
alternatives.


[Serhiy]
> It is better to not do this. PEP 572 was initially written with the 
> intention to be rejected.

Sounds like an excellent reason to write a PEP :-)

There are some issues that ought to be addressed:

- The status quo is easy to get wrong:

    # I've written this. More than once.
    L.pop(idx) if idx < len(L) else default

  is wrong if there is any chance of idx being negative.


- The more common case of popping from the front of the list
  is not thread-safe:

    L.pop() if L else default


- This clever trick is probably thread-safe (I think...) but it
  is wasteful and inefficient:

    (L or [default]).pop()

  and it isn't obvious how to adapt it efficiently if you need to 
  pop from an arbitrary index. I came up with this:

    (L[idx:idx+1] or [default]).pop()

  but it is doubly wrong.


- The obvious thread-safe EAFP idiom is a try...except statement, so 
  it needs to be wrapped in a helper function to use it in expressions.
  That adds more overhead.


The proposed .get(idx, default=x) and .pop(idx, default=x) signatures 
ought to be obvious and unsurprising to any moderately experienced 
Python programmer. These aren't complicated APIs.


On the other hand:

- I'm not volunteering to do the work (I don't know enough C to write
  a patch). Unless somebody has a patch, we can't expect the core devs
  who aren't interested in this feature to write it.

(Hence, status quo wins a stalemate.)


[1] "What makes a man turn neutral? Lust for gold? Power? Or were you 
just born with a heart full of neutrality?" -- Captain Zapp Brannigan


-- 
Steve

From boxed at killingar.net  Fri Nov  2 07:39:15 2018
From: boxed at killingar.net (=?utf-8?Q?Anders_Hovm=C3=B6ller?=)
Date: Fri, 2 Nov 2018 12:39:15 +0100
Subject: [Python-ideas] Add "default" kwarg to list.pop()
In-Reply-To: <20181102105912.GL3817@ando.pearwood.info>
References: <CAFYqXL_CZ2QJUJO3o3f6uXAesaDQ3SqSW1K3PgF8dHOxsueNmg@mail.gmail.com>
 <prasli$8iv$1@blaine.gmane.org> <20181031010851.GC3817@ando.pearwood.info>
 <CAOmnVUAOdT2SobfVA1_ysFMZxxUs09rgi2tjaTHiDvezZZopCA@mail.gmail.com>
 <CAPTjJmqEp-E7qVrU=PvGE=0XDOZdZr2G5ea2RUb9M=3DmZQW-g@mail.gmail.com>
 <CAP2Qz+VEVc6W7b1yppAgOY2KerNE=FH=jqY5qWSrGBjN=Ac0wg@mail.gmail.com>
 <CAGgTfkPBRPNADFzahb-f1EqFNdBHc6q4KWbRD7m7eScLYb_E3w@mail.gmail.com>
 <CA+msPNnZbYSnH8KKSHKk_W=5K=PmLWWvaq0x1j0EohC7Z3tHbg@mail.gmail.com>
 <prh2sj$irh$1@blaine.gmane.org> <20181102105912.GL3817@ando.pearwood.info>
Message-ID: <30307018-58AD-4823-9338-85A748F6E9CE@killingar.net>


> So far, I count 12 people who responded to the original post by 
> Giampaolo. By my count, I see:
> 
> * five people in favour;
> * three people against, or see no need for it;
> * four people I can't tell if they are for or against, 
>  (possibly neutral?) [1]

For the little it's worth I'm +1 too. This seems like an obvious little improvement. 

/ Anders 

From 2QdxY4RzWzUUiLuE at potatochowder.com  Fri Nov  2 08:38:04 2018
From: 2QdxY4RzWzUUiLuE at potatochowder.com (Dan Sommers)
Date: Fri, 2 Nov 2018 08:38:04 -0400
Subject: [Python-ideas] Add "default" kwarg to list.pop()
In-Reply-To: <30307018-58AD-4823-9338-85A748F6E9CE@killingar.net>
References: <CAFYqXL_CZ2QJUJO3o3f6uXAesaDQ3SqSW1K3PgF8dHOxsueNmg@mail.gmail.com>
 <prasli$8iv$1@blaine.gmane.org> <20181031010851.GC3817@ando.pearwood.info>
 <CAOmnVUAOdT2SobfVA1_ysFMZxxUs09rgi2tjaTHiDvezZZopCA@mail.gmail.com>
 <CAPTjJmqEp-E7qVrU=PvGE=0XDOZdZr2G5ea2RUb9M=3DmZQW-g@mail.gmail.com>
 <CAP2Qz+VEVc6W7b1yppAgOY2KerNE=FH=jqY5qWSrGBjN=Ac0wg@mail.gmail.com>
 <CAGgTfkPBRPNADFzahb-f1EqFNdBHc6q4KWbRD7m7eScLYb_E3w@mail.gmail.com>
 <CA+msPNnZbYSnH8KKSHKk_W=5K=PmLWWvaq0x1j0EohC7Z3tHbg@mail.gmail.com>
 <prh2sj$irh$1@blaine.gmane.org> <20181102105912.GL3817@ando.pearwood.info>
 <30307018-58AD-4823-9338-85A748F6E9CE@killingar.net>
Message-ID: <1292a03c-16c8-4276-8052-e441b60fe622@potatochowder.com>

On 11/2/18 7:39 AM, Anders Hovm??ller wrote:>
 >> So far, I count 12 people who responded to the original post by
 >> Giampaolo. By my count, I see:
 >>
 >> * five people in favour;
 >> * three people against, or see no need for it;
 >> * four people I can't tell if they are for or against,
 >>   (possibly neutral?) [1]
 >
 > For the little it's worth I'm +1 too. This seems like an obvious 
little improvement.

I'm having a hard time seeing a real use case.  Giampaolo's original
post contains this link:

https://github.com/giampaolo/psutil/blob/d8b05151e65f9348aff9b58da977abd8cacb2127/psutil/_pslinux.py#L1068

Yuck (from an aesthetics standpoint, not a functional standpoint).  :-)

There's an impedance mismatch between the data, which is structured and
has changed apparently arbitrarily between Linux releases, and the
return value of string.split, which is an ordered collection.  This code
effectively hides that mismatch and yields Python tuples, which
represent structured data.

I can certainly see the desire for a simpler solution (for some
definition of simpler), but how would adding a default parameter to
list.pop make this code any simpler?

Dan

From philip.martin2007 at gmail.com  Fri Nov  2 12:19:52 2018
From: philip.martin2007 at gmail.com (Philip Martin)
Date: Fri, 2 Nov 2018 11:19:52 -0500
Subject: [Python-ideas] Serialization of CSV vs. JSON
Message-ID: <CAJdod2pM=S_Kza80tuYr7RfUWhNXB73cs8evGeif9E0PkSRshQ@mail.gmail.com>

Is there any reason why date, datetime, and UUID objects automatically
serialize to default strings using the csv module, but json.dumps throws an
error as a default? i.e.

import csv
import json
import io
from datetime import date

stream = io.StringIO()
writer = csv.writer(stream)
writer.writerow([date(2018, 11, 2)])
# versus
json.dumps(date(2018, 11, 2))
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20181102/da888054/attachment.html>

From cspealma at redhat.com  Fri Nov  2 12:28:18 2018
From: cspealma at redhat.com (Calvin Spealman)
Date: Fri, 2 Nov 2018 12:28:18 -0400
Subject: [Python-ideas] Serialization of CSV vs. JSON
In-Reply-To: <CAJdod2pM=S_Kza80tuYr7RfUWhNXB73cs8evGeif9E0PkSRshQ@mail.gmail.com>
References: <CAJdod2pM=S_Kza80tuYr7RfUWhNXB73cs8evGeif9E0PkSRshQ@mail.gmail.com>
Message-ID: <CACo5Rz4SU2tQLn_0yoSwuFXs31KE+UfFtDpz002=b+HyGB42ag@mail.gmail.com>

First, this list is not appropriate. You should ask such a question in
python-list.

Second, JSON is a specific serialization format that explicitly rejects
datetime objects in *all* the languages with JSON libraries. You can only
use date objects in JSON if you control or understand both serialization
and deserialization ends and have an agreed representation.

On Fri, Nov 2, 2018 at 12:20 PM Philip Martin <philip.martin2007 at gmail.com>
wrote:

> Is there any reason why date, datetime, and UUID objects automatically
> serialize to default strings using the csv module, but json.dumps throws an
> error as a default? i.e.
>
> import csv
> import json
> import io
> from datetime import date
>
> stream = io.StringIO()
> writer = csv.writer(stream)
> writer.writerow([date(2018, 11, 2)])
> # versus
> json.dumps(date(2018, 11, 2))
>
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> https://mail.python.org/mailman/listinfo/python-ideas
> Code of Conduct: http://python.org/psf/codeofconduct/
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20181102/50fb7bbc/attachment.html>

From mal at egenix.com  Fri Nov  2 12:31:33 2018
From: mal at egenix.com (M.-A. Lemburg)
Date: Fri, 2 Nov 2018 17:31:33 +0100
Subject: [Python-ideas] Serialization of CSV vs. JSON
In-Reply-To: <CAJdod2pM=S_Kza80tuYr7RfUWhNXB73cs8evGeif9E0PkSRshQ@mail.gmail.com>
References: <CAJdod2pM=S_Kza80tuYr7RfUWhNXB73cs8evGeif9E0PkSRshQ@mail.gmail.com>
Message-ID: <d335e6d2-4f0b-54cb-af7d-2647448f32ab@egenix.com>

Serialization of those data types is not defined in the JSON standard:

https://www.json.org/

so you have to extend the parser/serializers to support them.

On 02.11.2018 17:19, Philip Martin wrote:
> Is there any reason why date, datetime, and UUID objects automatically
> serialize to default strings using the csv module, but json.dumps throws
> an error as a default? i.e.
> 
> import csv
> import json
> import io
> from datetime import date
> 
> stream = io.StringIO()
> writer = csv.writer(stream)
> writer.writerow([date(2018, 11, 2)])
> # versus
> json.dumps(date(2018, 11, 2))
> 
> 
> 
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> https://mail.python.org/mailman/listinfo/python-ideas
> Code of Conduct: http://python.org/psf/codeofconduct/
> 

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Experts (#1, Nov 02 2018)
>>> Python Projects, Coaching and Consulting ...  http://www.egenix.com/
>>> Python Database Interfaces ...           http://products.egenix.com/
>>> Plone/Zope Database Interfaces ...           http://zope.egenix.com/
________________________________________________________________________

::: We implement business ideas - efficiently in both time and costs :::

   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
    D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
           Registered at Amtsgericht Duesseldorf: HRB 46611
               http://www.egenix.com/company/contact/
                      http://www.malemburg.com/


From chris.barker at noaa.gov  Fri Nov  2 12:52:24 2018
From: chris.barker at noaa.gov (Chris Barker)
Date: Fri, 2 Nov 2018 09:52:24 -0700
Subject: [Python-ideas] dict.setdefault_call(),
 or API variations thereupon
In-Reply-To: <20181102033409.GI3817@ando.pearwood.info>
References: <mailman.7661.1541123849.2798.python-ideas@python.org>
 <CAJG_PJU10A=_STur3=DMB31RtL+rC9NA=R8zCJ-stFyb0kGu6g@mail.gmail.com>
 <CAP7+vJKMCMLtztMpeCmvT-jcS9ALzAPZquJx-4_LVHcM94YyaA@mail.gmail.com>
 <CA+msPNmnNuCnCbdwEbptvY+45nBm97RUxkwZYg9hFpo+6uADgA@mail.gmail.com>
 <CAP7+vJL9Nrp5858cyoMDiNRS3MWgfm_u6R0S8m+phPvRG4sNXw@mail.gmail.com>
 <CAJG_PJVJUXywV2MrfLu2hTj6s=uWJYrDy_Q_ZDXm5C=ORq9kgg@mail.gmail.com>
 <20181102033409.GI3817@ando.pearwood.info>
Message-ID: <CALGmxELxVaV7RkB5mY6nMFDQaazxts3B7H9hf1q-XZA6X+0Hsw@mail.gmail.com>

On Thu, Nov 1, 2018 at 8:34 PM, Steven D'Aprano <steve at pearwood.info> wrote:

> The bottom line is, if I understand your proposal, the functionality
> already exists. All you need do is subclass dict and give it a
> __missing__ method which does what you want.


or subclass dict and give it a "setdefault_call") method :-)

But as I think Guido wasa pointing out, the real difference here is that
DefaultDict, or any other subclass, is specifying what the default callable
is for the entire dict, rather than at time of use. Personally, I'm pretty
sure I"ve only used one default for any given dict, but I can imaige the
are use cases for having different defaults for the same dict depending on
context.

As for the OP's justification:

"""
If it's not clear, the purpose is to eliminate the overhead of creating an
empty list or similar in situations like this:

d = {}
for i in range(1000000):  # some large loop
     l = d.setdefault(somekey, [])
     l.append(somevalue)

# instead...

for i in range(1000000):
    l = d.setdefault_call(somekey, list)
    l.append(somevalue)

"""

I presume the point is that in the first case, somekey might be often the
same, and setdefault requires creating an actual empty list even if  the
key is alredy there. whereas case 2 will only create the empty list if the
key is not there. doing some timing with defaultdict:

In [19]: def setdefault():
    ...:     d = {}
    ...:     somekey = 5
    ...:     for i in range(1000000):  # some large loop
    ...:         l = d.setdefault(somekey, [])
    ...:         l.append(i)
    ...:     return d

In [20]: def default_dict():
    ...:     d = defaultdict(list)
    ...:     somekey = 5
    ...:     for i in range(1000000):  # some large loop
    ...:         l = d[somekey]
    ...:         l.append(i)
    ...:     return d

In [21]: % timeit setdefault()
185 ms ? 1.23 ms per loop (mean ? std. dev. of 7 runs, 10 loops each)

In [22]: % timeit default_dict()
128 ms ? 1.65 ms per loop (mean ? std. dev. of 7 runs, 10 loops each)

so yeah, it's a little more performant, and I suppose if you were using a
more expensive constructor, it would make a lot more difference. But then,
how much is it likely to matter in a real use cases -- this was 1 million
calls for one key and you got a 50% speed up -- is that common?

So it seems this would give us slightly better performance than
.setdefault() for the use cases where you are using more than one default
for a given dict.

BTW:

+1 for a mention of defaultdict in the dict.setdefault docs -- you can't do
everything with defaultdict that you can with setdefault, but it is a very
common use case.

-CHB

-- 

Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/OR&R            (206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115       (206) 526-6317   main reception

Chris.Barker at noaa.gov
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20181102/fdc6f772/attachment-0001.html>

From chris.barker at noaa.gov  Fri Nov  2 13:16:24 2018
From: chris.barker at noaa.gov (Chris Barker)
Date: Fri, 2 Nov 2018 10:16:24 -0700
Subject: [Python-ideas] Add "default" kwarg to list.pop()
In-Reply-To: <20181102105912.GL3817@ando.pearwood.info>
References: <CAFYqXL_CZ2QJUJO3o3f6uXAesaDQ3SqSW1K3PgF8dHOxsueNmg@mail.gmail.com>
 <prasli$8iv$1@blaine.gmane.org> <20181031010851.GC3817@ando.pearwood.info>
 <CAOmnVUAOdT2SobfVA1_ysFMZxxUs09rgi2tjaTHiDvezZZopCA@mail.gmail.com>
 <CAPTjJmqEp-E7qVrU=PvGE=0XDOZdZr2G5ea2RUb9M=3DmZQW-g@mail.gmail.com>
 <CAP2Qz+VEVc6W7b1yppAgOY2KerNE=FH=jqY5qWSrGBjN=Ac0wg@mail.gmail.com>
 <CAGgTfkPBRPNADFzahb-f1EqFNdBHc6q4KWbRD7m7eScLYb_E3w@mail.gmail.com>
 <CA+msPNnZbYSnH8KKSHKk_W=5K=PmLWWvaq0x1j0EohC7Z3tHbg@mail.gmail.com>
 <prh2sj$irh$1@blaine.gmane.org> <20181102105912.GL3817@ando.pearwood.info>
Message-ID: <CALGmxEKbPsjSOmc3KCbxQ9zGAikXkNqpLNEAPb1URo+aFEO-eg@mail.gmail.com>

On Fri, Nov 2, 2018 at 3:59 AM, Steven D'Aprano <steve at pearwood.info> wrote:

> - I'm not volunteering to do the work (I don't know enough C to write
>   a patch). Unless somebody has a patch, we can't expect the core devs
>   who aren't interested in this feature to write it.
>
> (Hence, status quo wins a stalemate.)
>

Well, it would be frustrating to have a feature accepted but not
implemented, but the steps are separate.

And it wouldn't have to be a core dev that implements it -- anyone with the
C chops (not me :-) ) could do it.

As an example, a good chunk of PEP 485 was implemented by someone else (I
wrote the first prototype, but it was not complete), who I'm pretty sure is
not a core dev.

A core dev has to actually merge it, of course, and that is a bottleneck,
but not a show stopper.

-CHB

-- 

Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/OR&R            (206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115       (206) 526-6317   main reception

Chris.Barker at noaa.gov
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20181102/37987fbc/attachment.html>

From chris.barker at noaa.gov  Fri Nov  2 13:26:58 2018
From: chris.barker at noaa.gov (Chris Barker)
Date: Fri, 2 Nov 2018 10:26:58 -0700
Subject: [Python-ideas] Serialization of CSV vs. JSON
In-Reply-To: <d335e6d2-4f0b-54cb-af7d-2647448f32ab@egenix.com>
References: <CAJdod2pM=S_Kza80tuYr7RfUWhNXB73cs8evGeif9E0PkSRshQ@mail.gmail.com>
 <d335e6d2-4f0b-54cb-af7d-2647448f32ab@egenix.com>
Message-ID: <CALGmxEK_z-FPjVOQMxZzbxXGsmUODWHCRa8wxCZ=Bk=YZMSTqg@mail.gmail.com>

On Fri, Nov 2, 2018 at 9:31 AM, M.-A. Lemburg <mal at egenix.com> wrote:

> Serialization of those data types is not defined in the JSON standard:
>
> https://www.json.org/


That being said, ISO 8601 is a standard for datetime stamps, and a defacto
one for JSON

So building encoding of datetime into Python's json encoder would be pretty
useful.

(I would not have any automatic decoding though -- as an ISO8601 string
would still be just a string in JSON)

Could we have a "pedantic" mode for "fully standard conforming" JSON, and
then add some extensions to the standard?

As another example, I would find it very handy if the json decoder would
respect comments in JSON (I know that they are explicitly not part of the
standard), but they are used in other applications, particularly when JSON
is used as a configuration language.

-CHB


-- 

Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/OR&R            (206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115       (206) 526-6317   main reception

Chris.Barker at noaa.gov
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20181102/58f5cb3f/attachment.html>

From philip.martin2007 at gmail.com  Fri Nov  2 13:31:22 2018
From: philip.martin2007 at gmail.com (Philip Martin)
Date: Fri, 2 Nov 2018 12:31:22 -0500
Subject: [Python-ideas] Serialization of CSV vs. JSON
In-Reply-To: <CACo5Rz4SU2tQLn_0yoSwuFXs31KE+UfFtDpz002=b+HyGB42ag@mail.gmail.com>
References: <CAJdod2pM=S_Kza80tuYr7RfUWhNXB73cs8evGeif9E0PkSRshQ@mail.gmail.com>
 <CACo5Rz4SU2tQLn_0yoSwuFXs31KE+UfFtDpz002=b+HyGB42ag@mail.gmail.com>
Message-ID: <CAJdod2qzSge5BChdz3Yk0=7aM3DJZE2ke9Q3uHW7JhA=mWX7og@mail.gmail.com>

My bad. I think I need to clarify my objective. I definitely understand the
issues regarding serialization/deserialization on JSON, i.e. decimals as
strings, etc., and hooking in a default serializer function is easy enough.
I guess my question is more related to why the csv writer and DictWriter
don't provide similar functionality for serialization/deserialization
hooks? There seems to be a wide gap between reaching for a tool like pandas
were maybe too much auto-magical parsing and guessing happens, and wrapping
the functionality around the csv module IMO. I was curious to see if anyone
else had similar opinions, and if so, whether conversion around what
extended functionality would be most fruitful?

On Fri, Nov 2, 2018 at 11:28 AM Calvin Spealman <cspealma at redhat.com> wrote:

> First, this list is not appropriate. You should ask such a question in
> python-list.
>
> Second, JSON is a specific serialization format that explicitly rejects
> datetime objects in *all* the languages with JSON libraries. You can only
> use date objects in JSON if you control or understand both serialization
> and deserialization ends and have an agreed representation.
>
> On Fri, Nov 2, 2018 at 12:20 PM Philip Martin <philip.martin2007 at gmail.com>
> wrote:
>
>> Is there any reason why date, datetime, and UUID objects automatically
>> serialize to default strings using the csv module, but json.dumps throws an
>> error as a default? i.e.
>>
>> import csv
>> import json
>> import io
>> from datetime import date
>>
>> stream = io.StringIO()
>> writer = csv.writer(stream)
>> writer.writerow([date(2018, 11, 2)])
>> # versus
>> json.dumps(date(2018, 11, 2))
>>
>> _______________________________________________
>> Python-ideas mailing list
>> Python-ideas at python.org
>> https://mail.python.org/mailman/listinfo/python-ideas
>> Code of Conduct: http://python.org/psf/codeofconduct/
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20181102/2f4bc5be/attachment.html>

From turnbull.stephen.fw at u.tsukuba.ac.jp  Fri Nov  2 13:49:00 2018
From: turnbull.stephen.fw at u.tsukuba.ac.jp (Stephen J. Turnbull)
Date: Sat, 3 Nov 2018 02:49:00 +0900
Subject: [Python-ideas] Make fnmatch.filter accept a tuple of patterns
In-Reply-To: <CACcSwS+GdO4Cp-UUaBRk8eSfiLF+Ma28mpLhB2Dqyy8BhRZcKw@mail.gmail.com>
References: <CACcSwS+GdO4Cp-UUaBRk8eSfiLF+Ma28mpLhB2Dqyy8BhRZcKw@mail.gmail.com>
Message-ID: <23516.36364.586345.44292@turnbull.sk.tsukuba.ac.jp>

Andre Delfino writes:

 > Frequently, while globbing, one needs to work with multiple extensions. I?d
 > like to propose for fnmatch.filter to handle a tuple of patterns (while
 > preserving the single str argument functionality, alas str.endswith),

This is one of those famous 3-line functions, though:

    import fnmatch
    def multifilter(names, *patterns):
        result = []
        for p in patterns:
            result.extend(fnmatch.filter(names, p)
        return result

It's a 3-line function in 5 lines, OK, but still.

 > as a first step for glob.i?glob to accept multiple patterns as well.

If you're going to improve the glob module, why not use bash or zsh
extended globbing ('**', '{a,b}') as the model?  This is more
powerful, and already familiar to many users.


From rosuav at gmail.com  Fri Nov  2 14:00:44 2018
From: rosuav at gmail.com (Chris Angelico)
Date: Sat, 3 Nov 2018 05:00:44 +1100
Subject: [Python-ideas] Make fnmatch.filter accept a tuple of patterns
In-Reply-To: <23516.36364.586345.44292@turnbull.sk.tsukuba.ac.jp>
References: <CACcSwS+GdO4Cp-UUaBRk8eSfiLF+Ma28mpLhB2Dqyy8BhRZcKw@mail.gmail.com>
 <23516.36364.586345.44292@turnbull.sk.tsukuba.ac.jp>
Message-ID: <CAPTjJmqHFNY3pRp0FioQMHyd9=H58qDSe9Dj3e0Xw9yd3z9VvA@mail.gmail.com>

On Sat, Nov 3, 2018 at 4:49 AM Stephen J. Turnbull
<turnbull.stephen.fw at u.tsukuba.ac.jp> wrote:
>
> Andre Delfino writes:
>
>  > Frequently, while globbing, one needs to work with multiple extensions. I?d
>  > like to propose for fnmatch.filter to handle a tuple of patterns (while
>  > preserving the single str argument functionality, alas str.endswith),
>
> This is one of those famous 3-line functions, though:
>
>     import fnmatch
>     def multifilter(names, *patterns):
>         result = []
>         for p in patterns:
>             result.extend(fnmatch.filter(names, p)
>         return result
>
> It's a 3-line function in 5 lines, OK, but still.
>

And like many "hey it's this easy" demonstrations, that isn't quite
identical, as a single file can match multiple patterns (but shouldn't
be in the result multiple times). Whether that's an important
distinction or not remains to be seen, but I do know of situations
where this would have bitten me.

ChrisA

From boxed at killingar.net  Fri Nov  2 14:27:04 2018
From: boxed at killingar.net (=?utf-8?Q?Anders_Hovm=C3=B6ller?=)
Date: Fri, 2 Nov 2018 19:27:04 +0100
Subject: [Python-ideas] dict.setdefault_call(),
 or API variations thereupon
In-Reply-To: <CALGmxELxVaV7RkB5mY6nMFDQaazxts3B7H9hf1q-XZA6X+0Hsw@mail.gmail.com>
References: <mailman.7661.1541123849.2798.python-ideas@python.org>
 <CAJG_PJU10A=_STur3=DMB31RtL+rC9NA=R8zCJ-stFyb0kGu6g@mail.gmail.com>
 <CAP7+vJKMCMLtztMpeCmvT-jcS9ALzAPZquJx-4_LVHcM94YyaA@mail.gmail.com>
 <CA+msPNmnNuCnCbdwEbptvY+45nBm97RUxkwZYg9hFpo+6uADgA@mail.gmail.com>
 <CAP7+vJL9Nrp5858cyoMDiNRS3MWgfm_u6R0S8m+phPvRG4sNXw@mail.gmail.com>
 <CAJG_PJVJUXywV2MrfLu2hTj6s=uWJYrDy_Q_ZDXm5C=ORq9kgg@mail.gmail.com>
 <20181102033409.GI3817@ando.pearwood.info>
 <CALGmxELxVaV7RkB5mY6nMFDQaazxts3B7H9hf1q-XZA6X+0Hsw@mail.gmail.com>
Message-ID: <87E85E43-A6E8-4FFA-8534-122E4F134C65@killingar.net>

Just a little improvement: you don't need the l local variable, you can just call append:

d.setdefault(foo, []).append(bar)

And correspondingly:
d[foo].append(bar)

> On 2 Nov 2018, at 17:52, Chris Barker via Python-ideas <python-ideas at python.org> wrote:
> 
>> On Thu, Nov 1, 2018 at 8:34 PM, Steven D'Aprano <steve at pearwood.info> wrote:
>> The bottom line is, if I understand your proposal, the functionality 
>> already exists. All you need do is subclass dict and give it a 
>> __missing__ method which does what you want.
> 
> or subclass dict and give it a "setdefault_call") method :-)
> 
> But as I think Guido wasa pointing out, the real difference here is that DefaultDict, or any other subclass, is specifying what the default callable is for the entire dict, rather than at time of use. Personally, I'm pretty sure I"ve only used one default for any given dict, but I can imaige the are use cases for having different defaults for the same dict depending on context.
> 
> As for the OP's justification:
> 
> """
> If it's not clear, the purpose is to eliminate the overhead of creating an empty list or similar in situations like this:
> 
> d = {}
> for i in range(1000000):  # some large loop
>      l = d.setdefault(somekey, [])
>      l.append(somevalue)
> 
> # instead...
> 
> for i in range(1000000):
>     l = d.setdefault_call(somekey, list)
>     l.append(somevalue)
> 
> """
> 
> I presume the point is that in the first case, somekey might be often the same, and setdefault requires creating an actual empty list even if  the key is alredy there. whereas case 2 will only create the empty list if the key is not there. doing some timing with defaultdict:
> 
> In [19]: def setdefault():
>     ...:     d = {}
>     ...:     somekey = 5
>     ...:     for i in range(1000000):  # some large loop
>     ...:         l = d.setdefault(somekey, [])
>     ...:         l.append(i)
>     ...:     return d
> 
> In [20]: def default_dict():
>     ...:     d = defaultdict(list)
>     ...:     somekey = 5
>     ...:     for i in range(1000000):  # some large loop
>     ...:         l = d[somekey]
>     ...:         l.append(i)
>     ...:     return d
> 
> In [21]: % timeit setdefault()
> 185 ms ? 1.23 ms per loop (mean ? std. dev. of 7 runs, 10 loops each)
> 
> In [22]: % timeit default_dict()
> 128 ms ? 1.65 ms per loop (mean ? std. dev. of 7 runs, 10 loops each)
> 
> so yeah, it's a little more performant, and I suppose if you were using a more expensive constructor, it would make a lot more difference. But then, how much is it likely to matter in a real use cases -- this was 1 million calls for one key and you got a 50% speed up -- is that common?
> 
> So it seems this would give us slightly better performance than .setdefault() for the use cases where you are using more than one default for a given dict.
> 
> BTW:
> 
> +1 for a mention of defaultdict in the dict.setdefault docs -- you can't do everything with defaultdict that you can with setdefault, but it is a very common use case.
> 
> -CHB
> 
> -- 
> 
> Christopher Barker, Ph.D.
> Oceanographer
> 
> Emergency Response Division
> NOAA/NOS/OR&R            (206) 526-6959   voice
> 7600 Sand Point Way NE   (206) 526-6329   fax
> Seattle, WA  98115       (206) 526-6317   main reception
> 
> Chris.Barker at noaa.gov
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> https://mail.python.org/mailman/listinfo/python-ideas
> Code of Conduct: http://python.org/psf/codeofconduct/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20181102/81aa09d3/attachment.html>

From mike at selik.org  Fri Nov  2 14:41:40 2018
From: mike at selik.org (Michael Selik)
Date: Fri, 2 Nov 2018 11:41:40 -0700
Subject: [Python-ideas] Serialization of CSV vs. JSON
In-Reply-To: <CAJdod2qzSge5BChdz3Yk0=7aM3DJZE2ke9Q3uHW7JhA=mWX7og@mail.gmail.com>
References: <CAJdod2pM=S_Kza80tuYr7RfUWhNXB73cs8evGeif9E0PkSRshQ@mail.gmail.com>
 <CACo5Rz4SU2tQLn_0yoSwuFXs31KE+UfFtDpz002=b+HyGB42ag@mail.gmail.com>
 <CAJdod2qzSge5BChdz3Yk0=7aM3DJZE2ke9Q3uHW7JhA=mWX7og@mail.gmail.com>
Message-ID: <CAGgTfkPkOHv+THTGiyaU+xm68WaziEyQntqmsoZFQco4Ua2KEA@mail.gmail.com>

On Fri, Nov 2, 2018 at 10:31 AM Philip Martin <philip.martin2007 at gmail.com>
wrote:

> [Why don't] csv writer and DictWriter provide ...
> serialization/deserialization hooks?
>

Do you have a specific use-case in mind?

My intuition is that comprehensions provide sufficient functionality such
that changing the csv module interface is unnecessary. Unlike JSON, CSV
files are easy to read in a streaming/iterator fashion, so the module
doesn't need to provide a way to intercept values during a holistic
encode/decode.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20181102/55ea005e/attachment.html>

From wes.turner at gmail.com  Fri Nov  2 15:17:25 2018
From: wes.turner at gmail.com (Wes Turner)
Date: Fri, 2 Nov 2018 15:17:25 -0400
Subject: [Python-ideas] Serialization of CSV vs. JSON
In-Reply-To: <CALGmxEK_z-FPjVOQMxZzbxXGsmUODWHCRa8wxCZ=Bk=YZMSTqg@mail.gmail.com>
References: <CAJdod2pM=S_Kza80tuYr7RfUWhNXB73cs8evGeif9E0PkSRshQ@mail.gmail.com>
 <d335e6d2-4f0b-54cb-af7d-2647448f32ab@egenix.com>
 <CALGmxEK_z-FPjVOQMxZzbxXGsmUODWHCRa8wxCZ=Bk=YZMSTqg@mail.gmail.com>
Message-ID: <CACfEFw91OXZB6N=-D8wv_1Wqt2RVUfxYeKHP0xdwEYGbMk0csA@mail.gmail.com>

JSON-LD supports datetimes (as e.g. IS8601 xsd:datetimes)
https://www.w3.org/TR/json-ld11/#typed-values

Jsonpickle (Python, JS, ) supports datetimes, numpy arrays, pandas
dataframes
https://github.com/jsonpickle/jsonpickle

JSON5 supports comments in JSON.
https://github.com/json5/json5/issues/3

... Some form of schema is necessary to avoid having to try parsing every
string value as a date time (and to specify precision: "2018" is not the
same as "2018 00:00:01")

On Friday, November 2, 2018, Chris Barker via Python-ideas <
python-ideas at python.org> wrote:

> On Fri, Nov 2, 2018 at 9:31 AM, M.-A. Lemburg <mal at egenix.com> wrote:
>
>> Serialization of those data types is not defined in the JSON standard:
>>
>> https://www.json.org/
>
>
> That being said, ISO 8601 is a standard for datetime stamps, and a defacto
> one for JSON
>
> So building encoding of datetime into Python's json encoder would be
> pretty useful.
>
> (I would not have any automatic decoding though -- as an ISO8601 string
> would still be just a string in JSON)
>
> Could we have a "pedantic" mode for "fully standard conforming" JSON, and
> then add some extensions to the standard?
>
> As another example, I would find it very handy if the json decoder would
> respect comments in JSON (I know that they are explicitly not part of the
> standard), but they are used in other applications, particularly when JSON
> is used as a configuration language.
>
> -CHB
>
>
> --
>
> Christopher Barker, Ph.D.
> Oceanographer
>
> Emergency Response Division
> NOAA/NOS/OR&R            (206) 526-6959   voice
> 7600 Sand Point Way NE   (206) 526-6329   fax
> Seattle, WA  98115       (206) 526-6317   main reception
>
> Chris.Barker at noaa.gov
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20181102/37320412/attachment-0001.html>

From steve at pearwood.info  Fri Nov  2 20:05:24 2018
From: steve at pearwood.info (Steven D'Aprano)
Date: Sat, 3 Nov 2018 11:05:24 +1100
Subject: [Python-ideas] dict.setdefault_call(),
 or API variations thereupon
In-Reply-To: <CALGmxELxVaV7RkB5mY6nMFDQaazxts3B7H9hf1q-XZA6X+0Hsw@mail.gmail.com>
References: <mailman.7661.1541123849.2798.python-ideas@python.org>
 <CAJG_PJU10A=_STur3=DMB31RtL+rC9NA=R8zCJ-stFyb0kGu6g@mail.gmail.com>
 <CAP7+vJKMCMLtztMpeCmvT-jcS9ALzAPZquJx-4_LVHcM94YyaA@mail.gmail.com>
 <CA+msPNmnNuCnCbdwEbptvY+45nBm97RUxkwZYg9hFpo+6uADgA@mail.gmail.com>
 <CAP7+vJL9Nrp5858cyoMDiNRS3MWgfm_u6R0S8m+phPvRG4sNXw@mail.gmail.com>
 <CAJG_PJVJUXywV2MrfLu2hTj6s=uWJYrDy_Q_ZDXm5C=ORq9kgg@mail.gmail.com>
 <20181102033409.GI3817@ando.pearwood.info>
 <CALGmxELxVaV7RkB5mY6nMFDQaazxts3B7H9hf1q-XZA6X+0Hsw@mail.gmail.com>
Message-ID: <20181103000524.GP3817@ando.pearwood.info>

On Fri, Nov 02, 2018 at 09:52:24AM -0700, Chris Barker wrote:
> On Thu, Nov 1, 2018 at 8:34 PM, Steven D'Aprano <steve at pearwood.info> wrote:
> 
> > The bottom line is, if I understand your proposal, the functionality
> > already exists. All you need do is subclass dict and give it a
> > __missing__ method which does what you want.
> 
> 
> or subclass dict and give it a "setdefault_call") method :-)

Well sure, if we're making up our own methods and calling them anything 
we like :-)

The status quo (as I see it):

dict.setdefault:
    - takes an explicit, but eagerly evaluated, default value;

dict.__missing__:
    - requires subclassing to make it work;
    - passes the missing key to the method, so the method can
      decide what value to return;

defaultdict:
    - takes a zero-argument factory function which is 
      unconditionally called when the key is missing.

Did I miss any?

What we don't have is a version of setdefault where the default is 
evaluated only on need. That would be a great use-case for Call-By-Name 
semantics and thunks, if Python supported such :-)

(That's just a half-baked thought, not a concrete proposal.)


> But as I think Guido wasa pointing out, the real difference here is that
> DefaultDict, or any other subclass, is specifying what the default callable
> is for the entire dict, rather than at time of use. 

As you show below, a default callable for the dict is precisely the use-case 
the OP gives:

    l = d.setdefault_call(somekey, list)

would be equivalent to defaultdict(list) and l = d[somekey].

(I think. Have I missed something?)

Nevertheless, Guido's point is reasonable -- if it comes up in practice 
often enough to care.


[...]
> As for the OP's justification:
> 
> """
> If it's not clear, the purpose is to eliminate the overhead of creating an
> empty list or similar in situations like this:
> 
> d = {}
> for i in range(1000000):  # some large loop
>      l = d.setdefault(somekey, [])
>      l.append(somevalue)
>
> # instead...
> 
> for i in range(1000000):
>     l = d.setdefault_call(somekey, list)
>     l.append(somevalue)
> 
> """

Are we sure that the overhead is significantly more than the cost of the 
name lookup of "list" and the expense of calling it?

You do demonstrate a speed difference with defaultdict (thanks for doing 
the timing tests) but the situation isn't precisely comparable to the 
proposed method, since you aren't looking up the name "list" each time 
through the outer loop.

Could construction of the empty list be optimized more? That might 
reduce the benefit even further (at least for the given case, but not 
for the general case of an arbitrarily expensive default).

We keep coming up against the issue of *eager evaluation* versus 
*delayed evaluation*, and I can't help feel that rather that solving 
this problem on an ad-hoc basis each time it comes up, maybe we really 
do need a way to tell the interpreter "delay evaluating this expression 
until needed". Then we could use it anywhere it was important, without 
having to create a plethora of special case setdefault_call() methods 
and the like.


-- 
Steve

From boxed at killingar.net  Fri Nov  2 20:15:04 2018
From: boxed at killingar.net (=?utf-8?Q?Anders_Hovm=C3=B6ller?=)
Date: Sat, 3 Nov 2018 01:15:04 +0100
Subject: [Python-ideas] dict.setdefault_call(),
 or API variations thereupon
In-Reply-To: <20181103000524.GP3817@ando.pearwood.info>
References: <mailman.7661.1541123849.2798.python-ideas@python.org>
 <CAJG_PJU10A=_STur3=DMB31RtL+rC9NA=R8zCJ-stFyb0kGu6g@mail.gmail.com>
 <CAP7+vJKMCMLtztMpeCmvT-jcS9ALzAPZquJx-4_LVHcM94YyaA@mail.gmail.com>
 <CA+msPNmnNuCnCbdwEbptvY+45nBm97RUxkwZYg9hFpo+6uADgA@mail.gmail.com>
 <CAP7+vJL9Nrp5858cyoMDiNRS3MWgfm_u6R0S8m+phPvRG4sNXw@mail.gmail.com>
 <CAJG_PJVJUXywV2MrfLu2hTj6s=uWJYrDy_Q_ZDXm5C=ORq9kgg@mail.gmail.com>
 <20181102033409.GI3817@ando.pearwood.info>
 <CALGmxELxVaV7RkB5mY6nMFDQaazxts3B7H9hf1q-XZA6X+0Hsw@mail.gmail.com>
 <20181103000524.GP3817@ando.pearwood.info>
Message-ID: <98AF148B-86A0-4C76-9BC8-0E07F5DCDC6B@killingar.net>


> defaultdict:
>    - takes a zero-argument factory function which is 
>      unconditionally called when the key is missing.
> 
> Did I miss any?
> 
> What we don't have is a version of setdefault where the default is 
> evaluated only on need. That would be a great use-case for Call-By-Name 
> semantics and thunks, if Python supported such :-)

Could you explain what the difference is between defaultdicts "factory which is unconditionally called when the key is missing" and "the default is evaluated only on need"? To me it seems that "unconditionally called when the key is missing" is a complex way of saying "called only when needed". I must be missing some nuance here.

/ Anders


From mike at selik.org  Fri Nov  2 21:34:20 2018
From: mike at selik.org (Michael Selik)
Date: Fri, 2 Nov 2018 18:34:20 -0700
Subject: [Python-ideas] dict.setdefault_call(),
 or API variations thereupon
In-Reply-To: <98AF148B-86A0-4C76-9BC8-0E07F5DCDC6B@killingar.net>
References: <mailman.7661.1541123849.2798.python-ideas@python.org>
 <CAJG_PJU10A=_STur3=DMB31RtL+rC9NA=R8zCJ-stFyb0kGu6g@mail.gmail.com>
 <CAP7+vJKMCMLtztMpeCmvT-jcS9ALzAPZquJx-4_LVHcM94YyaA@mail.gmail.com>
 <CA+msPNmnNuCnCbdwEbptvY+45nBm97RUxkwZYg9hFpo+6uADgA@mail.gmail.com>
 <CAP7+vJL9Nrp5858cyoMDiNRS3MWgfm_u6R0S8m+phPvRG4sNXw@mail.gmail.com>
 <CAJG_PJVJUXywV2MrfLu2hTj6s=uWJYrDy_Q_ZDXm5C=ORq9kgg@mail.gmail.com>
 <20181102033409.GI3817@ando.pearwood.info>
 <CALGmxELxVaV7RkB5mY6nMFDQaazxts3B7H9hf1q-XZA6X+0Hsw@mail.gmail.com>
 <20181103000524.GP3817@ando.pearwood.info>
 <98AF148B-86A0-4C76-9BC8-0E07F5DCDC6B@killingar.net>
Message-ID: <CAGgTfkPfOViObmj=i+zojbD-U2-+zkS==_gT1jg9X=2u9CL6DA@mail.gmail.com>

On Fri, Nov 2, 2018 at 5:25 PM Anders Hovm?ller <boxed at killingar.net> wrote:

> Could you explain what the difference is between defaultdicts "factory
> which is unconditionally called when the key is missing" and "the default
> is evaluated only on need"?
>

The distinction was the motivation for this thread: setdefault requires a
constructed default instance as an argument, regardless of whether the key
is missing, whereas defaultdict's factory is only called if necessary. If
the key is present in a defaultdict, no default is constructed.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20181102/9ca042cc/attachment.html>

From steve at pearwood.info  Fri Nov  2 22:49:11 2018
From: steve at pearwood.info (Steven D'Aprano)
Date: Sat, 3 Nov 2018 13:49:11 +1100
Subject: [Python-ideas] dict.setdefault_call(),
 or API variations thereupon
In-Reply-To: <98AF148B-86A0-4C76-9BC8-0E07F5DCDC6B@killingar.net>
References: <mailman.7661.1541123849.2798.python-ideas@python.org>
 <CAJG_PJU10A=_STur3=DMB31RtL+rC9NA=R8zCJ-stFyb0kGu6g@mail.gmail.com>
 <CAP7+vJKMCMLtztMpeCmvT-jcS9ALzAPZquJx-4_LVHcM94YyaA@mail.gmail.com>
 <CA+msPNmnNuCnCbdwEbptvY+45nBm97RUxkwZYg9hFpo+6uADgA@mail.gmail.com>
 <CAP7+vJL9Nrp5858cyoMDiNRS3MWgfm_u6R0S8m+phPvRG4sNXw@mail.gmail.com>
 <CAJG_PJVJUXywV2MrfLu2hTj6s=uWJYrDy_Q_ZDXm5C=ORq9kgg@mail.gmail.com>
 <20181102033409.GI3817@ando.pearwood.info>
 <CALGmxELxVaV7RkB5mY6nMFDQaazxts3B7H9hf1q-XZA6X+0Hsw@mail.gmail.com>
 <20181103000524.GP3817@ando.pearwood.info>
 <98AF148B-86A0-4C76-9BC8-0E07F5DCDC6B@killingar.net>
Message-ID: <20181103024911.GQ3817@ando.pearwood.info>

On Sat, Nov 03, 2018 at 01:15:04AM +0100, Anders Hovm?ller wrote:
> 
> > defaultdict:
> >    - takes a zero-argument factory function which is 
> >      unconditionally called when the key is missing.
> > 
> > Did I miss any?
> > 
> > What we don't have is a version of setdefault where the default is 
> > evaluated only on need. That would be a great use-case for Call-By-Name 
> > semantics and thunks, if Python supported such :-)
> 
> Could you explain what the difference is between defaultdicts "factory 
> which is unconditionally called when the key is missing" and "the 
> default is evaluated only on need"? To me it seems that 
> "unconditionally called when the key is missing" is a complex way of 
> saying "called only when needed". I must be missing some nuance here.

Consider the use-case where you want to pass a different default value 
to the dict each time:

    d.setdefault(key, expensive_function(1, 2, 3))
    d.setdefault(key, expensive_function(4, 8, 16))
    d.setdefault(key, expensive_function(10, 100, 1000))

The expensive function is eagerly evaluated each time you call 
setdefault, whether the result is needed or not.

defaultdict won't help, because your factory function takes no 
arguments: there's no way to supply arguments for the factory.

__missing__ won't help, because it only receives the key, not arbitrary 
arguments.

We can of course subclass dict and give it a method with the semantics 
we want:

    d.my_setdefault(key, expensive_function, args=(1, 2, 3), kw={})

but it would be nicer and more expressive if we could tell the 
interpreter "don't evaluate expensive_function(...) unless you really 
need it".

Other languages have this -- I believe it is called "Call By Need" or 
"Call By Name", depending on the precise details of how it works. I call 
it delayed evaluation, and Python already has it, but only in certain 
special syntactic forms:

    spam and <delayed expression>
    spam or <delayed expression>
    <delayed expression> if condition else <delayed expression>

There are others: e.g. the body of functions, including lambda. But 
functions are kinda heavyweight to make and build and call.


-- 
Steve

From daveshawley at gmail.com  Sat Nov  3 09:01:44 2018
From: daveshawley at gmail.com (David Shawley)
Date: Sat, 3 Nov 2018 09:01:44 -0400
Subject: [Python-ideas] Serialization of CSV vs. JSON
In-Reply-To: <CACo5Rz4SU2tQLn_0yoSwuFXs31KE+UfFtDpz002=b+HyGB42ag@mail.gmail.com>
References: <CAJdod2pM=S_Kza80tuYr7RfUWhNXB73cs8evGeif9E0PkSRshQ@mail.gmail.com>
 <CACo5Rz4SU2tQLn_0yoSwuFXs31KE+UfFtDpz002=b+HyGB42ag@mail.gmail.com>
Message-ID: <E92C82D6-57B6-4E81-9E4C-EAA97D048B38@gmail.com>

On Nov 2, 2018, at 12:28 PM, Calvin Spealman <cspealma at redhat.com> wrote:

> Second, JSON is a specific serialization format that explicitly rejects
> datetime objects in *all* the languages with JSON libraries. You can only
> use date objects in JSON if you control or understand both serialization
> and deserialization ends and have an agreed representation.

I would hardly say that "rejects datetime objects in *all* languages..."

Most Javascript implementations do handle dates correctly which is a bit
telling for me.  For example, the Mozilla reference calls out Date as
explicitly supported [1].  I also ran it through the Javascript console and
repl.it to make sure that it wasn't a doc glitch [2].

Go also supports serialization of date/times as shown in this repl.it
session [3].  As does rust, though rust doesn't use ISO-8601 [4].


That being said, I'm +1 on adding support for serializing datetime.date and
datetime.datetime *but* I'm -1 on automatically deserializing anything that
looks like a ISO-8601 in json.load*.  The asymmetry is the only thing that
kept me from bringing this up previously.

What about implementing this as a protocol?

The Javascript implementation of JSON.stringify looks for a method named
toJSON() when it encounters a non-primitive type and uses the result for
serialization.  This would be a pretty easy lift in json.JSONEncoder.default:

class JSONEncoder(object):
    def default(self, o):
        if hasattr(o, 'to_json'):
            return o.to_json(self)
        raise TypeError(f'Object of type {o.__class__.__name__} '
                        f'is not JSON serializable')

I would recommend passing the JSONEncoder instance in to ``to_json()`` as I
did in the snippet.  This makes serialization much easier for classes since
they do not have to assume a particular set of JSON serialization options.

Is this something that is PEP-worthy or is a PR with a simple flag to enable
the functionality in JSON encoder enough?

- cheers, dave.

[1]: https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/JSON/stringify
[2]: https://repl.it/@dave_shawley/OffensiveParallelResource
[3]: https://repl.it/@dave_shawley/EvenSunnyForce
[4]: https://play.rust-lang.org/?version=stable&mode=debug&edition=2015&gist=73de1454da4ac56900cde37edb0d6c8f

From rosuav at gmail.com  Sat Nov  3 09:29:32 2018
From: rosuav at gmail.com (Chris Angelico)
Date: Sun, 4 Nov 2018 00:29:32 +1100
Subject: [Python-ideas] Serialization of CSV vs. JSON
In-Reply-To: <E92C82D6-57B6-4E81-9E4C-EAA97D048B38@gmail.com>
References: <CAJdod2pM=S_Kza80tuYr7RfUWhNXB73cs8evGeif9E0PkSRshQ@mail.gmail.com>
 <CACo5Rz4SU2tQLn_0yoSwuFXs31KE+UfFtDpz002=b+HyGB42ag@mail.gmail.com>
 <E92C82D6-57B6-4E81-9E4C-EAA97D048B38@gmail.com>
Message-ID: <CAPTjJmpUawuQhSu=qEh5QXg_pypCKnD6norYg8YiiHaNRcuNkw@mail.gmail.com>

On Sun, Nov 4, 2018 at 12:02 AM David Shawley <daveshawley at gmail.com> wrote:
>
> On Nov 2, 2018, at 12:28 PM, Calvin Spealman <cspealma at redhat.com> wrote:
>
> > Second, JSON is a specific serialization format that explicitly rejects
> > datetime objects in *all* the languages with JSON libraries. You can only
> > use date objects in JSON if you control or understand both serialization
> > and deserialization ends and have an agreed representation.
>
> I would hardly say that "rejects datetime objects in *all* languages..."
>
> Most Javascript implementations do handle dates correctly which is a bit
> telling for me.  For example, the Mozilla reference calls out Date as
> explicitly supported [1].  I also ran it through the Javascript console and
> repl.it to make sure that it wasn't a doc glitch [2].

I think we need to clarify an important distinction here. JSON, as a
format, does *not* support date/time objects in any way. But
JavaScript's JSON.stringify() function is happy to accept them, and
will represent them as strings.

If the suggestion here is to have json.dumps(datetime.date(2018,11,4))
to return an encoded string, either by natively supporting it, or by
having a protocol which the date object implements, that's fine and
reasonable; but json.loads(s) won't return that date object. So, yes,
it would be asymmetric. I personally don't have a problem with this
(though I also don't have any strong use-cases). Custom encoders and
decoders could do this, with or without symmetry. What would it be
like to add a couple to the json module that can handle these extra
types?

ChrisA

From storchaka at gmail.com  Sat Nov  3 09:46:38 2018
From: storchaka at gmail.com (Serhiy Storchaka)
Date: Sat, 3 Nov 2018 15:46:38 +0200
Subject: [Python-ideas] Serialization of CSV vs. JSON
In-Reply-To: <CALGmxEK_z-FPjVOQMxZzbxXGsmUODWHCRa8wxCZ=Bk=YZMSTqg@mail.gmail.com>
References: <CAJdod2pM=S_Kza80tuYr7RfUWhNXB73cs8evGeif9E0PkSRshQ@mail.gmail.com>
 <d335e6d2-4f0b-54cb-af7d-2647448f32ab@egenix.com>
 <CALGmxEK_z-FPjVOQMxZzbxXGsmUODWHCRa8wxCZ=Bk=YZMSTqg@mail.gmail.com>
Message-ID: <prk8ns$43c$1@blaine.gmane.org>

02.11.18 19:26, Chris Barker via Python-ideas ????:
> On Fri, Nov 2, 2018 at 9:31 AM, M.-A. Lemburg 
> <mal at egenix.com 
> <mailto:mal at egenix.com>> wrote:
> 
>     Serialization of those data types is not defined in the JSON standard:
> 
>     https://www.json.org/
> 
> 
> That being said, ISO 8601 is a standard for datetime stamps, and a 
> defacto one for JSON

It is not the only standard. Other common representation is as POSIX 
timestamp. And, for a date without time, the Julian day.


From daveshawley at gmail.com  Sat Nov  3 10:00:42 2018
From: daveshawley at gmail.com (David Shawley)
Date: Sat, 3 Nov 2018 10:00:42 -0400
Subject: [Python-ideas] Serialization of CSV vs. JSON
In-Reply-To: <CAPTjJmpUawuQhSu=qEh5QXg_pypCKnD6norYg8YiiHaNRcuNkw@mail.gmail.com>
References: <CAJdod2pM=S_Kza80tuYr7RfUWhNXB73cs8evGeif9E0PkSRshQ@mail.gmail.com>
 <CACo5Rz4SU2tQLn_0yoSwuFXs31KE+UfFtDpz002=b+HyGB42ag@mail.gmail.com>
 <E92C82D6-57B6-4E81-9E4C-EAA97D048B38@gmail.com>
 <CAPTjJmpUawuQhSu=qEh5QXg_pypCKnD6norYg8YiiHaNRcuNkw@mail.gmail.com>
Message-ID: <D88D9931-6E82-49A9-A364-AA05657FA63D@gmail.com>

On Nov 3, 2018, at 9:29 AM, Chris Angelico <rosuav at gmail.com> wrote:
> 
> I think we need to clarify an important distinction here. JSON, as a
> format, does *not* support date/time objects in any way. But
> JavaScript's JSON.stringify() function is happy to accept them, and
> will represent them as strings.
> 

Very good point.  The JSON document type only supports object literals,
numbers, strings, and Boolean literals.  My suggestion was specifically to
provide an extensible mechanism for encoding arbitrary objects into the
supported primitives.

> If the suggestion here is to have json.dumps(datetime.date(2018,11,4))
> to return an encoded string, either by natively supporting it, or by
> having a protocol which the date object implements, that's fine and
> reasonable; but json.loads(s) won't return that date object. So, yes,
> it would be asymmetric. I personally don't have a problem with this
> (though I also don't have any strong use-cases). Custom encoders and
> decoders could do this, with or without symmetry. What would it be
> like to add a couple to the json module that can handle these extra
> types?

Completely agreed here.  I've seen many attempts to support "round trip"
encode/decode in JSON libraries and it really doesn't work well unless you go
down the path of type hinting.  I believe that MongoDB uses something akin to
hinting when it handles dates.  Something like the following representation
if I recall correctly.

{
    "now": {
        "$type": "JSONDate",
        "value": "2018-11-03T09:52:20-0400"
    }
}

During deserialization they recognize the hint and instantiate the object
instead of parsing it.  This is interesting but pretty awful for
interoperability since there isn't a standard that I'm aware of.  I'm
certainly not proposing that but I did want to mention it for completeness.

I'll try to put together a PR/branch that adds protocol support in JSON encoder
and to datetime, date, and uuid as well.  It will give us something to point at
and discuss.

- cheers, dave.
--
Mathematics provides a framework for dealing precisely with notions of "what is".
Computation provides a framework for dealing precisely with notions of "how to".
SICP Preface
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20181103/f93de1f7/attachment-0001.html>

From wes.turner at gmail.com  Sat Nov  3 10:16:53 2018
From: wes.turner at gmail.com (Wes Turner)
Date: Sat, 3 Nov 2018 10:16:53 -0400
Subject: [Python-ideas] Serialization of CSV vs. JSON
In-Reply-To: <D88D9931-6E82-49A9-A364-AA05657FA63D@gmail.com>
References: <CAJdod2pM=S_Kza80tuYr7RfUWhNXB73cs8evGeif9E0PkSRshQ@mail.gmail.com>
 <CACo5Rz4SU2tQLn_0yoSwuFXs31KE+UfFtDpz002=b+HyGB42ag@mail.gmail.com>
 <E92C82D6-57B6-4E81-9E4C-EAA97D048B38@gmail.com>
 <CAPTjJmpUawuQhSu=qEh5QXg_pypCKnD6norYg8YiiHaNRcuNkw@mail.gmail.com>
 <D88D9931-6E82-49A9-A364-AA05657FA63D@gmail.com>
Message-ID: <CACfEFw8yrmN1T_KA+esYT_UHYtMY9oX=LV=KcHH19uZxt9uaMw@mail.gmail.com>

jsondate, for example, supports both .load[s]() and .dump[s](); but only
for UTC datetimes

https://github.com/rconradharris/jsondate/blob/master/jsondate/__init__.py

UTC is only sometimes a fair assumption; otherwise it's dangerous to assume
that timezone-naieve [ISO8601] strings represent UTC-0 datetimes. In that
respect - aside from readability - arbitrary-precision POSIX timestamps are
less error-prone.


On Saturday, November 3, 2018, David Shawley <daveshawley at gmail.com> wrote:

> On Nov 3, 2018, at 9:29 AM, Chris Angelico <rosuav at gmail.com> wrote:
>
>
> I think we need to clarify an important distinction here. JSON, as a
> format, does *not* support date/time objects in any way. But
> JavaScript's JSON.stringify() function is happy to accept them, and
> will represent them as strings.
>
>
> Very good point.  The JSON document type only supports object literals,
> numbers, strings, and Boolean literals.  My suggestion was specifically to
> provide an extensible mechanism for encoding arbitrary objects into the
> supported primitives.
>
> If the suggestion here is to have json.dumps(datetime.date(2018,11,4))
> to return an encoded string, either by natively supporting it, or by
> having a protocol which the date object implements, that's fine and
> reasonable; but json.loads(s) won't return that date object. So, yes,
> it would be asymmetric. I personally don't have a problem with this
> (though I also don't have any strong use-cases). Custom encoders and
> decoders could do this, with or without symmetry. What would it be
> like to add a couple to the json module that can handle these extra
> types?
>
>
> Completely agreed here.  I've seen many attempts to support "round trip"
> encode/decode in JSON libraries and it really doesn't work well unless you
> go
> down the path of type hinting.  I believe that MongoDB uses something akin
> to
> hinting when it handles dates.  Something like the following representation
> if I recall correctly.
>
> {
>     "now": {
>         "$type": "JSONDate",
>         "value": "2018-11-03T09:52:20-0400"
>     }
> }
>
> During deserialization they recognize the hint and instantiate the object
> instead of parsing it.  This is interesting but pretty awful for
> interoperability since there isn't a standard that I'm aware of.  I'm
> certainly not proposing that but I did want to mention it for completeness.
>
> I'll try to put together a PR/branch that adds protocol support in JSON
> encoder
> and to datetime, date, and uuid as well.  It will give us something to
> point at
> and discuss.
>
> - cheers, dave.
> --
> Mathematics provides a framework for dealing precisely with notions of
> "what is".
> Computation provides a framework for dealing precisely with notions of
> "how to".
> SICP Preface
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20181103/d7bf411e/attachment.html>

From turnbull.stephen.fw at u.tsukuba.ac.jp  Sat Nov  3 13:29:55 2018
From: turnbull.stephen.fw at u.tsukuba.ac.jp (Stephen J. Turnbull)
Date: Sun, 4 Nov 2018 02:29:55 +0900
Subject: [Python-ideas] Make fnmatch.filter accept a tuple of patterns
In-Reply-To: <CAPTjJmqHFNY3pRp0FioQMHyd9=H58qDSe9Dj3e0Xw9yd3z9VvA@mail.gmail.com>
References: <CACcSwS+GdO4Cp-UUaBRk8eSfiLF+Ma28mpLhB2Dqyy8BhRZcKw@mail.gmail.com>
 <23516.36364.586345.44292@turnbull.sk.tsukuba.ac.jp>
 <CAPTjJmqHFNY3pRp0FioQMHyd9=H58qDSe9Dj3e0Xw9yd3z9VvA@mail.gmail.com>
Message-ID: <23517.56083.203909.511942@turnbull.sk.tsukuba.ac.jp>

Chris Angelico writes:
 > On Sat, Nov 3, 2018 at 4:49 AM Stephen J. Turnbull
 > <turnbull.stephen.fw at u.tsukuba.ac.jp> wrote:
 > > Andre Delfino writes:

 > >  > Frequently, while globbing, one needs to work with multiple
 > >  > extensions. I?d like to propose for fnmatch.filter to handle a
 > >  > tuple of patterns (while preserving the single str argument
 > >  > functionality, alas str.endswith),

 > > This is one of those famous 3-line functions, though:
 > >
 > >     import fnmatch
 > >     def multifilter(names, *patterns):
 > >         result = []
 > >         for p in patterns:
 > >             result.extend(fnmatch.filter(names, p))
 > >         return result
 > >
 > > It's a 3-line function in 5 lines, OK, but still.

 > And like many "hey it's this easy" demonstrations, that isn't quite
 > identical, as a single file can match multiple patterns

Sure.  I would have written it with set.union() on general principles
except I forgot how to say "union", didn't feel like looking it up,
and wanted to keep the def as close to 3 lines as I could without
being obfuscated (see below).  I wonder how many people would fall
into the trap I did.  (I don't consider myself a great programmer, but
maybe that's all the more reason for this?  Not-so-great minds think
alike? :-)

I was really more interested in the second question, though.  Why
invent yet another interface when we already have one that is
well-known and more powerful?

P.S.   I can't resist.  This is horrible, but:

def multifilter(names, *patterns):
    return list(set().union(*[fnmatch.filter(names, p) for p in patterns]))

Who even needs a function? ;-)


From mertz at gnosis.cx  Sat Nov  3 13:45:10 2018
From: mertz at gnosis.cx (David Mertz)
Date: Sat, 3 Nov 2018 13:45:10 -0400
Subject: [Python-ideas] Make fnmatch.filter accept a tuple of patterns
In-Reply-To: <23517.56083.203909.511942@turnbull.sk.tsukuba.ac.jp>
References: <CACcSwS+GdO4Cp-UUaBRk8eSfiLF+Ma28mpLhB2Dqyy8BhRZcKw@mail.gmail.com>
 <23516.36364.586345.44292@turnbull.sk.tsukuba.ac.jp>
 <CAPTjJmqHFNY3pRp0FioQMHyd9=H58qDSe9Dj3e0Xw9yd3z9VvA@mail.gmail.com>
 <23517.56083.203909.511942@turnbull.sk.tsukuba.ac.jp>
Message-ID: <CAEbHw4a0QOdw9wDUiSAbT1dWbzaMKY+d3x5XwnejyGH9WG0jqA@mail.gmail.com>

On Sat, Nov 3, 2018, 1:30 PM Stephen J. Turnbull <
turnbull.stephen.fw at u.tsukuba.ac.jp wrote:

> P.S.   I can't resist.  This is horrible, but:
>
> def multifilter(names, *patterns):
>     return list(set().union(*[fnmatch.filter(names, p) for p in patterns]))
>

Yes, that is a horrible spelling for:

    {fnmatch.filter(names, p) for p in patterns}

;-)

>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20181103/19c4cd05/attachment.html>

From python at mrabarnett.plus.com  Sat Nov  3 15:02:19 2018
From: python at mrabarnett.plus.com (MRAB)
Date: Sat, 3 Nov 2018 19:02:19 +0000
Subject: [Python-ideas] Make fnmatch.filter accept a tuple of patterns
In-Reply-To: <CAEbHw4a0QOdw9wDUiSAbT1dWbzaMKY+d3x5XwnejyGH9WG0jqA@mail.gmail.com>
References: <CACcSwS+GdO4Cp-UUaBRk8eSfiLF+Ma28mpLhB2Dqyy8BhRZcKw@mail.gmail.com>
 <23516.36364.586345.44292@turnbull.sk.tsukuba.ac.jp>
 <CAPTjJmqHFNY3pRp0FioQMHyd9=H58qDSe9Dj3e0Xw9yd3z9VvA@mail.gmail.com>
 <23517.56083.203909.511942@turnbull.sk.tsukuba.ac.jp>
 <CAEbHw4a0QOdw9wDUiSAbT1dWbzaMKY+d3x5XwnejyGH9WG0jqA@mail.gmail.com>
Message-ID: <5bc59e6b-119a-6419-1243-2626b224280e@mrabarnett.plus.com>

On 2018-11-03 17:45, David Mertz wrote:
> On Sat, Nov 3, 2018, 1:30 PM Stephen J. Turnbull 
> <turnbull.stephen.fw at u.tsukuba.ac.jp 
> <mailto:turnbull.stephen.fw at u.tsukuba.ac.jp> wrote:
> 
>     P.S.? ?I can't resist.? This is horrible, but:
> 
>     def multifilter(names, *patterns):
>      ? ? return list(set().union(*[fnmatch.filter(names, p) for p in
>     patterns]))
> 
> 
> Yes, that is a horrible spelling for:
> 
>  ? ? {fnmatch.filter(names, p) for p in patterns}
> 
> ;-)
> 
But it has the advantage that it works. :-)

From rosuav at gmail.com  Sat Nov  3 15:15:21 2018
From: rosuav at gmail.com (Chris Angelico)
Date: Sun, 4 Nov 2018 06:15:21 +1100
Subject: [Python-ideas] Make fnmatch.filter accept a tuple of patterns
In-Reply-To: <23517.56083.203909.511942@turnbull.sk.tsukuba.ac.jp>
References: <CACcSwS+GdO4Cp-UUaBRk8eSfiLF+Ma28mpLhB2Dqyy8BhRZcKw@mail.gmail.com>
 <23516.36364.586345.44292@turnbull.sk.tsukuba.ac.jp>
 <CAPTjJmqHFNY3pRp0FioQMHyd9=H58qDSe9Dj3e0Xw9yd3z9VvA@mail.gmail.com>
 <23517.56083.203909.511942@turnbull.sk.tsukuba.ac.jp>
Message-ID: <CAPTjJmqP5zDdNyRqaD2WUFZJ0qwWtKC_tGJfCQCXO6dDk+Rf+g@mail.gmail.com>

On Sun, Nov 4, 2018 at 4:29 AM Stephen J. Turnbull
<turnbull.stephen.fw at u.tsukuba.ac.jp> wrote:
>
> Chris Angelico writes:
>  > On Sat, Nov 3, 2018 at 4:49 AM Stephen J. Turnbull
>  > <turnbull.stephen.fw at u.tsukuba.ac.jp> wrote:
>  > > Andre Delfino writes:
>
>  > >  > Frequently, while globbing, one needs to work with multiple
>  > >  > extensions. I?d like to propose for fnmatch.filter to handle a
>  > >  > tuple of patterns (while preserving the single str argument
>  > >  > functionality, alas str.endswith),
>
>  > > This is one of those famous 3-line functions, though:
>  > >
>  > >     import fnmatch
>  > >     def multifilter(names, *patterns):
>  > >         result = []
>  > >         for p in patterns:
>  > >             result.extend(fnmatch.filter(names, p))
>  > >         return result
>  > >
>  > > It's a 3-line function in 5 lines, OK, but still.
>
>  > And like many "hey it's this easy" demonstrations, that isn't quite
>  > identical, as a single file can match multiple patterns
>
> Sure.  I would have written it with set.union() on general principles
> except I forgot how to say "union", didn't feel like looking it up,
> and wanted to keep the def as close to 3 lines as I could without
> being obfuscated (see below).  I wonder how many people would fall
> into the trap I did.  (I don't consider myself a great programmer, but
> maybe that's all the more reason for this?  Not-so-great minds think
> alike? :-)

A very fair point; and still supporting the notion that "it's a 3-line
function" doesn't instantly silence the need. TBH, it's the moments
when we AREN'T great programmers that we need the language to help us
out. Why is it that we love strong rules and tight exceptions? Because
they tell us when we've done something stupid, and help us to fix that
bug with a minimum of fuss :)

> I was really more interested in the second question, though.  Why
> invent yet another interface when we already have one that is
> well-known and more powerful?

That kind of globbing might also solve the use-cases, but I'm worried
about backward compatibility. Creating more glob-special characters
could potentially change the meaning of globs that are already in use.
I don't personally glob files with braces in their names, but someone
somewhere is doing it (and I do have a bunch of files with UUIDs in
their names, mainly in Wine directories); adding a feature like that
might break code, or alternatively, would have to be
fnmatch_with_braces(). In contrast, accepting a tuple of strings can't
possibly break any working code that uses individual strings.

> P.S.   I can't resist.  This is horrible, but:
>
> def multifilter(names, *patterns):
>     return list(set().union(*[fnmatch.filter(names, p) for p in patterns]))
>
> Who even needs a function? ;-)
>

.... wow.

I do want to make one small change to it, though: instead of list() at
the end of the chain, I'd use sorted(). You're throwing away the
original order of file names, so it'd look tidier to return them in
order, rather than in whichever order iterating over the set gives
them.

Also, I am a very very bad person for suggesting an 'improvement' to a
function of that nature. That is... a piece of art. Modern art, the
sort where you go "This is incomprehensible therefore it is
beautiful". :)

ChrisA

From rosuav at gmail.com  Sat Nov  3 15:18:43 2018
From: rosuav at gmail.com (Chris Angelico)
Date: Sun, 4 Nov 2018 06:18:43 +1100
Subject: [Python-ideas] Serialization of CSV vs. JSON
In-Reply-To: <D88D9931-6E82-49A9-A364-AA05657FA63D@gmail.com>
References: <CAJdod2pM=S_Kza80tuYr7RfUWhNXB73cs8evGeif9E0PkSRshQ@mail.gmail.com>
 <CACo5Rz4SU2tQLn_0yoSwuFXs31KE+UfFtDpz002=b+HyGB42ag@mail.gmail.com>
 <E92C82D6-57B6-4E81-9E4C-EAA97D048B38@gmail.com>
 <CAPTjJmpUawuQhSu=qEh5QXg_pypCKnD6norYg8YiiHaNRcuNkw@mail.gmail.com>
 <D88D9931-6E82-49A9-A364-AA05657FA63D@gmail.com>
Message-ID: <CAPTjJmpWdLZe1iGqdmfiNLGnWUh2aCLxfMGhGzjBR+rP5qfRAw@mail.gmail.com>

On Sun, Nov 4, 2018 at 1:00 AM David Shawley <daveshawley at gmail.com> wrote:
> Very good point.  The JSON document type only supports object literals,
> numbers, strings, and Boolean literals.  My suggestion was specifically to
> provide an extensible mechanism for encoding arbitrary objects into the
> supported primitives.
>

Okay, so to clarify: We currently have a mechanism for custom encoders
and decoders, which you have to specify as you're thinking about
encoding. But you're proposing having the core json.dumps() allow
objects to customize their own representation. Sounds like a plan, and
not even all that complex a plan.

ChrisA

From mertz at gnosis.cx  Sat Nov  3 15:29:26 2018
From: mertz at gnosis.cx (David Mertz)
Date: Sat, 3 Nov 2018 15:29:26 -0400
Subject: [Python-ideas] Make fnmatch.filter accept a tuple of patterns
In-Reply-To: <5bc59e6b-119a-6419-1243-2626b224280e@mrabarnett.plus.com>
References: <CACcSwS+GdO4Cp-UUaBRk8eSfiLF+Ma28mpLhB2Dqyy8BhRZcKw@mail.gmail.com>
 <23516.36364.586345.44292@turnbull.sk.tsukuba.ac.jp>
 <CAPTjJmqHFNY3pRp0FioQMHyd9=H58qDSe9Dj3e0Xw9yd3z9VvA@mail.gmail.com>
 <23517.56083.203909.511942@turnbull.sk.tsukuba.ac.jp>
 <CAEbHw4a0QOdw9wDUiSAbT1dWbzaMKY+d3x5XwnejyGH9WG0jqA@mail.gmail.com>
 <5bc59e6b-119a-6419-1243-2626b224280e@mrabarnett.plus.com>
Message-ID: <CAEbHw4Z1JgfUX9RS1VwSgGdKmw0FTr2wGbpaKVUpoU7kgLi-LA@mail.gmail.com>

On Sat, Nov 3, 2018 at 3:03 PM MRAB <python at mrabarnett.plus.com> wrote:

> > Yes, that is a horrible spelling for:
> >
> >      {fnmatch.filter(names, p) for p in patterns}
>


> But it has the advantage that it works. :-)
>

Indeed! Excellent point :-).  I definitely should not post untested code
from my tablet.

This is still slightly less horrible, but I recognize it's starting to
border on horrible:

    {n for p in patterns for n in fnmatch.filter(names, p)}

This seems worse:

    set(chain(*(fnmatch.filter(names, p) for p in patterns)))

-- 
Keeping medicines from the bloodstreams of the sick; food
from the bellies of the hungry; books from the hands of the
uneducated; technology from the underdeveloped; and putting
advocates of freedom in prisons.  Intellectual property is
to the 21st century what the slave trade was to the 16th.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20181103/bacd03f2/attachment.html>

From python.gem at gmail.com  Sat Nov  3 16:54:39 2018
From: python.gem at gmail.com (Joy Diamond)
Date: Sat, 3 Nov 2018 16:54:39 -0400
Subject: [Python-ideas] Are we supposed to be able to have our own class
 dictionary in python 3?
Message-ID: <CAK7-q6=rKWP+DMrHthVLai24PhpqgvCQGmtjq2+J4Gk9wkHHcw@mail.gmail.com>

Team,

Are we supposed to be able to have our own class dictionary in python 3?

If we currently cannot -- do we want to be able to?

That we can have out own class dictionary in python 3 is strongly implied
in the following at https://www.python.org/dev/peps/pep-3115/ where it says:

"""

    # The metaclass invocation
    def __new__(cls, name, bases, classdict):
        # Note that we replace the classdict with a regular
        # dict before passing it to the superclass, so that we
        # don't continue to record member names after the class
        # has been created.
        result = type.__new__(cls, name, bases, dict(classdict))
        result.member_names = classdict.member_names
        return result

"""

I don't understand this.  As far as I can tell, no matter what class
dictionary you pass into `type.__new__` it creates a copy of it.

Am I missing something?  Is this supposed to work?  Is the documentation
wrong?

Thanks,

Joy Diamond.

Program that shows that the class dictionary created is not what we pass in
--- Shows the actual symbol table is `dict` not `SymbolTable`

class SymbolTable(dict):
    pass

members = SymbolTable(a = 1)

X = type('X', ((object,)), members)

members['b'] = 2

print('X.a: {}'.format(X.a))

try:
    print('X.b: {}'.format(X.b))
except AttributeError as e:
    print('X.b: does not exist')

#
#   Get the actual symbol table of `X`, bypassing the mapping proxy.
#
X__symbol_table = __import__('gc').get_referents(X.__dict__)[0]

print('The type of the actual symbol table of X is: {} with keys:
{}'.format(
      type(X__symbol_table),
      X__symbol_table.keys()))


# Prints out
# X.a: 1
# X.b: does not exist
# The type of the actual symbol table of X is: <class 'dict'> with keys:
dict_keys(['a', '__module__', '__dict__', '__weakref__', '__doc__'])

<https://www.avast.com/sig-email?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=webmail&utm_term=icon>
Virus-free.
www.avast.com
<https://www.avast.com/sig-email?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=webmail&utm_term=link>
<#DAB4FAD8-2DD7-40BB-A1B8-4E2AA1F9FDF2>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20181103/a5d969a7/attachment.html>

From greg.ewing at canterbury.ac.nz  Sat Nov  3 17:43:57 2018
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Sun, 04 Nov 2018 10:43:57 +1300
Subject: [Python-ideas] Serialization of CSV vs. JSON
In-Reply-To: <E92C82D6-57B6-4E81-9E4C-EAA97D048B38@gmail.com>
References: <CAJdod2pM=S_Kza80tuYr7RfUWhNXB73cs8evGeif9E0PkSRshQ@mail.gmail.com>
 <CACo5Rz4SU2tQLn_0yoSwuFXs31KE+UfFtDpz002=b+HyGB42ag@mail.gmail.com>
 <E92C82D6-57B6-4E81-9E4C-EAA97D048B38@gmail.com>
Message-ID: <5BDE169D.4040408@canterbury.ac.nz>

David Shawley wrote:
> I'm +1 on adding support for serializing datetime.date and
> datetime.datetime *but* I'm -1 on automatically deserializing anything that
> looks like a ISO-8601 in json.load*.  The asymmetry is the only thing that
> kept me from bringing this up previously.

This asymmetry bothers me too. It makes me think that datetime
handling belongs at a different level of abstraction, something
that knows about the structure of the data being serialised or
deserialised.

Java's JSON libraries have a mechanism where you can give it
a class and a lump of JSON and it will figure out from runtime
type information what to do. It seems like we should be able
to do something similar using type annotations.

-- 
Greg

From dfmoisset at gmail.com  Sat Nov  3 19:59:02 2018
From: dfmoisset at gmail.com (Daniel Moisset)
Date: Sat, 3 Nov 2018 23:59:02 +0000
Subject: [Python-ideas] Are we supposed to be able to have our own class
 dictionary in python 3?
In-Reply-To: <CAPj3Rw4wqwigFwdc8gPNirMJnrCX+Tk8k=V_CkjWtPyjPE2+gg@mail.gmail.com>
References: <CAK7-q6=rKWP+DMrHthVLai24PhpqgvCQGmtjq2+J4Gk9wkHHcw@mail.gmail.com>
 <CAPj3Rw4wqwigFwdc8gPNirMJnrCX+Tk8k=V_CkjWtPyjPE2+gg@mail.gmail.com>
Message-ID: <CAPj3Rw5yeVz8vA_78rfusdZHP1w8nVyBNJ8vtW2qp23p-EYi_Q@mail.gmail.com>

Sorry, should have replied to the list too

On Sat, 3 Nov 2018, 23:55 Daniel Moisset <dfmoisset at gmail.com wrote:

> If I understood correctly what you want, it's possible with a metaclass.
> Check the __prepare__ method at
> https://docs.python.org/3/reference/datamodel.html#preparing-the-class-namespace
> and Pep 3115
>
> On Sat, 3 Nov 2018, 20:55 Joy Diamond <python.gem at gmail.com wrote:
>
>> Team,
>>
>> Are we supposed to be able to have our own class dictionary in python 3?
>>
>> If we currently cannot -- do we want to be able to?
>>
>> That we can have out own class dictionary in python 3 is strongly implied
>> in the following at https://www.python.org/dev/peps/pep-3115/ where it
>> says:
>>
>> """
>>
>>     # The metaclass invocation
>>     def __new__(cls, name, bases, classdict):
>>         # Note that we replace the classdict with a regular
>>         # dict before passing it to the superclass, so that we
>>         # don't continue to record member names after the class
>>         # has been created.
>>         result = type.__new__(cls, name, bases, dict(classdict))
>>         result.member_names = classdict.member_names
>>         return result
>>
>> """
>>
>> I don't understand this.  As far as I can tell, no matter what class
>> dictionary you pass into `type.__new__` it creates a copy of it.
>>
>> Am I missing something?  Is this supposed to work?  Is the documentation
>> wrong?
>>
>> Thanks,
>>
>> Joy Diamond.
>>
>> Program that shows that the class dictionary created is not what we pass
>> in --- Shows the actual symbol table is `dict` not `SymbolTable`
>>
>> class SymbolTable(dict):
>>     pass
>>
>> members = SymbolTable(a = 1)
>>
>> X = type('X', ((object,)), members)
>>
>> members['b'] = 2
>>
>> print('X.a: {}'.format(X.a))
>>
>> try:
>>     print('X.b: {}'.format(X.b))
>> except AttributeError as e:
>>     print('X.b: does not exist')
>>
>> #
>> #   Get the actual symbol table of `X`, bypassing the mapping proxy.
>> #
>> X__symbol_table = __import__('gc').get_referents(X.__dict__)[0]
>>
>> print('The type of the actual symbol table of X is: {} with keys:
>> {}'.format(
>>       type(X__symbol_table),
>>       X__symbol_table.keys()))
>>
>>
>> # Prints out
>> # X.a: 1
>> # X.b: does not exist
>> # The type of the actual symbol table of X is: <class 'dict'> with keys:
>> dict_keys(['a', '__module__', '__dict__', '__weakref__', '__doc__'])
>>
>>
>> <https://www.avast.com/sig-email?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=webmail&utm_term=icon> Virus-free.
>> www.avast.com
>> <https://www.avast.com/sig-email?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=webmail&utm_term=link>
>> <#m_8752709194315235090_m_603818574152564677_DAB4FAD8-2DD7-40BB-A1B8-4E2AA1F9FDF2>
>> _______________________________________________
>> Python-ideas mailing list
>> Python-ideas at python.org
>> https://mail.python.org/mailman/listinfo/python-ideas
>> Code of Conduct: http://python.org/psf/codeofconduct/
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20181103/e4ec39b9/attachment-0001.html>

From amit.mixie at gmail.com  Sat Nov  3 20:02:56 2018
From: amit.mixie at gmail.com (Amit Green)
Date: Sat, 3 Nov 2018 20:02:56 -0400
Subject: [Python-ideas] Are we supposed to be able to have our own class
 dictionary in python 3?
In-Reply-To: <CAPj3Rw4wqwigFwdc8gPNirMJnrCX+Tk8k=V_CkjWtPyjPE2+gg@mail.gmail.com>
References: <CAK7-q6=rKWP+DMrHthVLai24PhpqgvCQGmtjq2+J4Gk9wkHHcw@mail.gmail.com>
 <CAPj3Rw4wqwigFwdc8gPNirMJnrCX+Tk8k=V_CkjWtPyjPE2+gg@mail.gmail.com>
Message-ID: <CAKHFnbJ8yFygBNUQ8RBQe+K_g+fYV9npzcCHKK=QVXhsqSgWhw@mail.gmail.com>

 Thanks Daniel,

I found my answer here (using your link):
https://docs.python.org/3/reference/datamodel.html#preparing-the-class-namespace

"""
When a new class is created by type.__new__, the object provided as the
namespace parameter is copied to a new ordered mapping and the original
object is discarded.
"""

Therefore the answer seems to be that
https://www.python.org/dev/peps/pep-3115/ needs to be updated & fixed.

Replace the following:

"""

    def __new__(cls, name, bases, classdict):
        # Note that we replace the classdict with a regular
        # dict before passing it to the superclass, so that we
        # don't continue to record member names after the class
        # has been created.
        result = type.__new__(cls, name, bases, dict(classdict))
        result.member_names = classdict.member_names
        return result

"""

With:

"""
def __new__(cls, name, bases, classdict):

        result = type.__new__(cls, name, bases, classdict)
        result.member_names = classdict.member_names
        return result

"""

Removing the incorrect comments & the copying of `classdict`

I will go file a bug report to that effect.

Thanks,

Joy Diamond.


On Sat, Nov 3, 2018 at 7:55 PM Daniel Moisset <dfmoisset at gmail.com> wrote:

> If I understood correctly what you want, it's possible with a metaclass.
> Check the __prepare__ method at
> https://docs.python.org/3/reference/datamodel.html#preparing-the-class-namespace
> and Pep 3115
>
> On Sat, 3 Nov 2018, 20:55 Joy Diamond <python.gem at gmail.com wrote:
>
>> Team,
>>
>> Are we supposed to be able to have our own class dictionary in python 3?
>>
>> If we currently cannot -- do we want to be able to?
>>
>> That we can have out own class dictionary in python 3 is strongly implied
>> in the following at https://www.python.org/dev/peps/pep-3115/ where it
>> says:
>>
>> """
>>
>>     # The metaclass invocation
>>     def __new__(cls, name, bases, classdict):
>>         # Note that we replace the classdict with a regular
>>         # dict before passing it to the superclass, so that we
>>         # don't continue to record member names after the class
>>         # has been created.
>>         result = type.__new__(cls, name, bases, dict(classdict))
>>         result.member_names = classdict.member_names
>>         return result
>>
>> """
>>
>> I don't understand this.  As far as I can tell, no matter what class
>> dictionary you pass into `type.__new__` it creates a copy of it.
>>
>> Am I missing something?  Is this supposed to work?  Is the documentation
>> wrong?
>>
>> Thanks,
>>
>> Joy Diamond.
>>
>> Program that shows that the class dictionary created is not what we pass
>> in --- Shows the actual symbol table is `dict` not `SymbolTable`
>>
>> class SymbolTable(dict):
>>     pass
>>
>> members = SymbolTable(a = 1)
>>
>> X = type('X', ((object,)), members)
>>
>> members['b'] = 2
>>
>> print('X.a: {}'.format(X.a))
>>
>> try:
>>     print('X.b: {}'.format(X.b))
>> except AttributeError as e:
>>     print('X.b: does not exist')
>>
>> #
>> #   Get the actual symbol table of `X`, bypassing the mapping proxy.
>> #
>> X__symbol_table = __import__('gc').get_referents(X.__dict__)[0]
>>
>> print('The type of the actual symbol table of X is: {} with keys:
>> {}'.format(
>>       type(X__symbol_table),
>>       X__symbol_table.keys()))
>>
>>
>> # Prints out
>> # X.a: 1
>> # X.b: does not exist
>> # The type of the actual symbol table of X is: <class 'dict'> with keys:
>> dict_keys(['a', '__module__', '__dict__', '__weakref__', '__doc__'])
>>
>>
>> <https://www.avast.com/sig-email?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=webmail&utm_term=icon> Virus-free.
>> www.avast.com
>> <https://www.avast.com/sig-email?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=webmail&utm_term=link>
>> <#m_3863327966708087084_m_7370047320667964513_m_603818574152564677_DAB4FAD8-2DD7-40BB-A1B8-4E2AA1F9FDF2>
>> _______________________________________________
>> Python-ideas mailing list
>> Python-ideas at python.org
>> https://mail.python.org/mailman/listinfo/python-ideas
>> Code of Conduct: http://python.org/psf/codeofconduct/
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20181103/5986717a/attachment.html>

From steve at pearwood.info  Sat Nov  3 20:41:04 2018
From: steve at pearwood.info (Steven D'Aprano)
Date: Sun, 4 Nov 2018 11:41:04 +1100
Subject: [Python-ideas] Make fnmatch.filter accept a tuple of patterns
In-Reply-To: <23516.36364.586345.44292@turnbull.sk.tsukuba.ac.jp>
References: <CACcSwS+GdO4Cp-UUaBRk8eSfiLF+Ma28mpLhB2Dqyy8BhRZcKw@mail.gmail.com>
 <23516.36364.586345.44292@turnbull.sk.tsukuba.ac.jp>
Message-ID: <20181104004104.GT3817@ando.pearwood.info>

On Sat, Nov 03, 2018 at 02:49:00AM +0900, Stephen J. Turnbull wrote:

> If you're going to improve the glob module, why not use bash or zsh
> extended globbing ('**', '{a,b}') as the model?  This is more
> powerful, and already familiar to many users.

I thought it did support extended globbing?

https://docs.python.org/3/library/glob.html#glob.glob

But brace expansion should be a thing. For backwards compatibility 
reasons, we probably need a switch to turn it on, or a separate function 
call, or maybe a deprecation period.


-- 
Steve

From dfmoisset at gmail.com  Sun Nov  4 06:43:39 2018
From: dfmoisset at gmail.com (Daniel Moisset)
Date: Sun, 4 Nov 2018 11:43:39 +0000
Subject: [Python-ideas] Are we supposed to be able to have our own class
 dictionary in python 3?
In-Reply-To: <CAKHFnbJ8yFygBNUQ8RBQe+K_g+fYV9npzcCHKK=QVXhsqSgWhw@mail.gmail.com>
References: <CAK7-q6=rKWP+DMrHthVLai24PhpqgvCQGmtjq2+J4Gk9wkHHcw@mail.gmail.com>
 <CAPj3Rw4wqwigFwdc8gPNirMJnrCX+Tk8k=V_CkjWtPyjPE2+gg@mail.gmail.com>
 <CAKHFnbJ8yFygBNUQ8RBQe+K_g+fYV9npzcCHKK=QVXhsqSgWhw@mail.gmail.com>
Message-ID: <CAPj3Rw5Lp+7ML7TF2iMnZ_OOOePbTVbOEAREWROcqQ5Q1RmWQQ@mail.gmail.com>

I think the documentation is correct but you misinterpreted the intent of
that code. The code you're quoting, which is an example, is not about
ending up with a custom dict within the instance, the intent of the author
was just to captur the memeber_names list. So what it does for that is
customizing the class dict in prepare(), but then in __init__ it
*intentionally* converts it to a regular dict after extracting the
member_names. The goal of the example is ending up with instances with
regular attribute dicts but an extra member_names attributes, while I think
that you're looking for to end up with a custom attribute dict (so in
*your* case, you do not need to do the copying)

On Sun, 4 Nov 2018 at 00:03, Amit Green <amit.mixie at gmail.com> wrote:

> Thanks Daniel,
>
> I found my answer here (using your link):
> https://docs.python.org/3/reference/datamodel.html#preparing-the-class-namespace
>
> """
> When a new class is created by type.__new__, the object provided as the
> namespace parameter is copied to a new ordered mapping and the original
> object is discarded.
> """
>
> Therefore the answer seems to be that
> https://www.python.org/dev/peps/pep-3115/ needs to be updated & fixed.
>
> Replace the following:
>
> """
>
>     def __new__(cls, name, bases, classdict):
>         # Note that we replace the classdict with a regular
>         # dict before passing it to the superclass, so that we
>         # don't continue to record member names after the class
>         # has been created.
>         result = type.__new__(cls, name, bases, dict(classdict))
>         result.member_names = classdict.member_names
>         return result
>
> """
>
> With:
>
> """
> def __new__(cls, name, bases, classdict):
>
>         result = type.__new__(cls, name, bases, classdict)
>         result.member_names = classdict.member_names
>         return result
>
> """
>
> Removing the incorrect comments & the copying of `classdict`
>
> I will go file a bug report to that effect.
>
> Thanks,
>
> Joy Diamond.
>
>
>
> On Sat, Nov 3, 2018 at 7:55 PM Daniel Moisset <dfmoisset at gmail.com> wrote:
>
>> If I understood correctly what you want, it's possible with a metaclass.
>> Check the __prepare__ method at
>> https://docs.python.org/3/reference/datamodel.html#preparing-the-class-namespace
>> and Pep 3115
>>
>> On Sat, 3 Nov 2018, 20:55 Joy Diamond <python.gem at gmail.com wrote:
>>
>>> Team,
>>>
>>> Are we supposed to be able to have our own class dictionary in python 3?
>>>
>>> If we currently cannot -- do we want to be able to?
>>>
>>> That we can have out own class dictionary in python 3 is strongly
>>> implied in the following at https://www.python.org/dev/peps/pep-3115/
>>> where it says:
>>>
>>> """
>>>
>>>     # The metaclass invocation
>>>     def __new__(cls, name, bases, classdict):
>>>         # Note that we replace the classdict with a regular
>>>         # dict before passing it to the superclass, so that we
>>>         # don't continue to record member names after the class
>>>         # has been created.
>>>         result = type.__new__(cls, name, bases, dict(classdict))
>>>         result.member_names = classdict.member_names
>>>         return result
>>>
>>> """
>>>
>>> I don't understand this.  As far as I can tell, no matter what class
>>> dictionary you pass into `type.__new__` it creates a copy of it.
>>>
>>> Am I missing something?  Is this supposed to work?  Is the documentation
>>> wrong?
>>>
>>> Thanks,
>>>
>>> Joy Diamond.
>>>
>>> Program that shows that the class dictionary created is not what we pass
>>> in --- Shows the actual symbol table is `dict` not `SymbolTable`
>>>
>>> class SymbolTable(dict):
>>>     pass
>>>
>>> members = SymbolTable(a = 1)
>>>
>>> X = type('X', ((object,)), members)
>>>
>>> members['b'] = 2
>>>
>>> print('X.a: {}'.format(X.a))
>>>
>>> try:
>>>     print('X.b: {}'.format(X.b))
>>> except AttributeError as e:
>>>     print('X.b: does not exist')
>>>
>>> #
>>> #   Get the actual symbol table of `X`, bypassing the mapping proxy.
>>> #
>>> X__symbol_table = __import__('gc').get_referents(X.__dict__)[0]
>>>
>>> print('The type of the actual symbol table of X is: {} with keys:
>>> {}'.format(
>>>       type(X__symbol_table),
>>>       X__symbol_table.keys()))
>>>
>>>
>>> # Prints out
>>> # X.a: 1
>>> # X.b: does not exist
>>> # The type of the actual symbol table of X is: <class 'dict'> with keys:
>>> dict_keys(['a', '__module__', '__dict__', '__weakref__', '__doc__'])
>>>
>>>
>>> <https://www.avast.com/sig-email?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=webmail&utm_term=icon> Virus-free.
>>> www.avast.com
>>> <https://www.avast.com/sig-email?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=webmail&utm_term=link>
>>> <#m_7342312465347859571_m_3863327966708087084_m_7370047320667964513_m_603818574152564677_DAB4FAD8-2DD7-40BB-A1B8-4E2AA1F9FDF2>
>>> _______________________________________________
>>> Python-ideas mailing list
>>> Python-ideas at python.org
>>> https://mail.python.org/mailman/listinfo/python-ideas
>>> Code of Conduct: http://python.org/psf/codeofconduct/
>>>
>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20181104/68bd6b35/attachment-0001.html>

From chibicitiberiu at gmail.com  Sun Nov  4 07:20:31 2018
From: chibicitiberiu at gmail.com (Tiberiu Chibici)
Date: Sun, 4 Nov 2018 14:20:31 +0200
Subject: [Python-ideas] Proposal to add a key field to the bisect library
Message-ID: <CADEn14M+EK09zGb8E=+4aJ5MQE90b4ienNvjJP6EN2=W3Kw3aA@mail.gmail.com>

Hi,
I would like to propose an improvement to the functions in the bisect
library, to add a 'key' parameter, similar to 'sorted' or other system
functions.

-- 
Chibici Tiberiu
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20181104/19a691a0/attachment.html>

From steve at pearwood.info  Sun Nov  4 08:19:49 2018
From: steve at pearwood.info (Steven D'Aprano)
Date: Mon, 5 Nov 2018 00:19:49 +1100
Subject: [Python-ideas] Proposal to add a key field to the bisect library
In-Reply-To: <CADEn14M+EK09zGb8E=+4aJ5MQE90b4ienNvjJP6EN2=W3Kw3aA@mail.gmail.com>
References: <CADEn14M+EK09zGb8E=+4aJ5MQE90b4ienNvjJP6EN2=W3Kw3aA@mail.gmail.com>
Message-ID: <20181104131948.GV3817@ando.pearwood.info>

On Sun, Nov 04, 2018 at 02:20:31PM +0200, Tiberiu Chibici wrote:
> Hi,
> I would like to propose an improvement to the functions in the bisect
> library, to add a 'key' parameter, similar to 'sorted' or other system
> functions.

Quoting the bug tracker:

    This request has come up repeatedly (and been rejected) in the past.
    See issues 2954, 3374, 1185383, 1462228, 1451588, 1619060.

https://bugs.python.org/issue4356

Unless you have something new to add, something people have missed, I 
don't think this idea is going to go anywhere.


-- 
Steve

From steve at pearwood.info  Sun Nov  4 08:33:19 2018
From: steve at pearwood.info (Steven D'Aprano)
Date: Mon, 5 Nov 2018 00:33:19 +1100
Subject: [Python-ideas] Proposal to add a key field to the bisect library
In-Reply-To: <20181104131948.GV3817@ando.pearwood.info>
References: <CADEn14M+EK09zGb8E=+4aJ5MQE90b4ienNvjJP6EN2=W3Kw3aA@mail.gmail.com>
 <20181104131948.GV3817@ando.pearwood.info>
Message-ID: <20181104133319.GW3817@ando.pearwood.info>

On Mon, Nov 05, 2018 at 12:19:49AM +1100, Steven D'Aprano wrote:
> On Sun, Nov 04, 2018 at 02:20:31PM +0200, Tiberiu Chibici wrote:
> > Hi,
> > I would like to propose an improvement to the functions in the bisect
> > library, to add a 'key' parameter, similar to 'sorted' or other system
> > functions.
> 
> Quoting the bug tracker:
[...]
> https://bugs.python.org/issue4356

Actually, reading further along, it looks like there has been concensus 
that bisect ought to get a key function. Guido said:

"Bingo. That clinches it. We need to add key=."

also:

"PS. It should also be added to heapq."

and Raymond said:

"I'll add a key= variant for Python 3.6."

Obviously this didn't happen, but it might happen for 3.8.


-- 
Steve

From daveshawley at gmail.com  Sun Nov  4 08:49:08 2018
From: daveshawley at gmail.com (David Shawley)
Date: Sun, 4 Nov 2018 08:49:08 -0500
Subject: [Python-ideas] Serialization of CSV vs. JSON
In-Reply-To: <5BDE169D.4040408@canterbury.ac.nz>
References: <CAJdod2pM=S_Kza80tuYr7RfUWhNXB73cs8evGeif9E0PkSRshQ@mail.gmail.com>
 <CACo5Rz4SU2tQLn_0yoSwuFXs31KE+UfFtDpz002=b+HyGB42ag@mail.gmail.com>
 <E92C82D6-57B6-4E81-9E4C-EAA97D048B38@gmail.com>
 <5BDE169D.4040408@canterbury.ac.nz>
Message-ID: <1397FA69-0B7A-4BDF-B7C7-CBCA1A8D4F90@gmail.com>

On Nov 3, 2018, at 5:43 PM, Greg Ewing <greg.ewing at canterbury.ac.nz> wrote:

> David Shawley wrote:
> > I'm +1 on adding support for serializing datetime.date and
> > datetime.datetime *but* I'm -1 on automatically deserializing anything that
> > looks like a ISO-8601 in json.load*.  The asymmetry is the only thing that
> > kept me from bringing this up previously.
> 
> This asymmetry bothers me too. It makes me think that datetime
> handling belongs at a different level of abstraction, something
> that knows about the structure of the data being serialised or
> deserialised.
> 
> Java's JSON libraries have a mechanism where you can give it
> a class and a lump of JSON and it will figure out from runtime
> type information what to do. It seems like we should be able
> to do something similar using type annotations.

I was thinking about trying to do something similar to what golang has done in
their JSON support [1].  It is similar to what I would have done with JAXB when
I was still doing Java [2].  In both cases you have a type explicitly bound to
a JSON blob.  The place to make this sort of change might be in the
JSONDecoder and JSONEncoder classes.

Personally, I would place this sort of serialization logic outside of the
Standard Library -- maybe following the pattern that the rust community
adopted on this very issue.  In short, they separated serialization &
de-serialization into a free-standing library.  The best discussion that I
have found is a reddit thread [3].  The new library that they built is called
serde [4] and the previous code is in their deprecated library section [5].

The difference between the two approaches is that golang simply annotates the
types similar to what I would expect to happen in the Python case.  Then you
are required to pass a list of types into the deserializer so it knows what
which types are candidates for deserialization.  The rust and JAXB approaches
rely on type registration into the deserialiation framework.

We could probably use type annotations to handle the asymmetry provided that
we change the JSONDecoder interface to accept a list of classes that are
candidates for deserialization or something similar.  I would place this
outside of the Standard Library as a generalized serialization /
de-serialization framework since it feels embryonic to me.  This could be a
new implementation for CSV and pickle as well.

Bringing the conversation back around, I'm going to continue adding a simple
JSON formatting protocol that is asymmetric since it does solve a need that
I and others have today.  I'm not completely sure what the best way to move
this forward is.  I have most of an implementation working based on a simple
protocol of one method.  Should I:

1. Open a BPO and continue the discussion there once I have a working
   prototype?

2. Continue the discussion here?

3. Move the discussion to python-dev under a more appropriate subject?

cheers, dave.

[1]: https://golang.org/pkg/encoding/json/#Marshal
[2]: https://docs.oracle.com/javaee/6/tutorial/doc/gkknj.html#gmfnu
[3]: https://www.reddit.com/r/rust/comments/3v4ktz/differences_between_serde_and_rustc_serialize/
[4]: https://serde.rs
[5]: https://github.com/rust-lang-deprecated/rustc-serialize

--
"Syntactic sugar causes cancer of the semicolon" - Alan Perlis

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20181104/5b202cff/attachment.html>

From boxed at killingar.net  Sun Nov  4 08:53:46 2018
From: boxed at killingar.net (=?utf-8?Q?Anders_Hovm=C3=B6ller?=)
Date: Sun, 4 Nov 2018 14:53:46 +0100
Subject: [Python-ideas] Proposal to add a key field to the bisect library
In-Reply-To: <20181104131948.GV3817@ando.pearwood.info>
References: <CADEn14M+EK09zGb8E=+4aJ5MQE90b4ienNvjJP6EN2=W3Kw3aA@mail.gmail.com>
 <20181104131948.GV3817@ando.pearwood.info>
Message-ID: <9C3CF048-7A3A-470A-BB70-45A1BA54697E@killingar.net>

That link has Guido and Raymond Hettinger ending +1 and looking to either add it or writing a simple copy paste:able recipe to the docs. I mean, that's how I read it, did you read that and come to a different impression?


> On 4 Nov 2018, at 14:19, Steven D'Aprano <steve at pearwood.info> wrote:
> 
>> On Sun, Nov 04, 2018 at 02:20:31PM +0200, Tiberiu Chibici wrote:
>> Hi,
>> I would like to propose an improvement to the functions in the bisect
>> library, to add a 'key' parameter, similar to 'sorted' or other system
>> functions.
> 
> Quoting the bug tracker:
> 
>    This request has come up repeatedly (and been rejected) in the past.
>    See issues 2954, 3374, 1185383, 1462228, 1451588, 1619060.
> 
> https://bugs.python.org/issue4356
> 
> Unless you have something new to add, something people have missed, I 
> don't think this idea is going to go anywhere.
> 
> 
> 
> -- 
> Steve
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> https://mail.python.org/mailman/listinfo/python-ideas
> Code of Conduct: http://python.org/psf/codeofconduct/

From boxed at killingar.net  Sun Nov  4 08:54:21 2018
From: boxed at killingar.net (=?utf-8?Q?Anders_Hovm=C3=B6ller?=)
Date: Sun, 4 Nov 2018 14:54:21 +0100
Subject: [Python-ideas] Proposal to add a key field to the bisect library
In-Reply-To: <20181104133319.GW3817@ando.pearwood.info>
References: <CADEn14M+EK09zGb8E=+4aJ5MQE90b4ienNvjJP6EN2=W3Kw3aA@mail.gmail.com>
 <20181104131948.GV3817@ando.pearwood.info>
 <20181104133319.GW3817@ando.pearwood.info>
Message-ID: <83EC260D-9C44-47D2-8F85-F9C74813C827@killingar.net>

Oh heh. Well there we go :)

> On 4 Nov 2018, at 14:33, Steven D'Aprano <steve at pearwood.info> wrote:
> 
>> On Mon, Nov 05, 2018 at 12:19:49AM +1100, Steven D'Aprano wrote:
>>> On Sun, Nov 04, 2018 at 02:20:31PM +0200, Tiberiu Chibici wrote:
>>> Hi,
>>> I would like to propose an improvement to the functions in the bisect
>>> library, to add a 'key' parameter, similar to 'sorted' or other system
>>> functions.
>> 
>> Quoting the bug tracker:
> [...]
>> https://bugs.python.org/issue4356
> 
> Actually, reading further along, it looks like there has been concensus 
> that bisect ought to get a key function. Guido said:
> 
> "Bingo. That clinches it. We need to add key=."
> 
> also:
> 
> "PS. It should also be added to heapq."
> 
> and Raymond said:
> 
> "I'll add a key= variant for Python 3.6."
> 
> Obviously this didn't happen, but it might happen for 3.8.
> 
> 
> -- 
> Steve
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> https://mail.python.org/mailman/listinfo/python-ideas
> Code of Conduct: http://python.org/psf/codeofconduct/

From wes.turner at gmail.com  Sun Nov  4 16:09:26 2018
From: wes.turner at gmail.com (Wes Turner)
Date: Sun, 4 Nov 2018 16:09:26 -0500
Subject: [Python-ideas] Serialization of CSV vs. JSON
In-Reply-To: <1397FA69-0B7A-4BDF-B7C7-CBCA1A8D4F90@gmail.com>
References: <CAJdod2pM=S_Kza80tuYr7RfUWhNXB73cs8evGeif9E0PkSRshQ@mail.gmail.com>
 <CACo5Rz4SU2tQLn_0yoSwuFXs31KE+UfFtDpz002=b+HyGB42ag@mail.gmail.com>
 <E92C82D6-57B6-4E81-9E4C-EAA97D048B38@gmail.com>
 <5BDE169D.4040408@canterbury.ac.nz>
 <1397FA69-0B7A-4BDF-B7C7-CBCA1A8D4F90@gmail.com>
Message-ID: <CACfEFw_3K12NxvQPuZFNrdFh8Vzb=p3BEJZw155deEX3kvjkeA@mail.gmail.com>

Here's a JSONEncoder subclass with a default method that checks variable
types in a defined sequence that includes datetime:

https://gist.github.com/majgis/4200488

Passing an ordered map of (Type, fn) may or may not be any more readable
than simply subclassing JSONEncoder and defining .default().

On Sunday, November 4, 2018, David Shawley <daveshawley at gmail.com> wrote:

> On Nov 3, 2018, at 5:43 PM, Greg Ewing <greg.ewing at canterbury.ac.nz>
> wrote:
>
> > David Shawley wrote:
> > > I'm +1 on adding support for serializing datetime.date and
> > > datetime.datetime *but* I'm -1 on automatically deserializing anything
> that
> > > looks like a ISO-8601 in json.load*.  The asymmetry is the only thing
> that
> > > kept me from bringing this up previously.
> >
> > This asymmetry bothers me too. It makes me think that datetime
> > handling belongs at a different level of abstraction, something
> > that knows about the structure of the data being serialised or
> > deserialised.
> >
> > Java's JSON libraries have a mechanism where you can give it
> > a class and a lump of JSON and it will figure out from runtime
> > type information what to do. It seems like we should be able
> > to do something similar using type annotations.
>
> I was thinking about trying to do something similar to what golang has
> done in
> their JSON support [1].  It is similar to what I would have done with JAXB
> when
> I was still doing Java [2].  In both cases you have a type explicitly
> bound to
> a JSON blob.  The place to make this sort of change might be in the
> JSONDecoder and JSONEncoder classes.
>
> Personally, I would place this sort of serialization logic outside of the
> Standard Library -- maybe following the pattern that the rust community
> adopted on this very issue.  In short, they separated serialization &
> de-serialization into a free-standing library.  The best discussion that I
> have found is a reddit thread [3].  The new library that they built is
> called
> serde [4] and the previous code is in their deprecated library section [5].
>
> The difference between the two approaches is that golang simply annotates
> the
> types similar to what I would expect to happen in the Python case.  Then
> you
> are required to pass a list of types into the deserializer so it knows what
> which types are candidates for deserialization.  The rust and JAXB
> approaches
> rely on type registration into the deserialiation framework.
>
> We could probably use type annotations to handle the asymmetry provided
> that
> we change the JSONDecoder interface to accept a list of classes that are
> candidates for deserialization or something similar.  I would place this
> outside of the Standard Library as a generalized serialization /
> de-serialization framework since it feels embryonic to me.  This could be a
> new implementation for CSV and pickle as well.
>
> Bringing the conversation back around, I'm going to continue adding a
> simple
> JSON formatting protocol that is asymmetric since it does solve a need that
> I and others have today.  I'm not completely sure what the best way to move
> this forward is.  I have most of an implementation working based on a
> simple
> protocol of one method.  Should I:
>
> 1. Open a BPO and continue the discussion there once I have a working
>    prototype?
>
> 2. Continue the discussion here?
>
> 3. Move the discussion to python-dev under a more appropriate subject?
>
> cheers, dave.
>
> [1]: https://golang.org/pkg/encoding/json/#Marshal
> [2]: https://docs.oracle.com/javaee/6/tutorial/doc/gkknj.html#gmfnu
> [3]: https://www.reddit.com/r/rust/comments/3v4ktz/differences_
> between_serde_and_rustc_serialize/
> [4]: https://serde.rs
> [5]: https://github.com/rust-lang-deprecated/rustc-serialize
>
> --
> "Syntactic sugar causes cancer of the semicolon" - Alan Perlis
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20181104/64982f4c/attachment.html>

From chris.barker at noaa.gov  Mon Nov  5 19:11:30 2018
From: chris.barker at noaa.gov (Chris Barker)
Date: Mon, 5 Nov 2018 16:11:30 -0800
Subject: [Python-ideas] dict.setdefault_call(),
 or API variations thereupon
In-Reply-To: <20181103024911.GQ3817@ando.pearwood.info>
References: <mailman.7661.1541123849.2798.python-ideas@python.org>
 <CAJG_PJU10A=_STur3=DMB31RtL+rC9NA=R8zCJ-stFyb0kGu6g@mail.gmail.com>
 <CAP7+vJKMCMLtztMpeCmvT-jcS9ALzAPZquJx-4_LVHcM94YyaA@mail.gmail.com>
 <CA+msPNmnNuCnCbdwEbptvY+45nBm97RUxkwZYg9hFpo+6uADgA@mail.gmail.com>
 <CAP7+vJL9Nrp5858cyoMDiNRS3MWgfm_u6R0S8m+phPvRG4sNXw@mail.gmail.com>
 <CAJG_PJVJUXywV2MrfLu2hTj6s=uWJYrDy_Q_ZDXm5C=ORq9kgg@mail.gmail.com>
 <20181102033409.GI3817@ando.pearwood.info>
 <CALGmxELxVaV7RkB5mY6nMFDQaazxts3B7H9hf1q-XZA6X+0Hsw@mail.gmail.com>
 <20181103000524.GP3817@ando.pearwood.info>
 <98AF148B-86A0-4C76-9BC8-0E07F5DCDC6B@killingar.net>
 <20181103024911.GQ3817@ando.pearwood.info>
Message-ID: <CALGmxE+Os5mMAJk_tuFf2Fbes6ZyXGxajP5DQvO0x=Z7UdNX9A@mail.gmail.com>

On Fri, Nov 2, 2018 at 7:49 PM, Steven D'Aprano <steve at pearwood.info> wrote:

> Consider the use-case where you want to pass a different default value
> to the dict each time:
>

exactly - the "default" is per call, not the same for the whole dict.
though again, how common is this?


>     d.setdefault(key, expensive_function(1, 2, 3))
>     d.setdefault(key, expensive_function(4, 8, 16))
>     d.setdefault(key, expensive_function(10, 100, 1000))
>

also -- aside from performance, if expensive_function() has side effects,
you may really not want to call it when you don't need to (not that that
would be well-designed code, but...)

and of course, you can always simply do:

if key in d:
    val = d[key]
else:
    val = expensive_function(4, 8, 16)
    d[key] = val

sure, it requires looking up the key twice, but doesn't call the function
unnecessarily.

So it's a pretty small subset of cases, where this would be needed.

defaultdict won't help, because your factory function takes no
> arguments: there's no way to supply arguments for the factory.
>

maybe that's a feature defaultdict should have?

-CHB


> __missing__ won't help, because it only receives the key, not arbitrary
> arguments.
>
> We can of course subclass dict and give it a method with the semantics
> we want:
>
>     d.my_setdefault(key, expensive_function, args=(1, 2, 3), kw={})
>
> but it would be nicer and more expressive if we could tell the
> interpreter "don't evaluate expensive_function(...) unless you really
> need it".
>
> Other languages have this -- I believe it is called "Call By Need" or
> "Call By Name", depending on the precise details of how it works. I call
> it delayed evaluation, and Python already has it, but only in certain
> special syntactic forms:
>
>     spam and <delayed expression>
>     spam or <delayed expression>
>     <delayed expression> if condition else <delayed expression>
>
> There are others: e.g. the body of functions, including lambda. But
> functions are kinda heavyweight to make and build and call.
>
>
>
> --
> Steve
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> https://mail.python.org/mailman/listinfo/python-ideas
> Code of Conduct: http://python.org/psf/codeofconduct/
>


-- 

Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/OR&R            (206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115       (206) 526-6317   main reception

Chris.Barker at noaa.gov
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20181105/24e50625/attachment.html>

From chris.barker at noaa.gov  Mon Nov  5 19:17:59 2018
From: chris.barker at noaa.gov (Chris Barker)
Date: Mon, 5 Nov 2018 16:17:59 -0800
Subject: [Python-ideas] Serialization of CSV vs. JSON
In-Reply-To: <CACfEFw91OXZB6N=-D8wv_1Wqt2RVUfxYeKHP0xdwEYGbMk0csA@mail.gmail.com>
References: <CAJdod2pM=S_Kza80tuYr7RfUWhNXB73cs8evGeif9E0PkSRshQ@mail.gmail.com>
 <d335e6d2-4f0b-54cb-af7d-2647448f32ab@egenix.com>
 <CALGmxEK_z-FPjVOQMxZzbxXGsmUODWHCRa8wxCZ=Bk=YZMSTqg@mail.gmail.com>
 <CACfEFw91OXZB6N=-D8wv_1Wqt2RVUfxYeKHP0xdwEYGbMk0csA@mail.gmail.com>
Message-ID: <CALGmxE+TWy25cBn56mjMsi_ertZGMfnF_3=D7kUyyqR51ob-aw@mail.gmail.com>

On Fri, Nov 2, 2018 at 12:17 PM, Wes Turner <wes.turner at gmail.com> wrote:

> JSON5 supports comments in JSON.
> https://github.com/json5/json5/issues/3
>

and other nifty things -- any plans to support JSON5 in the stdlib json
library? I think that would be great.

-CHB


> ... Some form of schema is necessary to avoid having to try parsing every
> string value as a date time (and to specify precision: "2018" is not the
> same as "2018 00:00:01")
>
> On Friday, November 2, 2018, Chris Barker via Python-ideas <
> python-ideas at python.org> wrote:
>
>> On Fri, Nov 2, 2018 at 9:31 AM, M.-A. Lemburg <mal at egenix.com> wrote:
>>
>>> Serialization of those data types is not defined in the JSON standard:
>>>
>>> https://www.json.org/
>>
>>
>> That being said, ISO 8601 is a standard for datetime stamps, and a
>> defacto one for JSON
>>
>> So building encoding of datetime into Python's json encoder would be
>> pretty useful.
>>
>> (I would not have any automatic decoding though -- as an ISO8601 string
>> would still be just a string in JSON)
>>
>> Could we have a "pedantic" mode for "fully standard conforming" JSON, and
>> then add some extensions to the standard?
>>
>> As another example, I would find it very handy if the json decoder would
>> respect comments in JSON (I know that they are explicitly not part of the
>> standard), but they are used in other applications, particularly when JSON
>> is used as a configuration language.
>>
>> -CHB
>>
>>
>> --
>>
>> Christopher Barker, Ph.D.
>> Oceanographer
>>
>> Emergency Response Division
>> NOAA/NOS/OR&R            (206) 526-6959   voice
>> 7600 Sand Point Way NE   (206
>> <https://maps.google.com/?q=7600+Sand+Point+Way+NE+%C2%A0%C2%A0(206&entry=gmail&source=g>)
>> 526-6329   fax
>> Seattle, WA  98115       (206) 526-6317   main reception
>>
>> Chris.Barker at noaa.gov
>>
>


-- 

Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/OR&R            (206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115       (206) 526-6317   main reception

Chris.Barker at noaa.gov
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20181105/78ae0e95/attachment.html>

From storchaka at gmail.com  Tue Nov  6 01:42:01 2018
From: storchaka at gmail.com (Serhiy Storchaka)
Date: Tue, 6 Nov 2018 08:42:01 +0200
Subject: [Python-ideas] Serialization of CSV vs. JSON
In-Reply-To: <CALGmxE+TWy25cBn56mjMsi_ertZGMfnF_3=D7kUyyqR51ob-aw@mail.gmail.com>
References: <CAJdod2pM=S_Kza80tuYr7RfUWhNXB73cs8evGeif9E0PkSRshQ@mail.gmail.com>
 <d335e6d2-4f0b-54cb-af7d-2647448f32ab@egenix.com>
 <CALGmxEK_z-FPjVOQMxZzbxXGsmUODWHCRa8wxCZ=Bk=YZMSTqg@mail.gmail.com>
 <CACfEFw91OXZB6N=-D8wv_1Wqt2RVUfxYeKHP0xdwEYGbMk0csA@mail.gmail.com>
 <CALGmxE+TWy25cBn56mjMsi_ertZGMfnF_3=D7kUyyqR51ob-aw@mail.gmail.com>
Message-ID: <prrcvm$jdj$1@blaine.gmane.org>

06.11.18 02:17, Chris Barker via Python-ideas ????:
> and other nifty things -- any plans to support JSON5 in the stdlib json 
> library? I think that would be great.

When it be widely used official standard.

There is a number of general data exchange formats more popular than JSON5.


From daveshawley at gmail.com  Tue Nov  6 06:46:46 2018
From: daveshawley at gmail.com (David Shawley)
Date: Tue, 6 Nov 2018 06:46:46 -0500
Subject: [Python-ideas] Serialization of CSV vs. JSON
In-Reply-To: <CAGgTfkNfx_umXpe052-RFvbewPXQXKEJ35J0n22hHXEoKSyKtg@mail.gmail.com>
References: <CAJdod2pM=S_Kza80tuYr7RfUWhNXB73cs8evGeif9E0PkSRshQ@mail.gmail.com>
 <CACo5Rz4SU2tQLn_0yoSwuFXs31KE+UfFtDpz002=b+HyGB42ag@mail.gmail.com>
 <E92C82D6-57B6-4E81-9E4C-EAA97D048B38@gmail.com>
 <5BDE169D.4040408@canterbury.ac.nz>
 <1397FA69-0B7A-4BDF-B7C7-CBCA1A8D4F90@gmail.com>
 <CAGgTfkNfx_umXpe052-RFvbewPXQXKEJ35J0n22hHXEoKSyKtg@mail.gmail.com>
Message-ID: <667737F3-6965-463E-923C-EEE84CDCF47D@gmail.com>

On Nov 4, 2018, at 12:43 PM, Michael Selik <michael.selik at gmail.com> wrote:
> If you're making a module
>>
> > On Sun, Nov 4, 2018, 5:49 AM David Shawley <daveshawley at gmail.com wrote:
> > Personally, I would place this sort of serialization logic outside of the
> > Standard Library -- maybe following the pattern that the rust community
> > adopted on this very issue.  In short, they separated serialization &
> > de-serialization into a free-standing library.
>
> You don't need a bug on the tracker or discussion on -dev to share a module
> on PyPI or GitHub. When you've got something started, share a link in this
> thread.

I modified a branch of python/cpython to implement what I had in mind. [1]
The idea is to introduce a new protocol with a single method:

    self.jsonformat() -> object

    If this method exists, then json.encoder.JSONEncoder will call it
    to generate a JSON representation *instead* of calling *default*.
    This method must return a value that json.encoder.JSONEncoder can
    encoder or fail in the same manner as the *default* hook.

The implementation wasn't too difficult once I learned a little more about
how Standard Library classes are implemented when C speedups are included.
There are a few things that I haven't done:

1. I didn't guard the functionality with a flag to the JSONEncoder
   initializer.  This was oversight but I would add one before doing a PR
   against python/cpython.

2. As discussed before this is an asymmetric proposal since there is no
   support for detecting and de-serializing in JSONDecoder.

That is what I had in mind.  I'm not sure how we want to spell extension
methods like this one.  I chose to not use a double-underscore method since
I view them as ``for use by the interpreter/language'' more so than for
Library-recognised methods.  The name is the least of my worries.

Let me know if there is any reason that I shouldn't move forward with a bpo
and PR against python/cpython.

- cheers, dave.

[1]: https://github.com/dave-shawley/cpython/pull/2 <https://github.com/dave-shawley/cpython/pull/2>

--
"State and behavior. State and behavior. If it doesn?t bundle state and behavior in a sensible way, it should not be an object, and there should not be a class that produces it." eevee

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20181106/868155c4/attachment-0001.html>

From abedillon at gmail.com  Tue Nov  6 14:03:54 2018
From: abedillon at gmail.com (Abe Dillon)
Date: Tue, 6 Nov 2018 13:03:54 -0600
Subject: [Python-ideas] Add "default" kwarg to list.pop()
In-Reply-To: <CAPTjJmqEp-E7qVrU=PvGE=0XDOZdZr2G5ea2RUb9M=3DmZQW-g@mail.gmail.com>
References: <CAFYqXL_CZ2QJUJO3o3f6uXAesaDQ3SqSW1K3PgF8dHOxsueNmg@mail.gmail.com>
 <prasli$8iv$1@blaine.gmane.org> <20181031010851.GC3817@ando.pearwood.info>
 <CAOmnVUAOdT2SobfVA1_ysFMZxxUs09rgi2tjaTHiDvezZZopCA@mail.gmail.com>
 <CAPTjJmqEp-E7qVrU=PvGE=0XDOZdZr2G5ea2RUb9M=3DmZQW-g@mail.gmail.com>
Message-ID: <CAKr=oZuyt+n8t2s40x3DRwa_gYXuwquyFSk6M8cqz4NDjqRQbw@mail.gmail.com>

I don't understand the rationale behind PEP 463's rejection. Guido says, "I
disagree with the position that EAFP is better than LBYL, or "generally
recommended" by Python. (Where do you get that?..."; but it's been in the
official Python.org docs <https://docs.python.org/3/glossary.html> for a
while and even provides a pretty good justification for why EAFP is
preferable to LBYL (aside from the language calling EAFP "common", "clean",
and "fast" that's notably absent from LBYL's description):

"In a multi-threaded environment, the LBYL approach can risk introducing a
race condition between ?the looking? and ?the leaping?. For example, the
code, if key in mapping: returnmapping[key] can fail if another thread
removes *key* from *mapping* after the test, but before the lookup. This
issue can be solved with locks or by using the EAFP approach."

Which brings me to the question: What happens when a PEP gets rejected? Is
it final? Is there a process for reviving a PEP?

I personally would love to have *both* more consistent methods on built-in
classes AND exception handling expressions. I think the colon (and maybe
'except' keyword) could be replaced with an exclamation point:

        value = lst[2] except IndexError! "No value"

or just:

  value = lst[2] IndexError! "No value"

if that appeases the people who dislike the over-use of colons.

A full exception list would have to be in parentheses which get's ugly, but
would also be (I would wager) a less common form:

dirlist.append(os.getcwd() (AttributeError, OSError as e)! os.curdir)

That might need some work. I don't know if it's compatible w/ the compiler.
It may have to start with "try" or something, but it seems pretty close to
a workable solution.


On Wed, Oct 31, 2018 at 4:42 AM Chris Angelico <rosuav at gmail.com> wrote:

> On Wed, Oct 31, 2018 at 8:24 PM Nicolas Rolin <nicolas.rolin at tiime.fr>
> wrote:
> >
> >
> > As a user I always found a bit disurbing that dict pop method have a
> default while list and set doesn't.
> > While it is way more computationally easy to check wether a list or a
> set is empty that to check if a key is in a dict, it still create a
> signature difference for no real reason (having a default to a built-in in
> python is pretty standard).
> > It would be nice if every built-in/method of built-in type that returns
> a value and raise in some case have access to a default instead of raise,
> and not having to check the doc to see if it supports a default.
> >
>
> https://www.python.org/dev/peps/pep-0463/ wants to say hi.
>
> ChrisA
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> https://mail.python.org/mailman/listinfo/python-ideas
> Code of Conduct: http://python.org/psf/codeofconduct/
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20181106/a584eebd/attachment.html>

From rosuav at gmail.com  Tue Nov  6 15:00:29 2018
From: rosuav at gmail.com (Chris Angelico)
Date: Wed, 7 Nov 2018 07:00:29 +1100
Subject: [Python-ideas] Add "default" kwarg to list.pop()
In-Reply-To: <CAKr=oZuyt+n8t2s40x3DRwa_gYXuwquyFSk6M8cqz4NDjqRQbw@mail.gmail.com>
References: <CAFYqXL_CZ2QJUJO3o3f6uXAesaDQ3SqSW1K3PgF8dHOxsueNmg@mail.gmail.com>
 <prasli$8iv$1@blaine.gmane.org> <20181031010851.GC3817@ando.pearwood.info>
 <CAOmnVUAOdT2SobfVA1_ysFMZxxUs09rgi2tjaTHiDvezZZopCA@mail.gmail.com>
 <CAPTjJmqEp-E7qVrU=PvGE=0XDOZdZr2G5ea2RUb9M=3DmZQW-g@mail.gmail.com>
 <CAKr=oZuyt+n8t2s40x3DRwa_gYXuwquyFSk6M8cqz4NDjqRQbw@mail.gmail.com>
Message-ID: <CAPTjJmoSFo9hP1UOtFx9aKzGpQJCQAcaBK=iAcBHYP=T-TG-Nw@mail.gmail.com>

On Wed, Nov 7, 2018 at 6:04 AM Abe Dillon <abedillon at gmail.com> wrote:
>
> Which brings me to the question: What happens when a PEP gets rejected? Is it final? Is there a process for reviving a PEP?

It remains as a permanent document. No, that isn't final; and the
process for reviving a PEP basically consists of answering the
objections that led to its rejection. There have been a few cases
where a proposal lies dormant for years before finally being accepted
(such as the matrix multiplication operator). So if you want to do
that, open a new thread, and specifically respond to the issues in the
PEP - anything named as a reason for rejection, and anything else that
you think ought to be improved.

ChrisA

From steve at pearwood.info  Tue Nov  6 18:16:21 2018
From: steve at pearwood.info (Steven D'Aprano)
Date: Wed, 7 Nov 2018 10:16:21 +1100
Subject: [Python-ideas] Add "default" kwarg to list.pop()
In-Reply-To: <CAKr=oZuyt+n8t2s40x3DRwa_gYXuwquyFSk6M8cqz4NDjqRQbw@mail.gmail.com>
References: <CAFYqXL_CZ2QJUJO3o3f6uXAesaDQ3SqSW1K3PgF8dHOxsueNmg@mail.gmail.com>
 <prasli$8iv$1@blaine.gmane.org> <20181031010851.GC3817@ando.pearwood.info>
 <CAOmnVUAOdT2SobfVA1_ysFMZxxUs09rgi2tjaTHiDvezZZopCA@mail.gmail.com>
 <CAPTjJmqEp-E7qVrU=PvGE=0XDOZdZr2G5ea2RUb9M=3DmZQW-g@mail.gmail.com>
 <CAKr=oZuyt+n8t2s40x3DRwa_gYXuwquyFSk6M8cqz4NDjqRQbw@mail.gmail.com>
Message-ID: <20181106231621.GC4071@ando.pearwood.info>

On Tue, Nov 06, 2018 at 01:03:54PM -0600, Abe Dillon wrote:

> I don't understand the rationale behind PEP 463's rejection. Guido says, "I
> disagree with the position that EAFP is better than LBYL, or "generally
> recommended" by Python. (Where do you get that?...";

I can't comment on Guido's question about "generally recommended", but 
as for the first part, I agree: neither EAFP nor LBYL is "better", they 
are both appropriate under different circumstances. Sometimes one is 
clearer and more efficient than the other. The only time I would say 
that EAFP is clearly better is when LBYL introduces "Time Of Check To 
Time Of Use" bugs.


> Which brings me to the question: What happens when a PEP gets rejected? Is
> it final? Is there a process for reviving a PEP?

Nothing is final-final. You can try opening a competing PEP, or take 
over as champion of the existing PEP (assuming Chris is willing to step 
aside). You ought to respond to the reasons given in the rejection.

It's probably a good idea to gauge the chances of success by asking on 
Python-Ideas and Python-Dev first, to avoid the core devs saying "Oh 
give it up, it's not going to happen!" after you've wasted time trying 
to revise a rejected PEP.


[...]
> I think the colon (and maybe
> 'except' keyword) could be replaced with an exclamation point:
> 
>         value = lst[2] except IndexError! "No value"
[...]
> if that appeases the people who dislike the over-use of colons.

And I think that this is precisely the sort of syntax that prompted 
Guido to write many years ago that language design is not merely a 
problem-solving exercise. Aesthetics are important. This is not just a 
matter of finding an unused character or two and hammering it into the 
the language. That's how you get Perl, which is not a pretty language.


> A full exception list would have to be in parentheses which get's ugly, but
> would also be (I would wager) a less common form:
> 
> dirlist.append(os.getcwd() (AttributeError, OSError as e)! os.curdir)
> 
> That might need some work. I don't know if it's compatible w/ the compiler.
> It may have to start with "try" or something, but it seems pretty close to
> a workable solution.

Seeing that syntax, the phrase that came to my mind was not so much 
"close to workable" and more "kill it with fire!".


-- 
Steve

From eric at trueblade.com  Tue Nov  6 19:05:27 2018
From: eric at trueblade.com (Eric V. Smith)
Date: Tue, 6 Nov 2018 19:05:27 -0500
Subject: [Python-ideas] Serialization of CSV vs. JSON
In-Reply-To: <667737F3-6965-463E-923C-EEE84CDCF47D@gmail.com>
References: <CAJdod2pM=S_Kza80tuYr7RfUWhNXB73cs8evGeif9E0PkSRshQ@mail.gmail.com>
 <CACo5Rz4SU2tQLn_0yoSwuFXs31KE+UfFtDpz002=b+HyGB42ag@mail.gmail.com>
 <E92C82D6-57B6-4E81-9E4C-EAA97D048B38@gmail.com>
 <5BDE169D.4040408@canterbury.ac.nz>
 <1397FA69-0B7A-4BDF-B7C7-CBCA1A8D4F90@gmail.com>
 <CAGgTfkNfx_umXpe052-RFvbewPXQXKEJ35J0n22hHXEoKSyKtg@mail.gmail.com>
 <667737F3-6965-463E-923C-EEE84CDCF47D@gmail.com>
Message-ID: <006e4b9a-c2f5-ff43-88eb-cd706a4012bf@trueblade.com>

On 11/6/2018 6:46 AM, David Shawley wrote:
> On Nov 4, 2018, at 12:43 PM, Michael Selik <michael.selik at gmail.com 
> <mailto:michael.selik at gmail.com>> wrote:
> > If you're making a module
> >>
> > > On Sun, Nov 4, 2018, 5:49 AM David Shawley <daveshawley at gmail.com 
> <mailto:daveshawley at gmail.com> wrote:
> > > Personally, I would place this sort of serialization logic outside 
> of the
> > > Standard Library -- maybe following the pattern that the rust 
> community
> > > adopted on this very issue. ?In short, they separated serialization &
> > > de-serialization into a free-standing library.
> >
> > You don't need a bug on the tracker or discussion on -dev to share a 
> module
> > on PyPI or GitHub. When you've got something started, share a link 
> in this
> > thread.
>
> I modified a branch of python/cpython to implement what I had in mind. [1]
> The idea is to introduce a new protocol with a single method:
>
> ? ? self.jsonformat() -> object
>
> ? ? If this method exists, then json.encoder.JSONEncoder will call it
> ? ? to generate a JSON representation *instead* of calling *default*.
> ? ? This method must return a value that json.encoder.JSONEncoder can
> ? ? encoder or fail in the same manner as the *default* hook.
>
> The implementation wasn't too difficult once I learned a little more about
> how Standard Library classes are implemented when C speedups are included.
> There are a few things that I haven't done:
>
> 1. I didn't guard the functionality with a flag to the JSONEncoder
> ? ?initializer. ?This was oversight but I would add one before doing a PR
> ? ?against python/cpython.
>
> 2. As discussed before this is an asymmetric proposal since there is no
> ? ?support for detecting and de-serializing in JSONDecoder.
>
> That is what I had in mind. ?I'm not sure how we want to spell extension
> methods like this one. ?I chose to not use a double-underscore method 
> since
> I view them as ``for use by the interpreter/language'' more so than for
> Library-recognised methods. ?The name is the least of my worries.
>
> Let me know if there is any reason that I shouldn't move forward with 
> a bpo
> and PR against python/cpython.
>

I wouldn't support putting this in the stdlib yet. We need to get 
real-world experience first. Modifying existing object with what's 
basically a new protocol seems too heavyweight for a protocol that's not 
all that commonly used.

How about implementing this with functools.singledispatch? It's designed 
for exactly this sort of case: some base functionality, then per-type 
specialization. It would be super-easy to whip up something with 
datetime.date and datetime.datetime specializations. I have a long-term 
goal of moving parts of the stdlib to singledispatch where it makes 
sense (say the next generation of pprint, for example).

I also think you should pass in a context object, and maybe have None 
signify a default context, although I'll admit I haven't thought it 
through yet. It will take some design iterations to get it right, once 
the use cases are clear.

Eric


> - cheers, dave.
>
> [1]: https://github.com/dave-shawley/cpython/pull/2
>
> --
> /"State and behavior. State and behavior. If it doesn?t bundle state 
> and behavior in a sensible way, it should not be an object, and there 
> should not be a class that produces it."/eevee
>
>
>
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> https://mail.python.org/mailman/listinfo/python-ideas
> Code of Conduct:http://python.org/psf/codeofconduct/

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20181106/fa2a22ad/attachment-0001.html>

From abedillon at gmail.com  Thu Nov  8 20:49:18 2018
From: abedillon at gmail.com (Abe Dillon)
Date: Thu, 8 Nov 2018 19:49:18 -0600
Subject: [Python-ideas] Add "default" kwarg to list.pop()
In-Reply-To: <20181106231621.GC4071@ando.pearwood.info>
References: <CAFYqXL_CZ2QJUJO3o3f6uXAesaDQ3SqSW1K3PgF8dHOxsueNmg@mail.gmail.com>
 <prasli$8iv$1@blaine.gmane.org> <20181031010851.GC3817@ando.pearwood.info>
 <CAOmnVUAOdT2SobfVA1_ysFMZxxUs09rgi2tjaTHiDvezZZopCA@mail.gmail.com>
 <CAPTjJmqEp-E7qVrU=PvGE=0XDOZdZr2G5ea2RUb9M=3DmZQW-g@mail.gmail.com>
 <CAKr=oZuyt+n8t2s40x3DRwa_gYXuwquyFSk6M8cqz4NDjqRQbw@mail.gmail.com>
 <20181106231621.GC4071@ando.pearwood.info>
Message-ID: <CAKr=oZsRFjd0VChD+zRzJ3x=Uqkan9uMH1V9OXGc4BgG5smUMQ@mail.gmail.com>

>
>
> neither EAFP nor LBYL is "better", they are both appropriate under
> different circumstances. Sometimes one is

clearer and more efficient than the other.
>

One of the reasons LBYL is sometimes cleaner than EAPF is because it has
more support from the language in the form of an expression which is what
PEP 463 intends to change.

The only time I would say that EAFP is clearly better is when LBYL
> introduces "Time Of Check To Time Of Use" bugs.


It also puts the intent of the logic up front instead of requiring the
reader to scroll through a preamble of edge-case checks to get to what the
code is actually trying to do.

I think that this is precisely the sort of syntax that prompted
> Guido to write many years ago that language design is not merely a
> problem-solving exercise.


The sort of syntax that prompted that post was "precisely" multi-line
lambdas. Guido explained that he tried to throw people off the idea of
multi-line lambdas by posing it as an unsolvable puzzle (which people
promptly solved) when really he just thought the concept of multi-line
lambdas was flawed to begin with. I agree with him on that point. The whole
point of a lambda is that, in certain cases, they allow you to write more
expressive code by saying in-line exactly what you want to do. It only
works if that action is easily expressed in a line:

button.onClick(lambda: print("Hello!"))

If it's a long an complicated bit of code, the expressiveness of lambda is
lost and it makes more sense to give it a name and write:

button.onClick(doThatComplicatedThing)

I'm not trying to solve a puzzle that implements an anti-pattern (unless
you have some argument for why expressionized try-except would be an
anti-pattern).

 Aesthetics are important. This is not just a matter of finding an unused
> character or two and hammering it into the the language.


Yeah. That's why I didn't just try to find an unused character and hammer
it into the language without paying any regard to aesthetics. I find

 value = lst[2] except IndexError! "No value"

to be pretty well in keeping w/ Python's aesthetics because raising an
exception pretty naturally fits with an exclamation point, but of-course;
aesthetics are subjective.
I know what Perl is BTW, and share your distaste for it.

Seeing that syntax, the phrase that came to my mind was not so much
> "close to workable" and more "kill it with fire!".


Funny, that's exactly how I felt about the None-aware operators, only; I
didn't reject the entire concept simply because I disliked the syntax
choice. I simply rejected the syntax choice because I disliked the syntax
choice...

On Tue, Nov 6, 2018 at 5:21 PM Steven D'Aprano <steve at pearwood.info> wrote:

> On Tue, Nov 06, 2018 at 01:03:54PM -0600, Abe Dillon wrote:
>
> > I don't understand the rationale behind PEP 463's rejection. Guido says,
> "I
> > disagree with the position that EAFP is better than LBYL, or "generally
> > recommended" by Python. (Where do you get that?...";
>
> I can't comment on Guido's question about "generally recommended", but
> as for the first part, I agree: neither EAFP nor LBYL is "better", they
> are both appropriate under different circumstances. Sometimes one is
> clearer and more efficient than the other. The only time I would say
> that EAFP is clearly better is when LBYL introduces "Time Of Check To
> Time Of Use" bugs.
>
>
> > Which brings me to the question: What happens when a PEP gets rejected?
> Is
> > it final? Is there a process for reviving a PEP?
>
> Nothing is final-final. You can try opening a competing PEP, or take
> over as champion of the existing PEP (assuming Chris is willing to step
> aside). You ought to respond to the reasons given in the rejection.
>
> It's probably a good idea to gauge the chances of success by asking on
> Python-Ideas and Python-Dev first, to avoid the core devs saying "Oh
> give it up, it's not going to happen!" after you've wasted time trying
> to revise a rejected PEP.
>
>
> [...]
> > I think the colon (and maybe
> > 'except' keyword) could be replaced with an exclamation point:
> >
> >         value = lst[2] except IndexError! "No value"
> [...]
> > if that appeases the people who dislike the over-use of colons.
>
> And I think that this is precisely the sort of syntax that prompted
> Guido to write many years ago that language design is not merely a
> problem-solving exercise. Aesthetics are important. This is not just a
> matter of finding an unused character or two and hammering it into the
> the language. That's how you get Perl, which is not a pretty language.
>
>
> > A full exception list would have to be in parentheses which get's ugly,
> but
> > would also be (I would wager) a less common form:
> >
> > dirlist.append(os.getcwd() (AttributeError, OSError as e)! os.curdir)
> >
> > That might need some work. I don't know if it's compatible w/ the
> compiler.
> > It may have to start with "try" or something, but it seems pretty close
> to
> > a workable solution.
>
> Seeing that syntax, the phrase that came to my mind was not so much
> "close to workable" and more "kill it with fire!".
>
>
> --
> Steve
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> https://mail.python.org/mailman/listinfo/python-ideas
> Code of Conduct: http://python.org/psf/codeofconduct/
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20181108/9beead61/attachment.html>

From danish.bluecheese at gmail.com  Fri Nov  9 17:54:50 2018
From: danish.bluecheese at gmail.com (danish bluecheese)
Date: Fri, 9 Nov 2018 14:54:50 -0800
Subject: [Python-ideas] Relative Imports
Message-ID: <CABH8avi0CgvViRSfkFistg=f4t31_ugQBDy45A2BcYNO3xo2Fg@mail.gmail.com>

Hi all,

Im tired of not being able to make relative imports freely. Now trying to
develop a module which enable any project to use relative imports once it
is loaded. Anybody interested in?

Best,
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20181109/104c9a44/attachment.html>

From steve at pearwood.info  Fri Nov  9 18:16:35 2018
From: steve at pearwood.info (Steven D'Aprano)
Date: Sat, 10 Nov 2018 10:16:35 +1100
Subject: [Python-ideas] Relative Imports
In-Reply-To: <CABH8avi0CgvViRSfkFistg=f4t31_ugQBDy45A2BcYNO3xo2Fg@mail.gmail.com>
References: <CABH8avi0CgvViRSfkFistg=f4t31_ugQBDy45A2BcYNO3xo2Fg@mail.gmail.com>
Message-ID: <20181109231635.GL4071@ando.pearwood.info>

On Fri, Nov 09, 2018 at 02:54:50PM -0800, danish bluecheese wrote:

> Im tired of not being able to make relative imports freely. 

Python has supported relative imports for a while now. 

https://docs.python.org/3/tutorial/modules.html#intra-package-references

What do you mean?


-- 
Steve

From danish.bluecheese at gmail.com  Fri Nov  9 18:20:52 2018
From: danish.bluecheese at gmail.com (danish bluecheese)
Date: Fri, 9 Nov 2018 15:20:52 -0800
Subject: [Python-ideas] Relative Imports
In-Reply-To: <20181109231635.GL4071@ando.pearwood.info>
References: <CABH8avi0CgvViRSfkFistg=f4t31_ugQBDy45A2BcYNO3xo2Fg@mail.gmail.com>
 <20181109231635.GL4071@ando.pearwood.info>
Message-ID: <CABH8avifABaOau+YRxy97-G7qARgSHphQBM=wh5k5tcu-OY8hQ@mail.gmail.com>

It supports, but whenever you get multiple folders there is no clean
solution.
Either there are some sys.path hacks or running things as modules in some
cases.
These are not pleasant at all.
I think we can come up with something better.
Interested in?

On Fri, Nov 9, 2018 at 3:17 PM Steven D'Aprano <steve at pearwood.info> wrote:

> On Fri, Nov 09, 2018 at 02:54:50PM -0800, danish bluecheese wrote:
>
> > Im tired of not being able to make relative imports freely.
>
> Python has supported relative imports for a while now.
>
> https://docs.python.org/3/tutorial/modules.html#intra-package-references
>
> What do you mean?
>
>
> --
> Steve
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> https://mail.python.org/mailman/listinfo/python-ideas
> Code of Conduct: http://python.org/psf/codeofconduct/
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20181109/34da6ec4/attachment.html>

From steve at pearwood.info  Fri Nov  9 18:39:15 2018
From: steve at pearwood.info (Steven D'Aprano)
Date: Sat, 10 Nov 2018 10:39:15 +1100
Subject: [Python-ideas] Relative Imports
In-Reply-To: <CABH8avifABaOau+YRxy97-G7qARgSHphQBM=wh5k5tcu-OY8hQ@mail.gmail.com>
References: <CABH8avi0CgvViRSfkFistg=f4t31_ugQBDy45A2BcYNO3xo2Fg@mail.gmail.com>
 <20181109231635.GL4071@ando.pearwood.info>
 <CABH8avifABaOau+YRxy97-G7qARgSHphQBM=wh5k5tcu-OY8hQ@mail.gmail.com>
Message-ID: <20181109233915.GM4071@ando.pearwood.info>

On Fri, Nov 09, 2018 at 03:20:52PM -0800, danish bluecheese wrote:
> It supports, but whenever you get multiple folders there is no clean
> solution.

What do you mean?


-- 
Steve

From danish.bluecheese at gmail.com  Fri Nov  9 18:51:46 2018
From: danish.bluecheese at gmail.com (danish bluecheese)
Date: Fri, 9 Nov 2018 15:51:46 -0800
Subject: [Python-ideas] Relative Imports
In-Reply-To: <20181109233915.GM4071@ando.pearwood.info>
References: <CABH8avi0CgvViRSfkFistg=f4t31_ugQBDy45A2BcYNO3xo2Fg@mail.gmail.com>
 <20181109231635.GL4071@ando.pearwood.info>
 <CABH8avifABaOau+YRxy97-G7qARgSHphQBM=wh5k5tcu-OY8hQ@mail.gmail.com>
 <20181109233915.GM4071@ando.pearwood.info>
Message-ID: <CABH8aviskae=jzu0ewvDinVWQb+rHmcU4den8Fkyx7Xe96Nueg@mail.gmail.com>

??? src
    ??? __init__.py
    ??? main.py
    ??? test
        ??? __init__.py
        ??? test_main.py

assume the structure above. To be able to use relative imports with such
fundamental structure either i can go for sys.path hacks or could run as a
module from one further level parent.
I do not like this :D want to be able to use relational imports freely as
soon as I provide correct relational path.

Please let me know, if it is not clear on any aspect.
Thank you. Regards.

On Fri, Nov 9, 2018 at 3:39 PM Steven D'Aprano <steve at pearwood.info> wrote:

> On Fri, Nov 09, 2018 at 03:20:52PM -0800, danish bluecheese wrote:
> > It supports, but whenever you get multiple folders there is no clean
> > solution.
>
> What do you mean?
>
>
> --
> Steve
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> https://mail.python.org/mailman/listinfo/python-ideas
> Code of Conduct: http://python.org/psf/codeofconduct/
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20181109/ef824bef/attachment.html>

From rosuav at gmail.com  Fri Nov  9 19:00:16 2018
From: rosuav at gmail.com (Chris Angelico)
Date: Sat, 10 Nov 2018 11:00:16 +1100
Subject: [Python-ideas] Relative Imports
In-Reply-To: <CABH8aviskae=jzu0ewvDinVWQb+rHmcU4den8Fkyx7Xe96Nueg@mail.gmail.com>
References: <CABH8avi0CgvViRSfkFistg=f4t31_ugQBDy45A2BcYNO3xo2Fg@mail.gmail.com>
 <20181109231635.GL4071@ando.pearwood.info>
 <CABH8avifABaOau+YRxy97-G7qARgSHphQBM=wh5k5tcu-OY8hQ@mail.gmail.com>
 <20181109233915.GM4071@ando.pearwood.info>
 <CABH8aviskae=jzu0ewvDinVWQb+rHmcU4den8Fkyx7Xe96Nueg@mail.gmail.com>
Message-ID: <CAPTjJmrtApAXib4wM6FEh-XkpU_cf5RNX2TR0VzCAGgxPPgf5A@mail.gmail.com>

On Sat, Nov 10, 2018 at 10:52 AM danish bluecheese
<danish.bluecheese at gmail.com> wrote:
>
> ??? src
>     ??? __init__.py
>     ??? main.py
>     ??? test
>         ??? __init__.py
>         ??? test_main.py
>
> assume the structure above. To be able to use relative imports with such fundamental structure either i can go for sys.path hacks or could run as a module from one further level parent.
> I do not like this :D want to be able to use relational imports freely as soon as I provide correct relational path.
>
> Please let me know, if it is not clear on any aspect.

The main thing that's not clear here is what you're proposing. What is
the idea under discussion?

ChrisA

From steve at pearwood.info  Fri Nov  9 19:10:43 2018
From: steve at pearwood.info (Steven D'Aprano)
Date: Sat, 10 Nov 2018 11:10:43 +1100
Subject: [Python-ideas] Relative Imports
In-Reply-To: <CABH8aviskae=jzu0ewvDinVWQb+rHmcU4den8Fkyx7Xe96Nueg@mail.gmail.com>
References: <CABH8avi0CgvViRSfkFistg=f4t31_ugQBDy45A2BcYNO3xo2Fg@mail.gmail.com>
 <20181109231635.GL4071@ando.pearwood.info>
 <CABH8avifABaOau+YRxy97-G7qARgSHphQBM=wh5k5tcu-OY8hQ@mail.gmail.com>
 <20181109233915.GM4071@ando.pearwood.info>
 <CABH8aviskae=jzu0ewvDinVWQb+rHmcU4den8Fkyx7Xe96Nueg@mail.gmail.com>
Message-ID: <20181110001043.GO4071@ando.pearwood.info>

On Fri, Nov 09, 2018 at 03:51:46PM -0800, danish bluecheese wrote:
> ??? src
>     ??? __init__.py
>     ??? main.py
>     ??? test
>         ??? __init__.py
>         ??? test_main.py
> 
> assume the structure above. To be able to use relative imports with such
> fundamental structure either i can go for sys.path hacks or could run as a
> module from one further level parent.

I don't understand. From the top level of the package, running inside 
either __init__ or main, you should be able to say:

from . import test
from .test import test_main

>From the test subpackage, you should be able to say:

from .. import main

to get the src/main module, or 

from . import test_main

to get the test/test_main module from the test/__init__ module.

(Disclaimer: I have not actually run the above code to check that it 
works, beyond testing that its not a SyntaxError.)

What *precisely* is the problem you are trying to solve, and your 
proposed solution?


-- 
Steve

From ashafer01 at gmail.com  Fri Nov  9 19:22:57 2018
From: ashafer01 at gmail.com (Alex Shafer)
Date: Fri, 9 Nov 2018 17:22:57 -0700
Subject: [Python-ideas] Python-ideas Digest, Vol 144, Issue 24
In-Reply-To: <mailman.104.1541808029.2831.python-ideas@python.org>
References: <mailman.104.1541808029.2831.python-ideas@python.org>
Message-ID: <CAJG_PJVWc0HMXgUt9zJWF10OHitQyb1ara2RNAo46M3Vio3VZA@mail.gmail.com>

I think this about the limitation to . and .. possibly?

>
> Message: 6
> Date: Sat, 10 Nov 2018 11:00:16 +1100
> From: Chris Angelico <rosuav at gmail.com>
> To: python-ideas <python-ideas at python.org>
> Subject: Re: [Python-ideas] Relative Imports
> Message-ID:
>         <
> CAPTjJmrtApAXib4wM6FEh-XkpU_cf5RNX2TR0VzCAGgxPPgf5A at mail.gmail.com>
> Content-Type: text/plain; charset="UTF-8"
>
> On Sat, Nov 10, 2018 at 10:52 AM danish bluecheese
> <danish.bluecheese at gmail.com> wrote:
> >
> > ??? src
> >     ??? __init__.py
> >     ??? main.py
> >     ??? test
> >         ??? __init__.py
> >         ??? test_main.py
> >
> > assume the structure above. To be able to use relative imports with such
> fundamental structure either i can go for sys.path hacks or could run as a
> module from one further level parent.
> > I do not like this :D want to be able to use relational imports freely
> as soon as I provide correct relational path.
> >
> > Please let me know, if it is not clear on any aspect.
>
> The main thing that's not clear here is what you're proposing. What is
> the idea under discussion?
>
> ChrisA
>
>
> ------------------------------
>
> Subject: Digest Footer
>
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> https://mail.python.org/mailman/listinfo/python-ideas
>
>
> ------------------------------
>
> End of Python-ideas Digest, Vol 144, Issue 24
> *********************************************
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20181109/332f663b/attachment.html>

From danish.bluecheese at gmail.com  Fri Nov  9 19:32:47 2018
From: danish.bluecheese at gmail.com (danish bluecheese)
Date: Fri, 9 Nov 2018 16:32:47 -0800
Subject: [Python-ideas] Relative Imports
In-Reply-To: <20181110001043.GO4071@ando.pearwood.info>
References: <CABH8avi0CgvViRSfkFistg=f4t31_ugQBDy45A2BcYNO3xo2Fg@mail.gmail.com>
 <20181109231635.GL4071@ando.pearwood.info>
 <CABH8avifABaOau+YRxy97-G7qARgSHphQBM=wh5k5tcu-OY8hQ@mail.gmail.com>
 <20181109233915.GM4071@ando.pearwood.info>
 <CABH8aviskae=jzu0ewvDinVWQb+rHmcU4den8Fkyx7Xe96Nueg@mail.gmail.com>
 <20181110001043.GO4071@ando.pearwood.info>
Message-ID: <CABH8avgbANG3jvbAKcGUsp7rhucdxcnpr_XtjRiuNyG6CTy6Ew@mail.gmail.com>

you are right on the lines you mentioned. Those are all working if i run it
as a module which i do every time.
This is somewhat unpleasant to me, especially while developing something
and trying to test it quickly.
I just want to be able to use same relative imports and run single file
with `python3 test_main.py` for example.
Running files as modules every time is tiring. This is my problem.
I could not come up with a concrete solution idea yet i am thinking on it.
Open to suggestions.

Thank you all for your help!

On Fri, Nov 9, 2018 at 4:16 PM Steven D'Aprano <steve at pearwood.info> wrote:

> On Fri, Nov 09, 2018 at 03:51:46PM -0800, danish bluecheese wrote:
> > ??? src
> >     ??? __init__.py
> >     ??? main.py
> >     ??? test
> >         ??? __init__.py
> >         ??? test_main.py
> >
> > assume the structure above. To be able to use relative imports with such
> > fundamental structure either i can go for sys.path hacks or could run as
> a
> > module from one further level parent.
>
> I don't understand. From the top level of the package, running inside
> either __init__ or main, you should be able to say:
>
> from . import test
> from .test import test_main
>
> From the test subpackage, you should be able to say:
>
> from .. import main
>
> to get the src/main module, or
>
> from . import test_main
>
> to get the test/test_main module from the test/__init__ module.
>
> (Disclaimer: I have not actually run the above code to check that it
> works, beyond testing that its not a SyntaxError.)
>
> What *precisely* is the problem you are trying to solve, and your
> proposed solution?
>
>
>
> --
> Steve
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> https://mail.python.org/mailman/listinfo/python-ideas
> Code of Conduct: http://python.org/psf/codeofconduct/
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20181109/4c657763/attachment.html>

From jsbueno at python.org.br  Fri Nov  9 20:41:34 2018
From: jsbueno at python.org.br (Joao S. O. Bueno)
Date: Fri, 9 Nov 2018 23:41:34 -0200
Subject: [Python-ideas] Python octal escape character encoding "wats"
Message-ID: <CAH0mxTR=wAtB9MvUK3ikmW7K=8iDM3X1Ohbwkh=56TSxwxvn9g@mail.gmail.com>

I just saw some document which reminded me that strings with a
backslash followed by 3 octal digits. When a backslash is followed by
3 octal digits, that means a character with the corresponding
codepoint and all is well.

The "valid scenaario":

In [42]: "\777"
Out[42]: '?'

The problem is when you have just two valid octal digits

In [40]: "\778"
Out[40]: '?8'

Which is ambiguous at least -- why is this not "\x07" "77" for
example?  (0ct(77) actually corresponds to the "?" (63) character)

Or...when the first digit is not valid as octal - that is:
In [41]: "\877"
Out[41]: '\\877'

And then when the second digit is not valid octal:
In [43]: "\797"
Out[43]: '\x0797'
WAT?

So, between the possibly ambiguous scenario with two octal digits
followed by a no-octal digit, and   the complety unexpected expansion
to a 4-hexadecimal digit codepoint in the last case, what do you say
of deprecating any r"\[0-9]{1,3}" sequence that don't match full 3
octal digits, and yield a syntax error for that from Python 3.9 (or
3.10) on?

Best regards,

    js
  -><-

From rosuav at gmail.com  Fri Nov  9 20:56:07 2018
From: rosuav at gmail.com (Chris Angelico)
Date: Sat, 10 Nov 2018 12:56:07 +1100
Subject: [Python-ideas] Python octal escape character encoding "wats"
In-Reply-To: <CAH0mxTR=wAtB9MvUK3ikmW7K=8iDM3X1Ohbwkh=56TSxwxvn9g@mail.gmail.com>
References: <CAH0mxTR=wAtB9MvUK3ikmW7K=8iDM3X1Ohbwkh=56TSxwxvn9g@mail.gmail.com>
Message-ID: <CAPTjJmqom+afUzPRBPW-_T=XPe2PuO5SVvFDQFBCNEbGZ08F-A@mail.gmail.com>

On Sat, Nov 10, 2018 at 12:42 PM Joao S. O. Bueno <jsbueno at python.org.br> wrote:
>
> I just saw some document which reminded me that strings with a
> backslash followed by 3 octal digits. When a backslash is followed by
> 3 octal digits, that means a character with the corresponding
> codepoint and all is well.
>
> The "valid scenaario":
>
> In [42]: "\777"
> Out[42]: '?'
>
> The problem is when you have just two valid octal digits
>
> In [40]: "\778"
> Out[40]: '?8'
>
> Which is ambiguous at least -- why is this not "\x07" "77" for
> example?  (0ct(77) actually corresponds to the "?" (63) character)

Not ambiguous. It takes as many valid octal digits as it can.

https://docs.python.org/3/reference/lexical_analysis.html#string-and-bytes-literals

\ooo ==> Character with octal value ooo
Note 1: As in Standard C, up to three octal digits are accepted.

"Up to" means that one or two digits can also define a character. For
obvious reasons, it has to take digits greedily (otherwise "\777"
would be "\x07" followed by "77"), and it's not an error to have fewer
digits. Permitting a single digit means that "\0" means the NUL
character, which is often convenient.

> And then when the second digit is not valid octal:
> In [43]: "\797"
> Out[43]: '\x0797'
> WAT?
>
> So, between the possibly ambiguous scenario with two octal digits
> followed by a no-octal digit, and   the complety unexpected expansion
> to a 4-hexadecimal digit codepoint in the last case

You may possibly be misinterpreting the last result. It's exactly the
same as the previous ones.

>>> list("\797")
['\x07', '9', '7']

The octal escape grabs as many digits as it can, and when it finds a
character in the literal that isn't a valid octal digit (same whether
it's a '9' or a 'q'), it stops. The remaining characters have no
special meaning; this does not become four hex digits. A "\xNN" escape
in Python must be exactly two digits, no more and no less.

> what do you say
> of deprecating any r"\[0-9]{1,3}" sequence that don't match full 3
> octal digits, and yield a syntax error for that from Python 3.9 (or
> 3.10) on?

Nope. Would break code for no good reason.

ChrisA

From jsbueno at python.org.br  Fri Nov  9 21:04:22 2018
From: jsbueno at python.org.br (Joao S. O. Bueno)
Date: Sat, 10 Nov 2018 00:04:22 -0200
Subject: [Python-ideas] Python octal escape character encoding "wats"
In-Reply-To: <CAPTjJmqom+afUzPRBPW-_T=XPe2PuO5SVvFDQFBCNEbGZ08F-A@mail.gmail.com>
References: <CAH0mxTR=wAtB9MvUK3ikmW7K=8iDM3X1Ohbwkh=56TSxwxvn9g@mail.gmail.com>
 <CAPTjJmqom+afUzPRBPW-_T=XPe2PuO5SVvFDQFBCNEbGZ08F-A@mail.gmail.com>
Message-ID: <CAH0mxTQ8wt+1YMX=0EZxu4E4FzVK=bdzKjH3G0uw18BT5oGWiA@mail.gmail.com>

On Fri, 9 Nov 2018 at 23:56, Chris Angelico <rosuav at gmail.com> wrote:
> >>> list("\797")
> ['\x07', '9', '7']

> The octal escape grabs as many digits as it can, and when it finds a
> character in the literal that isn't a valid octal digit (same whether
> it's a '9' or a 'q'), it stops. The remaining characters have no
> special meaning; this does not become four hex digits. A "\xNN" escape
> in Python must be exactly two digits, no more and no less.

Yes-  I had just figured this out before going to sleep, and was
comming back that
although strange, this was no motive for breaking stuff up.

Thank your for the lengthy reply!!

>
> On Sat, Nov 10, 2018 at 12:42 PM Joao S. O. Bueno <jsbueno at python.org.br> wrote:
> >
> > I just saw some document which reminded me that strings with a
> > backslash followed by 3 octal digits. When a backslash is followed by
> > 3 octal digits, that means a character with the corresponding
> > codepoint and all is well.
> >
> > The "valid scenaario":
> >
> > In [42]: "\777"
> > Out[42]: '?'
> >
> > The problem is when you have just two valid octal digits
> >
> > In [40]: "\778"
> > Out[40]: '?8'
> >
> > Which is ambiguous at least -- why is this not "\x07" "77" for
> > example?  (0ct(77) actually corresponds to the "?" (63) character)
>
> Not ambiguous. It takes as many valid octal digits as it can.
>
> https://docs.python.org/3/reference/lexical_analysis.html#string-and-bytes-literals
>
> \ooo ==> Character with octal value ooo
> Note 1: As in Standard C, up to three octal digits are accepted.
>
> "Up to" means that one or two digits can also define a character. For
> obvious reasons, it has to take digits greedily (otherwise "\777"
> would be "\x07" followed by "77"), and it's not an error to have fewer
> digits. Permitting a single digit means that "\0" means the NUL
> character, which is often convenient.
>
> > And then when the second digit is not valid octal:
> > In [43]: "\797"
> > Out[43]: '\x0797'
> > WAT?
> >
> > So, between the possibly ambiguous scenario with two octal digits
> > followed by a no-octal digit, and   the complety unexpected expansion
> > to a 4-hexadecimal digit codepoint in the last case
>
> You may possibly be misinterpreting the last result. It's exactly the
> same as the previous ones.
>
> >>> list("\797")
> ['\x07', '9', '7']
>
> The octal escape grabs as many digits as it can, and when it finds a
> character in the literal that isn't a valid octal digit (same whether
> it's a '9' or a 'q'), it stops. The remaining characters have no
> special meaning; this does not become four hex digits. A "\xNN" escape
> in Python must be exactly two digits, no more and no less.
>
> > what do you say
> > of deprecating any r"\[0-9]{1,3}" sequence that don't match full 3
> > octal digits, and yield a syntax error for that from Python 3.9 (or
> > 3.10) on?
>
> Nope. Would break code for no good reason.
>
> ChrisA
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> https://mail.python.org/mailman/listinfo/python-ideas
> Code of Conduct: http://python.org/psf/codeofconduct/

From steve at pearwood.info  Fri Nov  9 23:19:09 2018
From: steve at pearwood.info (Steven D'Aprano)
Date: Sat, 10 Nov 2018 15:19:09 +1100
Subject: [Python-ideas] Python octal escape character encoding "wats"
In-Reply-To: <CAPTjJmqom+afUzPRBPW-_T=XPe2PuO5SVvFDQFBCNEbGZ08F-A@mail.gmail.com>
References: <CAH0mxTR=wAtB9MvUK3ikmW7K=8iDM3X1Ohbwkh=56TSxwxvn9g@mail.gmail.com>
 <CAPTjJmqom+afUzPRBPW-_T=XPe2PuO5SVvFDQFBCNEbGZ08F-A@mail.gmail.com>
Message-ID: <20181110041908.GP4071@ando.pearwood.info>

On Sat, Nov 10, 2018 at 12:56:07PM +1100, Chris Angelico wrote:

> Not ambiguous. It takes as many valid octal digits as it can.

What is the rationale for that? Hex escapes don't.

My guess is, "Because that's what C does". And C probably does it 
because "Dennis Ritchie wanted to minimize the number of keypresses when 
he was typing" :-)


> "Up to" means that one or two digits can also define a character. For
> obvious reasons, it has to take digits greedily (otherwise "\777"
> would be "\x07" followed by "77"), and it's not an error to have fewer
> digits.

In hindsight, I think we should have insisted that octal escapes must 
always be three digits, just as hex escapes are always two. The status 
quo has too much magical "Do What I Mean" in it for my liking:

py> '\509\51'  # pair of brackets surrounding a nine
'(9)'
py> '\507\51'  # pair of brackets surrounding a seven
'G)'

Dammit Python, that's not what I meant!


> > what do you say
> > of deprecating any r"\[0-9]{1,3}" sequence that don't match full 3
> > octal digits, and yield a syntax error for that from Python 3.9 (or
> > 3.10) on?
> 
> Nope. Would break code for no good reason.

There's a good reason: to make the behaviour more sensible and less 
confusing and have fewer "oops, that's not what I wanted" bugs. But we 
should have made that change for 3.0. Now, I agree: it would be breakage 
where the benefit doesn't outweigh the cost.

Maybe in Python 5000.

In the meantime, one or two digit octal escapes ought to be a linter 
warning.


-- 
Steve

From rosuav at gmail.com  Fri Nov  9 23:39:36 2018
From: rosuav at gmail.com (Chris Angelico)
Date: Sat, 10 Nov 2018 15:39:36 +1100
Subject: [Python-ideas] Python octal escape character encoding "wats"
In-Reply-To: <20181110041908.GP4071@ando.pearwood.info>
References: <CAH0mxTR=wAtB9MvUK3ikmW7K=8iDM3X1Ohbwkh=56TSxwxvn9g@mail.gmail.com>
 <CAPTjJmqom+afUzPRBPW-_T=XPe2PuO5SVvFDQFBCNEbGZ08F-A@mail.gmail.com>
 <20181110041908.GP4071@ando.pearwood.info>
Message-ID: <CAPTjJmr8Swzh8fYv9xNz6aZPdTe+fZUvB6CDWK-nCFjvP-OeYg@mail.gmail.com>

On Sat, Nov 10, 2018 at 3:19 PM Steven D'Aprano <steve at pearwood.info> wrote:
>
> On Sat, Nov 10, 2018 at 12:56:07PM +1100, Chris Angelico wrote:
>
> > Not ambiguous. It takes as many valid octal digits as it can.
>
> What is the rationale for that? Hex escapes don't.

Irrelevant to whether it's ambiguous or not.

> > "Up to" means that one or two digits can also define a character. For
> > obvious reasons, it has to take digits greedily (otherwise "\777"
> > would be "\x07" followed by "77"), and it's not an error to have fewer
> > digits.
>
> In hindsight, I think we should have insisted that octal escapes must
> always be three digits, just as hex escapes are always two. The status
> quo has too much magical "Do What I Mean" in it for my liking:
>
> py> '\509\51'  # pair of brackets surrounding a nine
> '(9)'
> py> '\507\51'  # pair of brackets surrounding a seven
> 'G)'
>
> Dammit Python, that's not what I meant!

How often do you actually do that with octal escapes, though? Ever had
actual real-world situations where this comes up? I don't recall
*ever* coming across a problem where sometimes I have an octal escape
followed by a nine, and other times by a different digit. I also do
not recall often wanting an octal escape followed by a digit, even
without that confusion.

> > > what do you say
> > > of deprecating any r"\[0-9]{1,3}" sequence that don't match full 3
> > > octal digits, and yield a syntax error for that from Python 3.9 (or
> > > 3.10) on?
> >
> > Nope. Would break code for no good reason.
>
> There's a good reason: to make the behaviour more sensible and less
> confusing and have fewer "oops, that's not what I wanted" bugs. But we
> should have made that change for 3.0. Now, I agree: it would be breakage
> where the benefit doesn't outweigh the cost.

We can debate whether it would be, in the abstract, better to mandate
exactly three digits, or to allow fewer. But I think we're all agreed
that it is nowhere _near_ enough of a problem to justify the breakage.
I perhaps exaggerated slightly in saying "no" good reason, but
certainly not enough to consider the change.

> Maybe in Python 5000.
>
> In the meantime, one or two digit octal escapes ought to be a linter
> warning.

Maybe. Or just have the editor colour the octal escape differently;
that way, the end of the colour will tell you if the language is
misinterpreting your intentions. Either way, yeah, something that
tooling can help with.

ChrisA

From Richard at Damon-Family.org  Sat Nov 10 08:08:59 2018
From: Richard at Damon-Family.org (Richard Damon)
Date: Sat, 10 Nov 2018 08:08:59 -0500
Subject: [Python-ideas] Python octal escape character encoding "wats"
In-Reply-To: <20181110041908.GP4071@ando.pearwood.info>
References: <CAH0mxTR=wAtB9MvUK3ikmW7K=8iDM3X1Ohbwkh=56TSxwxvn9g@mail.gmail.com>
 <CAPTjJmqom+afUzPRBPW-_T=XPe2PuO5SVvFDQFBCNEbGZ08F-A@mail.gmail.com>
 <20181110041908.GP4071@ando.pearwood.info>
Message-ID: <fd1f504d-f6f6-0b18-19f8-6684e9afa2ee@Damon-Family.org>

On 11/9/18 11:19 PM, Steven D'Aprano wrote:
> On Sat, Nov 10, 2018 at 12:56:07PM +1100, Chris Angelico wrote:
>
>> Not ambiguous. It takes as many valid octal digits as it can.
> What is the rationale for that? Hex escapes don't.
>
> My guess is, "Because that's what C does". And C probably does it 
> because "Dennis Ritchie wanted to minimize the number of keypresses when 
> he was typing" :-)
>
>
>> "Up to" means that one or two digits can also define a character. For
>> obvious reasons, it has to take digits greedily (otherwise "\777"
>> would be "\x07" followed by "77"), and it's not an error to have fewer
>> digits.
> In hindsight, I think we should have insisted that octal escapes must 
> always be three digits, just as hex escapes are always two. The status 
> quo has too much magical "Do What I Mean" in it for my liking:
>
> py> '\509\51'  # pair of brackets surrounding a nine
> '(9)'
> py> '\507\51'  # pair of brackets surrounding a seven
> 'G)'
>
> Dammit Python, that's not what I meant!
>
Since the 'normal' usage for octal escapes in C (which came long before
hex escapes) was to input control characters, the most likely being \0,
and the next most likely \33 (Escape), and by far most being in the
range of \0 - \37, requiring 3 all the time would be very inconvenient.
You would never use the escape for a printable character and interleave
it with other printable characters.

Yes, if you are putting in codes for a string of arbitrary byte values
using escapes, then you would likely always use 3 digits for
readability, but then you don't have the ambiguity as EVERY code is an
escape.

The one case where you might get the problem is if you had a control
character (like escape) followed by a digit between 0 and 7, you needed
to expand the escape to 3 digits. This was just one of the traps you
learned to live with (and it seemed that terminal escape codes seemed to
avoid that issue by normally following the escape character with a
non-digit character.)


-- 
Richard Damon


From erotemic at gmail.com  Sat Nov 10 20:36:52 2018
From: erotemic at gmail.com (Jonathan Crall)
Date: Sat, 10 Nov 2018 20:36:52 -0500
Subject: [Python-ideas] Proposing additions to the standard library
Message-ID: <CAAj5Ztk_RUw7_BvaEGyK07FdePbGCwpUbsth=JAhLoG3iP1Hmw@mail.gmail.com>

I'm interested in proposing several additions to the Python standard
library, and I would like more information on the procedure for doing so.
Are all additions done via a PEP? If not what is the procedure. If so, I've
read that the first step was to email this board and get feedback.

I have a library called `ubelt` that contains several tools that I think
might be worthy of adding to the standard library.

Here's my bullet point pitch:

   - Python is batteries included. Ubelt contains extra batteries.
   function are extra batteries.
   - Most function in ubelt are fast. All 222 tests takes 7.33 seconds.
   - Ubelt has 100% test coverage (sans `# nocover` locations).
   - I'm only championing a subset of the functions in ubelt. There are
   certainly functions in there that do not belong in the standard library.
   - I have a Jupyter notebook that give a demo of some select functions
   (not necessarily the same as the ones proposed here):
   https://github.com/Erotemic/ubelt/blob/master/docs/notebooks/Ubelt%20Demo.ipynb
   - I do have documentation (mostly in docstrings) and in the docs folder,
   but I've been having trouble auto-updating read-the-docs. Here is the link
   anyway: https://ubelt.readthedocs.io/en/latest/

Here is a tentative list of interesting functions. Hopefully the names are
descriptive (if not, see docstrings: https://github.com/Erotemic/ubelt)

ub.cmd

ub.compressuser

ub.group_items

ub.dict_hist

ub.find_duplicates

ub.AutoDict

ub.import_module_from_path

ub.import_module_from_name

ub.modname_to_modpath,

ub.modpath_to_modname

ub.ProgIter

ub.ensuredir

ub.expandpath


almost everything in util_list:

allsame, argmax, argmin, argsort, argunique,

chunks, flatten, iter_window, take, unique


These functions might be worth modifying into dictionary methods:

ub.dict_subset

ub.dict_take

ub.map_vals

ub.map_keys

ub.Timerit

ub.Timer


Because I built the library, I tend to like all the functions. Its
difficult to decide if they are stdlib worthy, so there might be some false
positives / negatives.

I'm on the fence about:
CacheStamp, Cacher, NoParam, argflag, argval, dzip, delete, hash_data,
hash_file, memoize, memoize_method, NiceRepr, augpath, userhome,
ensure_app_cache_dir, ensure_app_resource_dir, find_exe, find_path,
get_app_cache_dir, get_app_resource_dir, platform_cache_dir,
 platform_resource_dir, CaptureStdout, codeblock, ensure_unicode, hzcat,
 indent, OrderedSet


Its my hope that some of these are actually useful. Let me know any of the
following: what you think, if there are any questions, if something else
needs to be done, or what the next steps are.

-- 
-Jon
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20181110/5cc57df4/attachment-0001.html>

From steve at pearwood.info  Sat Nov 10 21:14:01 2018
From: steve at pearwood.info (Steven D'Aprano)
Date: Sun, 11 Nov 2018 13:14:01 +1100
Subject: [Python-ideas] Proposing additions to the standard library
In-Reply-To: <CAAj5Ztk_RUw7_BvaEGyK07FdePbGCwpUbsth=JAhLoG3iP1Hmw@mail.gmail.com>
References: <CAAj5Ztk_RUw7_BvaEGyK07FdePbGCwpUbsth=JAhLoG3iP1Hmw@mail.gmail.com>
Message-ID: <20181111021400.GU4071@ando.pearwood.info>

On Sat, Nov 10, 2018 at 08:36:52PM -0500, Jonathan Crall wrote:
> I'm interested in proposing several additions to the Python standard
> library, and I would like more information on the procedure for doing so.
> Are all additions done via a PEP?

Not necessarily. Small, obvious enhancements can go straight to the 
bug tracker. The tricky part is deciding what is "obvious".

Sometimes there's a good, useful function than doesn't get added because 
there's no reasonable place to put it. For example, a "flatten" function 
has been talked about since Python 1.x days, and we still don't have a 
standard solution for it, because (1) it isn't clear *precisely* what it 
should do, and (2) it isn't clear where it should go.

Given that once something gets added to the std lib, it is hard to 
remove it or even rename it, its better to be conservative about adding 
things and leave it to third party libraries to cover the gaps.


> If not what is the procedure. If so, I've
> read that the first step was to email this board and get feedback.

That's a good idea. If the enhancement request isn't both small and 
obvious, or is the least bit controversial, you'll usually be sent back 
here.


> I have a library called `ubelt` that contains several tools that I think
> might be worthy of adding to the standard library.

Generally speaking, we don't typically add grab-bags of random utility 
functions. There is no "utilities" or "toolbox" module in the std lib.


[...]
> Here is a tentative list of interesting functions. Hopefully the names are
> descriptive (if not, see docstrings: https://github.com/Erotemic/ubelt)

Sorry, some of these aren't descriptive enough, and if you're trying to 
make a pitch for these features, you ought to give at least a 
one-sentence explanation of them here in the email. You will lose half 
your audience as soon as you ask them to click through to a link, and 
even if they do, that risks splitting the discussion across two places.

My advice is to collate the functions you want to add into groups of 
related functionality, find the class or module in the std lib where you 
think they belong, and begin a new thread for each group. E.g. "New dict 
methods", "New importlib functions".


-- 
Steve

From erotemic at gmail.com  Sat Nov 10 21:56:18 2018
From: erotemic at gmail.com (Jonathan Crall)
Date: Sat, 10 Nov 2018 21:56:18 -0500
Subject: [Python-ideas] Proposing additions to the standard library
In-Reply-To: <20181111021400.GU4071@ando.pearwood.info>
References: <CAAj5Ztk_RUw7_BvaEGyK07FdePbGCwpUbsth=JAhLoG3iP1Hmw@mail.gmail.com>
 <20181111021400.GU4071@ando.pearwood.info>
Message-ID: <CAAj5ZtmHhSBJoaHyfS_QX4NSO0g0kp4=1rJv+WT=Z=L0HLffxg@mail.gmail.com>

@Steve, this is just the sort of feedback I was looking for. Small and
conservative additions make sense. I definitely think that some functions
do fit into existing stdlib modules. For instance, AutoDict might go in
collections.

Sorry, some of these aren't descriptive enough, and if you're trying to
> make a pitch for these features.

...

My advice is to collate the functions you want to add into groups of
> related functionality.


Makes sense. I figured that my original list had too may entries to do that
for, or else the email would explode. Separating each small group into its
own thread will allow me to describe the specific function without writing
a novel.

Sometimes there's a good, useful function than doesn't get added because
> there's no reasonable place to put it. For example, a "flatten" function
> has been talked about since Python 1.x days, and we still don't have a
> standard solution for it, because (1) it isn't clear *precisely* what it
> should do, and (2) it isn't clear where it should go.


The flatten example is good to know about. Is there a link to this
discussion or a summary of it? I would think flatten could go in itertools,
but clearly there must some reason why its not there. I imagine the
duplication with it.chain.from_iter + "There should be one-- and preferably
only one --obvious way to do it."? As for what it should do, I'm guessing
the controversy was over flattening one level vs all levels? That makes
sense and is good to know. I guess I won't pick `flatten` as one of my
first functions to pick for a writeup. On a similar note, do you (or anyone
else) have an intuition for which of these functions --- judging by name
only (so you don't have to click any links) --- might be the least
controversial? I'm not very good at judging controversy, which is one of
the main reasons for this initial email.

Maybe `expandpath` to os.path? Or perhaps start with ub.modname_to_modpath
and
ub.modpath_to_modname to importlib? Maybe some of the dict-methods? Perhaps
I'm overestimating the clear usefulness of any of these functions to the
stdlib?

On Sat, Nov 10, 2018 at 9:14 PM Steven D'Aprano <steve at pearwood.info> wrote:

> On Sat, Nov 10, 2018 at 08:36:52PM -0500, Jonathan Crall wrote:
> > I'm interested in proposing several additions to the Python standard
> > library, and I would like more information on the procedure for doing so.
> > Are all additions done via a PEP?
>
> Not necessarily. Small, obvious enhancements can go straight to the
> bug tracker. The tricky part is deciding what is "obvious".
>
> Sometimes there's a good, useful function than doesn't get added because
> there's no reasonable place to put it. For example, a "flatten" function
> has been talked about since Python 1.x days, and we still don't have a
> standard solution for it, because (1) it isn't clear *precisely* what it
> should do, and (2) it isn't clear where it should go.
>
> Given that once something gets added to the std lib, it is hard to
> remove it or even rename it, its better to be conservative about adding
> things and leave it to third party libraries to cover the gaps.
>
>
> > If not what is the procedure. If so, I've
> > read that the first step was to email this board and get feedback.
>
> That's a good idea. If the enhancement request isn't both small and
> obvious, or is the least bit controversial, you'll usually be sent back
> here.
>
>
> > I have a library called `ubelt` that contains several tools that I think
> > might be worthy of adding to the standard library.
>
> Generally speaking, we don't typically add grab-bags of random utility
> functions. There is no "utilities" or "toolbox" module in the std lib.
>
>
> [...]
> > Here is a tentative list of interesting functions. Hopefully the names
> are
> > descriptive (if not, see docstrings: https://github.com/Erotemic/ubelt)
>
> Sorry, some of these aren't descriptive enough, and if you're trying to
> make a pitch for these features, you ought to give at least a
> one-sentence explanation of them here in the email. You will lose half
> your audience as soon as you ask them to click through to a link, and
> even if they do, that risks splitting the discussion across two places.
>
> My advice is to collate the functions you want to add into groups of
> related functionality, find the class or module in the std lib where you
> think they belong, and begin a new thread for each group. E.g. "New dict
> methods", "New importlib functions".
>
>
> --
> Steve
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> https://mail.python.org/mailman/listinfo/python-ideas
> Code of Conduct: http://python.org/psf/codeofconduct/
>


-- 
-Jon
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20181110/4c9b04e3/attachment.html>

From nicholasharrison222 at gmail.com  Sun Nov 11 00:58:02 2018
From: nicholasharrison222 at gmail.com (Nicholas Harrison)
Date: Sat, 10 Nov 2018 22:58:02 -0700
Subject: [Python-ideas] Range and slice syntax
Message-ID: <CALOoFeOgTMWChxeF34pWF3Ux4xOei+F37YhP240bkWSeZQ=E7w@mail.gmail.com>

I'm aware that syntax for ranges and slices has been discussed a good
amount over the years, but I wanted to float an idea out there to see if it
hasn't been considered before. It's not really original. Rather, it's a
combination of a couple parts of Python, and I find it
fascinatingly-consistent with the rest of the language. This will look
similar to PEP 204, but there are some important differences and
clarifications.

(start:stop:step)


Meet a range/slice object. Parentheses are required. (Its syntax in this
regard follows exactly the same rules as a generator expression.) I say
both range and slice because it can be used in either role. On the one
hand, it is iterable and functions exactly like range(start, stop, step) in
those contexts. On the other, it can also be passed into list indexing
exactly like slice(start, stop, step). This is a proposal that range and
slice are really the same thing, just in different contexts.

Why is it useful? I at least find its syntax to be simple, intuitive, and
concise -- more so than the range(...) or slice(...) alternatives. It's
quite obvious for an experienced Python user and just as simple to pick up
as slice notation for a beginner (since it *is* slice notation).

It condenses and clears up sometimes-cumbersome range expressions. A couple
examples:


sum(1:6) # instead of sum(range(1, 6))

list(1:6)


for i in (1:6):

print(i**2)


(i**2 for i in (1:6))


It also makes forming reusable slices clearer and easier:

my_slice = (:6:2) # instead of slice(None, 6, 2)
my_list[my_slice]


It has a couple of siblings that should be obvious (think list or set
comprehension):

[start:stop:step] # gives a list
{start:stop:step} # gives a set


This is similar to passing a range/slice object into the respective
constructor:


[1:6] # list(1:6) or [1, 2, 3, 4, 5]
{1:6} # set(1:6) or {1, 2, 3, 4, 5}


Note that the parentheses aren't needed when it is the only argument of a
function call or is the only element within brackets or braces. It takes on
its respective roles for these bracket and brace cases, just like
comprehensions. This also gives rise to the normal slice syntax:

my_list[1:6:2] # What is inside the brackets is a slice object.
my_list[(1:6:2)] # Equivalent. The parentheses are valid but unnecessary.


So here's the part that requires a little more thought. Any of the values
may be omitted and in the slice context the behavior has no changes from
what it already does: start and stop default to the beginning or end of the
list depending on direction and the step defaults to 1. In the range
context, we simply repeat these semantics, but noting that there is no
longer a beginning or end of a list.

Step defaults to 1 (just like range or slice).
Start defaults to 0 when counting up and -1 when counting down (just like
slice).
If stop is omitted, the object will act like an itertools.count object,
counting indefinitely.

I have found infinite iteration to be a natural and oft-desired extension
to a range object, but I can understand that some may want it to remain
separate and pure within itertools. I also admit that the ability to form
an infinite list with only three characters can be a scary thought (though
we are all adults here, right? ;). Normally you have to take a couple extra
keystrokes:

from itertools import count
list(count())
# rather than just [:]


If that is the case, then raising an error when iter() is called on a
range/slice object with no stop value could be another acceptable course of
action. The syntax will still be left valid.

And that's mainly it. Slice is iterable or range is "indexable" and the
syntax can be used anywhere successive values are desired. If you want to
know what it does or how to use it in some case, just think, "what would a
slice object do?" or "what would a range object do?" or "how would I write
a generator expression/list comprehension here?".

Here are a few more examples:


for i in (:5): # 5 elements 0 to 4, i.e. range(5)

print(i**2)


 for i in (1:): # counts up from one for as long as you want, i.e. count(1)

print(i**2)

if i == 5: break

it = iter(:) # a convenient usage for an infinite counter

next(it)


' '.join(map(str, (:5:2))) # gives '0 2 4'

[(:5), (5:10)] # list of range/slice objects
[[:5], [5:10]] # list of lists
[*(:5), *(5:10)] # uses unpacking to get flat list
[*[:5], *[5:10]] # same unpacking to get flat list


Otherwise you'd have to do:

[list(range(5)), list(range(5, 10))] # list of lists
[*range(5), *range(5, 10)] # flat list


Tuples:

tuple(1:6:2) # (1, 3, 5)
*(1:6:2), # same


I don't actually have experience developing the interpreter and underlying
workings of Python, so I don't know how much of a change this requires. I
thought it might be possible since the constructs already exist in the
language. They just haven't been unified yet. I also realize that there are
a few other use-cases that need to be ironed out. The syntax might also be
too minimal in some cases to be obvious. One of the trickiest things may be
what it will be called, since the current language has the two different
terms.

In the end it's just another range/slice idea, and the idea has probably
already been proposed sometime in the past few decades, but what thoughts
are there?

- Nicholas
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20181110/36ee746e/attachment-0001.html>

From rosuav at gmail.com  Sun Nov 11 01:00:59 2018
From: rosuav at gmail.com (Chris Angelico)
Date: Sun, 11 Nov 2018 17:00:59 +1100
Subject: [Python-ideas] Range and slice syntax
In-Reply-To: <CALOoFeOgTMWChxeF34pWF3Ux4xOei+F37YhP240bkWSeZQ=E7w@mail.gmail.com>
References: <CALOoFeOgTMWChxeF34pWF3Ux4xOei+F37YhP240bkWSeZQ=E7w@mail.gmail.com>
Message-ID: <CAPTjJmrTqghj=ESvQ_JsrStSvddT8+QP4i37hi6EEs-hwcHmew@mail.gmail.com>

On Sun, Nov 11, 2018 at 4:59 PM Nicholas Harrison
<nicholasharrison222 at gmail.com> wrote:
> It has a couple of siblings that should be obvious (think list or set comprehension):
>
> [start:stop:step] # gives a list
> {start:stop:step} # gives a set
>

Be careful of this last one. If you omit the step, it looks like this:

{start:stop}

which is a dictionary display.

ChrisA

From steve at pearwood.info  Sun Nov 11 04:35:38 2018
From: steve at pearwood.info (Steven D'Aprano)
Date: Sun, 11 Nov 2018 20:35:38 +1100
Subject: [Python-ideas] Range and slice syntax
In-Reply-To: <CALOoFeOgTMWChxeF34pWF3Ux4xOei+F37YhP240bkWSeZQ=E7w@mail.gmail.com>
References: <CALOoFeOgTMWChxeF34pWF3Ux4xOei+F37YhP240bkWSeZQ=E7w@mail.gmail.com>
Message-ID: <20181111093537.GW4071@ando.pearwood.info>

On Sat, Nov 10, 2018 at 10:58:02PM -0700, Nicholas Harrison wrote:

[...]
> (start:stop:step)
> 
> 
> Meet a range/slice object. Parentheses are required. (Its syntax in this
> regard follows exactly the same rules as a generator expression.) I say
> both range and slice because it can be used in either role. 

Ranges and slices are conceptually different things. Even if they have 
similar features, they are very different:

- range is a lazy sequence, which produces its values on demand; 

- it supports (all? most?) of the Sequence ABC, including 
  membership testing and len();

- but its values are intentionally limited to integers;

- slice objects, on the other hand, are an abstraction referring 
  to a context-sensitive sequence of abstract indices;

- those indices can be anything you like:

py> s = slice("Surprise!", range(1, 100, 3), slice(None))
py> s.start
'Surprise!'
py> s.stop
range(1, 100, 3)
py> s.step
slice(None, None, None)


- they don't support membership testing, len() or other Sequence
  operations;

- most importantly, because they are context-sensitive, we don't
  even know how many indexes are included in a slice until we know
  what we're slicing.

That last item is why slice objects have an indices() method that takes 
a mandatory length parameter.

If slices were limited to single integer indices, then there would be an 
argument that they are redundant and we could use range objects in their 
place; but they aren't.


[...]
> Why is it useful? I at least find its syntax to be simple, intuitive, and
> concise -- more so than the range(...) or slice(...) alternatives.

Concise, I will grant, but *intuitive*?

I have never forgot the first time I saw Python code, and after being 
told over and over again how "intuitive" it was I was utterly confused 
by these mysterious list[:] and list[1:] and similar expressions. I had 
no idea what they were or what they were supposed to do. I didn't even 
have a name I could put to them.

At least range(x) was something I could *name* and ask sensible 
questions about. I didn't even have a name for this strange 
square-bracket and colon syntax, and no context for understanding what 
it did. There's surely few things in Python more cryptic than

    mylist = mylist[:]

until you've learned what slicing does and how it operates.

Your proposal has the same disadvantages: it is cryptic punctuation that 
is meaningless until the reader has learned what it means, without even 
an obvious name they can refer to.

Don't get me wrong: slice syntax is great, *once you have learned it*. 
But it is a million miles from intuitive. If this proposal is a winner, 
it won't be because it will make Python easier for beginners.


[...]
> sum(1:6) # instead of sum(range(1, 6))

That looks like you tried to take a slice of a sequence called "sum" but 
messed up the brackets, using round instead of square.


> list(1:6)

Same.

> for i in (1:6):

Looks like a tuple done wrong.


I think this is not an improvement, unless you're trying to minimize the 
number of characters in an expression.


> It also makes forming reusable slices clearer and easier:
> 
> my_slice = (:6:2) # instead of slice(None, 6, 2)

"Easier" in the sense of "fewer characters to type", but "clearer"? I 
don't think so.


[...]
> So here's the part that requires a little more thought.

Are you saying that so far you haven't put any thought into this 
proposal? 

*wink*

(You don't have to answer this part, it was just my feeble attempt at 
humour.)


-- 
Steve

From robertve92 at gmail.com  Sun Nov 11 06:48:09 2018
From: robertve92 at gmail.com (Robert Vanden Eynde)
Date: Sun, 11 Nov 2018 12:48:09 +0100
Subject: [Python-ideas] Range and slice syntax
In-Reply-To: <CALOoFeOgTMWChxeF34pWF3Ux4xOei+F37YhP240bkWSeZQ=E7w@mail.gmail.com>
References: <CALOoFeOgTMWChxeF34pWF3Ux4xOei+F37YhP240bkWSeZQ=E7w@mail.gmail.com>
Message-ID: <CA+msPNm=UAF8s41oYHx+fUnYj+UEbQXc6TV8EZn_-mENsMjWWA@mail.gmail.com>

I'm wondering how your examples would go with from funcoperators import
infix (https://pypi.org/project/funcoperators/)

sum(1:6) # instead of sum(range(1, 6))
>
>
sum(1 /exclusive/ 6)

list(1:6)
>
>
list(1 /exclusive/ 6)
set(1 /exclusive/ 1)

Note that you can pick another name.
Note that you can pick another function :

@infix
def inclusive (a, b):
   return range(a, b+1)

sum(1 /inclusive/ 6)

for i in (1:6):
>
> print(i**2)
>
>
for i in 1 /exclusive/ 6:
    print(i**2)

(i**2 for i in (1:6))
>
>
(i ** 2 for i in 1 /exclusive/ 6)

It also makes forming reusable slices clearer and easier:
>
> my_slice = (:6:2) # instead of slice(None, 6, 2)
> my_list[my_slice]
>
>
I don't have exact equivalent here, I would create a function or explicitly
say slice(0, 6, 2)

This is similar to passing a range/slice object into the respective
> constructor:
>
>
> [1:6] # list(1:6) or [1, 2, 3, 4, 5]
> {1:6} # set(1:6) or {1, 2, 3, 4, 5}
>
>
As mentioned before {1:6} is a dict.

Here are a few more examples:
>
>
> for i in (:5): # 5 elements 0 to 4, i.e. range(5)
>
> print(i**2)
>
>
Everybody knows i in range(5).


>  for i in (1:): # counts up from one for as long as you want, i.e. count(1)
>
>
Well, count(1) is nice and people can google it.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20181111/3817eb77/attachment.html>

From hemflit at gmail.com  Sun Nov 11 07:13:25 2018
From: hemflit at gmail.com (=?UTF-8?Q?Vladimir_Filipovi=C4=87?=)
Date: Sun, 11 Nov 2018 13:13:25 +0100
Subject: [Python-ideas] Range and slice syntax
In-Reply-To: <CA+msPNm=UAF8s41oYHx+fUnYj+UEbQXc6TV8EZn_-mENsMjWWA@mail.gmail.com>
References: <CALOoFeOgTMWChxeF34pWF3Ux4xOei+F37YhP240bkWSeZQ=E7w@mail.gmail.com>
 <CA+msPNm=UAF8s41oYHx+fUnYj+UEbQXc6TV8EZn_-mENsMjWWA@mail.gmail.com>
Message-ID: <CAMqRfuYXE4c1mdh5mPo51n1yeyFqcVR9++=uG+E0hVTykYL2Xw@mail.gmail.com>

On Sun, Nov 11, 2018 at 6:59 AM Nicholas Harrison
<nicholasharrison222 at gmail.com> wrote:

> Any of the values may be omitted and in the slice context the behavior has no changes from what it already does: start and stop default to the beginning or end of the list depending on direction and the step defaults to 1.

Just to point out, with slices it's a bit more complicated than that currently.

The start, stop and step values each default to None.

When slice-indexing built-in and (all? probably, not sure)
standard-library types, None values for start and stop are interpreted
consistently with what you described as defaults.
A None value for step is interpreted as either 1 or -1, depending on
the comparison of start and step, and accounting for None values in
either of them too.

------

In real life I've found a use for non-integer slice objects, and been
happy that Python allowed me to treat the slice as a purely syntactic
construct whose semantics (outside builtins) are not fixed.

My case was an interface to an external sparse time-series store, and
it was easy to make the objects indexable with [datetime1 : datetime2
: timedelta], with None's treated right etc.

(The primary intended use was in a REPL in a data-science context, so
if your first thought was a doubt about whether that syntax is neat or
abusive, please compare it to numpy or pandas idioms, not to
collection classes you use in server or application code.)

If this had not been syntactically possible, it would not have been a
great pain to have to work around it, but now it's existing code and I
can imagine other existing projects adapting the slice syntax to their
own needs. At first blush, it seems like your proposal would give
slices enough compulsory semantics to break some of such existing code
- maybe even numpy itself.

(FWIW, I've also occasionally had a need for non-integer ranges, and
chafed at having to implement or install them. I've also missed
hashable slices in real life, because functools.lru_cache.)

------

(Note I'm just a random person commenting on the mailing list, not
anybody with any authority or influence.)

I find this recurring idea of unifying slices and ranges seductive.
But it would take a lot more shaking-out to make sure the range
semantics can be vague-ified enough that they don't break non-integer
slice usage.

Also, I could imagine some disagreements about exactly how much
non-standard slice usage should be protected from breakage. Someone
could make the argument that _some_ objects as slice parameters are
just abuse and no sane person should have used them in the first
place. ("Really, slicing with [int : [[sys], ...] : __import__]? We
need to take care to not break THAT too?")

From apalala at gmail.com  Sun Nov 11 11:34:16 2018
From: apalala at gmail.com (=?UTF-8?Q?Juancarlo_A=C3=B1ez?=)
Date: Sun, 11 Nov 2018 16:34:16 +0000
Subject: [Python-ideas] Range and slice syntax
In-Reply-To: <CAPTjJmrTqghj=ESvQ_JsrStSvddT8+QP4i37hi6EEs-hwcHmew@mail.gmail.com>
References: <CALOoFeOgTMWChxeF34pWF3Ux4xOei+F37YhP240bkWSeZQ=E7w@mail.gmail.com>
 <CAPTjJmrTqghj=ESvQ_JsrStSvddT8+QP4i37hi6EEs-hwcHmew@mail.gmail.com>
Message-ID: <CAN1YFWvToJ9MbgV8eKiHfFGiXwdu010oNEU5h1PpWWmqsKNhRA@mail.gmail.com>

On Sun, Nov 11, 2018 at 6:00 AM Chris Angelico <rosuav at gmail.com> wrote:

> Be careful of this last one. If you omit the step, it looks like this:
>
> {start:stop}
>
> which is a dictionary display.
>

The parenthesis could always be required for this new syntax.

In [*1*]: {'a':1}

Out[*1*]: {'a': 1}


In [*2*]: {('a':1)}

  File "<ipython-input-2-373a3153a222>", line 1

    {('a':1)}

         ^

SyntaxError: invalid syntax


-- 
Juancarlo *A?ez*
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20181111/13230050/attachment.html>

From steve at pearwood.info  Sun Nov 11 16:43:10 2018
From: steve at pearwood.info (Steven D'Aprano)
Date: Mon, 12 Nov 2018 08:43:10 +1100
Subject: [Python-ideas] Range and slice syntax
In-Reply-To: <CAN1YFWvToJ9MbgV8eKiHfFGiXwdu010oNEU5h1PpWWmqsKNhRA@mail.gmail.com>
References: <CALOoFeOgTMWChxeF34pWF3Ux4xOei+F37YhP240bkWSeZQ=E7w@mail.gmail.com>
 <CAPTjJmrTqghj=ESvQ_JsrStSvddT8+QP4i37hi6EEs-hwcHmew@mail.gmail.com>
 <CAN1YFWvToJ9MbgV8eKiHfFGiXwdu010oNEU5h1PpWWmqsKNhRA@mail.gmail.com>
Message-ID: <20181111214309.GY4071@ando.pearwood.info>

On Sun, Nov 11, 2018 at 04:34:16PM +0000, Juancarlo A?ez wrote:
> On Sun, Nov 11, 2018 at 6:00 AM Chris Angelico <rosuav at gmail.com> wrote:
> 
> > Be careful of this last one. If you omit the step, it looks like this:
> >
> > {start:stop}
> >
> > which is a dictionary display.
> >
> 
> The parenthesis could always be required for this new syntax.

Under the proposed syntax

    {(start:stop)}

would be a set with a single item, a slice object. If slice objects were 
hashable, that would be legal.

The OP's proposal is for {start:stop} to be equivalent to 
set(range(start, stop)) so they will be very different things.


-- 
Steve

From nicholasharrison222 at gmail.com  Mon Nov 12 10:17:32 2018
From: nicholasharrison222 at gmail.com (Nicholas Harrison)
Date: Mon, 12 Nov 2018 08:17:32 -0700
Subject: [Python-ideas] Range and slice syntax
In-Reply-To: <CAPTjJmrTqghj=ESvQ_JsrStSvddT8+QP4i37hi6EEs-hwcHmew@mail.gmail.com>
References: <CALOoFeOgTMWChxeF34pWF3Ux4xOei+F37YhP240bkWSeZQ=E7w@mail.gmail.com>
 <CAPTjJmrTqghj=ESvQ_JsrStSvddT8+QP4i37hi6EEs-hwcHmew@mail.gmail.com>
Message-ID: <CALOoFeOtRzu27F_VLX-8giUTmLgMpaoy0=djfgpHjOgK9cBaVw@mail.gmail.com>

That's a good point.

It might be better to disallow the list and set versions all together. To
get a list or set you would instead have to explicitly unpack a range/slice
object:

[*(:5)] # [:5] no longer allowed

{*(1:6)} # {1:6} is a dict

That would also solve the misstep of the three-character infinite list.

On Sat, Nov 10, 2018 at 11:00 PM Chris Angelico <rosuav at gmail.com> wrote:

> On Sun, Nov 11, 2018 at 4:59 PM Nicholas Harrison
> <nicholasharrison222 at gmail.com> wrote:
> > It has a couple of siblings that should be obvious (think list or set
> comprehension):
> >
> > [start:stop:step] # gives a list
> > {start:stop:step} # gives a set
> >
>
> Be careful of this last one. If you omit the step, it looks like this:
>
> {start:stop}
>
> which is a dictionary display.
>
> ChrisA
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> https://mail.python.org/mailman/listinfo/python-ideas
> Code of Conduct: http://python.org/psf/codeofconduct/
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20181112/5c2acf40/attachment.html>

From nicholasharrison222 at gmail.com  Mon Nov 12 10:23:42 2018
From: nicholasharrison222 at gmail.com (Nicholas Harrison)
Date: Mon, 12 Nov 2018 08:23:42 -0700
Subject: [Python-ideas] Range and slice syntax
In-Reply-To: <20181111093537.GW4071@ando.pearwood.info>
References: <CALOoFeOgTMWChxeF34pWF3Ux4xOei+F37YhP240bkWSeZQ=E7w@mail.gmail.com>
 <20181111093537.GW4071@ando.pearwood.info>
Message-ID: <CALOoFePVviRMms2ZD9TTbWFm2A=E0=NamkX-wdCcVorOYYCmWA@mail.gmail.com>

Overall, I agree with you. It is more intuitive to an experienced Python
user, and not so helpful to beginners. It decreases the ability to read out
code like English sentences and makes it harder to know what to search for
online. So it boosts facility after you know the language, but not when
starting out.

On Sun, Nov 11, 2018 at 2:35 AM Steven D'Aprano <steve at pearwood.info> wrote:

> On Sat, Nov 10, 2018 at 10:58:02PM -0700, Nicholas Harrison wrote:
>
> [...]
> > (start:stop:step)
> >
> >
> > Meet a range/slice object. Parentheses are required. (Its syntax in this
> > regard follows exactly the same rules as a generator expression.) I say
> > both range and slice because it can be used in either role.
>
> Ranges and slices are conceptually different things. Even if they have
> similar features, they are very different:
>
> - range is a lazy sequence, which produces its values on demand;
>
> - it supports (all? most?) of the Sequence ABC, including
>   membership testing and len();
>
> - but its values are intentionally limited to integers;
>
> - slice objects, on the other hand, are an abstraction referring
>   to a context-sensitive sequence of abstract indices;
>
> - those indices can be anything you like:
>
> py> s = slice("Surprise!", range(1, 100, 3), slice(None))
> py> s.start
> 'Surprise!'
> py> s.stop
> range(1, 100, 3)
> py> s.step
> slice(None, None, None)
>
>
> - they don't support membership testing, len() or other Sequence
>   operations;
>
> - most importantly, because they are context-sensitive, we don't
>   even know how many indexes are included in a slice until we know
>   what we're slicing.
>
> That last item is why slice objects have an indices() method that takes
> a mandatory length parameter.
>
> If slices were limited to single integer indices, then there would be an
> argument that they are redundant and we could use range objects in their
> place; but they aren't.
>
>
> [...]
> > Why is it useful? I at least find its syntax to be simple, intuitive, and
> > concise -- more so than the range(...) or slice(...) alternatives.
>
> Concise, I will grant, but *intuitive*?
>
> I have never forgot the first time I saw Python code, and after being
> told over and over again how "intuitive" it was I was utterly confused
> by these mysterious list[:] and list[1:] and similar expressions. I had
> no idea what they were or what they were supposed to do. I didn't even
> have a name I could put to them.
>
> At least range(x) was something I could *name* and ask sensible
> questions about. I didn't even have a name for this strange
> square-bracket and colon syntax, and no context for understanding what
> it did. There's surely few things in Python more cryptic than
>
>     mylist = mylist[:]
>
> until you've learned what slicing does and how it operates.
>
> Your proposal has the same disadvantages: it is cryptic punctuation that
> is meaningless until the reader has learned what it means, without even
> an obvious name they can refer to.
>
> Don't get me wrong: slice syntax is great, *once you have learned it*.
> But it is a million miles from intuitive. If this proposal is a winner,
> it won't be because it will make Python easier for beginners.
>
>
>
> [...]
> > sum(1:6) # instead of sum(range(1, 6))
>
> That looks like you tried to take a slice of a sequence called "sum" but
> messed up the brackets, using round instead of square.
>
>
> > list(1:6)
>
> Same.
>
> > for i in (1:6):
>
> Looks like a tuple done wrong.
>
>
> I think this is not an improvement, unless you're trying to minimize the
> number of characters in an expression.
>
>
> > It also makes forming reusable slices clearer and easier:
> >
> > my_slice = (:6:2) # instead of slice(None, 6, 2)
>
> "Easier" in the sense of "fewer characters to type", but "clearer"? I
> don't think so.
>
>
> [...]
> > So here's the part that requires a little more thought.
>
> Are you saying that so far you haven't put any thought into this
> proposal?
>
> *wink*
>
> (You don't have to answer this part, it was just my feeble attempt at
> humour.)
>
>
>
>
> --
> Steve
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> https://mail.python.org/mailman/listinfo/python-ideas
> Code of Conduct: http://python.org/psf/codeofconduct/
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20181112/78cf67f7/attachment.html>

From nicholasharrison222 at gmail.com  Mon Nov 12 10:25:22 2018
From: nicholasharrison222 at gmail.com (Nicholas Harrison)
Date: Mon, 12 Nov 2018 08:25:22 -0700
Subject: [Python-ideas] Range and slice syntax
In-Reply-To: <CA+msPNm=UAF8s41oYHx+fUnYj+UEbQXc6TV8EZn_-mENsMjWWA@mail.gmail.com>
References: <CALOoFeOgTMWChxeF34pWF3Ux4xOei+F37YhP240bkWSeZQ=E7w@mail.gmail.com>
 <CA+msPNm=UAF8s41oYHx+fUnYj+UEbQXc6TV8EZn_-mENsMjWWA@mail.gmail.com>
Message-ID: <CALOoFePmJgOCVoZYVcnaciV9T3EidjEJoOJ0Sovs9fT=88ud3Q@mail.gmail.com>

Interesting. I haven't looked at that package before. It looks like it
would work well for that.

On Sun, Nov 11, 2018 at 4:48 AM Robert Vanden Eynde <robertve92 at gmail.com>
wrote:

> I'm wondering how your examples would go with from funcoperators import
> infix (https://pypi.org/project/funcoperators/)
>
> sum(1:6) # instead of sum(range(1, 6))
>>
>>
> sum(1 /exclusive/ 6)
>
> list(1:6)
>>
>>
> list(1 /exclusive/ 6)
> set(1 /exclusive/ 1)
>
> Note that you can pick another name.
> Note that you can pick another function :
>
> @infix
> def inclusive (a, b):
>    return range(a, b+1)
>
> sum(1 /inclusive/ 6)
>
> for i in (1:6):
>>
>> print(i**2)
>>
>>
> for i in 1 /exclusive/ 6:
>     print(i**2)
>
> (i**2 for i in (1:6))
>>
>>
> (i ** 2 for i in 1 /exclusive/ 6)
>
> It also makes forming reusable slices clearer and easier:
>>
>> my_slice = (:6:2) # instead of slice(None, 6, 2)
>> my_list[my_slice]
>>
>>
> I don't have exact equivalent here, I would create a function or
> explicitly say slice(0, 6, 2)
>
> This is similar to passing a range/slice object into the respective
>> constructor:
>>
>>
>> [1:6] # list(1:6) or [1, 2, 3, 4, 5]
>> {1:6} # set(1:6) or {1, 2, 3, 4, 5}
>>
>>
> As mentioned before {1:6} is a dict.
>
> Here are a few more examples:
>>
>>
>> for i in (:5): # 5 elements 0 to 4, i.e. range(5)
>>
>> print(i**2)
>>
>>
> Everybody knows i in range(5).
>
>
>>  for i in (1:): # counts up from one for as long as you want, i.e.
>> count(1)
>>
>>
> Well, count(1) is nice and people can google it.
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20181112/4d5664fb/attachment-0001.html>

From nicholasharrison222 at gmail.com  Mon Nov 12 10:42:33 2018
From: nicholasharrison222 at gmail.com (Nicholas Harrison)
Date: Mon, 12 Nov 2018 08:42:33 -0700
Subject: [Python-ideas] Range and slice syntax
In-Reply-To: <CAMqRfuYXE4c1mdh5mPo51n1yeyFqcVR9++=uG+E0hVTykYL2Xw@mail.gmail.com>
References: <CALOoFeOgTMWChxeF34pWF3Ux4xOei+F37YhP240bkWSeZQ=E7w@mail.gmail.com>
 <CA+msPNm=UAF8s41oYHx+fUnYj+UEbQXc6TV8EZn_-mENsMjWWA@mail.gmail.com>
 <CAMqRfuYXE4c1mdh5mPo51n1yeyFqcVR9++=uG+E0hVTykYL2Xw@mail.gmail.com>
Message-ID: <CALOoFeO7nzEh+COGWXvydytJ3ZUOq==Z13oiEVS-DVZi754ihw@mail.gmail.com>

That's true. I should clarify what I was thinking a bit more. Maybe it's
better to say that the new syntax creates a slice object:

(::) # this creates slice(None, None, None)

It accepts any object into its arguments and they default to None when they
are left off. This can be passed into list indexing and used as a slice.
The new addition is that slice is now iterable:

iter(slice(None, None, None)) # becomes valid

Only when this is called (implicitly or explicitly) do checks for valid
objects and bounds occur. From my experience using slices, this is how they
work in that context too.

my_slice = slice('what?') # slice(None, 'what?', None)

my_list[my_slice] # TypeError: slice indices must be integers or None or
have an __index__ method

# similarly

iter(my_slice) # TypeError: slice indices must be integers or None or have
an __index__ method


I still may not understand slices well enough though.

On Sun, Nov 11, 2018 at 5:13 AM Vladimir Filipovi? <hemflit at gmail.com>
wrote:

> On Sun, Nov 11, 2018 at 6:59 AM Nicholas Harrison
> <nicholasharrison222 at gmail.com> wrote:
>
> > Any of the values may be omitted and in the slice context the behavior
> has no changes from what it already does: start and stop default to the
> beginning or end of the list depending on direction and the step defaults
> to 1.
>
> Just to point out, with slices it's a bit more complicated than that
> currently.
>
> The start, stop and step values each default to None.
>
> When slice-indexing built-in and (all? probably, not sure)
> standard-library types, None values for start and stop are interpreted
> consistently with what you described as defaults.
> A None value for step is interpreted as either 1 or -1, depending on
> the comparison of start and step, and accounting for None values in
> either of them too.
>
> ------
>
> In real life I've found a use for non-integer slice objects, and been
> happy that Python allowed me to treat the slice as a purely syntactic
> construct whose semantics (outside builtins) are not fixed.
>
> My case was an interface to an external sparse time-series store, and
> it was easy to make the objects indexable with [datetime1 : datetime2
> : timedelta], with None's treated right etc.
>
> (The primary intended use was in a REPL in a data-science context, so
> if your first thought was a doubt about whether that syntax is neat or
> abusive, please compare it to numpy or pandas idioms, not to
> collection classes you use in server or application code.)
>
> If this had not been syntactically possible, it would not have been a
> great pain to have to work around it, but now it's existing code and I
> can imagine other existing projects adapting the slice syntax to their
> own needs. At first blush, it seems like your proposal would give
> slices enough compulsory semantics to break some of such existing code
> - maybe even numpy itself.
>
> (FWIW, I've also occasionally had a need for non-integer ranges, and
> chafed at having to implement or install them. I've also missed
> hashable slices in real life, because functools.lru_cache.)
>
> ------
>
> (Note I'm just a random person commenting on the mailing list, not
> anybody with any authority or influence.)
>
> I find this recurring idea of unifying slices and ranges seductive.
> But it would take a lot more shaking-out to make sure the range
> semantics can be vague-ified enough that they don't break non-integer
> slice usage.
>
> Also, I could imagine some disagreements about exactly how much
> non-standard slice usage should be protected from breakage. Someone
> could make the argument that _some_ objects as slice parameters are
> just abuse and no sane person should have used them in the first
> place. ("Really, slicing with [int : [[sys], ...] : __import__]? We
> need to take care to not break THAT too?")
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> https://mail.python.org/mailman/listinfo/python-ideas
> Code of Conduct: http://python.org/psf/codeofconduct/
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20181112/e8b293a3/attachment.html>

From mertz at gnosis.cx  Mon Nov 12 11:23:21 2018
From: mertz at gnosis.cx (David Mertz)
Date: Mon, 12 Nov 2018 11:23:21 -0500
Subject: [Python-ideas] Range and slice syntax
In-Reply-To: <CALOoFeO7nzEh+COGWXvydytJ3ZUOq==Z13oiEVS-DVZi754ihw@mail.gmail.com>
References: <CALOoFeOgTMWChxeF34pWF3Ux4xOei+F37YhP240bkWSeZQ=E7w@mail.gmail.com>
 <CA+msPNm=UAF8s41oYHx+fUnYj+UEbQXc6TV8EZn_-mENsMjWWA@mail.gmail.com>
 <CAMqRfuYXE4c1mdh5mPo51n1yeyFqcVR9++=uG+E0hVTykYL2Xw@mail.gmail.com>
 <CALOoFeO7nzEh+COGWXvydytJ3ZUOq==Z13oiEVS-DVZi754ihw@mail.gmail.com>
Message-ID: <CAEbHw4Zs3C-N2w10MLJa5WH=UJUJCuioCnyskiuxm=6Z-YoFfw@mail.gmail.com>

I mostly like the abstraction being proposed, but the syntactical edge
cases like `[::3]` (infinite list crashes) and {4:10} (a dict not a
slice/range set) tip the balance against it for me.  Saying to add some
various stars and parens in various not-really-obvious places just makes
the proposal way too much special casing.

Moreover, we can get what we want without new syntax.  Or at least the bulk
of it.  Both Pandas and NumPy offer special accessors for slices that look
the way you'd like:

>>> import pandas as pd
>>> import numpy as np
>>> I = pd.IndexSlice
>>> J = np.s_
>>> I[4:10:3]
slice(4, 10, 3)
>>> J[4:10:3]
slice(4, 10, 3)


These are incredibly simple classes, but they are worth including because
many programmers will forget how to write their own.

You don't get your range-like behavior with those, but it's easy to
construct. I'm having a think-o. I think it should be possible to make a
RangeSlice class that will act like an enhanced version of pd.IndexSlice,
but my try was wrong.

But simpler:

>>> def R(sl):
...     return range(sl.start, sl.stop, sl.step or sys.maxsize)
...
>>> for i in R(I[4:10:3]):
...     print(i)
...
4
7


Someone should figure out how to make that simply `RS[4:10:3]` that will
act both ways. :-)

On Mon, Nov 12, 2018 at 10:44 AM Nicholas Harrison <
nicholasharrison222 at gmail.com> wrote:

> That's true. I should clarify what I was thinking a bit more. Maybe it's
> better to say that the new syntax creates a slice object:
>
> (::) # this creates slice(None, None, None)
>
> It accepts any object into its arguments and they default to None when
> they are left off. This can be passed into list indexing and used as a
> slice. The new addition is that slice is now iterable:
>
> iter(slice(None, None, None)) # becomes valid
>
> Only when this is called (implicitly or explicitly) do checks for valid
> objects and bounds occur. From my experience using slices, this is how they
> work in that context too.
>
> my_slice = slice('what?') # slice(None, 'what?', None)
>
> my_list[my_slice] # TypeError: slice indices must be integers or None or
> have an __index__ method
>
> # similarly
>
> iter(my_slice) # TypeError: slice indices must be integers or None or have
> an __index__ method
>
>
> I still may not understand slices well enough though.
>
> On Sun, Nov 11, 2018 at 5:13 AM Vladimir Filipovi? <hemflit at gmail.com>
> wrote:
>
>> On Sun, Nov 11, 2018 at 6:59 AM Nicholas Harrison
>> <nicholasharrison222 at gmail.com> wrote:
>>
>> > Any of the values may be omitted and in the slice context the behavior
>> has no changes from what it already does: start and stop default to the
>> beginning or end of the list depending on direction and the step defaults
>> to 1.
>>
>> Just to point out, with slices it's a bit more complicated than that
>> currently.
>>
>> The start, stop and step values each default to None.
>>
>> When slice-indexing built-in and (all? probably, not sure)
>> standard-library types, None values for start and stop are interpreted
>> consistently with what you described as defaults.
>> A None value for step is interpreted as either 1 or -1, depending on
>> the comparison of start and step, and accounting for None values in
>> either of them too.
>>
>> ------
>>
>> In real life I've found a use for non-integer slice objects, and been
>> happy that Python allowed me to treat the slice as a purely syntactic
>> construct whose semantics (outside builtins) are not fixed.
>>
>> My case was an interface to an external sparse time-series store, and
>> it was easy to make the objects indexable with [datetime1 : datetime2
>> : timedelta], with None's treated right etc.
>>
>> (The primary intended use was in a REPL in a data-science context, so
>> if your first thought was a doubt about whether that syntax is neat or
>> abusive, please compare it to numpy or pandas idioms, not to
>> collection classes you use in server or application code.)
>>
>> If this had not been syntactically possible, it would not have been a
>> great pain to have to work around it, but now it's existing code and I
>> can imagine other existing projects adapting the slice syntax to their
>> own needs. At first blush, it seems like your proposal would give
>> slices enough compulsory semantics to break some of such existing code
>> - maybe even numpy itself.
>>
>> (FWIW, I've also occasionally had a need for non-integer ranges, and
>> chafed at having to implement or install them. I've also missed
>> hashable slices in real life, because functools.lru_cache.)
>>
>> ------
>>
>> (Note I'm just a random person commenting on the mailing list, not
>> anybody with any authority or influence.)
>>
>> I find this recurring idea of unifying slices and ranges seductive.
>> But it would take a lot more shaking-out to make sure the range
>> semantics can be vague-ified enough that they don't break non-integer
>> slice usage.
>>
>> Also, I could imagine some disagreements about exactly how much
>> non-standard slice usage should be protected from breakage. Someone
>> could make the argument that _some_ objects as slice parameters are
>> just abuse and no sane person should have used them in the first
>> place. ("Really, slicing with [int : [[sys], ...] : __import__]? We
>> need to take care to not break THAT too?")
>> _______________________________________________
>> Python-ideas mailing list
>> Python-ideas at python.org
>> https://mail.python.org/mailman/listinfo/python-ideas
>> Code of Conduct: http://python.org/psf/codeofconduct/
>>
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> https://mail.python.org/mailman/listinfo/python-ideas
> Code of Conduct: http://python.org/psf/codeofconduct/
>


-- 
Keeping medicines from the bloodstreams of the sick; food
from the bellies of the hungry; books from the hands of the
uneducated; technology from the underdeveloped; and putting
advocates of freedom in prisons.  Intellectual property is
to the 21st century what the slave trade was to the 16th.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20181112/7f0196f5/attachment-0001.html>

From nicholasharrison222 at gmail.com  Mon Nov 12 11:31:45 2018
From: nicholasharrison222 at gmail.com (Nicholas Harrison)
Date: Mon, 12 Nov 2018 09:31:45 -0700
Subject: [Python-ideas] Range and slice syntax
In-Reply-To: <CALOoFeOgTMWChxeF34pWF3Ux4xOei+F37YhP240bkWSeZQ=E7w@mail.gmail.com>
References: <CALOoFeOgTMWChxeF34pWF3Ux4xOei+F37YhP240bkWSeZQ=E7w@mail.gmail.com>
Message-ID: <CALOoFePcE6=5Ht0Xe7XcW6Wgdh5-MZpZGV5=ngBMYrhEdj4y-g@mail.gmail.com>

For sake of completeness, here is another possible problem I've found with
it.

I was afraid of making something context-dependent, and therefore breaking
its consistency. Here is the use of slices in the current language that
breaks my rules:

my_array[:,2] # valid syntax, though I've typically only seen it used in
numpy

This is interpreted as a tuple of a slice object and an integer. But with
the new syntax, the slice has to be surrounded by parentheses if not the
sole element in brackets:

my_array[(:),2]

This is a sticking point and would destroy backwards compatibility. I
realize that the context-dependence is due to the behavior of the current
slice syntax. For example,

my_array[(:,2)]

is invalid even though it looks like a tuple of a slice object and an
integer.

So I'm actually OK with parentheses not being mandatory in a slicing role,
since that is already correct syntax in the current language. This might be
a reconciliation. That would mean parentheses are not required when it is
the sole argument to a function or when it is used inside indexing.

my_array[:,2] # fine

my_indexing = (:), 2 # parentheses required

my_array[((:), 2)] # same here, but no need to do this

This actually makes building up index expressions easier and gives a
possible solution to the recent slice literals discussion.
Maybe this should be the only context when parentheses aren't required, so
you know you're dealing with an object inside of a function call.

sum((1:6))


Anyways, just some more thoughts.

On Sat, Nov 10, 2018 at 10:58 PM Nicholas Harrison <
nicholasharrison222 at gmail.com> wrote:

> I'm aware that syntax for ranges and slices has been discussed a good
> amount over the years, but I wanted to float an idea out there to see if it
> hasn't been considered before. It's not really original. Rather, it's a
> combination of a couple parts of Python, and I find it
> fascinatingly-consistent with the rest of the language. This will look
> similar to PEP 204, but there are some important differences and
> clarifications.
>
> (start:stop:step)
>
>
> Meet a range/slice object. Parentheses are required. (Its syntax in this
> regard follows exactly the same rules as a generator expression.) I say
> both range and slice because it can be used in either role. On the one
> hand, it is iterable and functions exactly like range(start, stop, step) in
> those contexts. On the other, it can also be passed into list indexing
> exactly like slice(start, stop, step). This is a proposal that range and
> slice are really the same thing, just in different contexts.
>
> Why is it useful? I at least find its syntax to be simple, intuitive, and
> concise -- more so than the range(...) or slice(...) alternatives. It's
> quite obvious for an experienced Python user and just as simple to pick up
> as slice notation for a beginner (since it *is* slice notation).
>
> It condenses and clears up sometimes-cumbersome range expressions. A
> couple examples:
>
>
> sum(1:6) # instead of sum(range(1, 6))
>
> list(1:6)
>
>
> for i in (1:6):
>
> print(i**2)
>
>
> (i**2 for i in (1:6))
>
>
> It also makes forming reusable slices clearer and easier:
>
> my_slice = (:6:2) # instead of slice(None, 6, 2)
> my_list[my_slice]
>
>
> It has a couple of siblings that should be obvious (think list or set
> comprehension):
>
> [start:stop:step] # gives a list
> {start:stop:step} # gives a set
>
>
> This is similar to passing a range/slice object into the respective
> constructor:
>
>
> [1:6] # list(1:6) or [1, 2, 3, 4, 5]
> {1:6} # set(1:6) or {1, 2, 3, 4, 5}
>
>
> Note that the parentheses aren't needed when it is the only argument of a
> function call or is the only element within brackets or braces. It takes on
> its respective roles for these bracket and brace cases, just like
> comprehensions. This also gives rise to the normal slice syntax:
>
> my_list[1:6:2] # What is inside the brackets is a slice object.
> my_list[(1:6:2)] # Equivalent. The parentheses are valid but unnecessary.
>
>
> So here's the part that requires a little more thought. Any of the values
> may be omitted and in the slice context the behavior has no changes from
> what it already does: start and stop default to the beginning or end of the
> list depending on direction and the step defaults to 1. In the range
> context, we simply repeat these semantics, but noting that there is no
> longer a beginning or end of a list.
>
> Step defaults to 1 (just like range or slice).
> Start defaults to 0 when counting up and -1 when counting down (just like
> slice).
> If stop is omitted, the object will act like an itertools.count object,
> counting indefinitely.
>
> I have found infinite iteration to be a natural and oft-desired extension
> to a range object, but I can understand that some may want it to remain
> separate and pure within itertools. I also admit that the ability to form
> an infinite list with only three characters can be a scary thought (though
> we are all adults here, right? ;). Normally you have to take a couple extra
> keystrokes:
>
> from itertools import count
> list(count())
> # rather than just [:]
>
>
> If that is the case, then raising an error when iter() is called on a
> range/slice object with no stop value could be another acceptable course of
> action. The syntax will still be left valid.
>
> And that's mainly it. Slice is iterable or range is "indexable" and the
> syntax can be used anywhere successive values are desired. If you want to
> know what it does or how to use it in some case, just think, "what would a
> slice object do?" or "what would a range object do?" or "how would I write
> a generator expression/list comprehension here?".
>
> Here are a few more examples:
>
>
> for i in (:5): # 5 elements 0 to 4, i.e. range(5)
>
> print(i**2)
>
>
>  for i in (1:): # counts up from one for as long as you want, i.e. count(1)
>
> print(i**2)
>
> if i == 5: break
>
> it = iter(:) # a convenient usage for an infinite counter
>
> next(it)
>
>
> ' '.join(map(str, (:5:2))) # gives '0 2 4'
>
> [(:5), (5:10)] # list of range/slice objects
> [[:5], [5:10]] # list of lists
> [*(:5), *(5:10)] # uses unpacking to get flat list
> [*[:5], *[5:10]] # same unpacking to get flat list
>
>
> Otherwise you'd have to do:
>
> [list(range(5)), list(range(5, 10))] # list of lists
> [*range(5), *range(5, 10)] # flat list
>
>
> Tuples:
>
> tuple(1:6:2) # (1, 3, 5)
> *(1:6:2), # same
>
>
> I don't actually have experience developing the interpreter and underlying
> workings of Python, so I don't know how much of a change this requires. I
> thought it might be possible since the constructs already exist in the
> language. They just haven't been unified yet. I also realize that there are
> a few other use-cases that need to be ironed out. The syntax might also be
> too minimal in some cases to be obvious. One of the trickiest things may be
> what it will be called, since the current language has the two different
> terms.
>
> In the end it's just another range/slice idea, and the idea has probably
> already been proposed sometime in the past few decades, but what thoughts
> are there?
>
> - Nicholas
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20181112/980621ec/attachment-0001.html>

From mike at selik.org  Mon Nov 12 14:36:59 2018
From: mike at selik.org (Michael Selik)
Date: Mon, 12 Nov 2018 11:36:59 -0800
Subject: [Python-ideas] Proposing additions to the standard library
In-Reply-To: <CAAj5ZtmHhSBJoaHyfS_QX4NSO0g0kp4=1rJv+WT=Z=L0HLffxg@mail.gmail.com>
References: <CAAj5Ztk_RUw7_BvaEGyK07FdePbGCwpUbsth=JAhLoG3iP1Hmw@mail.gmail.com>
 <20181111021400.GU4071@ando.pearwood.info>
 <CAAj5ZtmHhSBJoaHyfS_QX4NSO0g0kp4=1rJv+WT=Z=L0HLffxg@mail.gmail.com>
Message-ID: <CAGgTfkNHBQbkbT0om7p_=_mXgKFKgX1eTEDXXTK107YhE10CHQ@mail.gmail.com>

On Sat, Nov 10, 2018 at 6:56 PM Jonathan Crall <erotemic at gmail.com> wrote:

> Sometimes there's a good, useful function than doesn't get added because
>> there's no reasonable place to put it. For example, a "flatten" function
>> has been talked about since Python 1.x days, and we still don't have a
>> standard solution for it, because (1) it isn't clear *precisely* what it
>> should do, and (2) it isn't clear where it should go.
>
>
> The flatten example is good to know about. Is there a link to this
> discussion or a summary of it? I would think flatten could go in itertools,
> but clearly there must some reason why its not there. I imagine the
> duplication with it.chain.from_iter + "There should be one-- and preferably
> only one --obvious way to do it."?
>

https://docs.python.org/3/library/itertools.html#itertools-recipes
There's an example of ``flatten`` in the itertools recipes.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20181112/976b0a46/attachment.html>

From prometheus235 at gmail.com  Mon Nov 12 14:50:18 2018
From: prometheus235 at gmail.com (Nick Timkovich)
Date: Mon, 12 Nov 2018 19:50:18 +0000
Subject: [Python-ideas] Proposing additions to the standard library
In-Reply-To: <CAGgTfkNHBQbkbT0om7p_=_mXgKFKgX1eTEDXXTK107YhE10CHQ@mail.gmail.com>
References: <CAAj5Ztk_RUw7_BvaEGyK07FdePbGCwpUbsth=JAhLoG3iP1Hmw@mail.gmail.com>
 <20181111021400.GU4071@ando.pearwood.info>
 <CAAj5ZtmHhSBJoaHyfS_QX4NSO0g0kp4=1rJv+WT=Z=L0HLffxg@mail.gmail.com>
 <CAGgTfkNHBQbkbT0om7p_=_mXgKFKgX1eTEDXXTK107YhE10CHQ@mail.gmail.com>
Message-ID: <CAHkxivd7frR4HLBC8UeDHCf=60fRMqWJwPgc2=aonUU19=wTgg@mail.gmail.com>

Not to derail the conversation, but I've always been curious why the
itertools recipes are recipes and not ready-made goods (pre-baked?) that I
can just consume. They're great examples to draw from, but that shouldn't
preclude them from also being in the stdlib.

On Mon, Nov 12, 2018 at 7:41 PM Michael Selik <mike at selik.org> wrote:

>
>
> On Sat, Nov 10, 2018 at 6:56 PM Jonathan Crall <erotemic at gmail.com> wrote:
>
>> Sometimes there's a good, useful function than doesn't get added because
>>> there's no reasonable place to put it. For example, a "flatten" function
>>> has been talked about since Python 1.x days, and we still don't have a
>>> standard solution for it, because (1) it isn't clear *precisely* what it
>>> should do, and (2) it isn't clear where it should go.
>>
>>
>> The flatten example is good to know about. Is there a link to this
>> discussion or a summary of it? I would think flatten could go in itertools,
>> but clearly there must some reason why its not there. I imagine the
>> duplication with it.chain.from_iter + "There should be one-- and preferably
>> only one --obvious way to do it."?
>>
>
> https://docs.python.org/3/library/itertools.html#itertools-recipes
> There's an example of ``flatten`` in the itertools recipes.
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> https://mail.python.org/mailman/listinfo/python-ideas
> Code of Conduct: http://python.org/psf/codeofconduct/
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20181112/59c34344/attachment.html>

From ericfahlgren at gmail.com  Mon Nov 12 15:23:03 2018
From: ericfahlgren at gmail.com (Eric Fahlgren)
Date: Mon, 12 Nov 2018 12:23:03 -0800
Subject: [Python-ideas] Proposing additions to the standard library
In-Reply-To: <CAHkxivd7frR4HLBC8UeDHCf=60fRMqWJwPgc2=aonUU19=wTgg@mail.gmail.com>
References: <CAAj5Ztk_RUw7_BvaEGyK07FdePbGCwpUbsth=JAhLoG3iP1Hmw@mail.gmail.com>
 <20181111021400.GU4071@ando.pearwood.info>
 <CAAj5ZtmHhSBJoaHyfS_QX4NSO0g0kp4=1rJv+WT=Z=L0HLffxg@mail.gmail.com>
 <CAGgTfkNHBQbkbT0om7p_=_mXgKFKgX1eTEDXXTK107YhE10CHQ@mail.gmail.com>
 <CAHkxivd7frR4HLBC8UeDHCf=60fRMqWJwPgc2=aonUU19=wTgg@mail.gmail.com>
Message-ID: <CAP2Qz+X8a3w3+pCc6aztewOuTHcB-VXJOJx2ORgLhsQZdGOKeA@mail.gmail.com>

My intuition has always been that the recipes, taking 'flatten' as an
excellent example, solve problems in a specific way that is not generally
considered to be the "right" way.  For example, should 'flatten' perform
one-level flattening or deep recursive flattening?  Should it handle
strings as single entities, or should it treat them as iterables?  What
about byte strings, should they be treated differently than strings or the
same?  I could go on, but you probably get the point...


On Mon, Nov 12, 2018 at 11:50 AM Nick Timkovich <prometheus235 at gmail.com>
wrote:

> Not to derail the conversation, but I've always been curious why the
> itertools recipes are recipes and not ready-made goods (pre-baked?) that I
> can just consume. They're great examples to draw from, but that shouldn't
> preclude them from also being in the stdlib.
>
> On Mon, Nov 12, 2018 at 7:41 PM Michael Selik <mike at selik.org> wrote:
>
>>
>>
>> On Sat, Nov 10, 2018 at 6:56 PM Jonathan Crall <erotemic at gmail.com>
>> wrote:
>>
>>> Sometimes there's a good, useful function than doesn't get added because
>>>> there's no reasonable place to put it. For example, a "flatten"
>>>> function
>>>> has been talked about since Python 1.x days, and we still don't have a
>>>> standard solution for it, because (1) it isn't clear *precisely* what
>>>> it
>>>> should do, and (2) it isn't clear where it should go.
>>>
>>>
>>> The flatten example is good to know about. Is there a link to this
>>> discussion or a summary of it? I would think flatten could go in itertools,
>>> but clearly there must some reason why its not there. I imagine the
>>> duplication with it.chain.from_iter + "There should be one-- and preferably
>>> only one --obvious way to do it."?
>>>
>>
>> https://docs.python.org/3/library/itertools.html#itertools-recipes
>> There's an example of ``flatten`` in the itertools recipes.
>> _______________________________________________
>> Python-ideas mailing list
>> Python-ideas at python.org
>> https://mail.python.org/mailman/listinfo/python-ideas
>> Code of Conduct: http://python.org/psf/codeofconduct/
>>
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> https://mail.python.org/mailman/listinfo/python-ideas
> Code of Conduct: http://python.org/psf/codeofconduct/
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20181112/23d8735f/attachment-0001.html>

From steve at pearwood.info  Mon Nov 12 16:10:06 2018
From: steve at pearwood.info (Steven D'Aprano)
Date: Tue, 13 Nov 2018 08:10:06 +1100
Subject: [Python-ideas] Range and slice syntax
In-Reply-To: <CAEbHw4Zs3C-N2w10MLJa5WH=UJUJCuioCnyskiuxm=6Z-YoFfw@mail.gmail.com>
References: <CALOoFeOgTMWChxeF34pWF3Ux4xOei+F37YhP240bkWSeZQ=E7w@mail.gmail.com>
 <CA+msPNm=UAF8s41oYHx+fUnYj+UEbQXc6TV8EZn_-mENsMjWWA@mail.gmail.com>
 <CAMqRfuYXE4c1mdh5mPo51n1yeyFqcVR9++=uG+E0hVTykYL2Xw@mail.gmail.com>
 <CALOoFeO7nzEh+COGWXvydytJ3ZUOq==Z13oiEVS-DVZi754ihw@mail.gmail.com>
 <CAEbHw4Zs3C-N2w10MLJa5WH=UJUJCuioCnyskiuxm=6Z-YoFfw@mail.gmail.com>
Message-ID: <20181112211006.GD4071@ando.pearwood.info>

On Mon, Nov 12, 2018 at 11:23:21AM -0500, David Mertz wrote:

> >>> import pandas as pd
> >>> import numpy as np
> >>> I = pd.IndexSlice
> >>> J = np.s_
> >>> I[4:10:3]
> slice(4, 10, 3)

I'm not entirely sure that I like the idea of conflating slice 
*constructor* with slice *usage*. Slice objects are objects, like any 
other, and I'm not convinced that overloading slice syntax to create 
slice objects is a good design.

I'm pretty sure it would have confused the hell out of me as a beginner 
to learn that mylist[1::2] took a slice and [1::2] make a slice.

But at least numpy and pandas has the virtual of needing a prefix to 
make it work.


> You don't get your range-like behavior with those, but it's easy to
> construct. I'm having a think-o. I think it should be possible to make a
> RangeSlice class that will act like an enhanced version of pd.IndexSlice,
> but my try was wrong.

Just because we can do something, doesn't mean we should.


-- 
Steve

From sf at fermigier.com  Tue Nov 13 02:04:54 2018
From: sf at fermigier.com (=?UTF-8?Q?St=C3=A9fane_Fermigier?=)
Date: Tue, 13 Nov 2018 08:04:54 +0100
Subject: [Python-ideas] Proposing additions to the standard library
In-Reply-To: <CAAj5Ztk_RUw7_BvaEGyK07FdePbGCwpUbsth=JAhLoG3iP1Hmw@mail.gmail.com>
References: <CAAj5Ztk_RUw7_BvaEGyK07FdePbGCwpUbsth=JAhLoG3iP1Hmw@mail.gmail.com>
Message-ID: <CABuJJj7=jqQLN5+cOp3nxxceWCP_Woi0cNHC47TWS2VnLxvveA@mail.gmail.com>

Are you aware of https://boltons.readthedocs.io/ (whose motto is
"Functionality that should be in the standard library.") ?

Or similar endeavours such as:

- https://pypi.org/project/auxlib/
- https://pypi.org/project/omakase/
- (And probably many others on PyPI with similar descriptions such as "a
library of stuff I'm using / we're using at company X for all my / our
project(s)...")

Or the functional libraries listed here:
https://github.com/sfermigier/awesome-functional-python/blob/master/README.md#libraries

=> IMHO there is room for a "semi-standard" library, stuff that's not
included by default and has a release lifecycle more active that Python
itself, but that can be considered the standard by a large group of users.

Similar ideas can be found for instance in Java with Apache Commons (
https://commons.apache.org/ -> "an Apache project focused on all aspects of
reusable Java components."). One could argue, though, that the Java
standard library is much less developed than the Python standard library,
so it's much easier to justify the existence of Apache Commons than a
similar Python project.

There is also the question of the porosity between such a project and the
stdlib, which is the essence of the original question by the OP.

Another interesting issue is the granularity of such a project. I
sometimes, and somewhat foolishly, make libraries such as toolz or boltons
a dependency of my projects, for just one or two function calls from my
code.

Regards,

  S.


On Sun, Nov 11, 2018 at 2:37 AM Jonathan Crall <erotemic at gmail.com> wrote:

> I'm interested in proposing several additions to the Python standard
> library, and I would like more information on the procedure for doing so.
> Are all additions done via a PEP? If not what is the procedure. If so, I've
> read that the first step was to email this board and get feedback.
>
> I have a library called `ubelt` that contains several tools that I think
> might be worthy of adding to the standard library.
>
> Here's my bullet point pitch:
>
>    - Python is batteries included. Ubelt contains extra batteries.
>    function are extra batteries.
>    - Most function in ubelt are fast. All 222 tests takes 7.33 seconds.
>    - Ubelt has 100% test coverage (sans `# nocover` locations).
>    - I'm only championing a subset of the functions in ubelt. There are
>    certainly functions in there that do not belong in the standard library.
>    - I have a Jupyter notebook that give a demo of some select functions
>    (not necessarily the same as the ones proposed here):
>    https://github.com/Erotemic/ubelt/blob/master/docs/notebooks/Ubelt%20Demo.ipynb
>    - I do have documentation (mostly in docstrings) and in the docs
>    folder, but I've been having trouble auto-updating read-the-docs. Here is
>    the link anyway: https://ubelt.readthedocs.io/en/latest/
>
> Here is a tentative list of interesting functions. Hopefully the names are
> descriptive (if not, see docstrings: https://github.com/Erotemic/ubelt)
>
> ub.cmd
>
> ub.compressuser
>
> ub.group_items
>
> ub.dict_hist
>
> ub.find_duplicates
>
> ub.AutoDict
>
> ub.import_module_from_path
>
> ub.import_module_from_name
>
> ub.modname_to_modpath,
>
> ub.modpath_to_modname
>
> ub.ProgIter
>
> ub.ensuredir
>
> ub.expandpath
>
>
> almost everything in util_list:
>
> allsame, argmax, argmin, argsort, argunique,
>
> chunks, flatten, iter_window, take, unique
>
>
> These functions might be worth modifying into dictionary methods:
>
> ub.dict_subset
>
> ub.dict_take
>
> ub.map_vals
>
> ub.map_keys
>
> ub.Timerit
>
> ub.Timer
>
>
>
> Because I built the library, I tend to like all the functions. Its
> difficult to decide if they are stdlib worthy, so there might be some false
> positives / negatives.
>
> I'm on the fence about:
> CacheStamp, Cacher, NoParam, argflag, argval, dzip, delete, hash_data,
> hash_file, memoize, memoize_method, NiceRepr, augpath, userhome,
> ensure_app_cache_dir, ensure_app_resource_dir, find_exe, find_path,
> get_app_cache_dir, get_app_resource_dir, platform_cache_dir,
>  platform_resource_dir, CaptureStdout, codeblock, ensure_unicode, hzcat,
>  indent, OrderedSet
>
>
> Its my hope that some of these are actually useful. Let me know any of the
> following: what you think, if there are any questions, if something else
> needs to be done, or what the next steps are.
>
> --
> -Jon
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> https://mail.python.org/mailman/listinfo/python-ideas
> Code of Conduct: http://python.org/psf/codeofconduct/
>


-- 
Stefane Fermigier - http://fermigier.com/ - http://twitter.com/sfermigier -
http://linkedin.com/in/sfermigier
Founder & CEO, Abilian - Enterprise Social Software -
http://www.abilian.com/
Chairman, Free&OSS Group @ Systematic Cluster -
https://systematic-paris-region.org/fr/groupe-thematique-logiciel-libre/
Co-Chairman, National Council for Free & Open Source Software (CNLL) -
http://cnll.fr/
Founder & Organiser, PyParis & PyData Paris - http://pyparis.org/ &
http://pydata.fr/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20181113/bbdd2ffa/attachment-0001.html>

From hemflit at gmail.com  Tue Nov 13 12:49:03 2018
From: hemflit at gmail.com (=?UTF-8?Q?Vladimir_Filipovi=C4=87?=)
Date: Tue, 13 Nov 2018 18:49:03 +0100
Subject: [Python-ideas] Range and slice syntax
In-Reply-To: <CALOoFeO7nzEh+COGWXvydytJ3ZUOq==Z13oiEVS-DVZi754ihw@mail.gmail.com>
References: <CALOoFeOgTMWChxeF34pWF3Ux4xOei+F37YhP240bkWSeZQ=E7w@mail.gmail.com>
 <CA+msPNm=UAF8s41oYHx+fUnYj+UEbQXc6TV8EZn_-mENsMjWWA@mail.gmail.com>
 <CAMqRfuYXE4c1mdh5mPo51n1yeyFqcVR9++=uG+E0hVTykYL2Xw@mail.gmail.com>
 <CALOoFeO7nzEh+COGWXvydytJ3ZUOq==Z13oiEVS-DVZi754ihw@mail.gmail.com>
Message-ID: <CAMqRfua1CdcVOCBBihX56_DSQvchdnoCKJup+P_UgQkpQtdFRw@mail.gmail.com>

On Mon, Nov 12, 2018 at 4:43 PM Nicholas Harrison
<nicholasharrison222 at gmail.com> wrote:
> Only when this is called (implicitly or explicitly) do checks for valid objects and bounds occur. From my experience using slices, this is how they work in that context too.

On reconsideration, I've found one more argument in favour of (at
least this aspect of?) the proposal: the slice.indices method, which
takes a sequence's length and returns an iterable (range) of all
indices of such a sequence that would be "selected" by the slice. Not
sure if it's supposed to be documented.

So there is definitely precedent for "though slices in general are
primarily a syntactic construct and new container-like classes can
choose any semantics for indexing with them, the semantics
specifically in the context of sequences have a bit of a privileged
place in the language with concrete expectations, including strictly
integer (or None) attributes".

From allemang.d at gmail.com  Tue Nov 13 13:13:12 2018
From: allemang.d at gmail.com (David Allemang)
Date: Tue, 13 Nov 2018 13:13:12 -0500
Subject: [Python-ideas] Range and slice syntax
In-Reply-To: <CAMqRfua1CdcVOCBBihX56_DSQvchdnoCKJup+P_UgQkpQtdFRw@mail.gmail.com>
References: <CALOoFeOgTMWChxeF34pWF3Ux4xOei+F37YhP240bkWSeZQ=E7w@mail.gmail.com>
 <CA+msPNm=UAF8s41oYHx+fUnYj+UEbQXc6TV8EZn_-mENsMjWWA@mail.gmail.com>
 <CAMqRfuYXE4c1mdh5mPo51n1yeyFqcVR9++=uG+E0hVTykYL2Xw@mail.gmail.com>
 <CALOoFeO7nzEh+COGWXvydytJ3ZUOq==Z13oiEVS-DVZi754ihw@mail.gmail.com>
 <CAMqRfua1CdcVOCBBihX56_DSQvchdnoCKJup+P_UgQkpQtdFRw@mail.gmail.com>
Message-ID: <CAD9w1hCo1yqsyYLu6NoZRTKaVfAO0iZuPZ6BEXRFK96pp_J4qg@mail.gmail.com>

That is not what slice.indices does. Per help(slice.indices) -

"S.indices(len) -> (start, stop, stride)

"Assuming a sequence of length len, calculate the start and stop indices,
and the stride length of the extended slice described by S. Out of bounds
indices are clipped in a manner consistent with handling of normal slices.

Essentially, it returns (S.start, len, S.step), with start and stop
adjusted to prevent out-of-bounds indices.

On Tue, Nov 13, 2018, 12:50 PM Vladimir Filipovi? <hemflit at gmail.com wrote:

> On Mon, Nov 12, 2018 at 4:43 PM Nicholas Harrison
> <nicholasharrison222 at gmail.com> wrote:
> > Only when this is called (implicitly or explicitly) do checks for valid
> objects and bounds occur. From my experience using slices, this is how they
> work in that context too.
>
> On reconsideration, I've found one more argument in favour of (at
> least this aspect of?) the proposal: the slice.indices method, which
> takes a sequence's length and returns an iterable (range) of all
> indices of such a sequence that would be "selected" by the slice. Not
> sure if it's supposed to be documented.
>
> So there is definitely precedent for "though slices in general are
> primarily a syntactic construct and new container-like classes can
> choose any semantics for indexing with them, the semantics
> specifically in the context of sequences have a bit of a privileged
> place in the language with concrete expectations, including strictly
> integer (or None) attributes".
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> https://mail.python.org/mailman/listinfo/python-ideas
> Code of Conduct: http://python.org/psf/codeofconduct/
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20181113/9111c83e/attachment.html>

From rosuav at gmail.com  Tue Nov 13 13:18:54 2018
From: rosuav at gmail.com (Chris Angelico)
Date: Wed, 14 Nov 2018 05:18:54 +1100
Subject: [Python-ideas] Range and slice syntax
In-Reply-To: <CAD9w1hCo1yqsyYLu6NoZRTKaVfAO0iZuPZ6BEXRFK96pp_J4qg@mail.gmail.com>
References: <CALOoFeOgTMWChxeF34pWF3Ux4xOei+F37YhP240bkWSeZQ=E7w@mail.gmail.com>
 <CA+msPNm=UAF8s41oYHx+fUnYj+UEbQXc6TV8EZn_-mENsMjWWA@mail.gmail.com>
 <CAMqRfuYXE4c1mdh5mPo51n1yeyFqcVR9++=uG+E0hVTykYL2Xw@mail.gmail.com>
 <CALOoFeO7nzEh+COGWXvydytJ3ZUOq==Z13oiEVS-DVZi754ihw@mail.gmail.com>
 <CAMqRfua1CdcVOCBBihX56_DSQvchdnoCKJup+P_UgQkpQtdFRw@mail.gmail.com>
 <CAD9w1hCo1yqsyYLu6NoZRTKaVfAO0iZuPZ6BEXRFK96pp_J4qg@mail.gmail.com>
Message-ID: <CAPTjJmqNFP4a_dP7acbu0LdXxxYHnh_pWtxvi4qvaQTdW7yrmg@mail.gmail.com>

On Wed, Nov 14, 2018 at 5:14 AM David Allemang <allemang.d at gmail.com> wrote:
>
> That is not what slice.indices does. Per help(slice.indices) -
>
> "S.indices(len) -> (start, stop, stride)
>
> "Assuming a sequence of length len, calculate the start and stop indices, and the stride length of the extended slice described by S. Out of bounds indices are clipped in a manner consistent with handling of normal slices.
>
> Essentially, it returns (S.start, len, S.step), with start and stop adjusted to prevent out-of-bounds indices.

And to handle negative indexing.

>>> slice(1,-1).indices(100)
(1, 99, 1)

A range from 1 to -1 doesn't make sense (or rather, it's an empty
range), but a slice from 1 to -1 will exclude the first and last of
any sequence.

ChrisA

From chris.barker at noaa.gov  Tue Nov 13 19:45:14 2018
From: chris.barker at noaa.gov (Chris Barker - NOAA Federal)
Date: Tue, 13 Nov 2018 16:45:14 -0800
Subject: [Python-ideas] Relative Imports
In-Reply-To: <CABH8avgbANG3jvbAKcGUsp7rhucdxcnpr_XtjRiuNyG6CTy6Ew@mail.gmail.com>
References: <CABH8avi0CgvViRSfkFistg=f4t31_ugQBDy45A2BcYNO3xo2Fg@mail.gmail.com>
 <20181109231635.GL4071@ando.pearwood.info>
 <CABH8avifABaOau+YRxy97-G7qARgSHphQBM=wh5k5tcu-OY8hQ@mail.gmail.com>
 <20181109233915.GM4071@ando.pearwood.info>
 <CABH8aviskae=jzu0ewvDinVWQb+rHmcU4den8Fkyx7Xe96Nueg@mail.gmail.com>
 <20181110001043.GO4071@ando.pearwood.info>
 <CABH8avgbANG3jvbAKcGUsp7rhucdxcnpr_XtjRiuNyG6CTy6Ew@mail.gmail.com>
Message-ID: <CALGmxE+Mk+XiVVTvWrErFCkwjs+Ga_LjWrdvoC8zoAWQDSXcRQ@mail.gmail.com>

This is somewhat unpleasant to me, especially while developing something
and trying to test it quickly.
I just want to be able to use same relative imports and run single file
with `python3 test_main.py` for example.


I had the same frustration when I first tried to use relative imports.

Then I discovered setuptools? develop mode (now pip editable install)

It is the right way to run code in packages under development.

-CHB


Running files as modules every time is tiring. This is my problem.
I could not come up with a concrete solution idea yet i am thinking on it.
Open to suggestions.

Thank you all for your help!

On Fri, Nov 9, 2018 at 4:16 PM Steven D'Aprano <steve at pearwood.info> wrote:

> On Fri, Nov 09, 2018 at 03:51:46PM -0800, danish bluecheese wrote:
> > ??? src
> >     ??? __init__.py
> >     ??? main.py
> >     ??? test
> >         ??? __init__.py
> >         ??? test_main.py
> >
> > assume the structure above. To be able to use relative imports with such
> > fundamental structure either i can go for sys.path hacks or could run as
> a
> > module from one further level parent.
>
> I don't understand. From the top level of the package, running inside
> either __init__ or main, you should be able to say:
>
> from . import test
> from .test import test_main
>
> From the test subpackage, you should be able to say:
>
> from .. import main
>
> to get the src/main module, or
>
> from . import test_main
>
> to get the test/test_main module from the test/__init__ module.
>
> (Disclaimer: I have not actually run the above code to check that it
> works, beyond testing that its not a SyntaxError.)
>
> What *precisely* is the problem you are trying to solve, and your
> proposed solution?
>
>
>
> --
> Steve
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> https://mail.python.org/mailman/listinfo/python-ideas
> Code of Conduct: http://python.org/psf/codeofconduct/
>
_______________________________________________
Python-ideas mailing list
Python-ideas at python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20181113/c972da91/attachment-0001.html>

From njs at pobox.com  Tue Nov 13 20:21:44 2018
From: njs at pobox.com (Nathaniel Smith)
Date: Tue, 13 Nov 2018 17:21:44 -0800
Subject: [Python-ideas] Relative Imports
In-Reply-To: <CABH8avgbANG3jvbAKcGUsp7rhucdxcnpr_XtjRiuNyG6CTy6Ew@mail.gmail.com>
References: <CABH8avi0CgvViRSfkFistg=f4t31_ugQBDy45A2BcYNO3xo2Fg@mail.gmail.com>
 <20181109231635.GL4071@ando.pearwood.info>
 <CABH8avifABaOau+YRxy97-G7qARgSHphQBM=wh5k5tcu-OY8hQ@mail.gmail.com>
 <20181109233915.GM4071@ando.pearwood.info>
 <CABH8aviskae=jzu0ewvDinVWQb+rHmcU4den8Fkyx7Xe96Nueg@mail.gmail.com>
 <20181110001043.GO4071@ando.pearwood.info>
 <CABH8avgbANG3jvbAKcGUsp7rhucdxcnpr_XtjRiuNyG6CTy6Ew@mail.gmail.com>
Message-ID: <CAPJVwB=8Tj=Uvi_bUy7gR9pB+qs=O88824noD06yOuphHXfc1g@mail.gmail.com>

On Fri, Nov 9, 2018 at 4:32 PM, danish bluecheese
<danish.bluecheese at gmail.com> wrote:
> you are right on the lines you mentioned. Those are all working if i run it
> as a module which i do every time.
> This is somewhat unpleasant to me, especially while developing something and
> trying to test it quickly.
> I just want to be able to use same relative imports and run single file with
> `python3 test_main.py` for example.
> Running files as modules every time is tiring. This is my problem.

Have you tried 'python3 -m test_main'? IIRC it should be effectively
the same as 'python3 test_main.py' but with working relative imports.

-n

-- 
Nathaniel J. Smith -- https://vorpus.org

From ubershmekel at gmail.com  Wed Nov 14 00:34:33 2018
From: ubershmekel at gmail.com (Yuval Greenfield)
Date: Tue, 13 Nov 2018 21:34:33 -0800
Subject: [Python-ideas] Relative Imports
In-Reply-To: <CALGmxE+Mk+XiVVTvWrErFCkwjs+Ga_LjWrdvoC8zoAWQDSXcRQ@mail.gmail.com>
References: <CABH8avi0CgvViRSfkFistg=f4t31_ugQBDy45A2BcYNO3xo2Fg@mail.gmail.com>
 <20181109231635.GL4071@ando.pearwood.info>
 <CABH8avifABaOau+YRxy97-G7qARgSHphQBM=wh5k5tcu-OY8hQ@mail.gmail.com>
 <20181109233915.GM4071@ando.pearwood.info>
 <CABH8aviskae=jzu0ewvDinVWQb+rHmcU4den8Fkyx7Xe96Nueg@mail.gmail.com>
 <20181110001043.GO4071@ando.pearwood.info>
 <CABH8avgbANG3jvbAKcGUsp7rhucdxcnpr_XtjRiuNyG6CTy6Ew@mail.gmail.com>
 <CALGmxE+Mk+XiVVTvWrErFCkwjs+Ga_LjWrdvoC8zoAWQDSXcRQ@mail.gmail.com>
Message-ID: <CANSw7KzrPy0QFoMrxRvwLj5O4Eko4GO0Zn00pExSrLTCRYtVAg@mail.gmail.com>

On Tue, Nov 13, 2018 at 4:46 PM Chris Barker - NOAA Federal via
Python-ideas <python-ideas at python.org> wrote:

> Then I discovered setuptools? develop mode (now pip editable install)
>
> It is the right way to run code in packages under development.
>

In multiple workplaces I found a folder with python utility scripts that
users can just double-click. The need for installing causes problems with
handling different versions on one machine, and the need for "__init__.py"
files makes the folders less pretty. Sure - sometimes I need to install
stuff anyway - but that's just one "install.py" double click away.

I would like to propose allowing importing of strings that would support
relative paths. For example in Danish's example:

    # use this in `test_main.py`
    import '../main.py' as main

Maybe the syntax can be improved, but to me this need has been aching since
I started using Python 12 years ago. I've used C, C++, and Javascript where
the whole "how do I connect these two files that are a folder apart"
problem doesn't require googling for documentation on packaging tools,
magic filenames, constraints and gotchas. The solution is always obvious
because it works just like it works in every system - with a file-relative
path.

File-relative imports is probably highest on my Python wish list. I've
drafted but not sent out a python-ideas email about it multiple times. I've
seen a lot of "sys.path" hacking that would've been solved by
file-relative-paths.

Cheers and thanks,

Yuval Greenfield
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20181113/38e4a732/attachment.html>

From steve at pearwood.info  Wed Nov 14 01:15:02 2018
From: steve at pearwood.info (Steven D'Aprano)
Date: Wed, 14 Nov 2018 17:15:02 +1100
Subject: [Python-ideas] Relative Imports
In-Reply-To: <CANSw7KzrPy0QFoMrxRvwLj5O4Eko4GO0Zn00pExSrLTCRYtVAg@mail.gmail.com>
References: <CABH8avi0CgvViRSfkFistg=f4t31_ugQBDy45A2BcYNO3xo2Fg@mail.gmail.com>
 <20181109231635.GL4071@ando.pearwood.info>
 <CABH8avifABaOau+YRxy97-G7qARgSHphQBM=wh5k5tcu-OY8hQ@mail.gmail.com>
 <20181109233915.GM4071@ando.pearwood.info>
 <CABH8aviskae=jzu0ewvDinVWQb+rHmcU4den8Fkyx7Xe96Nueg@mail.gmail.com>
 <20181110001043.GO4071@ando.pearwood.info>
 <CABH8avgbANG3jvbAKcGUsp7rhucdxcnpr_XtjRiuNyG6CTy6Ew@mail.gmail.com>
 <CALGmxE+Mk+XiVVTvWrErFCkwjs+Ga_LjWrdvoC8zoAWQDSXcRQ@mail.gmail.com>
 <CANSw7KzrPy0QFoMrxRvwLj5O4Eko4GO0Zn00pExSrLTCRYtVAg@mail.gmail.com>
Message-ID: <20181114061502.GG4071@ando.pearwood.info>

On Tue, Nov 13, 2018 at 09:34:33PM -0800, Yuval Greenfield wrote:

> I would like to propose allowing importing of strings that would support
> relative paths. For example in Danish's example:
> 
>     # use this in `test_main.py`
>     import '../main.py' as main

How does that differ from existing syntax?

    from .. import main


Off the top of my head, a few more questions that don't have obvious 
answers (at least not to me):

What happens if main.py doesn't exist, but main.pyc does?

What if you want to import from a sub-package, rather than a single-file 
module?

What happens when Windows users use a backslash instead of a 
forward-slash?

Does this syntax support arbitrary relative paths anywhere on the file 
system, or is it limited to only searching the current package?

How does it interact with namespace packages?

What happens if you call os.chdir() before calling this?

Invariably people will want to write things like:

    path = '../spam.py'
    import path as spam

(I know that's something I'd try.) What will happen there?

If that is supported, invariably people will want to use pathlib.Path 
objects. Should that work?


> Maybe the syntax can be improved, but to me this need has been aching since
> I started using Python 12 years ago. I've used C, C++, and Javascript where
> the whole "how do I connect these two files that are a folder apart"
> problem doesn't require googling for documentation on packaging tools,
> magic filenames, constraints and gotchas. The solution is always obvious
> because it works just like it works in every system - with a file-relative
> path.

Beware of "obvious" solutions, because so often they lead to not so 
obvious problems. Like Javascript's "relative import hell":

Quote:

    // what we want
    import reducer from 'reducer';
    // what we don't want
    import reducer from '../../../reducer';

https://medium.com/@sherryhsu/how-to-change-relative-paths-to-absolute-paths-for-imports-32ba6cce18a5

And more here:

https://goenning.net/2017/07/21/how-to-avoid-relative-path-hell-javascript-typescript-projects/

https://lostechies.com/derickbailey/2014/02/20/how-i-work-around-the-require-problem-in-nodejs/

It seems to me that in languages which support this file-relative import 
feature, people spend a lot of time either trying to avoid using it, or 
building tools to allow them to avoid using it.

I don't know if that makes it better or worse than Python's solution for 
relative imports :-)


-- 
Steve

From rosuav at gmail.com  Wed Nov 14 01:27:15 2018
From: rosuav at gmail.com (Chris Angelico)
Date: Wed, 14 Nov 2018 17:27:15 +1100
Subject: [Python-ideas] Relative Imports
In-Reply-To: <20181114061502.GG4071@ando.pearwood.info>
References: <CABH8avi0CgvViRSfkFistg=f4t31_ugQBDy45A2BcYNO3xo2Fg@mail.gmail.com>
 <20181109231635.GL4071@ando.pearwood.info>
 <CABH8avifABaOau+YRxy97-G7qARgSHphQBM=wh5k5tcu-OY8hQ@mail.gmail.com>
 <20181109233915.GM4071@ando.pearwood.info>
 <CABH8aviskae=jzu0ewvDinVWQb+rHmcU4den8Fkyx7Xe96Nueg@mail.gmail.com>
 <20181110001043.GO4071@ando.pearwood.info>
 <CABH8avgbANG3jvbAKcGUsp7rhucdxcnpr_XtjRiuNyG6CTy6Ew@mail.gmail.com>
 <CALGmxE+Mk+XiVVTvWrErFCkwjs+Ga_LjWrdvoC8zoAWQDSXcRQ@mail.gmail.com>
 <CANSw7KzrPy0QFoMrxRvwLj5O4Eko4GO0Zn00pExSrLTCRYtVAg@mail.gmail.com>
 <20181114061502.GG4071@ando.pearwood.info>
Message-ID: <CAPTjJmpGH3mQWi5Y=Rsb_ge+P4+7LZCpHheLUW-R1x269CAnpA@mail.gmail.com>

On Wed, Nov 14, 2018 at 5:15 PM Steven D'Aprano <steve at pearwood.info> wrote:
> Beware of "obvious" solutions, because so often they lead to not so
> obvious problems. Like Javascript's "relative import hell":
>
> Quote:
>
>     // what we want
>     import reducer from 'reducer';
>     // what we don't want
>     import reducer from '../../../reducer';
>
> https://medium.com/@sherryhsu/how-to-change-relative-paths-to-absolute-paths-for-imports-32ba6cce18a5

Agreed. Having spent a lot of time with JavaScript students, I
actually am NOT a fan of directory-relative imports. They inevitably
result in equivalent code looking different, and subtly different code
looking identical. Consider:

// main.js
import User from './models/users';

// routers/users.js
import User from '../models/users';

That example is probably okay, because if you ever get it wrong, you
get an immediate error. But what about this?

// routers/index.js
import usersRouter from './users';

// models/stuff.js
import User from './users';

The exact same import does completely different things based on which
file it's in. That's dangerous, because it makes code subtly context
sensitive. I would much rather work with package-relative pathing,
where there is a known basis for *all* local imports, no matter what
file the import is actually happening in. In Python, that's best done
by naming the package again, so that's not quite ideal either, but
it's better than having to pile in the exact right number of "../" to
make the import work.

ChrisA

From erotemic at gmail.com  Wed Nov 14 19:31:49 2018
From: erotemic at gmail.com (Jonathan Crall)
Date: Wed, 14 Nov 2018 19:31:49 -0500
Subject: [Python-ideas] Proposing additions to the standard library
In-Reply-To: <CABuJJj7=jqQLN5+cOp3nxxceWCP_Woi0cNHC47TWS2VnLxvveA@mail.gmail.com>
References: <CAAj5Ztk_RUw7_BvaEGyK07FdePbGCwpUbsth=JAhLoG3iP1Hmw@mail.gmail.com>
 <CABuJJj7=jqQLN5+cOp3nxxceWCP_Woi0cNHC47TWS2VnLxvveA@mail.gmail.com>
Message-ID: <CAAj5ZtkDuuArE-WCW9_bDZm0tYMG=2iUr+Ud43uTp6CjiDV8UQ@mail.gmail.com>

@St?fane Bolton looks really neat! I'll take a look. Some of my stuff may
fit better as a PR for this library.

Also I don't think its foolish to depend on a package for one function,
given that that (a) that function is really useful or (b) the size of the
dependency itself is small. Given my initial impressions of boltons, I
would guess that it doesn't have a large download size of runtime impact.
Although if this case keeps reoccurring with a particular function, then
perhaps that function might improve the stdlib? After really reviewing my
stuff, I think a few functions I have would make the stdlib better, but its
probably only a small few. I bet there are things in bolton that would
improve the stdlib as well, but I do agree that "semi-standard" libraries
(e.g. numpy / scipy / what-I-hope-ubelt-to-be) might be better place for
these functions that would otherwise cause costly-clutter in the stdlib.

On Tue, Nov 13, 2018 at 2:05 AM St?fane Fermigier <sf at fermigier.com> wrote:

> Are you aware of https://boltons.readthedocs.io/ (whose motto is
> "Functionality that should be in the standard library.") ?
>
> Or similar endeavours such as:
>
> - https://pypi.org/project/auxlib/
> - https://pypi.org/project/omakase/
> - (And probably many others on PyPI with similar descriptions such as "a
> library of stuff I'm using / we're using at company X for all my / our
> project(s)...")
>
> Or the functional libraries listed here:
> https://github.com/sfermigier/awesome-functional-python/blob/master/README.md#libraries
>
> => IMHO there is room for a "semi-standard" library, stuff that's not
> included by default and has a release lifecycle more active that Python
> itself, but that can be considered the standard by a large group of users.
>
> Similar ideas can be found for instance in Java with Apache Commons (
> https://commons.apache.org/ -> "an Apache project focused on all aspects
> of reusable Java components."). One could argue, though, that the Java
> standard library is much less developed than the Python standard library,
> so it's much easier to justify the existence of Apache Commons than a
> similar Python project.
>
> There is also the question of the porosity between such a project and the
> stdlib, which is the essence of the original question by the OP.
>
> Another interesting issue is the granularity of such a project. I
> sometimes, and somewhat foolishly, make libraries such as toolz or boltons
> a dependency of my projects, for just one or two function calls from my
> code.
>
> Regards,
>
>   S.
>
>
> On Sun, Nov 11, 2018 at 2:37 AM Jonathan Crall <erotemic at gmail.com> wrote:
>
>> I'm interested in proposing several additions to the Python standard
>> library, and I would like more information on the procedure for doing so.
>> Are all additions done via a PEP? If not what is the procedure. If so, I've
>> read that the first step was to email this board and get feedback.
>>
>> I have a library called `ubelt` that contains several tools that I think
>> might be worthy of adding to the standard library.
>>
>> Here's my bullet point pitch:
>>
>>    - Python is batteries included. Ubelt contains extra batteries.
>>    function are extra batteries.
>>    - Most function in ubelt are fast. All 222 tests takes 7.33 seconds.
>>    - Ubelt has 100% test coverage (sans `# nocover` locations).
>>    - I'm only championing a subset of the functions in ubelt. There are
>>    certainly functions in there that do not belong in the standard library.
>>    - I have a Jupyter notebook that give a demo of some select functions
>>    (not necessarily the same as the ones proposed here):
>>    https://github.com/Erotemic/ubelt/blob/master/docs/notebooks/Ubelt%20Demo.ipynb
>>    - I do have documentation (mostly in docstrings) and in the docs
>>    folder, but I've been having trouble auto-updating read-the-docs. Here is
>>    the link anyway: https://ubelt.readthedocs.io/en/latest/
>>
>> Here is a tentative list of interesting functions. Hopefully the names
>> are descriptive (if not, see docstrings:
>> https://github.com/Erotemic/ubelt)
>>
>> ub.cmd
>>
>> ub.compressuser
>>
>> ub.group_items
>>
>> ub.dict_hist
>>
>> ub.find_duplicates
>>
>> ub.AutoDict
>>
>> ub.import_module_from_path
>>
>> ub.import_module_from_name
>>
>> ub.modname_to_modpath,
>>
>> ub.modpath_to_modname
>>
>> ub.ProgIter
>>
>> ub.ensuredir
>>
>> ub.expandpath
>>
>>
>> almost everything in util_list:
>>
>> allsame, argmax, argmin, argsort, argunique,
>>
>> chunks, flatten, iter_window, take, unique
>>
>>
>> These functions might be worth modifying into dictionary methods:
>>
>> ub.dict_subset
>>
>> ub.dict_take
>>
>> ub.map_vals
>>
>> ub.map_keys
>>
>> ub.Timerit
>>
>> ub.Timer
>>
>>
>>
>> Because I built the library, I tend to like all the functions. Its
>> difficult to decide if they are stdlib worthy, so there might be some false
>> positives / negatives.
>>
>> I'm on the fence about:
>> CacheStamp, Cacher, NoParam, argflag, argval, dzip, delete, hash_data,
>> hash_file, memoize, memoize_method, NiceRepr, augpath, userhome,
>> ensure_app_cache_dir, ensure_app_resource_dir, find_exe, find_path,
>> get_app_cache_dir, get_app_resource_dir, platform_cache_dir,
>>  platform_resource_dir, CaptureStdout, codeblock, ensure_unicode, hzcat,
>>  indent, OrderedSet
>>
>>
>> Its my hope that some of these are actually useful. Let me know any of
>> the following: what you think, if there are any questions, if something
>> else needs to be done, or what the next steps are.
>>
>> --
>> -Jon
>> _______________________________________________
>> Python-ideas mailing list
>> Python-ideas at python.org
>> https://mail.python.org/mailman/listinfo/python-ideas
>> Code of Conduct: http://python.org/psf/codeofconduct/
>>
>
>
> --
> Stefane Fermigier - http://fermigier.com/ - http://twitter.com/sfermigier
> - http://linkedin.com/in/sfermigier
> Founder & CEO, Abilian - Enterprise Social Software -
> http://www.abilian.com/
> Chairman, Free&OSS Group @ Systematic Cluster -
> https://systematic-paris-region.org/fr/groupe-thematique-logiciel-libre/
> Co-Chairman, National Council for Free & Open Source Software (CNLL) -
> http://cnll.fr/
> Founder & Organiser, PyParis & PyData Paris - http://pyparis.org/ &
> http://pydata.fr/
>


-- 
-Jon
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20181114/af25c2a5/attachment-0001.html>

From rene at pylots.com  Sat Nov 17 00:10:02 2018
From: rene at pylots.com (Rene Nejsum)
Date: Sat, 17 Nov 2018 06:10:02 +0100
Subject: [Python-ideas] f-string "debug" conversion
Message-ID: <21400AAD-4D76-441B-A45A-CA13FE512EEB@pylots.com>

+1 for this, I would use it all the time for debugging and tracing programs

breakpoints and IDE?s can be nice, but my code is filled with lines like:

	logger.debug(f?transaction_id={transaction_id}, state={state}, amount={amount}, etc={etc}?)

So yeah, well +10 actually :-)

/rene

From steve at pearwood.info  Mon Nov 19 19:19:51 2018
From: steve at pearwood.info (Steven D'Aprano)
Date: Tue, 20 Nov 2018 11:19:51 +1100
Subject: [Python-ideas] Enhancing range object string displays
Message-ID: <20181120001951.GA4319@ando.pearwood.info>

On the bug tracker, there is a proposal to enhance range objects so that 
printing them will display a snapshot of the values included, including 
the end points. For example:

print(range(10))

currently displays "range(10)". The proposal is for the __str__ method 
to instead return "<range object [0, 1, ..., 8, 9]>".

https://bugs.python.org/issue35200

print(range(2, 200, 3)) would display

<range object [2, 5, ..., 194, 197]>

Note that the original proposal was for range objects' __repr__ to 
display this behaviour. But given the loss of eval(repr(obj)) round 
tripping, and the risk of breaking backwards compatibility, it was 
decided that isn't acceptable but using the same display for __str__ 
(and hence produced by print) would be nearly as useful but without the 
downsides.

The developer who proposed the feature, Julien, now wants to reject the 
feature request. I think it is still a useful feature for range objects. 
What do others think? Is this worth re-opening?


-- 
Steve

From danish.bluecheese at gmail.com  Mon Nov 19 20:09:25 2018
From: danish.bluecheese at gmail.com (danish bluecheese)
Date: Mon, 19 Nov 2018 17:09:25 -0800
Subject: [Python-ideas] Enhancing range object string displays
In-Reply-To: <20181120001951.GA4319@ando.pearwood.info>
References: <20181120001951.GA4319@ando.pearwood.info>
Message-ID: <CABH8avikmyKoVhTdP+mMsnHbJxH2Jv=RbWWiJXRRGw98=x-PVA@mail.gmail.com>

I think it is kind of useless effort. If somebody using range() then
probably knows about it. Also there are some workarounds inspect range()
result already.
Like:
*range(10) or if it is big: *range(10000000)[:10]

On Mon, Nov 19, 2018 at 4:25 PM Steven D'Aprano <steve at pearwood.info> wrote:

> On the bug tracker, there is a proposal to enhance range objects so that
> printing them will display a snapshot of the values included, including
> the end points. For example:
>
> print(range(10))
>
> currently displays "range(10)". The proposal is for the __str__ method
> to instead return "<range object [0, 1, ..., 8, 9]>".
>
> https://bugs.python.org/issue35200
>
> print(range(2, 200, 3)) would display
>
> <range object [2, 5, ..., 194, 197]>
>
> Note that the original proposal was for range objects' __repr__ to
> display this behaviour. But given the loss of eval(repr(obj)) round
> tripping, and the risk of breaking backwards compatibility, it was
> decided that isn't acceptable but using the same display for __str__
> (and hence produced by print) would be nearly as useful but without the
> downsides.
>
> The developer who proposed the feature, Julien, now wants to reject the
> feature request. I think it is still a useful feature for range objects.
> What do others think? Is this worth re-opening?
>
>
>
> --
> Steve
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> https://mail.python.org/mailman/listinfo/python-ideas
> Code of Conduct: http://python.org/psf/codeofconduct/
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20181119/9348384e/attachment.html>

From steve at pearwood.info  Mon Nov 19 21:09:27 2018
From: steve at pearwood.info (Steven D'Aprano)
Date: Tue, 20 Nov 2018 13:09:27 +1100
Subject: [Python-ideas] Enhancing range object string displays
In-Reply-To: <CABH8avikmyKoVhTdP+mMsnHbJxH2Jv=RbWWiJXRRGw98=x-PVA@mail.gmail.com>
References: <20181120001951.GA4319@ando.pearwood.info>
 <CABH8avikmyKoVhTdP+mMsnHbJxH2Jv=RbWWiJXRRGw98=x-PVA@mail.gmail.com>
Message-ID: <20181120020926.GA5054@ando.pearwood.info>

On Mon, Nov 19, 2018 at 05:09:25PM -0800, danish bluecheese wrote:
> I think it is kind of useless effort. If somebody using range() then
> probably knows about it.

For experienced users, sure, but this is an enhancement to help 
beginners who may be confused by the half-open end points.

Even non-beginners may find it nice to be able to easily see the end 
points when the step size is not 1.

If range objects had this, I'd use it in the REPL to check the end 
points. Sure, I could convert to a list and take a slice, but giving the 
object a nicer print output makes less work for the user.


-- 
Steve

From python at mrabarnett.plus.com  Mon Nov 19 21:17:34 2018
From: python at mrabarnett.plus.com (MRAB)
Date: Tue, 20 Nov 2018 02:17:34 +0000
Subject: [Python-ideas] Enhancing range object string displays
In-Reply-To: <20181120001951.GA4319@ando.pearwood.info>
References: <20181120001951.GA4319@ando.pearwood.info>
Message-ID: <24ecfa43-2ba7-a721-625b-843203e1dc9c@mrabarnett.plus.com>

On 2018-11-20 00:19, Steven D'Aprano wrote:
> On the bug tracker, there is a proposal to enhance range objects so that
> printing them will display a snapshot of the values included, including
> the end points. For example:
> 
> print(range(10))
> 
> currently displays "range(10)". The proposal is for the __str__ method
> to instead return "<range object [0, 1, ..., 8, 9]>".
> 
> https://bugs.python.org/issue35200
> 
> print(range(2, 200, 3)) would display
> 
> <range object [2, 5, ..., 194, 197]>
> 
> Note that the original proposal was for range objects' __repr__ to
> display this behaviour. But given the loss of eval(repr(obj)) round
> tripping, and the risk of breaking backwards compatibility, it was
> decided that isn't acceptable but using the same display for __str__
> (and hence produced by print) would be nearly as useful but without the
> downsides.
> 
> The developer who proposed the feature, Julien, now wants to reject the
> feature request. I think it is still a useful feature for range objects.
> What do others think? Is this worth re-opening?
> 
Well, if it's not going to round-trip, and it's going to be more 
verbose, then I think it shouldn't be making the step size implicit.

Maybe something more like: <range object, start 2, step 3, max 197>

But, overall, I'm ?0.

From rosuav at gmail.com  Mon Nov 19 21:22:16 2018
From: rosuav at gmail.com (Chris Angelico)
Date: Tue, 20 Nov 2018 13:22:16 +1100
Subject: [Python-ideas] Enhancing range object string displays
In-Reply-To: <20181120020926.GA5054@ando.pearwood.info>
References: <20181120001951.GA4319@ando.pearwood.info>
 <CABH8avikmyKoVhTdP+mMsnHbJxH2Jv=RbWWiJXRRGw98=x-PVA@mail.gmail.com>
 <20181120020926.GA5054@ando.pearwood.info>
Message-ID: <CAPTjJmr66VkjXPWTuHs1gHn4ShYycnk0+bXVKFBWhQqTeTs8pA@mail.gmail.com>

On Tue, Nov 20, 2018 at 1:10 PM Steven D'Aprano <steve at pearwood.info> wrote:
>
> On Mon, Nov 19, 2018 at 05:09:25PM -0800, danish bluecheese wrote:
> > I think it is kind of useless effort. If somebody using range() then
> > probably knows about it.
>
> For experienced users, sure, but this is an enhancement to help
> beginners who may be confused by the half-open end points.
>
> Even non-beginners may find it nice to be able to easily see the end
> points when the step size is not 1.
>
> If range objects had this, I'd use it in the REPL to check the end
> points. Sure, I could convert to a list and take a slice, but giving the
> object a nicer print output makes less work for the user.
>

I'm a fairly experienced Python programmer, and I still just fire up a
REPL to confirm certain uses of range() with steps.

What would it be like if the string form looked like this:

>>> range(1, 30, 3)
range([1, 4, 7, ..., 25, 28])

In theory, this could actually be made legal, and could be a cool
feature for an enhanced REPL to support. (All you have to do is define
'range' as a function that checks if it's been given a list, and if
not, passes it on unchanged.)

Whether this form or the original, I think this would be an improvement.

ChrisA

From njs at pobox.com  Tue Nov 20 00:02:25 2018
From: njs at pobox.com (Nathaniel Smith)
Date: Mon, 19 Nov 2018 21:02:25 -0800
Subject: [Python-ideas] Enhancing range object string displays
In-Reply-To: <20181120020926.GA5054@ando.pearwood.info>
References: <20181120001951.GA4319@ando.pearwood.info>
 <CABH8avikmyKoVhTdP+mMsnHbJxH2Jv=RbWWiJXRRGw98=x-PVA@mail.gmail.com>
 <20181120020926.GA5054@ando.pearwood.info>
Message-ID: <CAPJVwBkv6xqzNSUhgrowWMvzPEMFadxrGKKdStuCwFpRNsr9_Q@mail.gmail.com>

On Mon, Nov 19, 2018 at 6:09 PM Steven D'Aprano <steve at pearwood.info> wrote:
>
> On Mon, Nov 19, 2018 at 05:09:25PM -0800, danish bluecheese wrote:
> > I think it is kind of useless effort. If somebody using range() then
> > probably knows about it.
>
> For experienced users, sure, but this is an enhancement to help
> beginners who may be confused by the half-open end points.
>
> Even non-beginners may find it nice to be able to easily see the end
> points when the step size is not 1.
>
> If range objects had this, I'd use it in the REPL to check the end
> points. Sure, I could convert to a list and take a slice, but giving the
> object a nicer print output makes less work for the user.

I feel like the kind of users who would benefit the most from this are
exactly the same users who are baffled by the distinction between
str() and repr() and which one is used when, and thus would struggle
to benefit from it?

-n

-- 
Nathaniel J. Smith -- https://vorpus.org

From p.f.moore at gmail.com  Tue Nov 20 04:10:03 2018
From: p.f.moore at gmail.com (Paul Moore)
Date: Tue, 20 Nov 2018 09:10:03 +0000
Subject: [Python-ideas] Enhancing range object string displays
In-Reply-To: <CAPTjJmr66VkjXPWTuHs1gHn4ShYycnk0+bXVKFBWhQqTeTs8pA@mail.gmail.com>
References: <20181120001951.GA4319@ando.pearwood.info>
 <CABH8avikmyKoVhTdP+mMsnHbJxH2Jv=RbWWiJXRRGw98=x-PVA@mail.gmail.com>
 <20181120020926.GA5054@ando.pearwood.info>
 <CAPTjJmr66VkjXPWTuHs1gHn4ShYycnk0+bXVKFBWhQqTeTs8pA@mail.gmail.com>
Message-ID: <CACac1F_yUC7HUBnJ-wDYRqxASTi+9GrHP+pgEszr26fBiTYncA@mail.gmail.com>

On Tue, 20 Nov 2018 at 02:23, Chris Angelico <rosuav at gmail.com> wrote:
> I'm a fairly experienced Python programmer, and I still just fire up a
> REPL to confirm certain uses of range() with steps.
>
> What would it be like if the string form looked like this:
>
> >>> range(1, 30, 3)
> range([1, 4, 7, ..., 25, 28])

Wouldn't that use the repr, which is *not* changing in this proposal?

> In theory, this could actually be made legal, and could be a cool
> feature for an enhanced REPL to support. (All you have to do is define
> 'range' as a function that checks if it's been given a list, and if
> not, passes it on unchanged.)
>
> Whether this form or the original, I think this would be an improvement.

I do like the improved display, but I don't know how useful it would
be in practice, given that I don't *often* use raw range objects (as
opposed to "for x in range(...)") and the default repr display won't
change.

I am inclined to think that we're overthinking the problem - changing
the str() of a range object is unlikely to break anything, is a small
but clear usability improvement, and more time has probably been spent
debating whether it's a good idea than it would have cost to just make
the change...

Paul

From celelibi at gmail.com  Sun Nov 25 11:52:37 2018
From: celelibi at gmail.com (Celelibi)
Date: Sun, 25 Nov 2018 17:52:37 +0100
Subject: [Python-ideas] hybrid regex engine: backtracking + Thompson NFA
Message-ID: <CAJR2zJ9PqqZ-Hj8RcQvpE-3ONQyn5jyb5iAAB3TDJ4Hii4dwsw@mail.gmail.com>

Hello,

I found this topic discussed on the python-dev ML back in 2010 [1].
I'm bringing it up 8 years later with a variation.

In short: The article [2] highlight that backtracking-based regex
engine (like SRE in python) have pathological cases that run in an
exponential time of the input, while they could run in a linear time.
Not mentioned by the article is that even quadratic time can be a
problem with large inputs. Which happen pretty often when you're
looking for delimited stuff like this :
re.match(r'.*?,(.*),,', "a,"*10000)

Of course, there's a catch. Backreferences most-likely cannot be
implemented in a guaranteed linear time, and some cases of look-behind
assertions might prove difficult as well. But other features like
alternatives (foo|bar) and repetitions (.*) are no problem.

The general idea of the Thompson's algorithm is that the simulation of
the NFA is basically in all the reachable states at the same time
while parsing the input string character by character. Of course, for
regex engines that use a VM (like Python's SRE) you can also make the
execution of the equivalent byte-code be in several states at once.

The 2010 discussion seems to be about having two separate engines and
selecting the best one for a given regex. What I'm proposing here is
to resort to backtracking only for the regex features that need it.
Meaning that within a regex like r'(.*),.* .*,\1' the evaluation of
the middle part would use the Thompson's algorithm, while the \1 could
trigger the backtracking mechanism if the string doesn't match.

What do you think about it?
Has this already been discussed and rejected?
Is it just a matter of showing the code? (_sre.c seems... non-trivial)
Has this already been juged not worth the maintainance effort?

AFAIK, there's not many hybrid regex engines out there. But one
notable implementation is the third version of the Henry Spencer's
regex engine [3]. Which he doesn't seem to have documented publicly,
but postgres has done a pretty good job at reverse-engineering a high
level view of it [4]. I'm still unsure how the backtracking of some
parts of the regex interact with the multi-state evaluation of the
other parts. But at least it exists and works, so it is feasible.


Best regards,
Celelibi

[1] https://mail.python.org/pipermail/python-dev/2010-March/098354.html
[2] https://swtch.com/~rsc/regexp/regexp1.html
[3] https://github.com/garyhouston/hsrex
[4] https://github.com/postgres/postgres/tree/master/src/backend/regex

From python at mrabarnett.plus.com  Sun Nov 25 12:59:23 2018
From: python at mrabarnett.plus.com (MRAB)
Date: Sun, 25 Nov 2018 17:59:23 +0000
Subject: [Python-ideas] hybrid regex engine: backtracking + Thompson NFA
In-Reply-To: <CAJR2zJ9PqqZ-Hj8RcQvpE-3ONQyn5jyb5iAAB3TDJ4Hii4dwsw@mail.gmail.com>
References: <CAJR2zJ9PqqZ-Hj8RcQvpE-3ONQyn5jyb5iAAB3TDJ4Hii4dwsw@mail.gmail.com>
Message-ID: <017cffe5-0608-6a10-8631-ff5cc5f37047@mrabarnett.plus.com>

On 2018-11-25 16:52, Celelibi wrote:
> Hello,
> 
> I found this topic discussed on the python-dev ML back in 2010 [1].
> I'm bringing it up 8 years later with a variation.
> 
> In short: The article [2] highlight that backtracking-based regex
> engine (like SRE in python) have pathological cases that run in an
> exponential time of the input, while they could run in a linear time.
> Not mentioned by the article is that even quadratic time can be a
> problem with large inputs. Which happen pretty often when you're
> looking for delimited stuff like this :
> re.match(r'.*?,(.*),,', "a,"*10000)
> 
> Of course, there's a catch. Backreferences most-likely cannot be
> implemented in a guaranteed linear time, and some cases of look-behind
> assertions might prove difficult as well. But other features like
> alternatives (foo|bar) and repetitions (.*) are no problem.
> 
> The general idea of the Thompson's algorithm is that the simulation of
> the NFA is basically in all the reachable states at the same time
> while parsing the input string character by character. Of course, for
> regex engines that use a VM (like Python's SRE) you can also make the
> execution of the equivalent byte-code be in several states at once.
> 
> The 2010 discussion seems to be about having two separate engines and
> selecting the best one for a given regex. What I'm proposing here is
> to resort to backtracking only for the regex features that need it.
> Meaning that within a regex like r'(.*),.* .*,\1' the evaluation of
> the middle part would use the Thompson's algorithm, while the \1 could
> trigger the backtracking mechanism if the string doesn't match.
> 
> What do you think about it?
> Has this already been discussed and rejected?
> Is it just a matter of showing the code? (_sre.c seems... non-trivial)
> Has this already been juged not worth the maintainance effort?
> 
> AFAIK, there's not many hybrid regex engines out there. But one
> notable implementation is the third version of the Henry Spencer's
> regex engine [3]. Which he doesn't seem to have documented publicly,
> but postgres has done a pretty good job at reverse-engineering a high
> level view of it [4]. I'm still unsure how the backtracking of some
> parts of the regex interact with the multi-state evaluation of the
> other parts. But at least it exists and works, so it is feasible.
> 
> 
> Best regards,
> Celelibi
> 
> [1] https://mail.python.org/pipermail/python-dev/2010-March/098354.html
> [2] https://swtch.com/~rsc/regexp/regexp1.html
> [3] https://github.com/garyhouston/hsrex
> [4] https://github.com/postgres/postgres/tree/master/src/backend/regex
> 
This is open source. Nothing gets done unless someone decides to do it. 
So, yes, it's (just?) a matter of showing the code, and, if you want it 
in the re module, to persuade the core devs that it's worth doing and 
that you're willing to maintain it and fix the bugs.

From kale at thekunderts.net  Mon Nov 26 16:29:21 2018
From: kale at thekunderts.net (Kale Kundert)
Date: Mon, 26 Nov 2018 13:29:21 -0800
Subject: [Python-ideas] __len__() for map()
Message-ID: <3e46b3e5-09e0-b53e-16f3-a1605c88df3f@thekunderts.net>

I just ran into the following behavior, and found it surprising:

>>> len(map(float, [1,2,3]))
TypeError: object of type 'map' has no len()

I understand that map() could be given an infinite sequence and therefore might
not always have a length.? But in this case, it seems like map() should've known
that its length was 3.? I also understand that I can just call list() on the
whole thing and get a list, but the nice thing about map() is that it doesn't
copy data, so it's unfortunate to lose that advantage for no particular reason.

My proposal is to delegate map.__len__() to the underlying iterable.? Similarly,
map.__getitem__() could be implemented if the underlying iterable supports item
access:

class map:

    def __init__(self, func, iterable):
        self.func = func
        self.iterable = iterable

    def __iter__(self):
        yield from (self.func(x) for x in self.iterable)

    def __len__(self):
        return len(self.iterable)

    def __getitem__(self, key):
        return self.func(self.iterable[key])

Let me know if there any downsides to this that I'm not seeing.? From my
perspective, it seems like there would be only a number of (small) advantages:

- Less surprising
- Avoid some unnecessary copies
- Backwards compatible

-Kale

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20181126/242f4b80/attachment.html>

From mike at selik.org  Mon Nov 26 17:06:52 2018
From: mike at selik.org (Michael Selik)
Date: Mon, 26 Nov 2018 14:06:52 -0800
Subject: [Python-ideas] __len__() for map()
In-Reply-To: <3e46b3e5-09e0-b53e-16f3-a1605c88df3f@thekunderts.net>
References: <3e46b3e5-09e0-b53e-16f3-a1605c88df3f@thekunderts.net>
Message-ID: <CAGgTfkOT8BkC1BWTdcSC6z=5Tw9g6xB=ZdjDiiOXcH9B2GCQtQ@mail.gmail.com>

If you know the input is sizeable, why not check its length instead of the
map's?

On Mon, Nov 26, 2018, 1:35 PM Kale Kundert <kale at thekunderts.net wrote:

> I just ran into the following behavior, and found it surprising:
>
> >>> len(map(float, [1,2,3]))
> TypeError: object of type 'map' has no len()
>
> I understand that map() could be given an infinite sequence and therefore
> might not always have a length.  But in this case, it seems like map()
> should've known that its length was 3.  I also understand that I can just
> call list() on the whole thing and get a list, but the nice thing about
> map() is that it doesn't copy data, so it's unfortunate to lose that
> advantage for no particular reason.
>
> My proposal is to delegate map.__len__() to the underlying iterable.
> Similarly, map.__getitem__() could be implemented if the underlying
> iterable supports item access:
>
> class map:
>
>     def __init__(self, func, iterable):
>         self.func = func
>         self.iterable = iterable
>
>     def __iter__(self):
>         yield from (self.func(x) for x in self.iterable)
>
>     def __len__(self):
>         return len(self.iterable)
>
>     def __getitem__(self, key):
>         return self.func(self.iterable[key])
>
> Let me know if there any downsides to this that I'm not seeing.  From my
> perspective, it seems like there would be only a number of (small)
> advantages:
>
> - Less surprising
> - Avoid some unnecessary copies
> - Backwards compatible
>
> -Kale
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> https://mail.python.org/mailman/listinfo/python-ideas
> Code of Conduct: http://python.org/psf/codeofconduct/
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20181126/a1403758/attachment.html>

From jfine2358 at gmail.com  Mon Nov 26 17:14:58 2018
From: jfine2358 at gmail.com (Jonathan Fine)
Date: Mon, 26 Nov 2018 22:14:58 +0000
Subject: [Python-ideas] __len__() for map()
In-Reply-To: <3e46b3e5-09e0-b53e-16f3-a1605c88df3f@thekunderts.net>
References: <3e46b3e5-09e0-b53e-16f3-a1605c88df3f@thekunderts.net>
Message-ID: <CALD=Yf8r_fj_77gxTSfwmzqs+8HtzbTw0gh171EjdUTOQ96NGA@mail.gmail.com>

Hi Kale

Thank you for the sample code. It's most helpful. Please consider

>>> it = iter(range(4))
>>> list(map(float, it))
[0.0, 1.0, 2.0, 3.0]

>>> it = iter(range(4))
>>> list(zip(it, it))
[(0, 1), (2, 3)]

>>> list(zip(range(4), range(4)))
[(0, 0), (1, 1), (2, 2), (3, 3)]

A sequence is iterable. An iterator is iterable. There are other
things that are iterable. A random number generator is an iterator,
whose underlying object does not have a length.

Briefly, I don't like your suggestion because many important iterables
don't have a length!

-- 
Jonathan

From rosuav at gmail.com  Mon Nov 26 17:36:08 2018
From: rosuav at gmail.com (Chris Angelico)
Date: Tue, 27 Nov 2018 09:36:08 +1100
Subject: [Python-ideas] __len__() for map()
In-Reply-To: <CALD=Yf8r_fj_77gxTSfwmzqs+8HtzbTw0gh171EjdUTOQ96NGA@mail.gmail.com>
References: <3e46b3e5-09e0-b53e-16f3-a1605c88df3f@thekunderts.net>
 <CALD=Yf8r_fj_77gxTSfwmzqs+8HtzbTw0gh171EjdUTOQ96NGA@mail.gmail.com>
Message-ID: <CAPTjJmpVz4DWX98mWoWFm-t9aAMXmhK-jNJUw4A2vd=j7J85zg@mail.gmail.com>

On Tue, Nov 27, 2018 at 9:15 AM Jonathan Fine <jfine2358 at gmail.com> wrote:
> Briefly, I don't like your suggestion because many important iterables
> don't have a length!

That part's fine. The implication is that mapping over an iterable
with a length would give a map with a known length, and mapping over
something without a length wouldn't.

But I think there are enough odd edge cases (for instance, is it okay
to call the function twice if you __getitem__ twice, or should you
cache it?) that it's probably best to keep the built-in map() simple
and reliable. Don't forget, too, that map() can take more than one
iterable, and some may not have lengths. (You can define enumerate in
terms of map and itertools.count; what is the length of the resulting
enumeration?) If you want a map-like object that takes specifically a
single list, and is a mapped view to that list, then go for it - but
that can be its own beast, not related to the map() built-in function.

Also, it may be of value to check out more-itertools; you might find
something there that you like.

ChrisA

From steve at pearwood.info  Mon Nov 26 17:37:20 2018
From: steve at pearwood.info (Steven D'Aprano)
Date: Tue, 27 Nov 2018 09:37:20 +1100
Subject: [Python-ideas] __len__() for map()
In-Reply-To: <CAGgTfkOT8BkC1BWTdcSC6z=5Tw9g6xB=ZdjDiiOXcH9B2GCQtQ@mail.gmail.com>
References: <3e46b3e5-09e0-b53e-16f3-a1605c88df3f@thekunderts.net>
 <CAGgTfkOT8BkC1BWTdcSC6z=5Tw9g6xB=ZdjDiiOXcH9B2GCQtQ@mail.gmail.com>
Message-ID: <20181126223720.GJ4319@ando.pearwood.info>

On Mon, Nov 26, 2018 at 02:06:52PM -0800, Michael Selik wrote:
> If you know the input is sizeable, why not check its length instead of the
> map's?

The consumer of map may not be the producer of map.

You might know that alist supports len(), but by the time I see it, I 
only see map(f, alist), not alist itself.


-- 
Steve

From steve at pearwood.info  Mon Nov 26 17:39:16 2018
From: steve at pearwood.info (Steven D'Aprano)
Date: Tue, 27 Nov 2018 09:39:16 +1100
Subject: [Python-ideas] __len__() for map()
In-Reply-To: <CALD=Yf8r_fj_77gxTSfwmzqs+8HtzbTw0gh171EjdUTOQ96NGA@mail.gmail.com>
References: <3e46b3e5-09e0-b53e-16f3-a1605c88df3f@thekunderts.net>
 <CALD=Yf8r_fj_77gxTSfwmzqs+8HtzbTw0gh171EjdUTOQ96NGA@mail.gmail.com>
Message-ID: <20181126223915.GK4319@ando.pearwood.info>

On Mon, Nov 26, 2018 at 10:14:58PM +0000, Jonathan Fine wrote:

> Briefly, I don't like your suggestion because many important iterables
> don't have a length!

But many important iterables do.

-- 
Steve

From danish.bluecheese at gmail.com  Mon Nov 26 18:08:40 2018
From: danish.bluecheese at gmail.com (danish bluecheese)
Date: Mon, 26 Nov 2018 15:08:40 -0800
Subject: [Python-ideas] __len__() for map()
In-Reply-To: <20181126223915.GK4319@ando.pearwood.info>
References: <3e46b3e5-09e0-b53e-16f3-a1605c88df3f@thekunderts.net>
 <CALD=Yf8r_fj_77gxTSfwmzqs+8HtzbTw0gh171EjdUTOQ96NGA@mail.gmail.com>
 <20181126223915.GK4319@ando.pearwood.info>
Message-ID: <CABH8avh3+eivvPEFnF8twGQeB9sHbGgnphhiOWtkOvPx6jRhCg@mail.gmail.com>

>
>
> On Mon, Nov 26, 2018 at 10:14:58PM +0000, Jonathan Fine wrote:
>
> > Briefly, I don't like your suggestion because many important iterables
> > don't have a length!
>
> But many important iterables do.
>

I agree many important iterables do have.

On Mon, Nov 26, 2018 at 02:06:52PM -0800, Michael Selik wrote:
> If you know the input is sizeable, why not check its length instead of the
> map's?

The consumer of map may not be the producer of map.

Very good point.

Honestly, i like the proposal but love to see more reviews on the idea.
Maybe i am missing something.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20181126/86f4df10/attachment.html>

From steve at pearwood.info  Mon Nov 26 18:37:13 2018
From: steve at pearwood.info (Steven D'Aprano)
Date: Tue, 27 Nov 2018 10:37:13 +1100
Subject: [Python-ideas] __len__() for map()
In-Reply-To: <3e46b3e5-09e0-b53e-16f3-a1605c88df3f@thekunderts.net>
References: <3e46b3e5-09e0-b53e-16f3-a1605c88df3f@thekunderts.net>
Message-ID: <20181126233713.GL4319@ando.pearwood.info>

On Mon, Nov 26, 2018 at 01:29:21PM -0800, Kale Kundert wrote:
> I just ran into the following behavior, and found it surprising:
> 
> >>> len(map(float, [1,2,3]))
> TypeError: object of type 'map' has no len()
> 
> I understand that map() could be given an infinite sequence and therefore might
> not always have a length.? But in this case, it seems like map() should've known
> that its length was 3.

This seems straightforward, but I think there's more complexity than you 
might realise, a nasty surprise which I expect is going to annoy people 
no matter what decision we make, and the usefulness is probably less 
than you might think.

First, the usefulness: we still have to wrap the call to 
len() in a try...except block, even if we know we have a map object, 
because we won't know whether the underlying iterable supports len. So 
it won't reduce the amount of code we have to write. At best it will 
allow us to take a fast-path when len() returns a value, and a slow-path 
when it raises.

Here's the definition of the Sized abc:

https://docs.python.org/3/library/collections.abc.html#collections.abc.Sized

and the implementation simply checks for the existence of __len__. We 
(rightly) assume that if __len__ exists, the object has a known length, 
and that calling len() on it will succeed or at least not raise 
TypeError.

Your proposal will break that expectation. map objects will be sized, 
but since sometimes the underlying iterator won't be, they may still 
raise TypeError.

Of course there are ways to work around this. We could just change our 
expectations: even Sized objects might not be *actually* sized. Or map() 
could catch the TypeError and raise instead a ValueError, or something. 
Or we could rethink the whole length concept (see below), which after 
all was invented back in Python 1 days and is looking a bit old.

As for the nasty surprise... do you agree that this ought to be an 
invariant for sized iterables?

count = len(it)
i = 0
for obj in it:
    i += 1
assert i == count


That's the invariant I expect, and breaking that will annoy me (and I 
expect many other people) greatly.

But that means that map() cannot just delegate its length to the 
underlying iterable. The implementation must be more complex, keeping 
track of how many items it has seen.

And consider this case:

it = map(lambda x: x, [1, 2, 3, 4, 5])
x = next(it)
x = next(it)

assert len(it) == 5  # underlying length of the iterable
assert len(list(it)) == 3  # but only three items left

assert len(it) == 5  # still 5
assert len(list(it)) == 0  # but nothing left


So the length of the iterable has to vary as you iterate over it, or you 
break the invariant shown above.

But that's going to annoy other people for another reason: we rightly 
expect that iterables shouldn't change their length just because you 
iterate over them! The length should only change if you *modify* them. 
So these two snippets should do the same:

# 1
n = len(it)
x = sum(it)

# 2
x = sum(it)
n = len(it)

but if map() updates its length as it goes, it will break that 
invariant.

So *whichever* behaviour we choose, we're going to break *something*. 
Either the reported length isn't necessarily the same as the actual 
length you get from iterating over the items, which will be annoying and 
confusing, or it varies as you iterate, which will ALSO be annoying and 
confusing.

Either way, this apparently simple and obvious change will be annoying 
and confusing.


Rethinking object length
------------------------

len() was invented back in Python 1 days, or earlier, when we 
effectively had only one kind of iterable: sequences like lists, with a 
known length. Today, iterables can have:

1. a known, finite length;
2. a known infinite length;
3. An unknown length (and usually no way to estimate it).

At least. The len() protocol is intentionally simple, it only supports 
the first case, with the expectation that iterables will simply not 
define __len__ in the other two cases.

Perhaps there is a case for updating the len() concept to explicitly 
handle cases 2 and 3, instead of simply not defining __len__. Perhaps it 
could return -1 for unknown and -2 for infinite. Or raise some other 
exception apart from TypeError.

(I know there have been times I've wanted to know if an iterable was 
infinite, before spending the rest of my life iterating over it...)

And perhaps we can come up with a concept of total length, versus length 
of items remaining.

But these aren't simple issues with obvious solutions, it would surely 
need a PEP. And the benefit isn't obvious either.


-- 
Steve

From steve at pearwood.info  Mon Nov 26 18:41:19 2018
From: steve at pearwood.info (Steven D'Aprano)
Date: Tue, 27 Nov 2018 10:41:19 +1100
Subject: [Python-ideas] __len__() for map()
In-Reply-To: <CAPTjJmpVz4DWX98mWoWFm-t9aAMXmhK-jNJUw4A2vd=j7J85zg@mail.gmail.com>
References: <3e46b3e5-09e0-b53e-16f3-a1605c88df3f@thekunderts.net>
 <CALD=Yf8r_fj_77gxTSfwmzqs+8HtzbTw0gh171EjdUTOQ96NGA@mail.gmail.com>
 <CAPTjJmpVz4DWX98mWoWFm-t9aAMXmhK-jNJUw4A2vd=j7J85zg@mail.gmail.com>
Message-ID: <20181126234119.GM4319@ando.pearwood.info>

On Tue, Nov 27, 2018 at 09:36:08AM +1100, Chris Angelico wrote:

> Don't forget, too, that map() can take more than one iterable

I forgot about that!

But in this case, I think the answer is obvious: the length of the map 
object is the *smallest* length of the iterables, ignoring any unsized 
or infinite ones.

Same would apply to zip().

But as per my previous post, there are other problems with this concept 
that aren't so easy to solve.


-- 
Steve

From toddrjen at gmail.com  Mon Nov 26 18:45:06 2018
From: toddrjen at gmail.com (Todd)
Date: Mon, 26 Nov 2018 18:45:06 -0500
Subject: [Python-ideas] __len__() for map()
In-Reply-To: <20181126223720.GJ4319@ando.pearwood.info>
References: <3e46b3e5-09e0-b53e-16f3-a1605c88df3f@thekunderts.net>
 <CAGgTfkOT8BkC1BWTdcSC6z=5Tw9g6xB=ZdjDiiOXcH9B2GCQtQ@mail.gmail.com>
 <20181126223720.GJ4319@ando.pearwood.info>
Message-ID: <CAFpSVpJSdwoGbZdAHYe1Xq6HmWMCm7dKU6rYCnEA18-Oy=2Dgg@mail.gmail.com>

On Mon, Nov 26, 2018, 17:38 Steven D'Aprano <steve at pearwood.info wrote:

> On Mon, Nov 26, 2018 at 02:06:52PM -0800, Michael Selik wrote:
> > If you know the input is sizeable, why not check its length instead of
> the
> > map's?
>
> The consumer of map may not be the producer of map.
>
> You might know that alist supports len(), but by the time I see it, I
> only see map(f, alist), not alist itself.
>

Then you have no way of knowing whether it is safe to use "len" or not.

>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20181126/c79e7151/attachment-0001.html>

From rosuav at gmail.com  Mon Nov 26 19:02:31 2018
From: rosuav at gmail.com (Chris Angelico)
Date: Tue, 27 Nov 2018 11:02:31 +1100
Subject: [Python-ideas] __len__() for map()
In-Reply-To: <20181126234119.GM4319@ando.pearwood.info>
References: <3e46b3e5-09e0-b53e-16f3-a1605c88df3f@thekunderts.net>
 <CALD=Yf8r_fj_77gxTSfwmzqs+8HtzbTw0gh171EjdUTOQ96NGA@mail.gmail.com>
 <CAPTjJmpVz4DWX98mWoWFm-t9aAMXmhK-jNJUw4A2vd=j7J85zg@mail.gmail.com>
 <20181126234119.GM4319@ando.pearwood.info>
Message-ID: <CAPTjJmq7hPD_AR0aApM1cs9F8M+v15g5yM1R0ekjma5ss1oUgA@mail.gmail.com>

On Tue, Nov 27, 2018 at 10:41 AM Steven D'Aprano <steve at pearwood.info> wrote:
>
> On Tue, Nov 27, 2018 at 09:36:08AM +1100, Chris Angelico wrote:
>
> > Don't forget, too, that map() can take more than one iterable
>
> I forgot about that!
>
> But in this case, I think the answer is obvious: the length of the map
> object is the *smallest* length of the iterables, ignoring any unsized
> or infinite ones.

Equally obvious and valid answer: The length is the smallest length of
its iterables, ignoring any infinite ones, but if any iterable is
unsized, the map is unsized.

And both answers will surprise people.

I still think there's room in the world for a "mapped list view" type,
which retains a reference to an underlying list, plus a function, and
proxies everything through to the function. It would NOT have the
flexibility of map(), but it would be able to directly subscript, it
wouldn't need any cache, etc, etc.

ChrisA

From kale at thekunderts.net  Mon Nov 26 20:37:11 2018
From: kale at thekunderts.net (Kale Kundert)
Date: Mon, 26 Nov 2018 17:37:11 -0800
Subject: [Python-ideas] __len__() for map()
In-Reply-To: <20181126233713.GL4319@ando.pearwood.info>
References: <3e46b3e5-09e0-b53e-16f3-a1605c88df3f@thekunderts.net>
 <20181126233713.GL4319@ando.pearwood.info>
Message-ID: <82dbbb7d-7f8b-ac9f-f78b-9d8c2505e65a@thekunderts.net>

Hi Steven,

Thanks for the good feedback.

> First, the usefulness: we still have to wrap the call to 
> len() in a try...except block, even if we know we have a map object, 
> because we won't know whether the underlying iterable supports len. So 
> it won't reduce the amount of code we have to write. At best it will 
> allow us to take a fast-path when len() returns a value, and a slow-path 
> when it raises.

I think most of the time you would know whether the underlying iterable was
sized or not.? After all, if you need the length, whatever code you're writing
would probably not work on an infinite/unsized iterable.

> So the length of the iterable has to vary as you iterate over it, or you 
> break the invariant shown above.

I think I see the problem here.? map() is an iterator, where I was thinking of
it as a wrapper around an iterable.? Since an iterator is really just a pointer
into an iterable, it doesn't really make sense for it to have a length.? Give it
one, and you end up with the inconsistencies you describe.

I guess I probably would have disagreed with the decision to make map() an
iterator rather than a wrapper around an iterable.? Such a prominent function
should have an API geared towards usability, not towards implementing a
low-level protocol (in my opinion).? But clearly that ship has sailed.

-Kale


From kale at thekunderts.net  Mon Nov 26 20:44:55 2018
From: kale at thekunderts.net (Kale Kundert)
Date: Mon, 26 Nov 2018 17:44:55 -0800
Subject: [Python-ideas] __len__() for map()
In-Reply-To: <CAPTjJmq7hPD_AR0aApM1cs9F8M+v15g5yM1R0ekjma5ss1oUgA@mail.gmail.com>
References: <3e46b3e5-09e0-b53e-16f3-a1605c88df3f@thekunderts.net>
 <CALD=Yf8r_fj_77gxTSfwmzqs+8HtzbTw0gh171EjdUTOQ96NGA@mail.gmail.com>
 <CAPTjJmpVz4DWX98mWoWFm-t9aAMXmhK-jNJUw4A2vd=j7J85zg@mail.gmail.com>
 <20181126234119.GM4319@ando.pearwood.info>
 <CAPTjJmq7hPD_AR0aApM1cs9F8M+v15g5yM1R0ekjma5ss1oUgA@mail.gmail.com>
Message-ID: <778a8ecb-24cf-abdf-5987-86dd7df0b402@thekunderts.net>


>
> Equally obvious and valid answer: The length is the smallest length of
> its iterables, ignoring any infinite ones, but if any iterable is
> unsized, the map is unsized.
>
> And both answers will surprise people.
>
> I still think there's room in the world for a "mapped list view" type,
> which retains a reference to an underlying list, plus a function, and
> proxies everything through to the function. It would NOT have the
> flexibility of map(), but it would be able to directly subscript, it
> wouldn't need any cache, etc, etc.
>
> ChrisA
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> https://mail.python.org/mailman/listinfo/python-ideas
> Code of Conduct: http://python.org/psf/codeofconduct/
I don't really agree that there are multiple surprising answers here.? If you
iterate through the whole map, that will produce some number of elements, and
that's the length.? Whether you can calculate that number in __len__() depends
on the particular iterables you have, which is fine, but I don't think the
definition of length is ambiguous.

But I think Steven is right that you can't implement __len__() for an iterator
without running into some inconsistencies.? It's just unfortunate that map() is
an iterator.

-Kale


From rosuav at gmail.com  Mon Nov 26 20:45:02 2018
From: rosuav at gmail.com (Chris Angelico)
Date: Tue, 27 Nov 2018 12:45:02 +1100
Subject: [Python-ideas] __len__() for map()
In-Reply-To: <82dbbb7d-7f8b-ac9f-f78b-9d8c2505e65a@thekunderts.net>
References: <3e46b3e5-09e0-b53e-16f3-a1605c88df3f@thekunderts.net>
 <20181126233713.GL4319@ando.pearwood.info>
 <82dbbb7d-7f8b-ac9f-f78b-9d8c2505e65a@thekunderts.net>
Message-ID: <CAPTjJmqmnZo_h94VzKNGw4gHqedmbRrsvfLH3FLY9z1RTpN9fQ@mail.gmail.com>

On Tue, Nov 27, 2018 at 12:37 PM Kale Kundert <kale at thekunderts.net> wrote:
> I guess I probably would have disagreed with the decision to make map() an
> iterator rather than a wrapper around an iterable.  Such a prominent function
> should have an API geared towards usability, not towards implementing a
> low-level protocol (in my opinion).  But clearly that ship has sailed.

For map() returns an iterable that can be used more than once, it has
to be mapping over an iterable that can be used more than once. That
limits it. The way map is currently defined, it can accept any
iterable, and it returns a one-shot iterable (which happens to be its
own iterator). That's why I think the best solution is to create a
separate mapped-sequence-view that depends on its iterable being an
actual sequence, and exposes itself as a sequence also. (Yes, I said
"list" in my previous post, but any sequence would work.) It can carry
the length through, it can directly support subscripting, etc, etc,
etc. Both it and map() would have their places.

ChrisA

From tjreedy at udel.edu  Tue Nov 27 12:47:47 2018
From: tjreedy at udel.edu (Terry Reedy)
Date: Tue, 27 Nov 2018 12:47:47 -0500
Subject: [Python-ideas] __len__() for map()
In-Reply-To: <3e46b3e5-09e0-b53e-16f3-a1605c88df3f@thekunderts.net>
References: <3e46b3e5-09e0-b53e-16f3-a1605c88df3f@thekunderts.net>
Message-ID: <ptjvs0$86b$1@blaine.gmane.org>

On 11/26/2018 4:29 PM, Kale Kundert wrote:
> I just ran into the following behavior, and found it surprising:
> 
>  >>> len(map(float, [1,2,3]))
> TypeError: object of type 'map' has no len()
> 
> I understand that map() could be given an infinite sequence and 
> therefore might not always have a length.

The len function is defined as always returning the length, an int >= 0. 
Hence .__len__ methods should always do the same.
https://docs.python.org/3/reference/datamodel.html#object.__len__
Objects that cannot do that should not have this method.

The previous discussion of this issue lead to function 
operator.length_hint and special method object.__length_hint__ in 3.4.

https://docs.python.org/3/library/operator.html#operator.length_hint
"""
operator.length_hint(obj, default=0)

     Return an estimated length for the object o. First try to return 
its actual length, then an estimate using object.__length_hint__(), and 
finally return the default value.

     New in version 3.4.
"""
https://docs.python.org/3/reference/datamodel.html#object.__length_hint__
"""
object.__length_hint__(self)

     Called to implement operator.length_hint(). Should return an 
estimated length for the object (which may be greater or less than the 
actual length). The length must be an integer >= 0. This method is 
purely an optimization and is never required for correctness.

     New in version 3.4.
"""

> But in this case, it seems 
> like map() should've known that its length was 3.

As others have pointed out, this is not true.  If not infinite, the 
size, defined as the number of items to be yielded, and hence the size 
of list(iterator), shrinks by 1 after every next call, just as with pop 
methods.

 >>> it = iter([1,2,3])
 >>> it.__length_hint__()
3
 >>> next(it)
1
 >>> it.__length_hint__()
2
 >>> list(it)
[2, 3]
 >>> it.__length_hint__()
0

Last I heard, list() uses length_hint for its initial allocation.  But 
this is undocumented implementation.

Built-in map does not have .__length_hint__, for the reasons others gave 
for it not having .__len__.  But for private code, you are free to 
define a subclass that does, with the definition you want.

-- 
Terry Jan Reedy


From abedillon at gmail.com  Tue Nov 27 15:21:55 2018
From: abedillon at gmail.com (Abe Dillon)
Date: Tue, 27 Nov 2018 14:21:55 -0600
Subject: [Python-ideas] Make None a subclass of int [alternative to iNaN]
In-Reply-To: <CADPFgppqDTh4vmqpq2Q=vnipOp4gkucQ_eEsfA7=T=cfmTZ=Kg@mail.gmail.com>
References: <CADPFgppqDTh4vmqpq2Q=vnipOp4gkucQ_eEsfA7=T=cfmTZ=Kg@mail.gmail.com>
Message-ID: <CAKr=oZvTNcxyT=TFJOEHY8trBv8SBH7eL_AnUOvoS371y+CCGw@mail.gmail.com>

I'm -1 on this idea. None is and should remain domain-independent. Specific
domains may require additional special values like "NaN", "+/-inf", etc.
for floating point math, in which case it makes more sense to define a
domain-specific special value than compromise the independence of None.
Doing so could break code that assumes None is only an instance of NoneType:

if isinstance(x, int): handle_integer(x)
else if x is None: handle_none()


On Sun, Sep 30, 2018 at 1:06 AM Ken Hilton <kenlhilton at gmail.com> wrote:

> Hi all,
>
> Reading the iNaN discussion, most of the opposition seems to be that
> adding iNaN would add a new special value to integers and therefore add new
> complexity.
>
> I propose, instead, that we make None a subclass of int (or even a certain
> value of int) to represent iNaN. Therefore:
>
>     >>> None + 1, None - 1, None * 2, None / 2, None // 2
>     (None, None, None, nan, None) # mathematical operations on NaN return
> NaN
>     >>> None & 1, None | 1, None ^ 1
>     # I'm not sure about this one. The following could be plausible:
>     (0, 1, 1)
>     # or this might make more sense, as this *is* NaN we're talking about:
>     (None, None, None)
>     >>> isinstance(None, int)
>     True # the whole point of this idea
>     >>> issubclass(type(None), int)
>     True # no matter whether None *is* an int or just a subclass, this
> will be true as issubclass(int, int) is True
>
> I know this is a crazy idea, but I thought it could have some merit, so
> why not throw it out here?
>
> Sharing,
> Ken Hilton;
>
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> https://mail.python.org/mailman/listinfo/python-ideas
> Code of Conduct: http://python.org/psf/codeofconduct/
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20181127/d2cec15b/attachment.html>

From abedillon at gmail.com  Tue Nov 27 23:47:06 2018
From: abedillon at gmail.com (Abe Dillon)
Date: Tue, 27 Nov 2018 22:47:06 -0600
Subject: [Python-ideas] [Brainstorm] Testing with Documented ABCs
Message-ID: <CAKr=oZupOAN47d2LC9Xzu0mTu6KJM--BUqrUpo41WUQ=8VedfQ@mail.gmail.com>

I've been pulling a lot of ideas from the recent discussion on design by
contract (DBC), the elegance and drawbacks
<https://bemusement.org/doctests-arent-code> of doctests
<https://docs.python.org/3/library/doctest.html>, and the amazing talk
<https://www.youtube.com/watch?v=MYucYon2-lk> given by Hillel Wayne at this
year's PyCon entitled "Beyond Unit Tests: Taking your Tests to the Next
Level".

To recap a lot of previous discussions:

- Documentation should tell you:
    A) What a variable represents
    B) What kind of thing a variable is
    C) The acceptable values a variable can take

- Typing and Tests can partially take the place of documentation by filling
in B and C (respectively) and sometimes A can be inferred from decent
naming and context.

- Contracts can take the place of many tests (especially when combined with
a library like hypothesis)

- Contracts/assertions can provide "stable" documentation in the sense that
it can't get out of sync with the code.

- Attempts to implement contracts using standard Python syntax are verbose
and noisy because they rely heavily on decorators that add a lot of
repetitive preamble to the methods being decorated. They may also require a
metaclass which restricts their use to code that doesn't already use a
metaclass.

- There was some discussion about the importance of "what a variable
represents" which pointed to this article
<http://pgbovine.net/python-unreadable.htm> by Philip J. Guo (author of the
magnificent pythontutor.com). I believe Guo's usage of "in-the-small" and
"in-the-large" are confusing because a well decoupled program shouldn't
yield functions that know or care how they're being used in the grand
machinations of your project. The examples he gives are of functions that
could use a doc string and some type annotations, but don't actually say
how they relate to the rest of the project.

One thing that caught me about Hillel Wayne's talk was that some of his
examples were close to needing practically no code. He starts with:

def tail(lst: List[Any]) -> List[Any]:
  assert len(lst) > 0, "precondition"
  result = lst[1:]
  assert [lst[0]] + result == lst, "postcondition"
  return result

He then re-writes the function using a contracts library:

@require("lst must not be empty", lambda args: len(args.lst) > 0)
@ensure("result is tail of lst", lambda args, result: [args.lst[0]] +
result == args.lst)
def tail(lst: List[Any]) -> List[Any]:
  return lst[1:]

He then writes a unit test for the function:

@given(lists(integers(), 1))
def test_tail(lst):
  tail(lst)

What strikes me as interesting is that the test pretty-much doesn't need to
be written. The 'given' statement should be redundant based on the type
annotation and the precondition. Anyone who knows hypothesis, just imagine
the @require is a hypothesis 'assume' call. Furthermore, hypothesis should
be able to build strategies for more complex objects based on class
invariants and attribute types:

@invariant("no overdrafts", lambda self: self.balance >= 0)
class Account:
  def __init__(self, number: int, balance: float = 0):
    super().__init__()
    self.number: int = number
    self.balance: float = balance

A library like hypothesis should be able to generate valid account objects.
Hypothesis also has stateful testing
<https://hypothesis.readthedocs.io/en/1.4.1/stateful.html> but I think the
implementation could use some work. As it is, you have inherit from a class
that uses a metaclass AND you have to pollute your class's name-space with
helper objects and methods.

If we could figure out a cleaner syntax for defining invariants,
preconditions, and postconditions we'd be half-way to automated testing
UTOPIA! (ok, maybe I'm being a little over-zealous)

I think there are two missing pieces to this testing problem: side-effect
verification and failure verification.

Failure verification should test that the expected exceptions get thrown
when known bad data is passed in or when an object is put in a known
illegal state. This should be doable by allowing Hypothesis to probe the
bounds of unacceptable input data or states, though it might seem a bit
silly because if you've already added a precondition, "x >= 0" to a
function, then it obviously should raise a PreconditionViolated when passed
any x < 0. It may be important, however; if for performance reasons, you
need to disable invariant checking but you still want certain bad input to
raise exceptions, or your system has two components that interact with
slightly mis-matched invariants and you want to make sure the components
handle the edge-condition correctly. You can think of Types from a
set-theory perspective where the Integer type is conceptually the set of
all integers, and invariants would specify a smaller subset than Typing
alone, however if the set of all valid outputs of one component is not
completely contained within the set of all valid inputs to another
component, then there will be edge-cases resulting from the mismatch. In
that sense, some of the invariant verification could be static-ish (as much
as Python allows).

Side-effect verification is usually done by mocking dependencies. You pass
in a mock database connection and make sure my object sends and receives
data as expected. As crazy as it sounds, this too can be almost completely
automated away if all of the above tools are in place AND if Python gained
support for Exception annotations. I wrote a Java (yuck) library at work
that does this. I wan't to port it to Python and share it, but it basically
enumerates a bunch of stuff: the "sources" and "destinations" of the
system, how those relate to dependencies, how they relate to each other (if
dependency X is unresponsive, I can't get sources A, B, or G and if I can't
get source B, I can't write destination Y), the dependency failure modes
(Exceptions raised, timeouts, unrecognized key, missing data, etc.), all
the public methods of the class under test and what sources and
destinations they use.

Then I enumerate 'k' from 0 to some limit for the max number of
simultaneous faults to test for:
   Then for each method that can have n >= k simultaneous faults I test all
(n choose k) combinations of faults for that method against the desired
behavior.

I'm sure that explanation is as clear as mud. I will try to get a working
Python example at some point to demonstrate.

Finally, in the PyCon video; Hillel Wayne shows an example of testing that
an "add" function is commutative. It seems that once you write that
invariant, it might apply to many different functions. A similar invariant
may be "reversibility" like:

@given(text())
def test_reversable_codex(s):
   assert s == decode(encode(s)), "not reversible"

That might be a common property that other functions share:

@invariant(reversible(decode))
def encode(s: str) -> bytes: ...

Having said all that, I wanted to brainstorm some possible solutions for
implementing some or all of the above in Python without drowning you code
in decorators.

NOTE: Please don't get hung up on specific syntax suggestions! Try to see
the forest through the trees!

An example syntax could be:

#Instead of this
@require("lst must not be empty", lambda args: len(args.lst) > 0)
@ensure("result is tail of lst", lambda args, result: [args.lst[0]] +
result == args.lst)
def tail(lst: List[Any]) -> List[Any]:
  return lst[1:]

#Maybe this?
non_empty = invariant("Must not be empty", lambda x: len(x) > 0)  # can be
re-used

def tail(lst: List[Any]   d"Description of what this param represents.
{non_empty}") -> List[Any]  d"Description of return value {lst == [lst[0]]
+ __result__}":
  """
  Description of function
  """
  return lst[1:]

Python could build the full doc string like so:

"""
Description of function

Args:
  lst: Description of what this param represents. Must not be empty.

Returns:
  Description of return value.
"""

d-strings have some description followed by some terminator after which
either invariant objects or [optionally strings] followed by an expression
on the arguments and __return__?

I'm sorry this is so half-baked. I don't really like the d-string concept
and I'm pretty sure there are a million problems with it. I'll try to flesh
out the side-effect verification concept more later along with all the
other poorly explained stuff. I just wanted to get these thoughts out for
discussion, but now it's super late and I have to go!
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20181127/792e2a4d/attachment-0001.html>

From marko.ristin at gmail.com  Wed Nov 28 02:12:41 2018
From: marko.ristin at gmail.com (Marko Ristin-Kaufmann)
Date: Wed, 28 Nov 2018 08:12:41 +0100
Subject: [Python-ideas] [Brainstorm] Testing with Documented ABCs
In-Reply-To: <CAKr=oZupOAN47d2LC9Xzu0mTu6KJM--BUqrUpo41WUQ=8VedfQ@mail.gmail.com>
References: <CAKr=oZupOAN47d2LC9Xzu0mTu6KJM--BUqrUpo41WUQ=8VedfQ@mail.gmail.com>
Message-ID: <CAGu4bVBmsuh5hBS9fGGc6FOPTq_gf7Jg1Z_L-kr_hnU+f9xH6Q@mail.gmail.com>

Hi Abe,

I've been pulling a lot of ideas from the recent discussion on design by
> contract (DBC), the elegance and drawbacks
> <https://bemusement.org/doctests-arent-code> of doctests
> <https://docs.python.org/3/library/doctest.html>, and the amazing talk
> <https://www.youtube.com/watch?v=MYucYon2-lk> given by Hillel Wayne at
> this year's PyCon entitled "Beyond Unit Tests: Taking your Tests to the
> Next Level".
>

Have you looked at the recent discussions regarding design-by-contract on
this list (
https://groups.google.com/forum/m/#!topic/python-ideas/JtMgpSyODTU
and the following forked threads)?

You might want to have a look at static checking techniques such as
abstract interpretation. I hope to be able to work on such a tool for
Python in some two years from now. We can stay in touch if you are
interested.

Re decorators: to my own surprise, using decorators in a larger code base
is completely practical including the  readability and maintenance of the
code. It's neither that ugly nor problematic as it might seem at first look.

We use our https://github.com/Parquery/icontract at the company. Most of
the design choices come from practical issues we faced -- so you might want
to read the doc even if you don't plant to use the library.

Some of the aspects we still haven't figured out are: how to approach
multi-threading (locking around the whole function with an additional
decorator?) and granularity of contract switches (right now we use
always/optimized, production/non-optimized and teating/slow, but it seems
that a larger system requires finer categories).

Cheers Marko
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20181128/756585ae/attachment.html>

From steve at pearwood.info  Wed Nov 28 04:44:43 2018
From: steve at pearwood.info (Steven D'Aprano)
Date: Wed, 28 Nov 2018 20:44:43 +1100
Subject: [Python-ideas] [Brainstorm] Testing with Documented ABCs
In-Reply-To: <CAKr=oZupOAN47d2LC9Xzu0mTu6KJM--BUqrUpo41WUQ=8VedfQ@mail.gmail.com>
References: <CAKr=oZupOAN47d2LC9Xzu0mTu6KJM--BUqrUpo41WUQ=8VedfQ@mail.gmail.com>
Message-ID: <20181128094443.GS4319@ando.pearwood.info>

On Tue, Nov 27, 2018 at 10:47:06PM -0600, Abe Dillon wrote:

> If we could figure out a cleaner syntax for defining invariants,
> preconditions, and postconditions we'd be half-way to automated testing
> UTOPIA! (ok, maybe I'm being a little over-zealous)

You should look at the state of the art in Design By Contract. In 
Eiffel, DBC is integrated in the language:

https://www.eiffel.com/values/design-by-contract/introduction/

https://www.eiffel.org/doc/eiffel/ET-_Design_by_Contract_%28tm%29%2C_Assertions_and_Exceptions


Eiffel uses a rather Pythonic block structure to define invariants. 
The syntax is not identical to Python's (Eiffel eschews the colons) but 
it also comes close to executable pseudo-code.

trust this syntax requires little explanation:

	require
		... preconditions, tested on function entry
	do
		... body of the function
	ensure
		... postconditions, tested on function exit
	end

There is a similar invariant block for classes.

Cobra is a language which intentionally modeled its syntax on Python. It 
too has contracts integrated with the language:

http://cobra-language.com/how-to/DeclareContracts/

http://cobra-language.com/trac/cobra/wiki/Contracts


-- 
Steve

From solipsis at pitrou.net  Wed Nov 28 09:18:10 2018
From: solipsis at pitrou.net (Antoine Pitrou)
Date: Wed, 28 Nov 2018 15:18:10 +0100
Subject: [Python-ideas] [Brainstorm] Testing with Documented ABCs
References: <CAKr=oZupOAN47d2LC9Xzu0mTu6KJM--BUqrUpo41WUQ=8VedfQ@mail.gmail.com>
Message-ID: <20181128151810.7c2393c2@fsol>

On Tue, 27 Nov 2018 22:47:06 -0600
Abe Dillon <abedillon at gmail.com> wrote:
> 
> If we could figure out a cleaner syntax for defining invariants,
> preconditions, and postconditions we'd be half-way to automated testing
> UTOPIA! (ok, maybe I'm being a little over-zealous)

I think utopia is the word here.  Fuzz testing can be useful, but it's
not a replacement for manual testing of carefully selected values.

Also, the idea that fuzz testing will automatically find edge cases in
your code is idealistic.  It depends on the algorithm you've
implemented and the distribution of values chosen by the tester.
Showcasing trivially wrong examples (such as an addition function that
always returns 0, or a tail function that doesn't return the tail)
isn't very helpful for a real-world analysis, IMHO.

In the end, you have to be rigorous when writing tests, and for most
non-trivial functions it requires that you devise the distribution of
input values depending on the implemented algorithm, not leave that
distribution to a third-party library that knows nothing about your
program.

Regards

Antoine.


From erik.m.bray at gmail.com  Wed Nov 28 09:27:25 2018
From: erik.m.bray at gmail.com (E. Madison Bray)
Date: Wed, 28 Nov 2018 15:27:25 +0100
Subject: [Python-ideas] __len__() for map()
In-Reply-To: <3e46b3e5-09e0-b53e-16f3-a1605c88df3f@thekunderts.net>
References: <3e46b3e5-09e0-b53e-16f3-a1605c88df3f@thekunderts.net>
Message-ID: <CAOTD34Z6XBM=_Gv_A2TVK2O2BEgDz6gOTbLFRHeeJsqTTWf9MA@mail.gmail.com>

On Mon, Nov 26, 2018 at 10:35 PM Kale Kundert <kale at thekunderts.net> wrote:
>
> I just ran into the following behavior, and found it surprising:
>
> >>> len(map(float, [1,2,3]))
> TypeError: object of type 'map' has no len()
>
> I understand that map() could be given an infinite sequence and therefore might not always have a length.  But in this case, it seems like map() should've known that its length was 3.  I also understand that I can just call list() on the whole thing and get a list, but the nice thing about map() is that it doesn't copy data, so it's unfortunate to lose that advantage for no particular reason.
>
> My proposal is to delegate map.__len__() to the underlying iterable.  Similarly, map.__getitem__() could be implemented if the underlying iterable supports item access:

I mostly agree with the existing objections, though I have often found
myself wanting this too, especially now that `map` does not simply
return a list.  This problem alone (along with the same problem for
filter) has had a ridiculously outsized impact on the Python 3 porting
effort for SageMath, and I find it really irritating at times.

As a simple counter-proposal which I believe has fewer issues, I would
really like it if the built-in `map()` and `filter()` at least
provided a Python-level attribute to access the underlying iterables.
This is necessary because if I have a function that used to take, say,
a list as an argument, and it receives a `map` object, I now have to
be able to deal with map()s, and I may have checks I want to perform
on the underlying iterables before, say, I try to iterate over the
`map`.

Exactly what those checks are and whether or not they're useful may be
highly application-specific, which is why say a generic `map.__len__`
is not workable.  However, if I can at least inspect those iterables I
can make my own choices on how to handle the map.

Exposing the underlying iterables to Python also has dangers in that I
could directly call `next()` on them and possibly create some
confusion, but consenting adults and all that...

From jfine2358 at gmail.com  Wed Nov 28 09:45:36 2018
From: jfine2358 at gmail.com (Jonathan Fine)
Date: Wed, 28 Nov 2018 14:45:36 +0000
Subject: [Python-ideas] __len__() for map()
In-Reply-To: <CAOTD34Z6XBM=_Gv_A2TVK2O2BEgDz6gOTbLFRHeeJsqTTWf9MA@mail.gmail.com>
References: <3e46b3e5-09e0-b53e-16f3-a1605c88df3f@thekunderts.net>
 <CAOTD34Z6XBM=_Gv_A2TVK2O2BEgDz6gOTbLFRHeeJsqTTWf9MA@mail.gmail.com>
Message-ID: <CALD=Yf-7Jsb22thjHordR-EkK_Mj6sKX8x4kUEMFhJE=71AGWA@mail.gmail.com>

On Wed, Nov 28, 2018 at 2:28 PM E. Madison Bray <erik.m.bray at gmail.com> wrote:

> I mostly agree with the existing objections, though I have often found
> myself wanting this too, especially now that `map` does not simply
> return a list.  This problem alone (along with the same problem for
> filter) has had a ridiculously outsized impact on the Python 3 porting
> effort for SageMath, and I find it really irritating at times.

I'm a mathematician, so understand your concerns. Here's what I hope
is a helpful suggestion.

Create a module, say sage.itertools that contains (not tested)

   def py2map(iterable):
        return list(map(iterable))

The porting to Python 3 (for map) is now reduced to writing

    from .itertools import py2map as map

at the head of each module.

Please let me know if this helps.

-- 
Jonathan

From rosuav at gmail.com  Wed Nov 28 09:54:18 2018
From: rosuav at gmail.com (Chris Angelico)
Date: Thu, 29 Nov 2018 01:54:18 +1100
Subject: [Python-ideas] __len__() for map()
In-Reply-To: <CALD=Yf-7Jsb22thjHordR-EkK_Mj6sKX8x4kUEMFhJE=71AGWA@mail.gmail.com>
References: <3e46b3e5-09e0-b53e-16f3-a1605c88df3f@thekunderts.net>
 <CAOTD34Z6XBM=_Gv_A2TVK2O2BEgDz6gOTbLFRHeeJsqTTWf9MA@mail.gmail.com>
 <CALD=Yf-7Jsb22thjHordR-EkK_Mj6sKX8x4kUEMFhJE=71AGWA@mail.gmail.com>
Message-ID: <CAPTjJmqmx4bPr_aCbM9GHn6VuWf=yJGEObwO51Y1SQ6W7FOxxw@mail.gmail.com>

On Thu, Nov 29, 2018 at 1:46 AM Jonathan Fine <jfine2358 at gmail.com> wrote:
>
> On Wed, Nov 28, 2018 at 2:28 PM E. Madison Bray <erik.m.bray at gmail.com> wrote:
>
> > I mostly agree with the existing objections, though I have often found
> > myself wanting this too, especially now that `map` does not simply
> > return a list.  This problem alone (along with the same problem for
> > filter) has had a ridiculously outsized impact on the Python 3 porting
> > effort for SageMath, and I find it really irritating at times.
>
> I'm a mathematician, so understand your concerns. Here's what I hope
> is a helpful suggestion.
>
> Create a module, say sage.itertools that contains (not tested)
>
>    def py2map(iterable):
>         return list(map(iterable))

With the nitpick that the arguments should be (func, *iterables)
rather than just the single iterable, yes, this is a viable transition
strategy. In fact, it's very similar to what 2to3 would do, except
that 2to3 would do it at the call site. If any Py3 porting process is
being held up significantly by this, I would strongly recommend giving
2to3 an eyeball - run it on some of your code, then either accept its
changes or just learn from the diffs. It's not perfect (nothing is),
but it's a useful tool.

ChrisA

From steve at pearwood.info  Wed Nov 28 10:03:49 2018
From: steve at pearwood.info (Steven D'Aprano)
Date: Thu, 29 Nov 2018 02:03:49 +1100
Subject: [Python-ideas] __len__() for map()
In-Reply-To: <CAOTD34Z6XBM=_Gv_A2TVK2O2BEgDz6gOTbLFRHeeJsqTTWf9MA@mail.gmail.com>
References: <3e46b3e5-09e0-b53e-16f3-a1605c88df3f@thekunderts.net>
 <CAOTD34Z6XBM=_Gv_A2TVK2O2BEgDz6gOTbLFRHeeJsqTTWf9MA@mail.gmail.com>
Message-ID: <20181128150348.GT4319@ando.pearwood.info>

On Wed, Nov 28, 2018 at 03:27:25PM +0100, E. Madison Bray wrote:

> I mostly agree with the existing objections, though I have often found
> myself wanting this too, especially now that `map` does not simply
> return a list.  This problem alone (along with the same problem for
> filter) has had a ridiculously outsized impact on the Python 3 porting
> effort for SageMath, and I find it really irritating at times.

*scratches head*

I presume that SageMath worked fine with Python 2 map and filter? You 
can have them back again:

# put this in a module called py2
_map = map
def map(*args):
    return list(_map(*args))


And similarly for filter. The only annoying part is to import this new 
map at the start of every module that needs it, but while that's 
annoying, I wouldn't call it a "ridiculously outsized impact". Its one 
line at the top of each module.

from py2 import map, filter


What am I missing?


> As a simple counter-proposal which I believe has fewer issues, I would
> really like it if the built-in `map()` and `filter()` at least
> provided a Python-level attribute to access the underlying iterables.
> This is necessary because if I have a function that used to take, say,
> a list as an argument, and it receives a `map` object, I now have to
> be able to deal with map()s, and I may have checks I want to perform
> on the underlying iterables before, say, I try to iterate over the
> `map`.
> 
> Exactly what those checks are and whether or not they're useful may be
> highly application-specific, which is why say a generic `map.__len__`
> is not workable.  However, if I can at least inspect those iterables I
> can make my own choices on how to handle the map.

Can you give a concrete example of what you would do in practice? I'm 
having trouble thinking of how and when this sort of thing would be 
useful. Aside from extracting the length of the iterable(s), under what 
circumstances would you want to bypass the call to map() or filter() and 
access the iterables directly?


> Exposing the underlying iterables to Python also has dangers in that I
> could directly call `next()` on them and possibly create some
> confusion, but consenting adults and all that...

I don't think that's worse than what we can already do if you hold onto 
a reference to the underlying iterable:

py> a = [1, 2, 3]
py> it = map(lambda x: x+100, a)
py> next(it)
101
py> a.insert(0, None)
py> next(it)
101


-- 
Steve

From erik.m.bray at gmail.com  Wed Nov 28 10:04:33 2018
From: erik.m.bray at gmail.com (E. Madison Bray)
Date: Wed, 28 Nov 2018 16:04:33 +0100
Subject: [Python-ideas] __len__() for map()
In-Reply-To: <CAPTjJmqmx4bPr_aCbM9GHn6VuWf=yJGEObwO51Y1SQ6W7FOxxw@mail.gmail.com>
References: <3e46b3e5-09e0-b53e-16f3-a1605c88df3f@thekunderts.net>
 <CAOTD34Z6XBM=_Gv_A2TVK2O2BEgDz6gOTbLFRHeeJsqTTWf9MA@mail.gmail.com>
 <CALD=Yf-7Jsb22thjHordR-EkK_Mj6sKX8x4kUEMFhJE=71AGWA@mail.gmail.com>
 <CAPTjJmqmx4bPr_aCbM9GHn6VuWf=yJGEObwO51Y1SQ6W7FOxxw@mail.gmail.com>
Message-ID: <CAOTD34aK5fVRMDiM3s+sYWMJqUH_RF_N=QSBWD-Nbs0DeEDqqw@mail.gmail.com>

On Wed, Nov 28, 2018 at 3:54 PM Chris Angelico <rosuav at gmail.com> wrote:
>
> On Thu, Nov 29, 2018 at 1:46 AM Jonathan Fine <jfine2358 at gmail.com> wrote:
> >
> > On Wed, Nov 28, 2018 at 2:28 PM E. Madison Bray <erik.m.bray at gmail.com> wrote:
> >
> > > I mostly agree with the existing objections, though I have often found
> > > myself wanting this too, especially now that `map` does not simply
> > > return a list.  This problem alone (along with the same problem for
> > > filter) has had a ridiculously outsized impact on the Python 3 porting
> > > effort for SageMath, and I find it really irritating at times.
> >
> > I'm a mathematician, so understand your concerns. Here's what I hope
> > is a helpful suggestion.
> >
> > Create a module, say sage.itertools that contains (not tested)
> >
> >    def py2map(iterable):
> >         return list(map(iterable))
>
> With the nitpick that the arguments should be (func, *iterables)
> rather than just the single iterable, yes, this is a viable transition
> strategy. In fact, it's very similar to what 2to3 would do, except
> that 2to3 would do it at the call site. If any Py3 porting process is
> being held up significantly by this, I would strongly recommend giving
> 2to3 an eyeball - run it on some of your code, then either accept its
> changes or just learn from the diffs. It's not perfect (nothing is),
> but it's a useful tool.

That effort is already mostly done and adding a helper function would
not have worked as users *passing* map(...) as an argument to some
function just expect it to work.  The only alternative would have been
replacing the builtin map with something else at the globals level.
2to3 is mostly useless since a major portion of Sage is written in
Cython anyways.

I just mentioned that porting effort for background.  I still believe
that the actual proposal of making the arguments to a map(...) call
accessible from Python as attributes of the map object (ditto filter,
zip, etc.) is useful in its own right, rather than just having this
completely opaque iterator.

From erik.m.bray at gmail.com  Wed Nov 28 10:05:50 2018
From: erik.m.bray at gmail.com (E. Madison Bray)
Date: Wed, 28 Nov 2018 16:05:50 +0100
Subject: [Python-ideas] __len__() for map()
In-Reply-To: <20181128150348.GT4319@ando.pearwood.info>
References: <3e46b3e5-09e0-b53e-16f3-a1605c88df3f@thekunderts.net>
 <CAOTD34Z6XBM=_Gv_A2TVK2O2BEgDz6gOTbLFRHeeJsqTTWf9MA@mail.gmail.com>
 <20181128150348.GT4319@ando.pearwood.info>
Message-ID: <CAOTD34YxekdRM-qMx8oG1t8aKKuJCVfaPGSqmV5KMr1A+noFwA@mail.gmail.com>

On Wed, Nov 28, 2018 at 4:04 PM Steven D'Aprano <steve at pearwood.info> wrote:
>
> On Wed, Nov 28, 2018 at 03:27:25PM +0100, E. Madison Bray wrote:
>
> > I mostly agree with the existing objections, though I have often found
> > myself wanting this too, especially now that `map` does not simply
> > return a list.  This problem alone (along with the same problem for
> > filter) has had a ridiculously outsized impact on the Python 3 porting
> > effort for SageMath, and I find it really irritating at times.
>
> *scratches head*
>
> I presume that SageMath worked fine with Python 2 map and filter? You
> can have them back again:
>
> # put this in a module called py2
> _map = map
> def map(*args):
>     return list(_map(*args))
>
>
> And similarly for filter. The only annoying part is to import this new
> map at the start of every module that needs it, but while that's
> annoying, I wouldn't call it a "ridiculously outsized impact". Its one
> line at the top of each module.
>
> from py2 import map, filter
>
>
> What am I missing?

Large amounts of context; size of code base.

From steve at pearwood.info  Wed Nov 28 10:14:13 2018
From: steve at pearwood.info (Steven D'Aprano)
Date: Thu, 29 Nov 2018 02:14:13 +1100
Subject: [Python-ideas] __len__() for map()
In-Reply-To: <CAOTD34aK5fVRMDiM3s+sYWMJqUH_RF_N=QSBWD-Nbs0DeEDqqw@mail.gmail.com>
References: <3e46b3e5-09e0-b53e-16f3-a1605c88df3f@thekunderts.net>
 <CAOTD34Z6XBM=_Gv_A2TVK2O2BEgDz6gOTbLFRHeeJsqTTWf9MA@mail.gmail.com>
 <CALD=Yf-7Jsb22thjHordR-EkK_Mj6sKX8x4kUEMFhJE=71AGWA@mail.gmail.com>
 <CAPTjJmqmx4bPr_aCbM9GHn6VuWf=yJGEObwO51Y1SQ6W7FOxxw@mail.gmail.com>
 <CAOTD34aK5fVRMDiM3s+sYWMJqUH_RF_N=QSBWD-Nbs0DeEDqqw@mail.gmail.com>
Message-ID: <20181128151412.GU4319@ando.pearwood.info>

On Wed, Nov 28, 2018 at 04:04:33PM +0100, E. Madison Bray wrote:

> That effort is already mostly done and adding a helper function would
> not have worked as users *passing* map(...) as an argument to some
> function just expect it to work.

Ah, that's what I was missing.

But... surely the function will still work if they pass an opaque 
iterator *other* than map() and/or filter?

it = (func(x) for x in something if condition(x))
some_sage_function(it)


You surely don't expect to be able to peer inside every and any iterator
that you are given? So if you have to handle the opaque iterator case 
anyway, how is it *worse* when the user passes map() or filter() instead 
of a generator like the above?


> I just mentioned that porting effort for background.  I still believe
> that the actual proposal of making the arguments to a map(...) call
> accessible from Python as attributes of the map object (ditto filter,
> zip, etc.) is useful in its own right, rather than just having this
> completely opaque iterator.

Perhaps...

I *want* to agree with this, but I'm having trouble thinking of when and 
how it would be useful. Some concrete examples would help justify it.


-- 
Steve

From erik.m.bray at gmail.com  Wed Nov 28 10:14:24 2018
From: erik.m.bray at gmail.com (E. Madison Bray)
Date: Wed, 28 Nov 2018 16:14:24 +0100
Subject: [Python-ideas] __len__() for map()
In-Reply-To: <20181128150348.GT4319@ando.pearwood.info>
References: <3e46b3e5-09e0-b53e-16f3-a1605c88df3f@thekunderts.net>
 <CAOTD34Z6XBM=_Gv_A2TVK2O2BEgDz6gOTbLFRHeeJsqTTWf9MA@mail.gmail.com>
 <20181128150348.GT4319@ando.pearwood.info>
Message-ID: <CAOTD34bzze1+LneSPX5+s4BGaSLo2sgS9_6uzF=_js3d=KxFMw@mail.gmail.com>

On Wed, Nov 28, 2018 at 4:04 PM Steven D'Aprano <steve at pearwood.info> wrote:
>
> On Wed, Nov 28, 2018 at 03:27:25PM +0100, E. Madison Bray wrote:
>
> > I mostly agree with the existing objections, though I have often found
> > myself wanting this too, especially now that `map` does not simply
> > return a list.  This problem alone (along with the same problem for
> > filter) has had a ridiculously outsized impact on the Python 3 porting
> > effort for SageMath, and I find it really irritating at times.
> >
> > As a simple counter-proposal which I believe has fewer issues, I would
> > really like it if the built-in `map()` and `filter()` at least
> > provided a Python-level attribute to access the underlying iterables.
> > This is necessary because if I have a function that used to take, say,
> > a list as an argument, and it receives a `map` object, I now have to
> > be able to deal with map()s, and I may have checks I want to perform
> > on the underlying iterables before, say, I try to iterate over the
> > `map`.
> >
> > Exactly what those checks are and whether or not they're useful may be
> > highly application-specific, which is why say a generic `map.__len__`
> > is not workable.  However, if I can at least inspect those iterables I
> > can make my own choices on how to handle the map.
>
> Can you give a concrete example of what you would do in practice? I'm
> having trouble thinking of how and when this sort of thing would be
> useful. Aside from extracting the length of the iterable(s), under what
> circumstances would you want to bypass the call to map() or filter() and
> access the iterables directly?

For example, some function that used to expect some finite-sized
sequence such as a list or tuple is now passed a "map", possibly
wrapping one or more iterable of arbitrary, possibly non-finite size.
For the purposes of some algorithm I have this is not useful and I
need to convert it to a sequence anyways but don't want to do that
without some guarantee that I won't blow up the user's memory usage.
So I might want to check:

finite_definite = True
for it in my_map.iters:
    try:
        len(it)
    except TypeError:
        finite_definite = False

if finite_definite:
    my_seq = list(my_map)
else:
    # some other algorithm

Of course, some arbitrary object could lie about its __len__ but I'm
not concerned about pathological cases here.  There may be other
opportunities for optimization as well that are otherwise hidden.

Either way, I don't see any reason to hide this data; it's a couple of
slot attributes and instantly better introspection capability.

From erik.m.bray at gmail.com  Wed Nov 28 10:18:35 2018
From: erik.m.bray at gmail.com (E. Madison Bray)
Date: Wed, 28 Nov 2018 16:18:35 +0100
Subject: [Python-ideas] __len__() for map()
In-Reply-To: <20181128151412.GU4319@ando.pearwood.info>
References: <3e46b3e5-09e0-b53e-16f3-a1605c88df3f@thekunderts.net>
 <CAOTD34Z6XBM=_Gv_A2TVK2O2BEgDz6gOTbLFRHeeJsqTTWf9MA@mail.gmail.com>
 <CALD=Yf-7Jsb22thjHordR-EkK_Mj6sKX8x4kUEMFhJE=71AGWA@mail.gmail.com>
 <CAPTjJmqmx4bPr_aCbM9GHn6VuWf=yJGEObwO51Y1SQ6W7FOxxw@mail.gmail.com>
 <CAOTD34aK5fVRMDiM3s+sYWMJqUH_RF_N=QSBWD-Nbs0DeEDqqw@mail.gmail.com>
 <20181128151412.GU4319@ando.pearwood.info>
Message-ID: <CAOTD34ZLDbK94VKESzFZ-9-v5uqR5pEijF9HJaPkPan5UzsxVQ@mail.gmail.com>

On Wed, Nov 28, 2018 at 4:14 PM Steven D'Aprano <steve at pearwood.info> wrote:
>
> On Wed, Nov 28, 2018 at 04:04:33PM +0100, E. Madison Bray wrote:
>
> > That effort is already mostly done and adding a helper function would
> > not have worked as users *passing* map(...) as an argument to some
> > function just expect it to work.
>
> Ah, that's what I was missing.
>
> But... surely the function will still work if they pass an opaque
> iterator *other* than map() and/or filter?
>
> it = (func(x) for x in something if condition(x))
> some_sage_function(it)

That one is admittedly tricky.  For that matter it might be nice to
have more introspection of generator expressions too, but there at
least we have .gi_code if nothing else.

But those are a far less common example in my case, whereas map() is
*everywhere* in math code :)

From rosuav at gmail.com  Wed Nov 28 10:24:05 2018
From: rosuav at gmail.com (Chris Angelico)
Date: Thu, 29 Nov 2018 02:24:05 +1100
Subject: [Python-ideas] __len__() for map()
In-Reply-To: <CAOTD34ZLDbK94VKESzFZ-9-v5uqR5pEijF9HJaPkPan5UzsxVQ@mail.gmail.com>
References: <3e46b3e5-09e0-b53e-16f3-a1605c88df3f@thekunderts.net>
 <CAOTD34Z6XBM=_Gv_A2TVK2O2BEgDz6gOTbLFRHeeJsqTTWf9MA@mail.gmail.com>
 <CALD=Yf-7Jsb22thjHordR-EkK_Mj6sKX8x4kUEMFhJE=71AGWA@mail.gmail.com>
 <CAPTjJmqmx4bPr_aCbM9GHn6VuWf=yJGEObwO51Y1SQ6W7FOxxw@mail.gmail.com>
 <CAOTD34aK5fVRMDiM3s+sYWMJqUH_RF_N=QSBWD-Nbs0DeEDqqw@mail.gmail.com>
 <20181128151412.GU4319@ando.pearwood.info>
 <CAOTD34ZLDbK94VKESzFZ-9-v5uqR5pEijF9HJaPkPan5UzsxVQ@mail.gmail.com>
Message-ID: <CAPTjJmrtEiQoxXqiQp0oiv+Tv2Z+gPr49PReyBDK4zwOdr62Tg@mail.gmail.com>

On Thu, Nov 29, 2018 at 2:19 AM E. Madison Bray <erik.m.bray at gmail.com> wrote:
>
> On Wed, Nov 28, 2018 at 4:14 PM Steven D'Aprano <steve at pearwood.info> wrote:
> >
> > On Wed, Nov 28, 2018 at 04:04:33PM +0100, E. Madison Bray wrote:
> >
> > > That effort is already mostly done and adding a helper function would
> > > not have worked as users *passing* map(...) as an argument to some
> > > function just expect it to work.
> >
> > Ah, that's what I was missing.
> >
> > But... surely the function will still work if they pass an opaque
> > iterator *other* than map() and/or filter?
> >
> > it = (func(x) for x in something if condition(x))
> > some_sage_function(it)
>
> That one is admittedly tricky.  For that matter it might be nice to
> have more introspection of generator expressions too, but there at
> least we have .gi_code if nothing else.

Considering that a genexp can do literally anything, I doubt you'll
get anywhere with that introspection.

> But those are a far less common example in my case, whereas map() is
> *everywhere* in math code :)

Perhaps then, the problem is that math code treats "map" as something
that is more akin to "instrumented list" than it is to a generator. If
you know for certain that you're mapping a low-cost pure function over
an immutable collection, the best solution may be to proxy through to
the original list than to generate values on the fly. And if that's
the case, you don't want the Py3 map *or* the Py2 one, although the
Py2 one can behave this way, at the cost of crazy amounts of
efficiency.

ChrisA

From erik.m.bray at gmail.com  Wed Nov 28 10:31:35 2018
From: erik.m.bray at gmail.com (E. Madison Bray)
Date: Wed, 28 Nov 2018 16:31:35 +0100
Subject: [Python-ideas] __len__() for map()
In-Reply-To: <CAPTjJmrtEiQoxXqiQp0oiv+Tv2Z+gPr49PReyBDK4zwOdr62Tg@mail.gmail.com>
References: <3e46b3e5-09e0-b53e-16f3-a1605c88df3f@thekunderts.net>
 <CAOTD34Z6XBM=_Gv_A2TVK2O2BEgDz6gOTbLFRHeeJsqTTWf9MA@mail.gmail.com>
 <CALD=Yf-7Jsb22thjHordR-EkK_Mj6sKX8x4kUEMFhJE=71AGWA@mail.gmail.com>
 <CAPTjJmqmx4bPr_aCbM9GHn6VuWf=yJGEObwO51Y1SQ6W7FOxxw@mail.gmail.com>
 <CAOTD34aK5fVRMDiM3s+sYWMJqUH_RF_N=QSBWD-Nbs0DeEDqqw@mail.gmail.com>
 <20181128151412.GU4319@ando.pearwood.info>
 <CAOTD34ZLDbK94VKESzFZ-9-v5uqR5pEijF9HJaPkPan5UzsxVQ@mail.gmail.com>
 <CAPTjJmrtEiQoxXqiQp0oiv+Tv2Z+gPr49PReyBDK4zwOdr62Tg@mail.gmail.com>
Message-ID: <CAOTD34Y8r0xesH_x619+2Vu2yokRFkLdkVjA-eb=83S24tp7EQ@mail.gmail.com>

On Wed, Nov 28, 2018 at 4:24 PM Chris Angelico <rosuav at gmail.com> wrote:
>
> On Thu, Nov 29, 2018 at 2:19 AM E. Madison Bray <erik.m.bray at gmail.com> wrote:
> >
> > On Wed, Nov 28, 2018 at 4:14 PM Steven D'Aprano <steve at pearwood.info> wrote:
> > >
> > > On Wed, Nov 28, 2018 at 04:04:33PM +0100, E. Madison Bray wrote:
> > >
> > > > That effort is already mostly done and adding a helper function would
> > > > not have worked as users *passing* map(...) as an argument to some
> > > > function just expect it to work.
> > >
> > > Ah, that's what I was missing.
> > >
> > > But... surely the function will still work if they pass an opaque
> > > iterator *other* than map() and/or filter?
> > >
> > > it = (func(x) for x in something if condition(x))
> > > some_sage_function(it)
> >
> > That one is admittedly tricky.  For that matter it might be nice to
> > have more introspection of generator expressions too, but there at
> > least we have .gi_code if nothing else.
>
> Considering that a genexp can do literally anything, I doubt you'll
> get anywhere with that introspection.
>
> > But those are a far less common example in my case, whereas map() is
> > *everywhere* in math code :)
>
> Perhaps then, the problem is that math code treats "map" as something
> that is more akin to "instrumented list" than it is to a generator. If
> you know for certain that you're mapping a low-cost pure function over
> an immutable collection, the best solution may be to proxy through to
> the original list than to generate values on the fly. And if that's
> the case, you don't want the Py3 map *or* the Py2 one, although the
> Py2 one can behave this way, at the cost of crazy amounts of
> efficiency.

Yep, that's a great example where it might be possible to introspect a
given `map` object and take it apart to do something more efficient
with it.  This is less of a problem with internal code where it's easy
to just not use map() at all, and that is often the case.  But a lot
of the people who develop code for Sage are mathematicians, not
engineers, and they may not be aware of this, so they write code that
passes `map()` objects to more internal machinery.  And users will do
the same even moreso.

I can (and have) written horrible C-level hacks--not for this specific
issue, but others like it--and am sometimes tempted to do the same
here :(

From steve at pearwood.info  Wed Nov 28 10:33:06 2018
From: steve at pearwood.info (Steven D'Aprano)
Date: Thu, 29 Nov 2018 02:33:06 +1100
Subject: [Python-ideas] __len__() for map()
In-Reply-To: <CAOTD34bzze1+LneSPX5+s4BGaSLo2sgS9_6uzF=_js3d=KxFMw@mail.gmail.com>
References: <3e46b3e5-09e0-b53e-16f3-a1605c88df3f@thekunderts.net>
 <CAOTD34Z6XBM=_Gv_A2TVK2O2BEgDz6gOTbLFRHeeJsqTTWf9MA@mail.gmail.com>
 <20181128150348.GT4319@ando.pearwood.info>
 <CAOTD34bzze1+LneSPX5+s4BGaSLo2sgS9_6uzF=_js3d=KxFMw@mail.gmail.com>
Message-ID: <20181128153305.GV4319@ando.pearwood.info>

On Wed, Nov 28, 2018 at 04:14:24PM +0100, E. Madison Bray wrote:

> For example, some function that used to expect some finite-sized
> sequence such as a list or tuple is now passed a "map", possibly
> wrapping one or more iterable of arbitrary, possibly non-finite size.
> For the purposes of some algorithm I have this is not useful and I
> need to convert it to a sequence anyways but don't want to do that
> without some guarantee that I won't blow up the user's memory usage.
> So I might want to check:
> 
> finite_definite = True
> for it in my_map.iters:
>     try:
>         len(it)
>     except TypeError:
>         finite_definite = False
> 
> if finite_definite:
>     my_seq = list(my_map)
> else:
>     # some other algorithm

But surely you didn't need to do this just because of *map*. Users could 
have passed an infinite, unsized iterable going back to Python 1 days 
with the old sequence protocol. They certainly could pass a generator or 
other opaque iterator apart from map. So I'm having trouble seeing why 
the Python 2/3 change to map made things worse for SageMath.

But in any case, this example comes back to the question of len again, 
and we've already covered why this is problematic. In case you missed 
it, let's take a toy example which demonstrates the problem:


def mean(it):
    if isinstance(it, map):
        # Hypothetical attribute access to the underlying iterable.
        n = len(it.iterable)  
        return sum(it)/n


Now let's pass a map object to it:

data = [1, 2, 3, 4, 5]
it = map(lambda x: x, data)
assert len(it.iterable) == 5
next(it); next(it); next(it)

assert mean(it) == 4.5
# fails, as it actually returns 9/5 instead of 9/2


-- 
Steve

From marcos.eliziario at gmail.com  Wed Nov 28 10:41:07 2018
From: marcos.eliziario at gmail.com (Marcos Eliziario)
Date: Wed, 28 Nov 2018 13:41:07 -0200
Subject: [Python-ideas] [Brainstorm] Testing with Documented ABCs
In-Reply-To: <20181128151810.7c2393c2@fsol>
References: <CAKr=oZupOAN47d2LC9Xzu0mTu6KJM--BUqrUpo41WUQ=8VedfQ@mail.gmail.com>
 <20181128151810.7c2393c2@fsol>
Message-ID: <CAKhLUo2Yn4p8x+rDL-9PWq+us0GgcyobSNb1pzxuwg3vDQFAYA@mail.gmail.com>

>
> In the end, you have to be rigorous when writing tests, and for most
> non-trivial functions it requires that you devise the distribution of
> input values depending on the implemented algorithm, not leave that
> distribution to a third-party library that knows nothing about your
> program.
>

Indeed.
But the great thing about the "hypothesis" tool is that it allows me to
somewhat automate the generation of sets of input values based on my
specific requirements derived from my knowledge of my program.
It allows me to think about what is the reasonable distribution of values
for each argument in a function by either using existing strategies, using
their arguments, combining and extending them, and them letting the tool do
the grunt work of running the test for lots of different equivalent classes
of argument values.
I think that as long as the tool user keeps what you said in mind and uses
the tool accordingly it can be a great helper, and probably even force the
average programmer to think more rigorously about the input values to be
tested, not to mention the whole class of trivial mistakes and
forgetfulness we are all bound to be subject when writing test cases.

Best,


Em qua, 28 de nov de 2018 ?s 12:18, Antoine Pitrou <solipsis at pitrou.net>
escreveu:

> On Tue, 27 Nov 2018 22:47:06 -0600
> Abe Dillon <abedillon at gmail.com> wrote:
> >
> > If we could figure out a cleaner syntax for defining invariants,
> > preconditions, and postconditions we'd be half-way to automated testing
> > UTOPIA! (ok, maybe I'm being a little over-zealous)
>
> I think utopia is the word here.  Fuzz testing can be useful, but it's
> not a replacement for manual testing of carefully selected values.
>
> Also, the idea that fuzz testing will automatically find edge cases in
> your code is idealistic.  It depends on the algorithm you've
> implemented and the distribution of values chosen by the tester.
> Showcasing trivially wrong examples (such as an addition function that
> always returns 0, or a tail function that doesn't return the tail)
> isn't very helpful for a real-world analysis, IMHO.
>
> In the end, you have to be rigorous when writing tests, and for most
> non-trivial functions it requires that you devise the distribution of
> input values depending on the implemented algorithm, not leave that
> distribution to a third-party library that knows nothing about your
> program.
>
> Regards
>
> Antoine.
>
>
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> https://mail.python.org/mailman/listinfo/python-ideas
> Code of Conduct: http://python.org/psf/codeofconduct/
>


-- 
Marcos Elizi?rio Santos
mobile/whatsapp/telegram: +55(21) 9-8027-0156
skype: marcos.eliziario at gmail.com
linked-in : https://www.linkedin.com/in/eliziario/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20181128/cbb7a18e/attachment.html>

From jfine2358 at gmail.com  Wed Nov 28 10:52:17 2018
From: jfine2358 at gmail.com (Jonathan Fine)
Date: Wed, 28 Nov 2018 15:52:17 +0000
Subject: [Python-ideas] __len__() for map()
In-Reply-To: <20181128153305.GV4319@ando.pearwood.info>
References: <3e46b3e5-09e0-b53e-16f3-a1605c88df3f@thekunderts.net>
 <CAOTD34Z6XBM=_Gv_A2TVK2O2BEgDz6gOTbLFRHeeJsqTTWf9MA@mail.gmail.com>
 <20181128150348.GT4319@ando.pearwood.info>
 <CAOTD34bzze1+LneSPX5+s4BGaSLo2sgS9_6uzF=_js3d=KxFMw@mail.gmail.com>
 <20181128153305.GV4319@ando.pearwood.info>
Message-ID: <CALD=Yf_OAefYe3p2U2HiFKV5HnE7fjuDfOhyaENw4rso4Mo1nQ@mail.gmail.com>

Suppose itr_1 is an iterator. Now consider

    itr_2 = map(lambda x: x, itr_1)
    itr_3 = itr_1

We now have itr_1, itr_2 and itr_3. There are all, effectively, the
same iterator (unless we do an 'x is y' comparision).

I conclude that this suggestion amounts to have a __len__ for ANY
iterator, and not just a map. In other words, this suggestion has
broader scope and consequences than were presented in the original
post.

-- 
Jonathan

From erik.m.bray at gmail.com  Wed Nov 28 11:04:26 2018
From: erik.m.bray at gmail.com (E. Madison Bray)
Date: Wed, 28 Nov 2018 17:04:26 +0100
Subject: [Python-ideas] __len__() for map()
In-Reply-To: <CAOTD34Y8r0xesH_x619+2Vu2yokRFkLdkVjA-eb=83S24tp7EQ@mail.gmail.com>
References: <3e46b3e5-09e0-b53e-16f3-a1605c88df3f@thekunderts.net>
 <CAOTD34Z6XBM=_Gv_A2TVK2O2BEgDz6gOTbLFRHeeJsqTTWf9MA@mail.gmail.com>
 <CALD=Yf-7Jsb22thjHordR-EkK_Mj6sKX8x4kUEMFhJE=71AGWA@mail.gmail.com>
 <CAPTjJmqmx4bPr_aCbM9GHn6VuWf=yJGEObwO51Y1SQ6W7FOxxw@mail.gmail.com>
 <CAOTD34aK5fVRMDiM3s+sYWMJqUH_RF_N=QSBWD-Nbs0DeEDqqw@mail.gmail.com>
 <20181128151412.GU4319@ando.pearwood.info>
 <CAOTD34ZLDbK94VKESzFZ-9-v5uqR5pEijF9HJaPkPan5UzsxVQ@mail.gmail.com>
 <CAPTjJmrtEiQoxXqiQp0oiv+Tv2Z+gPr49PReyBDK4zwOdr62Tg@mail.gmail.com>
 <CAOTD34Y8r0xesH_x619+2Vu2yokRFkLdkVjA-eb=83S24tp7EQ@mail.gmail.com>
Message-ID: <CAOTD34ZZZ9wiX=c3MhVgu8KTQqKAkqxQgPiObA8imrDv-AjYkA@mail.gmail.com>

One thing I'd like to add real quick to this (I'm on my phone so apologies
for crappy quoting):

Although there are existing cases where there is a loss of efficiency over
Python 2 map() when dealing with the opaque, iterable Python 3 map(), the
latter also presents many opportunities for enhancements that weren't
possible before.

For example, previously a user might pass map(func, some_list) where func
is some pure function and the iterable is almost always a list of some
kind. Previously that map() call would be evaluated (often slowly) first.

But now we can treat a map as something a little more formal, as a
container for a function and one or more iterables, which happens to have
this special functionality when you iterate over it, but is otherwise just
a special container. This is technically already the case, we just can't
directly access it as a container. If we could, it would be possible to
implement various optimizations that a user might not have otherwise been
obvious to the user. This is especially the case of the iterable is a
simple list, which is something we can check. The function in this case
very likely might actually be a C function that was wrapped with Cython. I
can easily convert this on the user's behalf to a simple C loop or possibly
even some other more optimal vectorized code.

These are application-specific special cases of course, but many such cases
become easily accessible if map() and friends are usable as specialized
containers.

On Wed, Nov 28, 2018, 16:31 E. Madison Bray <erik.m.bray at gmail.com wrote:

> On Wed, Nov 28, 2018 at 4:24 PM Chris Angelico <rosuav at gmail.com> wrote:
> >
> > On Thu, Nov 29, 2018 at 2:19 AM E. Madison Bray <erik.m.bray at gmail.com>
> wrote:
> > >
> > > On Wed, Nov 28, 2018 at 4:14 PM Steven D'Aprano <steve at pearwood.info>
> wrote:
> > > >
> > > > On Wed, Nov 28, 2018 at 04:04:33PM +0100, E. Madison Bray wrote:
> > > >
> > > > > That effort is already mostly done and adding a helper function
> would
> > > > > not have worked as users *passing* map(...) as an argument to some
> > > > > function just expect it to work.
> > > >
> > > > Ah, that's what I was missing.
> > > >
> > > > But... surely the function will still work if they pass an opaque
> > > > iterator *other* than map() and/or filter?
> > > >
> > > > it = (func(x) for x in something if condition(x))
> > > > some_sage_function(it)
> > >
> > > That one is admittedly tricky.  For that matter it might be nice to
> > > have more introspection of generator expressions too, but there at
> > > least we have .gi_code if nothing else.
> >
> > Considering that a genexp can do literally anything, I doubt you'll
> > get anywhere with that introspection.
> >
> > > But those are a far less common example in my case, whereas map() is
> > > *everywhere* in math code :)
> >
> > Perhaps then, the problem is that math code treats "map" as something
> > that is more akin to "instrumented list" than it is to a generator. If
> > you know for certain that you're mapping a low-cost pure function over
> > an immutable collection, the best solution may be to proxy through to
> > the original list than to generate values on the fly. And if that's
> > the case, you don't want the Py3 map *or* the Py2 one, although the
> > Py2 one can behave this way, at the cost of crazy amounts of
> > efficiency.
>
> Yep, that's a great example where it might be possible to introspect a
> given `map` object and take it apart to do something more efficient
> with it.  This is less of a problem with internal code where it's easy
> to just not use map() at all, and that is often the case.  But a lot
> of the people who develop code for Sage are mathematicians, not
> engineers, and they may not be aware of this, so they write code that
> passes `map()` objects to more internal machinery.  And users will do
> the same even moreso.
>
> I can (and have) written horrible C-level hacks--not for this specific
> issue, but others like it--and am sometimes tempted to do the same
> here :(
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20181128/1fe2a936/attachment-0001.html>

From erik.m.bray at gmail.com  Wed Nov 28 11:16:30 2018
From: erik.m.bray at gmail.com (E. Madison Bray)
Date: Wed, 28 Nov 2018 17:16:30 +0100
Subject: [Python-ideas] __len__() for map()
In-Reply-To: <20181128153305.GV4319@ando.pearwood.info>
References: <3e46b3e5-09e0-b53e-16f3-a1605c88df3f@thekunderts.net>
 <CAOTD34Z6XBM=_Gv_A2TVK2O2BEgDz6gOTbLFRHeeJsqTTWf9MA@mail.gmail.com>
 <20181128150348.GT4319@ando.pearwood.info>
 <CAOTD34bzze1+LneSPX5+s4BGaSLo2sgS9_6uzF=_js3d=KxFMw@mail.gmail.com>
 <20181128153305.GV4319@ando.pearwood.info>
Message-ID: <CAOTD34ax4CGJ-7WCdgGr7+ZpjNEJsMDDSmNX24TBVYd+kQYngA@mail.gmail.com>

Probably the most proliferate reason it made things *worse* is that many
functions that can take collections as arguments--in fact probably
most--were never written to accept arbitrary iterables in the first place.
Perhaps they should have been, but the majority of that was before my time
so I and others who worked on the Python 3 port were stuck with that.

Sure the fix is simple enough: check if the object is iterable (itself not
always as simple as one might assume) and then call list() on it. But we're
talking thousands upon thousands of functions that need to be updated where
examples involving map previously would have just worked.

But on top of the obvious workarounds I would now like to do things like
protect users, where possible, from doing things like passing arbitrarily
sized data to relatively flimsy C libraries, or as I mentioned in my last
message make new optimizations that weren't possible before.

Of course this isn't always possible in some cases where dealing with an
arbitrary opaque iterator, or some pathological cases. But I'm concerned
more about doing the best we can in the most common cases (lists, tuples,
vectors, etc) which are *vastly* more common.

I use SageMath as an example but I'm sure others could come up with their
own clever use cases. I know there are other cases where I've wanted to at
least try to get the len of a map, at least in cases where it was
unambiguous (for example making a progress meter or something)

On Wed, Nov 28, 2018, 16:33 Steven D'Aprano <steve at pearwood.info wrote:

> On Wed, Nov 28, 2018 at 04:14:24PM +0100, E. Madison Bray wrote:
>
> > For example, some function that used to expect some finite-sized
> > sequence such as a list or tuple is now passed a "map", possibly
> > wrapping one or more iterable of arbitrary, possibly non-finite size.
> > For the purposes of some algorithm I have this is not useful and I
> > need to convert it to a sequence anyways but don't want to do that
> > without some guarantee that I won't blow up the user's memory usage.
> > So I might want to check:
> >
> > finite_definite = True
> > for it in my_map.iters:
> >     try:
> >         len(it)
> >     except TypeError:
> >         finite_definite = False
> >
> > if finite_definite:
> >     my_seq = list(my_map)
> > else:
> >     # some other algorithm
>
> But surely you didn't need to do this just because of *map*. Users could
> have passed an infinite, unsized iterable going back to Python 1 days
> with the old sequence protocol. They certainly could pass a generator or
> other opaque iterator apart from map. So I'm having trouble seeing why
> the Python 2/3 change to map made things worse for SageMath.
>
> But in any case, this example comes back to the question of len again,
> and we've already covered why this is problematic. In case you missed
> it, let's take a toy example which demonstrates the problem:
>
>
> def mean(it):
>     if isinstance(it, map):
>         # Hypothetical attribute access to the underlying iterable.
>         n = len(it.iterable)
>         return sum(it)/n
>
>
> Now let's pass a map object to it:
>
> data = [1, 2, 3, 4, 5]
> it = map(lambda x: x, data)
> assert len(it.iterable) == 5
> next(it); next(it); next(it)
>
> assert mean(it) == 4.5
> # fails, as it actually returns 9/5 instead of 9/2
>
>
> --
> Steve
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> https://mail.python.org/mailman/listinfo/python-ideas
> Code of Conduct: http://python.org/psf/codeofconduct/
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20181128/ce574070/attachment.html>

From erik.m.bray at gmail.com  Wed Nov 28 11:25:03 2018
From: erik.m.bray at gmail.com (E. Madison Bray)
Date: Wed, 28 Nov 2018 17:25:03 +0100
Subject: [Python-ideas] __len__() for map()
In-Reply-To: <CAOTD34ax4CGJ-7WCdgGr7+ZpjNEJsMDDSmNX24TBVYd+kQYngA@mail.gmail.com>
References: <3e46b3e5-09e0-b53e-16f3-a1605c88df3f@thekunderts.net>
 <CAOTD34Z6XBM=_Gv_A2TVK2O2BEgDz6gOTbLFRHeeJsqTTWf9MA@mail.gmail.com>
 <20181128150348.GT4319@ando.pearwood.info>
 <CAOTD34bzze1+LneSPX5+s4BGaSLo2sgS9_6uzF=_js3d=KxFMw@mail.gmail.com>
 <20181128153305.GV4319@ando.pearwood.info>
 <CAOTD34ax4CGJ-7WCdgGr7+ZpjNEJsMDDSmNX24TBVYd+kQYngA@mail.gmail.com>
Message-ID: <CAOTD34Z2EVggsyLyLzg6_pgVzrG6eYtyUCdEhnK5_Tw+412+_w@mail.gmail.com>

I should add, I know the history here of bitterness surrounding Python 3
complaints and this is not one. I defend most things Python 3 and have
ported many projects (Sage just being the largest by orders of magnitude,
with every Python 3 porting quirk represented and often magnified). I agree
with the new iterable map(), filter(), and zip() and welcomed that change.
But I think making them more introspectable would be a useful enhancement.

On Wed, Nov 28, 2018, 17:16 E. Madison Bray <erik.m.bray at gmail.com wrote:

> Probably the most proliferate reason it made things *worse* is that many
> functions that can take collections as arguments--in fact probably
> most--were never written to accept arbitrary iterables in the first place.
> Perhaps they should have been, but the majority of that was before my time
> so I and others who worked on the Python 3 port were stuck with that.
>
> Sure the fix is simple enough: check if the object is iterable (itself not
> always as simple as one might assume) and then call list() on it. But we're
> talking thousands upon thousands of functions that need to be updated where
> examples involving map previously would have just worked.
>
> But on top of the obvious workarounds I would now like to do things like
> protect users, where possible, from doing things like passing arbitrarily
> sized data to relatively flimsy C libraries, or as I mentioned in my last
> message make new optimizations that weren't possible before.
>
> Of course this isn't always possible in some cases where dealing with an
> arbitrary opaque iterator, or some pathological cases. But I'm concerned
> more about doing the best we can in the most common cases (lists, tuples,
> vectors, etc) which are *vastly* more common.
>
> I use SageMath as an example but I'm sure others could come up with their
> own clever use cases. I know there are other cases where I've wanted to at
> least try to get the len of a map, at least in cases where it was
> unambiguous (for example making a progress meter or something)
>
> On Wed, Nov 28, 2018, 16:33 Steven D'Aprano <steve at pearwood.info wrote:
>
>> On Wed, Nov 28, 2018 at 04:14:24PM +0100, E. Madison Bray wrote:
>>
>> > For example, some function that used to expect some finite-sized
>> > sequence such as a list or tuple is now passed a "map", possibly
>> > wrapping one or more iterable of arbitrary, possibly non-finite size.
>> > For the purposes of some algorithm I have this is not useful and I
>> > need to convert it to a sequence anyways but don't want to do that
>> > without some guarantee that I won't blow up the user's memory usage.
>> > So I might want to check:
>> >
>> > finite_definite = True
>> > for it in my_map.iters:
>> >     try:
>> >         len(it)
>> >     except TypeError:
>> >         finite_definite = False
>> >
>> > if finite_definite:
>> >     my_seq = list(my_map)
>> > else:
>> >     # some other algorithm
>>
>> But surely you didn't need to do this just because of *map*. Users could
>> have passed an infinite, unsized iterable going back to Python 1 days
>> with the old sequence protocol. They certainly could pass a generator or
>> other opaque iterator apart from map. So I'm having trouble seeing why
>> the Python 2/3 change to map made things worse for SageMath.
>>
>> But in any case, this example comes back to the question of len again,
>> and we've already covered why this is problematic. In case you missed
>> it, let's take a toy example which demonstrates the problem:
>>
>>
>> def mean(it):
>>     if isinstance(it, map):
>>         # Hypothetical attribute access to the underlying iterable.
>>         n = len(it.iterable)
>>         return sum(it)/n
>>
>>
>> Now let's pass a map object to it:
>>
>> data = [1, 2, 3, 4, 5]
>> it = map(lambda x: x, data)
>> assert len(it.iterable) == 5
>> next(it); next(it); next(it)
>>
>> assert mean(it) == 4.5
>> # fails, as it actually returns 9/5 instead of 9/2
>>
>>
>> --
>> Steve
>> _______________________________________________
>> Python-ideas mailing list
>> Python-ideas at python.org
>> https://mail.python.org/mailman/listinfo/python-ideas
>> Code of Conduct: http://python.org/psf/codeofconduct/
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20181128/c9aea804/attachment-0001.html>

From jfine2358 at gmail.com  Wed Nov 28 11:28:49 2018
From: jfine2358 at gmail.com (Jonathan Fine)
Date: Wed, 28 Nov 2018 16:28:49 +0000
Subject: [Python-ideas] __len__() for map()
In-Reply-To: <CAOTD34ax4CGJ-7WCdgGr7+ZpjNEJsMDDSmNX24TBVYd+kQYngA@mail.gmail.com>
References: <3e46b3e5-09e0-b53e-16f3-a1605c88df3f@thekunderts.net>
 <CAOTD34Z6XBM=_Gv_A2TVK2O2BEgDz6gOTbLFRHeeJsqTTWf9MA@mail.gmail.com>
 <20181128150348.GT4319@ando.pearwood.info>
 <CAOTD34bzze1+LneSPX5+s4BGaSLo2sgS9_6uzF=_js3d=KxFMw@mail.gmail.com>
 <20181128153305.GV4319@ando.pearwood.info>
 <CAOTD34ax4CGJ-7WCdgGr7+ZpjNEJsMDDSmNX24TBVYd+kQYngA@mail.gmail.com>
Message-ID: <CALD=Yf-o1PomR07W=U91311qMEdUpDbLavqLXy8PY9cSdXfzmQ@mail.gmail.com>

Hi Madison

Is there a URL somewhere where I can view code written to port sage to
Python3? I've already found
https://trac.sagemath.org/search?q=python3

And because I'm a bit interested in cluster algebra, I went to
https://git.sagemath.org/sage.git/commit/?id=3a6f494ac1d4dbc1e22b0ecbebdbc639f6c7f6d3

Is this a good example of the change required? Are there other example
worth looking at?

-- 
Jonathan

From boxed at killingar.net  Wed Nov 28 11:37:39 2018
From: boxed at killingar.net (=?utf-8?Q?Anders_Hovm=C3=B6ller?=)
Date: Wed, 28 Nov 2018 17:37:39 +0100
Subject: [Python-ideas] __len__() for map()
In-Reply-To: <CAOTD34aK5fVRMDiM3s+sYWMJqUH_RF_N=QSBWD-Nbs0DeEDqqw@mail.gmail.com>
References: <3e46b3e5-09e0-b53e-16f3-a1605c88df3f@thekunderts.net>
 <CAOTD34Z6XBM=_Gv_A2TVK2O2BEgDz6gOTbLFRHeeJsqTTWf9MA@mail.gmail.com>
 <CALD=Yf-7Jsb22thjHordR-EkK_Mj6sKX8x4kUEMFhJE=71AGWA@mail.gmail.com>
 <CAPTjJmqmx4bPr_aCbM9GHn6VuWf=yJGEObwO51Y1SQ6W7FOxxw@mail.gmail.com>
 <CAOTD34aK5fVRMDiM3s+sYWMJqUH_RF_N=QSBWD-Nbs0DeEDqqw@mail.gmail.com>
Message-ID: <A0F3D279-0160-47E0-9F2D-8D1667792BE9@killingar.net>


> I just mentioned that porting effort for background.  I still believe
> that the actual proposal of making the arguments to a map(...) call
> accessible from Python as attributes of the map object (ditto filter,
> zip, etc.) is useful in its own right, rather than just having this
> completely opaque iterator.

+1.  Throwing away information is almost always a bad idea. That was fixed with classes and kwargs in 3.6 which removes a lot of fiddle workarounds for example. 

Throwing away data needlessly is also why 2to3, baron, Parso and probably many more had to reimplement a python parser instead of using the built in. 

We should have information preservation and transparency be general design goals imo. Not because we can see the obvious use now but because it keeps the door open to discover uses later. 

/ Anders

From tjreedy at udel.edu  Wed Nov 28 14:53:50 2018
From: tjreedy at udel.edu (Terry Reedy)
Date: Wed, 28 Nov 2018 14:53:50 -0500
Subject: [Python-ideas] __len__() for map()
In-Reply-To: <CAOTD34Z6XBM=_Gv_A2TVK2O2BEgDz6gOTbLFRHeeJsqTTWf9MA@mail.gmail.com>
References: <3e46b3e5-09e0-b53e-16f3-a1605c88df3f@thekunderts.net>
 <CAOTD34Z6XBM=_Gv_A2TVK2O2BEgDz6gOTbLFRHeeJsqTTWf9MA@mail.gmail.com>
Message-ID: <ptmrkb$fam$1@blaine.gmane.org>

On 11/28/2018 9:27 AM, E. Madison Bray wrote:
> On Mon, Nov 26, 2018 at 10:35 PM Kale Kundert <kale at thekunderts.net> wrote:
>>
>> I just ran into the following behavior, and found it surprising:
>>
>>>>> len(map(float, [1,2,3]))
>> TypeError: object of type 'map' has no len()
>>
>> I understand that map() could be given an infinite sequence and therefore might not always have a length.  But in this case, it seems like map() should've known that its length was 3.  I also understand that I can just call list() on the whole thing and get a list, but the nice thing about map() is that it doesn't copy data, so it's unfortunate to lose that advantage for no particular reason.
>>
>> My proposal is to delegate map.__len__() to the underlying iterable.  

One of the guidelines in the Zen of Python is
"Special cases aren't special enough to break the rules."

This proposal claims that the Python 3 built-in iterator class 'map' is 
so special that it should break the rule that iterators in general 
cannot and therefore do not have .__len__ methods because their size may 
be infinite, unknowable until exhaustion, or declining with each 
.__next__ call.

For iterators, 3.4 added an optional __length_hint__ method.  This makes 
sense for iterators, like tuple_iterator, list_iterator, range_iterator, 
and dict_keyiterator, based on a known finite collection.  At the time, 
map.__length_hint__ was proposed and rejected as problematic, for 
obvious reasons, and insufficiently useful.

The proposal above amounts to adding an unspecified __length_hint__ 
misnamed as __len__.  Won't happen.  Instead, proponents should define 
and test one or more specific implementations of __length_hint__ in map 
subclass(es).

> I mostly agree with the existing objections, though I have often found
> myself wanting this too, especially now that `map` does not simply
> return a list.

What makes the map class special among all built-in iterator classes? 
It appears not to be a property of the class itself, as an iterator 
class, but of its name.  In Python 2, 'map' was bound to a different 
implementation of the map idea, a function that produced a list, which 
has a length.  I suspect that if Python 3 were the original Python, we 
would not have this discussion.

> As a simple counter-proposal which I believe has fewer issues, I would
> really like it if the built-in `map()` and `filter()` at least
> provided a Python-level attribute to access the underlying iterables.

This proposes to make map (and filter) special in a different way, by 
adding other special (dunder) attributes.  In general, built-in 
callables do not attach their args to their output, for obvious reasons. 
  If they do, they do not expose them.  If input data must be saved, the 
details are implementation dependent.  A C-coded callable would not 
necessarily save information in the form of Python objects.

Again, it seems to me that the only thing special about these two, 
versus the other iterators left in itertools, is the history of the names.

> This is necessary because if I have a function that used to take, say,
> a list as an argument, and it receives a `map` object, I now have to
> be able to deal with map()s,

If a function is documented as requiring a list, or a sequence, or a 
length object, it is a user bug to pass an iterator.  The only thing 
special about map and filter as errors is the rebinding of the names 
between Py2 and Py3, so that the same code may be good in 2.x and bad in 
3.x.

Perhaps 2.7, in addition to future imports of text as unicode and print 
as a function, should have had one to make map and filter be the 3.x 
iterators.

Perhaps Sage needs something like

def size_map(func, *iterables):
     for it in iterables:
         if not hasattr(it, '__len__'):
             raise TypeError(f'iterable {repr(it)} has no size')
     return map(func, *iterables)

https://docs.python.org/3/library/functions.html#map says
"map(function, iterable, ...)
     Return an iterator [...]"

The wording is intentional.  The fact that map is a class and the 
iterator an instance of the class is a CPython implementation detail. 
Another implementation could use the generator function equivalent given 
in the Python 2 itertools doc, or a translation thereof.  I don't know 
what pypy and other implementations do.  The fact that CPython itertools 
callables are (now) C-coded classes instead Python-coded generator 
functions, or C translations thereof (which is tricky) is for 
performance and ease of maintenance.

-- 
Terry Jan Reedy


From abedillon at gmail.com  Wed Nov 28 15:28:16 2018
From: abedillon at gmail.com (Abe Dillon)
Date: Wed, 28 Nov 2018 14:28:16 -0600
Subject: [Python-ideas] [Brainstorm] Testing with Documented ABCs
In-Reply-To: <CAGu4bVBmsuh5hBS9fGGc6FOPTq_gf7Jg1Z_L-kr_hnU+f9xH6Q@mail.gmail.com>
References: <CAKr=oZupOAN47d2LC9Xzu0mTu6KJM--BUqrUpo41WUQ=8VedfQ@mail.gmail.com>
 <CAGu4bVBmsuh5hBS9fGGc6FOPTq_gf7Jg1Z_L-kr_hnU+f9xH6Q@mail.gmail.com>
Message-ID: <CAKr=oZtBUK2rHpa0uw8a-VE7n=Y_HutU+B1zyTt2cKQm797-Xg@mail.gmail.com>

[Marko Ristin-Kaufmann]
>
> Have you looked at the recent discussions regarding design-by-contract on
> this list


I tried to read through them all before posting, but I may have missed some
of the forks. There was a lot of good discussion!

[Marko Ristin-Kaufmann]

> You might want to have a look at static checking techniques such as
> abstract interpretation. I hope to be able to work on such a tool for
> Python in some two years from now. We can stay in touch if you are
> interested.


I'll look into that! I'm very interested!

[Marko Ristin-Kaufmann]

> Re decorators: to my own surprise, using decorators in a larger code base
> is completely practical including the  readability and maintenance of the
> code. It's neither that ugly nor problematic as it might seem at first look.


Interesting. In the thread you linked on DBC, it seemed like Steve D'Aprano
and David Mertz (and possibly others) were put off by the verbosity and
noisiness of the decorator-based solution you provided with icontract
(though I think there are ways to streamline that solution). It seems like
syntactic support could offer a more concise and less noisy implementation.

One thing that I can get on a soap-box about is the benefit putting the
most relevant information to the reader in the order of top to bottom and
left to right whenever possible. I've written many posts about this. I
think a lot of Python syntax gets this right. It would have been easy to
follow the same order as for-loops when designing comprehensions, but
expressions allow you some freedom to order things differently, so now
comprehensions read:

squares = ...
# squares is

squares = [...
# squares is a list

squares = [number*number...
# squares is a list of num squared

squares = [number*number for num in numbers]
# squares is a list of num squared 'from' numbers

I think decorators sort-of break this rule because they can put a lot of
less important information (like, that a function is logged or timed)
before more important information (like the function's name, signature,
doc-string, etc...). It's not a huge deal because they tend to be
de-emphasized by my IDE and there typically aren't dozens of them on each
function, but I definitely prefer Eiffel's syntax
<https://www.eiffel.com/values/design-by-contract/introduction/> over
decorators for that reason.

I understand that syntax changes have an very high bar for very good
reasons. Hillel Wayne's PyCon talk got me thinking that we might be close
enough to a really great solution to a wide variety of testing problems
that it might justify some new syntax or perhaps someone has an idea that
wouldn't require new syntax that I didn't think of.

[Marko Ristin-Kaufmann]

> Some of the aspects we still haven't figured out are: how to approach
> multi-threading (locking around the whole function with an additional
> decorator?) and granularity of contract switches (right now we use
> always/optimized, production/non-optimized and teating/slow, but it seems
> that a larger system requires finer categories).


Yeah... I don't know anything about testing concurrent or parallel code.

On Wed, Nov 28, 2018 at 1:12 AM Marko Ristin-Kaufmann <
marko.ristin at gmail.com> wrote:

> Hi Abe,
>
> I've been pulling a lot of ideas from the recent discussion on design by
>> contract (DBC), the elegance and drawbacks
>> <https://bemusement.org/doctests-arent-code> of doctests
>> <https://docs.python.org/3/library/doctest.html>, and the amazing talk
>> <https://www.youtube.com/watch?v=MYucYon2-lk> given by Hillel Wayne at
>> this year's PyCon entitled "Beyond Unit Tests: Taking your Tests to the
>> Next Level".
>>
>
> Have you looked at the recent discussions regarding design-by-contract on
> this list (
> https://groups.google.com/forum/m/#!topic/python-ideas/JtMgpSyODTU
> and the following forked threads)?
>
> You might want to have a look at static checking techniques such as
> abstract interpretation. I hope to be able to work on such a tool for
> Python in some two years from now. We can stay in touch if you are
> interested.
>
> Re decorators: to my own surprise, using decorators in a larger code base
> is completely practical including the  readability and maintenance of the
> code. It's neither that ugly nor problematic as it might seem at first look.
>
> We use our https://github.com/Parquery/icontract at the company. Most of
> the design choices come from practical issues we faced -- so you might want
> to read the doc even if you don't plant to use the library.
>
> Some of the aspects we still haven't figured out are: how to approach
> multi-threading (locking around the whole function with an additional
> decorator?) and granularity of contract switches (right now we use
> always/optimized, production/non-optimized and teating/slow, but it seems
> that a larger system requires finer categories).
>
> Cheers Marko
>
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20181128/35d97398/attachment.html>

From greg.ewing at canterbury.ac.nz  Wed Nov 28 15:41:41 2018
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Thu, 29 Nov 2018 09:41:41 +1300
Subject: [Python-ideas] __len__() for map()
In-Reply-To: <CAOTD34Z6XBM=_Gv_A2TVK2O2BEgDz6gOTbLFRHeeJsqTTWf9MA@mail.gmail.com>
References: <3e46b3e5-09e0-b53e-16f3-a1605c88df3f@thekunderts.net>
 <CAOTD34Z6XBM=_Gv_A2TVK2O2BEgDz6gOTbLFRHeeJsqTTWf9MA@mail.gmail.com>
Message-ID: <5BFEFD85.3060700@canterbury.ac.nz>

E. Madison Bray wrote:
> if I have a function that used to take, say,
> a list as an argument, and it receives a `map` object, I now have to
> be able to deal with map()s, and I may have checks I want to perform
> on the underlying iterables before, say, I try to iterate over the
> `map`.

This sounds like a backwards way to address the issue. If you
have a function that expects a list in particular, it's up to
its callers to make sure they give it one. Instead of maing the
function do a bunch of looking before it leaps, it would be
better to define something like

    def lmap(f, *args): return list(map(f, *args))

and then replace 'map' with 'lmap' elsewhere in your code.

-- 
Greg

From abedillon at gmail.com  Wed Nov 28 16:31:54 2018
From: abedillon at gmail.com (Abe Dillon)
Date: Wed, 28 Nov 2018 15:31:54 -0600
Subject: [Python-ideas] [Brainstorm] Testing with Documented ABCs
In-Reply-To: <20181128094443.GS4319@ando.pearwood.info>
References: <CAKr=oZupOAN47d2LC9Xzu0mTu6KJM--BUqrUpo41WUQ=8VedfQ@mail.gmail.com>
 <20181128094443.GS4319@ando.pearwood.info>
Message-ID: <CAKr=oZtrffvytRgFkGfts3QSk+2xueEdWaPKz0D+SAZ+N3t_5Q@mail.gmail.com>

[Steven D'Aprano]

> You should look at the state of the art in Design By Contract. In
> Eiffel, DBC is integrated in the language:
> https://www.eiffel.com/values/design-by-contract/introduction/
>
> https://www.eiffel.org/doc/eiffel/ET-_Design_by_Contract_%28tm%29%2C_Assertions_and_Exceptions
>
> Eiffel uses a rather Pythonic block structure to define invariants.
> The syntax is not identical to Python's (Eiffel eschews the colons) but
> it also comes close to executable pseudo-code.


Thank you! I forgot to mention this (or look into how other languages solve
this problem).
I saw your example syntax in the recent DBC main thread and liked it a lot.

One thought I keep coming back to is this comparison between doc string
formats <https://bwanamarko.alwaysdata.net/napoleon/format_exception.html>.
It seems obvious that the "Sphynxy" style is the noisiest, most verbose,
and ugliest format.
Instead of putting ":arg ...:" and ":type ...:" for each parameter and the
return value, it makes much more sense to open up an Args: section and use
a concise notation for type.

The decorator-based pre and post conditions seem like they suffer from the
same redundant, noisy, verbosity problem as the Sphynxy docstring format
but makes it worse by put all that noise before the function declaration
itself.

It makes sense to me that a docstring might have a markdown-style syntax
like

def format_exception(etype, value):
    """    Format the exception with a traceback.
    Args:       etype (str):  what etype represents           [some
constraint on etype](precondition)           [another constraint on
etype](in_line_precondition?)       value (int): what value represents
          [some constraint on value](precondition)       [some
constraints across multiple params](precondition)
    Returns:       What the return value represents  # usually very
similar to the description at the top           [some constraint on
return](postcondition)
    """
    ...


That ties most bits of the documentation to some code that enforces the
correctness of the documentation. And if it's a little noisy, we could take
another page from markdown's book and offer alternate ways to reference
precondition and postcondition logic. I'm worried that such a style would
carry a lot of the same drawbacks as doctest
<https://bemusement.org/doctests-arent-code>

Also, my sense of coding style has been heavily influenced by [this talk](
https://vimeo.com/74316116), particularly the part where he shoves a
mangled Hamlet Soliloquy into the margins, so now many of my functions
adopt the following style:

def someDescriptiveName(
        arg1: SomeType,
        arg2: AnotherType[Thing],
        ...
        argN: SomeOtherType = default_value) -> ReturnType:
    """
    what the function does

    Args:
        arg1: what arg1 represents
        arg2: what arg2 represents
        ...
    """
    ...

This highlights a rather obvious duplication of code. We declare an
arguments section in code and list all the arguments, then we do so again
in the doc string.
If you want your doc string to stay in sync with the code, this duplication
is a problem. It makes more sense to tie the documentation for an argument
to said argument:

def someDescriptiveName( # what the function does
        arg1: SomeType,
            # what arg1 represents
        arg2: AnotherType[Thing],
            # what arg2 represents
        ...
        argN: SomeOtherType = default_value
            # what argN represents
        ) -> ReturnType:  # what the return value represents
    ...

I think it especially makes sense if you consider the preconditions,
postconditions, and invariants as a sort-of extension of typing in the
sense that it Typing narrows the set of acceptable values to a set of types
and contracts restrict that set further.

I hope that clarifies my thought process. I don't like the d-strings that I
proposed. I'd prefer syntax closer to Eiffel, but the above is the line of
thought I was following to arrive at d-strings.

>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20181128/ef8cdd2f/attachment.html>

From abedillon at gmail.com  Wed Nov 28 16:58:24 2018
From: abedillon at gmail.com (Abe Dillon)
Date: Wed, 28 Nov 2018 15:58:24 -0600
Subject: [Python-ideas] [Brainstorm] Testing with Documented ABCs
In-Reply-To: <20181128151810.7c2393c2@fsol>
References: <CAKr=oZupOAN47d2LC9Xzu0mTu6KJM--BUqrUpo41WUQ=8VedfQ@mail.gmail.com>
 <20181128151810.7c2393c2@fsol>
Message-ID: <CAKr=oZsxC4kH=a=n2a2V0Ft5SpiL44wZ6PCw7gV49dg4ESfRYw@mail.gmail.com>

[Antoinw Pitrou]

> I think utopia is the word here.  Fuzz testing can be useful, but it's
> not a replacement for manual testing of carefully selected values.


First, they aren't mutually exclusive. It's trivial to add manually
selected cases to a hypothesis test.
Second, from my experience; people rarely choose between carefully selected
optimal values and Fuzz testing, they usually choose between manually
selected trivial values or no test at all.
Thirdly, Computers are very good at exhaustively searching multidimensional
spaces. If your tool sucks so bad at that that a human can do it better,
then your tool needs work. Improving the tool saves way more time than
reverting to manual testing.

There was a post long ago (I think I read it on Digg.com for some
indication) about how to run a cloud-based system correctly. One of the
controversial practice the article advocated was disabling ssh on the
machine instances. The rational is that you never want to waste your time
fiddling with an instance that's not behaving properly. In cloud-systems,
instances should not be special. If they fail, blow them away and bring up
another. If the failure persists, it's a problem with the *system* not the
instance. If you care about individual instances YOU'RE DOING IT WRONG. You
need to re-design the system.

On Wed, Nov 28, 2018 at 8:19 AM Antoine Pitrou <solipsis at pitrou.net> wrote:

> On Tue, 27 Nov 2018 22:47:06 -0600
> Abe Dillon <abedillon at gmail.com> wrote:
> >
> > If we could figure out a cleaner syntax for defining invariants,
> > preconditions, and postconditions we'd be half-way to automated testing
> > UTOPIA! (ok, maybe I'm being a little over-zealous)
>
> I think utopia is the word here.  Fuzz testing can be useful, but it's
> not a replacement for manual testing of carefully selected values.
>
> Also, the idea that fuzz testing will automatically find edge cases in
> your code is idealistic.  It depends on the algorithm you've
> implemented and the distribution of values chosen by the tester.
> Showcasing trivially wrong examples (such as an addition function that
> always returns 0, or a tail function that doesn't return the tail)
> isn't very helpful for a real-world analysis, IMHO.
>
> In the end, you have to be rigorous when writing tests, and for most
> non-trivial functions it requires that you devise the distribution of
> input values depending on the implemented algorithm, not leave that
> distribution to a third-party library that knows nothing about your
> program.
>
> Regards
>
> Antoine.
>
>
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> https://mail.python.org/mailman/listinfo/python-ideas
> Code of Conduct: http://python.org/psf/codeofconduct/
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20181128/08c8227d/attachment-0001.html>

From steve at pearwood.info  Wed Nov 28 17:03:23 2018
From: steve at pearwood.info (Steven D'Aprano)
Date: Thu, 29 Nov 2018 09:03:23 +1100
Subject: [Python-ideas] __len__() for map()
In-Reply-To: <A0F3D279-0160-47E0-9F2D-8D1667792BE9@killingar.net>
References: <3e46b3e5-09e0-b53e-16f3-a1605c88df3f@thekunderts.net>
 <CAOTD34Z6XBM=_Gv_A2TVK2O2BEgDz6gOTbLFRHeeJsqTTWf9MA@mail.gmail.com>
 <CALD=Yf-7Jsb22thjHordR-EkK_Mj6sKX8x4kUEMFhJE=71AGWA@mail.gmail.com>
 <CAPTjJmqmx4bPr_aCbM9GHn6VuWf=yJGEObwO51Y1SQ6W7FOxxw@mail.gmail.com>
 <CAOTD34aK5fVRMDiM3s+sYWMJqUH_RF_N=QSBWD-Nbs0DeEDqqw@mail.gmail.com>
 <A0F3D279-0160-47E0-9F2D-8D1667792BE9@killingar.net>
Message-ID: <20181128220323.GX4319@ando.pearwood.info>

On Wed, Nov 28, 2018 at 05:37:39PM +0100, Anders Hovm?ller wrote:
> 
> 
> > I just mentioned that porting effort for background.  I still believe
> > that the actual proposal of making the arguments to a map(...) call
> > accessible from Python as attributes of the map object (ditto filter,
> > zip, etc.) is useful in its own right, rather than just having this
> > completely opaque iterator.
> 
> +1.  Throwing away information is almost always a bad idea.

"Almost always"? Let's take this seriously, and think about the 
consequences if we actually believed that. If I created a series of 
integers:

a = 23
b = 0x17
c = 0o27
d = 0b10111
e = int('1b', 12)

your assertion would say it is a bad idea to throw away the information 
about how they were created, and hence we ought to treat all five values 
as distinct and distinguishable. So much for the small integer cache...

Perhaps every single object we create ought to hold onto a AST 
representing the literal or expression which was used to create it.

Let's not exaggerate the benefit, and ignore the costs, of "throwing 
away information". Sometimes we absolutely do want to throw away 
information, or at least make it inaccessible to the consumer of our 
data structures.

Sometimes the right thing to do is *not* to open up interfaces unless 
there is a clear need for it to be open. Doing so adds bloat to the 
interface, prevents many changes in implementation including potential 
optimizations, and may carry significant memory burdens.

Bringing this discussion back to the concrete proposal in this thread, 
as I said earlier, I want to agree with this proposal. I too like the 
idea of having map (and filter, and zip...) objects expose their 
arguments, and for the same reason: "it might be useful some day".

But every time we scratch beneath the surface and try to think about how 
and when we would actually use that information, we run into conceptual 
and practical problems which suggests strongly to me that doing this 
will turn it into a serious bug magnet, an anti-feature which sounds 
good but causes more problems than it solves.

I'm really hoping someone can convince me this is a good idea, but so 
far the proposal seems like an attractive nuisance and not a feature.


> We should have information preservation and transparency be general 
> design goals imo. Not because we can see the obvious use now but 
> because it keeps the door open to discover uses later.

While that is a reasonable position to take in some circumstances, 
in others it goes completely against YAGNI.


-- 
Steve

From solipsis at pitrou.net  Wed Nov 28 17:08:26 2018
From: solipsis at pitrou.net (Antoine Pitrou)
Date: Wed, 28 Nov 2018 23:08:26 +0100
Subject: [Python-ideas] [Brainstorm] Testing with Documented ABCs
References: <CAKr=oZupOAN47d2LC9Xzu0mTu6KJM--BUqrUpo41WUQ=8VedfQ@mail.gmail.com>
 <20181128151810.7c2393c2@fsol>
 <CAKr=oZsxC4kH=a=n2a2V0Ft5SpiL44wZ6PCw7gV49dg4ESfRYw@mail.gmail.com>
Message-ID: <20181128230826.39ce721c@fsol>

On Wed, 28 Nov 2018 15:58:24 -0600
Abe Dillon <abedillon at gmail.com> wrote:
> Thirdly, Computers are very good at exhaustively searching multidimensional
> spaces.

How long do you think it will take your computer to exhaustively search
the space of possible input values to a 2-integer addition function?

Do you think it can finish before the Earth gets engulfed by the Sun? 

Regards

Antoine.


From abedillon at gmail.com  Wed Nov 28 17:24:50 2018
From: abedillon at gmail.com (Abe Dillon)
Date: Wed, 28 Nov 2018 16:24:50 -0600
Subject: [Python-ideas] [Brainstorm] Testing with Documented ABCs
In-Reply-To: <20181128230826.39ce721c@fsol>
References: <CAKr=oZupOAN47d2LC9Xzu0mTu6KJM--BUqrUpo41WUQ=8VedfQ@mail.gmail.com>
 <20181128151810.7c2393c2@fsol>
 <CAKr=oZsxC4kH=a=n2a2V0Ft5SpiL44wZ6PCw7gV49dg4ESfRYw@mail.gmail.com>
 <20181128230826.39ce721c@fsol>
Message-ID: <CAKr=oZv5e+ADKBNxtxq1LtSmcCUi6e=w0jWgPVOMSH2zORKgrA@mail.gmail.com>

[Antoine Pitrou]

> How long do you think it will take your computer to exhaustively search
> the space of possible input values to a 2-integer addition function?
> Do you think it can finish before the Earth gets engulfed by the Sun?


Yes, ok. I used the word "exhaustively" wrong. Sorry about that.

I don't think humans are made of a magical substance that can exhaustively
search the space of possible pairs of integers before the heat-death of the
universe.
I think humans use strategies based, hopefully; in logic to come up with
test examples, and that it's often more valuable to capture said strategies
in code than to make a human run the algorithms. In cases where
domain-knowledge helps inform the search strategy, there should be
easy-to-use tools to build a domain-specific search strategy.

On Wed, Nov 28, 2018 at 4:09 PM Antoine Pitrou <solipsis at pitrou.net> wrote:

> On Wed, 28 Nov 2018 15:58:24 -0600
> Abe Dillon <abedillon at gmail.com> wrote:
> > Thirdly, Computers are very good at exhaustively searching
> multidimensional
> > spaces.
>
> How long do you think it will take your computer to exhaustively search
> the space of possible input values to a 2-integer addition function?
>
> Do you think it can finish before the Earth gets engulfed by the Sun?
>
> Regards
>
> Antoine.
>
>
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> https://mail.python.org/mailman/listinfo/python-ideas
> Code of Conduct: http://python.org/psf/codeofconduct/
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20181128/05fa39fe/attachment.html>

From steve at pearwood.info  Wed Nov 28 17:27:14 2018
From: steve at pearwood.info (Steven D'Aprano)
Date: Thu, 29 Nov 2018 09:27:14 +1100
Subject: [Python-ideas] __len__() for map()
In-Reply-To: <ptmrkb$fam$1@blaine.gmane.org>
References: <3e46b3e5-09e0-b53e-16f3-a1605c88df3f@thekunderts.net>
 <CAOTD34Z6XBM=_Gv_A2TVK2O2BEgDz6gOTbLFRHeeJsqTTWf9MA@mail.gmail.com>
 <ptmrkb$fam$1@blaine.gmane.org>
Message-ID: <20181128222714.GY4319@ando.pearwood.info>

On Wed, Nov 28, 2018 at 02:53:50PM -0500, Terry Reedy wrote:

> One of the guidelines in the Zen of Python is
> "Special cases aren't special enough to break the rules."
> 
> This proposal claims that the Python 3 built-in iterator class 'map' is 
> so special that it should break the rule that iterators in general 
> cannot and therefore do not have .__len__ methods because their size may 
> be infinite, unknowable until exhaustion, or declining with each 
> .__next__ call.
> 
> For iterators, 3.4 added an optional __length_hint__ method.  This makes 
> sense for iterators, like tuple_iterator, list_iterator, range_iterator, 
> and dict_keyiterator, based on a known finite collection.  At the time, 
> map.__length_hint__ was proposed and rejected as problematic, for 
> obvious reasons, and insufficiently useful.

Thanks for the background Terry, but doesn't that suggest that sometimes 
special cases ARE special enough to break the rules? *wink*

Unfortunately, I don't think it is obvious why map.__length_hint__ is 
problematic. It only needs to return the *maximum* length, or some 
sentinel (zero?) to say "I don't know". It doesn't need to be accurate, 
unlike __len__ itself.

Perhaps we should rethink the decision not to give map() and filter() a 
length hint?


[...]
> What makes the map class special among all built-in iterator classes? 
> It appears not to be a property of the class itself, as an iterator 
> class, but of its name.  In Python 2, 'map' was bound to a different 
> implementation of the map idea, a function that produced a list, which 
> has a length.  I suspect that if Python 3 were the original Python, we 
> would not have this discussion.

No, in fairness, I too have often wanted to know the length of an 
arbitrary iterator, including map(), without consuming it. In general 
this is an unsolvable problem, but sometimes it is (or at least, at first 
glance *seems*) solvable. map() is one of those cases.

If we could solve it, that would be great -- but I'm not convinced that 
it is solvable, since the solution seems worse than the problem it aims 
to solve. But I live in hope that somebody cleverer than me can point 
out the flaws in my argument.


[...]
> If a function is documented as requiring a list, or a sequence, or a 
> length object, it is a user bug to pass an iterator.  The only thing 
> special about map and filter as errors is the rebinding of the names 
> between Py2 and Py3, so that the same code may be good in 2.x and bad in 
> 3.x.
> 
> Perhaps 2.7, in addition to future imports of text as unicode and print 
> as a function, should have had one to make map and filter be the 3.x 
> iterators.

I think that's future_builtins:

[steve at ando ~]$ python2.7 -c "from future_builtins import *; print map(len, [])"
<itertools.imap object at 0xb7ed39ec>

But that wouldn't have helped E. Madison Bray or SageMath, since their 
difficulty is not their own internal use of map(), but their users' use 
of map().

Unless they simply ban any use of iterators at all, which I imagine will 
be a backwards-incompatible change (and for that matter an excessive 
overreaction for many uses), SageMath can't prevent users from providing 
map() objects or other iterator arguments.


-- 
Steve

From steve at pearwood.info  Wed Nov 28 17:03:23 2018
From: steve at pearwood.info (Steven D'Aprano)
Date: Thu, 29 Nov 2018 09:03:23 +1100
Subject: [Python-ideas] __len__() for map()
In-Reply-To: <A0F3D279-0160-47E0-9F2D-8D1667792BE9@killingar.net>
References: <3e46b3e5-09e0-b53e-16f3-a1605c88df3f@thekunderts.net>
 <CAOTD34Z6XBM=_Gv_A2TVK2O2BEgDz6gOTbLFRHeeJsqTTWf9MA@mail.gmail.com>
 <CALD=Yf-7Jsb22thjHordR-EkK_Mj6sKX8x4kUEMFhJE=71AGWA@mail.gmail.com>
 <CAPTjJmqmx4bPr_aCbM9GHn6VuWf=yJGEObwO51Y1SQ6W7FOxxw@mail.gmail.com>
 <CAOTD34aK5fVRMDiM3s+sYWMJqUH_RF_N=QSBWD-Nbs0DeEDqqw@mail.gmail.com>
 <A0F3D279-0160-47E0-9F2D-8D1667792BE9@killingar.net>
Message-ID: <20181128220323.GX4319@ando.pearwood.info>

On Wed, Nov 28, 2018 at 05:37:39PM +0100, Anders Hovm?ller wrote:
> 
> 
> > I just mentioned that porting effort for background.  I still believe
> > that the actual proposal of making the arguments to a map(...) call
> > accessible from Python as attributes of the map object (ditto filter,
> > zip, etc.) is useful in its own right, rather than just having this
> > completely opaque iterator.
> 
> +1.  Throwing away information is almost always a bad idea.

"Almost always"? Let's take this seriously, and think about the 
consequences if we actually believed that. If I created a series of 
integers:

a = 23
b = 0x17
c = 0o27
d = 0b10111
e = int('1b', 12)

your assertion would say it is a bad idea to throw away the information 
about how they were created, and hence we ought to treat all five values 
as distinct and distinguishable. So much for the small integer cache...

Perhaps every single object we create ought to hold onto a AST 
representing the literal or expression which was used to create it.

Let's not exaggerate the benefit, and ignore the costs, of "throwing 
away information". Sometimes we absolutely do want to throw away 
information, or at least make it inaccessible to the consumer of our 
data structures.

Sometimes the right thing to do is *not* to open up interfaces unless 
there is a clear need for it to be open. Doing so adds bloat to the 
interface, prevents many changes in implementation including potential 
optimizations, and may carry significant memory burdens.

Bringing this discussion back to the concrete proposal in this thread, 
as I said earlier, I want to agree with this proposal. I too like the 
idea of having map (and filter, and zip...) objects expose their 
arguments, and for the same reason: "it might be useful some day".

But every time we scratch beneath the surface and try to think about how 
and when we would actually use that information, we run into conceptual 
and practical problems which suggests strongly to me that doing this 
will turn it into a serious bug magnet, an anti-feature which sounds 
good but causes more problems than it solves.

I'm really hoping someone can convince me this is a good idea, but so 
far the proposal seems like an attractive nuisance and not a feature.


> We should have information preservation and transparency be general 
> design goals imo. Not because we can see the obvious use now but 
> because it keeps the door open to discover uses later.

While that is a reasonable position to take in some circumstances, 
in others it goes completely against YAGNI.


-- 
Steve

From greg.ewing at canterbury.ac.nz  Wed Nov 28 17:45:05 2018
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Thu, 29 Nov 2018 11:45:05 +1300
Subject: [Python-ideas] __len__() for map()
In-Reply-To: <CAOTD34aK5fVRMDiM3s+sYWMJqUH_RF_N=QSBWD-Nbs0DeEDqqw@mail.gmail.com>
References: <3e46b3e5-09e0-b53e-16f3-a1605c88df3f@thekunderts.net>
 <CAOTD34Z6XBM=_Gv_A2TVK2O2BEgDz6gOTbLFRHeeJsqTTWf9MA@mail.gmail.com>
 <CALD=Yf-7Jsb22thjHordR-EkK_Mj6sKX8x4kUEMFhJE=71AGWA@mail.gmail.com>
 <CAPTjJmqmx4bPr_aCbM9GHn6VuWf=yJGEObwO51Y1SQ6W7FOxxw@mail.gmail.com>
 <CAOTD34aK5fVRMDiM3s+sYWMJqUH_RF_N=QSBWD-Nbs0DeEDqqw@mail.gmail.com>
Message-ID: <5BFF1A71.203@canterbury.ac.nz>

E. Madison Bray wrote:
> I still believe
> that the actual proposal of making the arguments to a map(...) call
> accessible from Python as attributes of the map object (ditto filter,
> zip, etc.) is useful in its own right, rather than just having this
> completely opaque iterator.

But it will only help if the user passes a map object in particular,
and not some other kind of iterator. Also it won't help if the
inputs to the map are themselves iterators that aren't amenable to
inspection. This smells like exposing an implementation detail of your
function in its API.

I don't see how it would help with your Sage port either, since
the original code only got the result of the mapping and wouldn't
have been able to inspect the underlying iterables.

I wonder whether it's too late to redefine map() so that it returns
a view object instead of an iterator, as was done when merging
dict.{items, iter_items} etc.

Alternatively, add a mapped() bultin that returns a view.

-- 
Greg

From greg.ewing at canterbury.ac.nz  Wed Nov 28 17:59:27 2018
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Thu, 29 Nov 2018 11:59:27 +1300
Subject: [Python-ideas] __len__() for map()
In-Reply-To: <CAOTD34bzze1+LneSPX5+s4BGaSLo2sgS9_6uzF=_js3d=KxFMw@mail.gmail.com>
References: <3e46b3e5-09e0-b53e-16f3-a1605c88df3f@thekunderts.net>
 <CAOTD34Z6XBM=_Gv_A2TVK2O2BEgDz6gOTbLFRHeeJsqTTWf9MA@mail.gmail.com>
 <20181128150348.GT4319@ando.pearwood.info>
 <CAOTD34bzze1+LneSPX5+s4BGaSLo2sgS9_6uzF=_js3d=KxFMw@mail.gmail.com>
Message-ID: <5BFF1DCF.5040003@canterbury.ac.nz>

E. Madison Bray wrote:
> So I might want to check:
> 
> finite_definite = True
> for it in my_map.iters:
>     try:
>         len(it)
>     except TypeError:
>         finite_definite = False
> 
> if finite_definite:
>     my_seq = list(my_map)
> else:
>     # some other algorithm

If map is being passed into your function, you can still do this
check before calling map.

If the user is doing the mapping themselves, then in Python 2 it
would have blown up anyway before your function even got called,
so nothing is any worse.

-- 
Greg


From abedillon at gmail.com  Wed Nov 28 18:14:54 2018
From: abedillon at gmail.com (Abe Dillon)
Date: Wed, 28 Nov 2018 17:14:54 -0600
Subject: [Python-ideas] __len__() for map()
In-Reply-To: <3e46b3e5-09e0-b53e-16f3-a1605c88df3f@thekunderts.net>
References: <3e46b3e5-09e0-b53e-16f3-a1605c88df3f@thekunderts.net>
Message-ID: <CAKr=oZtEf+wuDnTmLzHrg=VVQ8e1DUJ7ZjNJAmpvMJROQXhLVQ@mail.gmail.com>

I raised a related problem a while back when I found that random.sample can
only take a sequence. The example I gave was randomly sampling points on a
2D grid to initialize a board for Conway's Game of Life:

>>> def random_board(height: int, width: int, ratio: float = 0.5) ->
Set[Tuple[int, int]]:
...     """ produce a set of points randomly chosen from an height x width
grid """
...     all_points = itertools.product(range(height), range(width))
...     num_samples = ratio*height*width
...     return set(random.sample(all_points, num_samples))
...
>>> random_board(height=5, width=10, ratio=0.25)
TypeError: Population must be a sequence or set.  For dicts, use list(d).

It seems like there should be some way to pass along the information that
the size *is* known, but I couldn't think of any way of passing that info
along without adding massive amounts of complexity everywhere.

If map is able to support len() under certain circumstances, it makes sense
that other iterators and generators would be able to do the same. You might
even want a way to annotate a generator function with logic about how it
might support len().

I don't have an answer to this problem, but I hope this provides some sense
of the scope of what you're asking.

On Mon, Nov 26, 2018 at 3:36 PM Kale Kundert <kale at thekunderts.net> wrote:

> I just ran into the following behavior, and found it surprising:
>
> >>> len(map(float, [1,2,3]))
> TypeError: object of type 'map' has no len()
>
> I understand that map() could be given an infinite sequence and therefore
> might not always have a length.  But in this case, it seems like map()
> should've known that its length was 3.  I also understand that I can just
> call list() on the whole thing and get a list, but the nice thing about
> map() is that it doesn't copy data, so it's unfortunate to lose that
> advantage for no particular reason.
>
> My proposal is to delegate map.__len__() to the underlying iterable.
> Similarly, map.__getitem__() could be implemented if the underlying
> iterable supports item access:
>
> class map:
>
>     def __init__(self, func, iterable):
>         self.func = func
>         self.iterable = iterable
>
>     def __iter__(self):
>         yield from (self.func(x) for x in self.iterable)
>
>     def __len__(self):
>         return len(self.iterable)
>
>     def __getitem__(self, key):
>         return self.func(self.iterable[key])
>
> Let me know if there any downsides to this that I'm not seeing.  From my
> perspective, it seems like there would be only a number of (small)
> advantages:
>
> - Less surprising
> - Avoid some unnecessary copies
> - Backwards compatible
>
> -Kale
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> https://mail.python.org/mailman/listinfo/python-ideas
> Code of Conduct: http://python.org/psf/codeofconduct/
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20181128/8155a698/attachment.html>

From mertz at gnosis.cx  Wed Nov 28 18:24:19 2018
From: mertz at gnosis.cx (David Mertz)
Date: Wed, 28 Nov 2018 18:24:19 -0500
Subject: [Python-ideas] [Brainstorm] Testing with Documented ABCs
In-Reply-To: <20181128230826.39ce721c@fsol>
References: <CAKr=oZupOAN47d2LC9Xzu0mTu6KJM--BUqrUpo41WUQ=8VedfQ@mail.gmail.com>
 <20181128151810.7c2393c2@fsol>
 <CAKr=oZsxC4kH=a=n2a2V0Ft5SpiL44wZ6PCw7gV49dg4ESfRYw@mail.gmail.com>
 <20181128230826.39ce721c@fsol>
Message-ID: <CAEbHw4a+ad5aHW7k08PfvZqw4y2aF78YwTfdzTh=9inhtKHJSg@mail.gmail.com>

That's easy, Antoine. On a reasonable modern multi-core workstation, I can
do 4 billion additions per second. A year is just over 30 million seconds.
For 32-bit ints, I can whiz through the task in only 130,000 years. We have
at least several hundred million years before the sun engulfs us.

On Wed, Nov 28, 2018, 5:09 PM Antoine Pitrou <solipsis at pitrou.net wrote:

> On Wed, 28 Nov 2018 15:58:24 -0600
> Abe Dillon <abedillon at gmail.com> wrote:
> > Thirdly, Computers are very good at exhaustively searching
> multidimensional
> > spaces.
>
> How long do you think it will take your computer to exhaustively search
> the space of possible input values to a 2-integer addition function?
>
> Do you think it can finish before the Earth gets engulfed by the Sun?
>
> Regards
>
> Antoine.
>
>
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> https://mail.python.org/mailman/listinfo/python-ideas
> Code of Conduct: http://python.org/psf/codeofconduct/
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20181128/2535dc7f/attachment.html>

From rosuav at gmail.com  Wed Nov 28 18:27:03 2018
From: rosuav at gmail.com (Chris Angelico)
Date: Thu, 29 Nov 2018 10:27:03 +1100
Subject: [Python-ideas] [Brainstorm] Testing with Documented ABCs
In-Reply-To: <CAEbHw4a+ad5aHW7k08PfvZqw4y2aF78YwTfdzTh=9inhtKHJSg@mail.gmail.com>
References: <CAKr=oZupOAN47d2LC9Xzu0mTu6KJM--BUqrUpo41WUQ=8VedfQ@mail.gmail.com>
 <20181128151810.7c2393c2@fsol>
 <CAKr=oZsxC4kH=a=n2a2V0Ft5SpiL44wZ6PCw7gV49dg4ESfRYw@mail.gmail.com>
 <20181128230826.39ce721c@fsol>
 <CAEbHw4a+ad5aHW7k08PfvZqw4y2aF78YwTfdzTh=9inhtKHJSg@mail.gmail.com>
Message-ID: <CAPTjJmpHwmAtG3ZaUiMMc0TV-x_BS_3rvQzmeQwd9d0ZQC67qg@mail.gmail.com>

On Thu, Nov 29, 2018 at 10:25 AM David Mertz <mertz at gnosis.cx> wrote:
>
> That's easy, Antoine. On a reasonable modern multi-core workstation, I can do 4 billion additions per second. A year is just over 30 million seconds. For 32-bit ints, I can whiz through the task in only 130,000 years. We have at least several hundred million years before the sun engulfs us.
>

Python ints are not 32-bit ints. Have fun. :)

ChrisA

From antoine at python.org  Wed Nov 28 18:42:50 2018
From: antoine at python.org (Antoine Pitrou)
Date: Thu, 29 Nov 2018 00:42:50 +0100
Subject: [Python-ideas] [Brainstorm] Testing with Documented ABCs
In-Reply-To: <CAEbHw4a+ad5aHW7k08PfvZqw4y2aF78YwTfdzTh=9inhtKHJSg@mail.gmail.com>
References: <CAKr=oZupOAN47d2LC9Xzu0mTu6KJM--BUqrUpo41WUQ=8VedfQ@mail.gmail.com>
 <20181128151810.7c2393c2@fsol>
 <CAKr=oZsxC4kH=a=n2a2V0Ft5SpiL44wZ6PCw7gV49dg4ESfRYw@mail.gmail.com>
 <20181128230826.39ce721c@fsol>
 <CAEbHw4a+ad5aHW7k08PfvZqw4y2aF78YwTfdzTh=9inhtKHJSg@mail.gmail.com>
Message-ID: <bee2ca82-8836-5f7e-ff7f-64445306023f@python.org>


But Python integers are variable-sized, and their size is basically
limited by available memory or address space.

Let's take a typical 64-bit Python build, assuming 4 GB RAM available.
Let's also assume that 90% of those 4 GB can be readily allocated for
Python objects (there's overhead, etc.).

Also let's take a look at the Python integer representation:

>>> sys.int_info
sys.int_info(bits_per_digit=30, sizeof_digit=4)

This means that every 4 bytes of integer object store 30 bit of actual
integer data.

So, how many bits has the largest allocatable integer on that system,
assuming 90% of 4 GB are available for allocation?

>>> nbits = (2**32)*0.9*30/4
>>> nbits
28991029248.0

Now how many possible integers are there in that number of bits?

>>> x = 1 << int(nbits)
>>> x.bit_length()
28991029249

(yes, that number was successfully allocated in full.  And the Python
process occupies 3.7 GB RAM at that point, which validates the estimate.)

Let's try to have a readable approximation of that number.  Convert it
to a float perhaps?

>>> float(x)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
OverflowError: int too large to convert to float

Well, of course.  So let's just extract a power of 10:

>>> math.log10(x)
8727169408.819794
>>> 10**0.819794
6.603801339268099

(yes, math.log10() works on non-float-convertible integers.  I'm impressed!)

So the number of representable integers on that system is approximately
6.6e8727169408.  Let's hope the Sun takes its time.

(and of course, what is true for ints is true for any variable-sized
input, such as strings, lists, dicts, sets, etc.)

Regards

Antoine.


Le 29/11/2018 ? 00:24, David Mertz a ?crit?:
> That's easy, Antoine. On a reasonable modern multi-core workstation, I
> can do 4 billion additions per second. A year is just over 30 million
> seconds. For 32-bit ints, I can whiz through the task in only 130,000
> years. We have at least several hundred million years before the sun
> engulfs us.
> 
> On Wed, Nov 28, 2018, 5:09 PM Antoine Pitrou <solipsis at pitrou.net
> <mailto:solipsis at pitrou.net> wrote:
> 
>     On Wed, 28 Nov 2018 15:58:24 -0600
>     Abe Dillon <abedillon at gmail.com <mailto:abedillon at gmail.com>> wrote:
>     > Thirdly, Computers are very good at exhaustively searching
>     multidimensional
>     > spaces.
> 
>     How long do you think it will take your computer to exhaustively search
>     the space of possible input values to a 2-integer addition function?
> 
>     Do you think it can finish before the Earth gets engulfed by the Sun?
> 
>     Regards
> 
>     Antoine.
> 
> 
>     _______________________________________________
>     Python-ideas mailing list
>     Python-ideas at python.org <mailto:Python-ideas at python.org>
>     https://mail.python.org/mailman/listinfo/python-ideas
>     Code of Conduct: http://python.org/psf/codeofconduct/
> 

From marcos.eliziario at gmail.com  Wed Nov 28 20:22:20 2018
From: marcos.eliziario at gmail.com (Marcos Eliziario)
Date: Wed, 28 Nov 2018 23:22:20 -0200
Subject: [Python-ideas] [Brainstorm] Testing with Documented ABCs
In-Reply-To: <bee2ca82-8836-5f7e-ff7f-64445306023f@python.org>
References: <CAKr=oZupOAN47d2LC9Xzu0mTu6KJM--BUqrUpo41WUQ=8VedfQ@mail.gmail.com>
 <20181128151810.7c2393c2@fsol>
 <CAKr=oZsxC4kH=a=n2a2V0Ft5SpiL44wZ6PCw7gV49dg4ESfRYw@mail.gmail.com>
 <20181128230826.39ce721c@fsol>
 <CAEbHw4a+ad5aHW7k08PfvZqw4y2aF78YwTfdzTh=9inhtKHJSg@mail.gmail.com>
 <bee2ca82-8836-5f7e-ff7f-64445306023f@python.org>
Message-ID: <CAKhLUo1bMsqSWegTxivvj5VCiHSRWyp7GxQifBBxAL1cJeuqBg@mail.gmail.com>

But nobody is talking about exhausting the combinatoric space of all
possible values. Property Based Testing looks like Fuzzy Testing but it is
not quite the same thing.

Property based testing is not about just generating random values till the
heath death of the universe, but generating sensible values in a
configurable way to cover all equivalence classes we can think of. if my
function takes two floating point numbers as arguments, hypothesis
"strategies" won't try all possible combinations of all possible floating
point values, but instead all possible combination of interesting values
(NaN, Infinity, too big, too small, positive, negative, zero, None, decimal
fractions, etc..), something that an experienced programmer probably would
end up doing by himself with a lot of test cases, but that can be better
done with less effort by the automation provided by the hypothesis package.

It could be well that just by using such a tool, a naive programmer could
end up being convinced of the fact that maybe he probably would better be
served by sticking to Decimal Arithmetics :-)


Em qua, 28 de nov de 2018 ?s 21:43, Antoine Pitrou <antoine at python.org>
escreveu:

>
> But Python integers are variable-sized, and their size is basically
> limited by available memory or address space.
>
> Let's take a typical 64-bit Python build, assuming 4 GB RAM available.
> Let's also assume that 90% of those 4 GB can be readily allocated for
> Python objects (there's overhead, etc.).
>
> Also let's take a look at the Python integer representation:
>
> >>> sys.int_info
> sys.int_info(bits_per_digit=30, sizeof_digit=4)
>
> This means that every 4 bytes of integer object store 30 bit of actual
> integer data.
>
> So, how many bits has the largest allocatable integer on that system,
> assuming 90% of 4 GB are available for allocation?
>
> >>> nbits = (2**32)*0.9*30/4
> >>> nbits
> 28991029248.0
>
> Now how many possible integers are there in that number of bits?
>
> >>> x = 1 << int(nbits)
> >>> x.bit_length()
> 28991029249
>
> (yes, that number was successfully allocated in full.  And the Python
> process occupies 3.7 GB RAM at that point, which validates the estimate.)
>
> Let's try to have a readable approximation of that number.  Convert it
> to a float perhaps?
>
> >>> float(x)
> Traceback (most recent call last):
>   File "<stdin>", line 1, in <module>
> OverflowError: int too large to convert to float
>
> Well, of course.  So let's just extract a power of 10:
>
> >>> math.log10(x)
> 8727169408.819794
> >>> 10**0.819794
> 6.603801339268099
>
> (yes, math.log10() works on non-float-convertible integers.  I'm
> impressed!)
>
> So the number of representable integers on that system is approximately
> 6.6e8727169408.  Let's hope the Sun takes its time.
>
> (and of course, what is true for ints is true for any variable-sized
> input, such as strings, lists, dicts, sets, etc.)
>
> Regards
>
> Antoine.
>
>
> Le 29/11/2018 ? 00:24, David Mertz a ?crit :
> > That's easy, Antoine. On a reasonable modern multi-core workstation, I
> > can do 4 billion additions per second. A year is just over 30 million
> > seconds. For 32-bit ints, I can whiz through the task in only 130,000
> > years. We have at least several hundred million years before the sun
> > engulfs us.
> >
> > On Wed, Nov 28, 2018, 5:09 PM Antoine Pitrou <solipsis at pitrou.net
> > <mailto:solipsis at pitrou.net> wrote:
> >
> >     On Wed, 28 Nov 2018 15:58:24 -0600
> >     Abe Dillon <abedillon at gmail.com <mailto:abedillon at gmail.com>> wrote:
> >     > Thirdly, Computers are very good at exhaustively searching
> >     multidimensional
> >     > spaces.
> >
> >     How long do you think it will take your computer to exhaustively
> search
> >     the space of possible input values to a 2-integer addition function?
> >
> >     Do you think it can finish before the Earth gets engulfed by the Sun?
> >
> >     Regards
> >
> >     Antoine.
> >
> >
> >     _______________________________________________
> >     Python-ideas mailing list
> >     Python-ideas at python.org <mailto:Python-ideas at python.org>
> >     https://mail.python.org/mailman/listinfo/python-ideas
> >     Code of Conduct: http://python.org/psf/codeofconduct/
> >
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> https://mail.python.org/mailman/listinfo/python-ideas
> Code of Conduct: http://python.org/psf/codeofconduct/
>


-- 
Marcos Elizi?rio Santos
mobile/whatsapp/telegram: +55(21) 9-8027-0156
skype: marcos.eliziario at gmail.com
linked-in : https://www.linkedin.com/in/eliziario/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20181128/85bf3149/attachment.html>

From abedillon at gmail.com  Wed Nov 28 20:26:17 2018
From: abedillon at gmail.com (Abe Dillon)
Date: Wed, 28 Nov 2018 19:26:17 -0600
Subject: [Python-ideas] [Brainstorm] Testing with Documented ABCs
In-Reply-To: <CAGu4bVBmsuh5hBS9fGGc6FOPTq_gf7Jg1Z_L-kr_hnU+f9xH6Q@mail.gmail.com>
References: <CAKr=oZupOAN47d2LC9Xzu0mTu6KJM--BUqrUpo41WUQ=8VedfQ@mail.gmail.com>
 <CAGu4bVBmsuh5hBS9fGGc6FOPTq_gf7Jg1Z_L-kr_hnU+f9xH6Q@mail.gmail.com>
Message-ID: <CAKr=oZvPsfp=1ShSJ66uD8rs-L616g9-P0oWr_MCN0vMetdTrQ@mail.gmail.com>

Marko, I have a few thoughts that might improve icontract.
First, multiple clauses per decorator:

@pre(
    *lambda* x: x >= 0,
    *lambda* y: y >= 0,
    *lambda* width: width >= 0,
    *lambda* height: height >= 0,
    *lambda* x, width, img: x + width <= width_of(img),
    *lambda* y, height, img: y + height <= height_of(img))
@post(
    *lambda* self: (self.x, self.y) in self,
    *lambda* self: (self.x+self.width-1, self.y+self.height-1) in self,
    *lambda* self: (self.x+self.width, self.y+self.height) not in self)
*def* __init__(self, img: np.ndarray, x: int, y: int, width: int, height:
int) -> None:
    self.img = img[y : y+height, x : x+width].copy()
    self.x = x
    self.y = y
    self.width = width
    self.height = height

*def* __contains__(self, pt: Tuple[int, int]) -> bool:
    x, y = pt
    return (self.x <= x < self.x + self.width) and (self.y <= y < self.y +
self.height)


You might be able to get away with some magic by decorating a method just
to flag it as using contracts:


@contract  # <- does byte-code and/or AST voodoo
*def* __init__(self, img: np.ndarray, x: int, y: int, width: int, height:
int) -> None:
    pre(x >= 0,
        y >= 0,
        width >= 0,
        height >= 0,
        x + width <= width_of(img),
        y + height <= height_of(img))

    # this would probably be declared at the class level
    inv(*lambda* self: (self.x, self.y) in self,
        *lambda* self: (self.x+self.width-1, self.y+self.height-1) in self,
        *lambda* self: (self.x+self.width, self.y+self.height) not in self)

    self.img = img[y : y+height, x : x+width].copy()
    self.x = x
    self.y = y
    self.width = width
    self.height = height

That might be super tricky to implement, but it saves you some lambda
noise. Also, I saw a forked thread in which you were considering some sort
of transpiler  with similar syntax to the above example. That also works.
Another thing to consider is that the role of descriptors
<https://www.smallsurething.com/python-descriptors-made-simple/> overlaps
some with the role of invariants. I don't know what to do with that
knowledge, but it seems like it might be useful.

Anyway, I hope those half-baked thoughts have *some* value...

On Wed, Nov 28, 2018 at 1:12 AM Marko Ristin-Kaufmann <
marko.ristin at gmail.com> wrote:

> Hi Abe,
>
> I've been pulling a lot of ideas from the recent discussion on design by
>> contract (DBC), the elegance and drawbacks
>> <https://bemusement.org/doctests-arent-code> of doctests
>> <https://docs.python.org/3/library/doctest.html>, and the amazing talk
>> <https://www.youtube.com/watch?v=MYucYon2-lk> given by Hillel Wayne at
>> this year's PyCon entitled "Beyond Unit Tests: Taking your Tests to the
>> Next Level".
>>
>
> Have you looked at the recent discussions regarding design-by-contract on
> this list (
> https://groups.google.com/forum/m/#!topic/python-ideas/JtMgpSyODTU
> and the following forked threads)?
>
> You might want to have a look at static checking techniques such as
> abstract interpretation. I hope to be able to work on such a tool for
> Python in some two years from now. We can stay in touch if you are
> interested.
>
> Re decorators: to my own surprise, using decorators in a larger code base
> is completely practical including the  readability and maintenance of the
> code. It's neither that ugly nor problematic as it might seem at first look.
>
> We use our https://github.com/Parquery/icontract at the company. Most of
> the design choices come from practical issues we faced -- so you might want
> to read the doc even if you don't plant to use the library.
>
> Some of the aspects we still haven't figured out are: how to approach
> multi-threading (locking around the whole function with an additional
> decorator?) and granularity of contract switches (right now we use
> always/optimized, production/non-optimized and teating/slow, but it seems
> that a larger system requires finer categories).
>
> Cheers Marko
>
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20181128/76f2bbde/attachment-0001.html>

From mertz at gnosis.cx  Wed Nov 28 20:46:35 2018
From: mertz at gnosis.cx (David Mertz)
Date: Wed, 28 Nov 2018 20:46:35 -0500
Subject: [Python-ideas] [Brainstorm] Testing with Documented ABCs
In-Reply-To: <bee2ca82-8836-5f7e-ff7f-64445306023f@python.org>
References: <CAKr=oZupOAN47d2LC9Xzu0mTu6KJM--BUqrUpo41WUQ=8VedfQ@mail.gmail.com>
 <20181128151810.7c2393c2@fsol>
 <CAKr=oZsxC4kH=a=n2a2V0Ft5SpiL44wZ6PCw7gV49dg4ESfRYw@mail.gmail.com>
 <20181128230826.39ce721c@fsol>
 <CAEbHw4a+ad5aHW7k08PfvZqw4y2aF78YwTfdzTh=9inhtKHJSg@mail.gmail.com>
 <bee2ca82-8836-5f7e-ff7f-64445306023f@python.org>
Message-ID: <CAEbHw4avvuR-04AEcmN5KuwLhrue2OBEG=GUGkU83Gcp_xmPpg@mail.gmail.com>

I was assuming it was a Numba-ized function since it's purely numeric. ;-)

FWIW, the theoretical limit of Python ints is limited by the fact
'int.bit_length()' is a platform native int. So my system cannot store ints
larger than (2**(2**63-1)). It'll take a lot more memory than my measly
4GiB to store that number though.

So yes, that's way longer that heat-death-of-universe even before 128-bit
machines are widespread.

On Wed, Nov 28, 2018, 6:43 PM Antoine Pitrou <antoine at python.org wrote:

>
> But Python integers are variable-sized, and their size is basically
> limited by available memory or address space.
>
> Let's take a typical 64-bit Python build, assuming 4 GB RAM available.
> Let's also assume that 90% of those 4 GB can be readily allocated for
> Python objects (there's overhead, etc.).
>
> Also let's take a look at the Python integer representation:
>
> >>> sys.int_info
> sys.int_info(bits_per_digit=30, sizeof_digit=4)
>
> This means that every 4 bytes of integer object store 30 bit of actual
> integer data.
>
> So, how many bits has the largest allocatable integer on that system,
> assuming 90% of 4 GB are available for allocation?
>
> >>> nbits = (2**32)*0.9*30/4
> >>> nbits
> 28991029248.0
>
> Now how many possible integers are there in that number of bits?
>
> >>> x = 1 << int(nbits)
> >>> x.bit_length()
> 28991029249
>
> (yes, that number was successfully allocated in full.  And the Python
> process occupies 3.7 GB RAM at that point, which validates the estimate.)
>
> Let's try to have a readable approximation of that number.  Convert it
> to a float perhaps?
>
> >>> float(x)
> Traceback (most recent call last):
>   File "<stdin>", line 1, in <module>
> OverflowError: int too large to convert to float
>
> Well, of course.  So let's just extract a power of 10:
>
> >>> math.log10(x)
> 8727169408.819794
> >>> 10**0.819794
> 6.603801339268099
>
> (yes, math.log10() works on non-float-convertible integers.  I'm
> impressed!)
>
> So the number of representable integers on that system is approximately
> 6.6e8727169408.  Let's hope the Sun takes its time.
>
> (and of course, what is true for ints is true for any variable-sized
> input, such as strings, lists, dicts, sets, etc.)
>
> Regards
>
> Antoine.
>
>
> Le 29/11/2018 ? 00:24, David Mertz a ?crit :
> > That's easy, Antoine. On a reasonable modern multi-core workstation, I
> > can do 4 billion additions per second. A year is just over 30 million
> > seconds. For 32-bit ints, I can whiz through the task in only 130,000
> > years. We have at least several hundred million years before the sun
> > engulfs us.
> >
> > On Wed, Nov 28, 2018, 5:09 PM Antoine Pitrou <solipsis at pitrou.net
> > <mailto:solipsis at pitrou.net> wrote:
> >
> >     On Wed, 28 Nov 2018 15:58:24 -0600
> >     Abe Dillon <abedillon at gmail.com <mailto:abedillon at gmail.com>> wrote:
> >     > Thirdly, Computers are very good at exhaustively searching
> >     multidimensional
> >     > spaces.
> >
> >     How long do you think it will take your computer to exhaustively
> search
> >     the space of possible input values to a 2-integer addition function?
> >
> >     Do you think it can finish before the Earth gets engulfed by the Sun?
> >
> >     Regards
> >
> >     Antoine.
> >
> >
> >     _______________________________________________
> >     Python-ideas mailing list
> >     Python-ideas at python.org <mailto:Python-ideas at python.org>
> >     https://mail.python.org/mailman/listinfo/python-ideas
> >     Code of Conduct: http://python.org/psf/codeofconduct/
> >
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> https://mail.python.org/mailman/listinfo/python-ideas
> Code of Conduct: http://python.org/psf/codeofconduct/
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20181128/56bef854/attachment.html>

From abedillon at gmail.com  Wed Nov 28 20:49:24 2018
From: abedillon at gmail.com (Abe Dillon)
Date: Wed, 28 Nov 2018 19:49:24 -0600
Subject: [Python-ideas] [Brainstorm] Testing with Documented ABCs
In-Reply-To: <CAEbHw4avvuR-04AEcmN5KuwLhrue2OBEG=GUGkU83Gcp_xmPpg@mail.gmail.com>
References: <CAKr=oZupOAN47d2LC9Xzu0mTu6KJM--BUqrUpo41WUQ=8VedfQ@mail.gmail.com>
 <20181128151810.7c2393c2@fsol>
 <CAKr=oZsxC4kH=a=n2a2V0Ft5SpiL44wZ6PCw7gV49dg4ESfRYw@mail.gmail.com>
 <20181128230826.39ce721c@fsol>
 <CAEbHw4a+ad5aHW7k08PfvZqw4y2aF78YwTfdzTh=9inhtKHJSg@mail.gmail.com>
 <bee2ca82-8836-5f7e-ff7f-64445306023f@python.org>
 <CAEbHw4avvuR-04AEcmN5KuwLhrue2OBEG=GUGkU83Gcp_xmPpg@mail.gmail.com>
Message-ID: <CAKr=oZsDN6Rf7Mp_Naa=NXaKi+VrpWpT19hTax2iGLj3CbsHOA@mail.gmail.com>

OK. I know I made a mistake by saying, "computers are very good at
*exhaustively* searching multidimensional spaces." I should have said,
"computers are very good at enumerating examples from multi-dimensional
spaces" or something to that effect. Now that we've had our fun, can you
guys please continue in a forked conversation so it doesn't derail the
conversation?

On Wed, Nov 28, 2018 at 7:47 PM David Mertz <mertz at gnosis.cx> wrote:

> I was assuming it was a Numba-ized function since it's purely numeric. ;-)
>
> FWIW, the theoretical limit of Python ints is limited by the fact
> 'int.bit_length()' is a platform native int. So my system cannot store ints
> larger than (2**(2**63-1)). It'll take a lot more memory than my measly
> 4GiB to store that number though.
>
> So yes, that's way longer that heat-death-of-universe even before 128-bit
> machines are widespread.
>
> On Wed, Nov 28, 2018, 6:43 PM Antoine Pitrou <antoine at python.org wrote:
>
>>
>> But Python integers are variable-sized, and their size is basically
>> limited by available memory or address space.
>>
>> Let's take a typical 64-bit Python build, assuming 4 GB RAM available.
>> Let's also assume that 90% of those 4 GB can be readily allocated for
>> Python objects (there's overhead, etc.).
>>
>> Also let's take a look at the Python integer representation:
>>
>> >>> sys.int_info
>> sys.int_info(bits_per_digit=30, sizeof_digit=4)
>>
>> This means that every 4 bytes of integer object store 30 bit of actual
>> integer data.
>>
>> So, how many bits has the largest allocatable integer on that system,
>> assuming 90% of 4 GB are available for allocation?
>>
>> >>> nbits = (2**32)*0.9*30/4
>> >>> nbits
>> 28991029248.0
>>
>> Now how many possible integers are there in that number of bits?
>>
>> >>> x = 1 << int(nbits)
>> >>> x.bit_length()
>> 28991029249
>>
>> (yes, that number was successfully allocated in full.  And the Python
>> process occupies 3.7 GB RAM at that point, which validates the estimate.)
>>
>> Let's try to have a readable approximation of that number.  Convert it
>> to a float perhaps?
>>
>> >>> float(x)
>> Traceback (most recent call last):
>>   File "<stdin>", line 1, in <module>
>> OverflowError: int too large to convert to float
>>
>> Well, of course.  So let's just extract a power of 10:
>>
>> >>> math.log10(x)
>> 8727169408.819794
>> >>> 10**0.819794
>> 6.603801339268099
>>
>> (yes, math.log10() works on non-float-convertible integers.  I'm
>> impressed!)
>>
>> So the number of representable integers on that system is approximately
>> 6.6e8727169408.  Let's hope the Sun takes its time.
>>
>> (and of course, what is true for ints is true for any variable-sized
>> input, such as strings, lists, dicts, sets, etc.)
>>
>> Regards
>>
>> Antoine.
>>
>>
>> Le 29/11/2018 ? 00:24, David Mertz a ?crit :
>> > That's easy, Antoine. On a reasonable modern multi-core workstation, I
>> > can do 4 billion additions per second. A year is just over 30 million
>> > seconds. For 32-bit ints, I can whiz through the task in only 130,000
>> > years. We have at least several hundred million years before the sun
>> > engulfs us.
>> >
>> > On Wed, Nov 28, 2018, 5:09 PM Antoine Pitrou <solipsis at pitrou.net
>> > <mailto:solipsis at pitrou.net> wrote:
>> >
>> >     On Wed, 28 Nov 2018 15:58:24 -0600
>> >     Abe Dillon <abedillon at gmail.com <mailto:abedillon at gmail.com>>
>> wrote:
>> >     > Thirdly, Computers are very good at exhaustively searching
>> >     multidimensional
>> >     > spaces.
>> >
>> >     How long do you think it will take your computer to exhaustively
>> search
>> >     the space of possible input values to a 2-integer addition function?
>> >
>> >     Do you think it can finish before the Earth gets engulfed by the
>> Sun?
>> >
>> >     Regards
>> >
>> >     Antoine.
>> >
>> >
>> >     _______________________________________________
>> >     Python-ideas mailing list
>> >     Python-ideas at python.org <mailto:Python-ideas at python.org>
>> >     https://mail.python.org/mailman/listinfo/python-ideas
>> >     Code of Conduct: http://python.org/psf/codeofconduct/
>> >
>> _______________________________________________
>> Python-ideas mailing list
>> Python-ideas at python.org
>> https://mail.python.org/mailman/listinfo/python-ideas
>> Code of Conduct: http://python.org/psf/codeofconduct/
>>
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> https://mail.python.org/mailman/listinfo/python-ideas
> Code of Conduct: http://python.org/psf/codeofconduct/
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20181128/822caaf5/attachment-0001.html>

From abedillon at gmail.com  Wed Nov 28 21:47:07 2018
From: abedillon at gmail.com (Abe Dillon)
Date: Wed, 28 Nov 2018 20:47:07 -0600
Subject: [Python-ideas] [Brainstorm] Testing with Documented ABCs
In-Reply-To: <CAKr=oZupOAN47d2LC9Xzu0mTu6KJM--BUqrUpo41WUQ=8VedfQ@mail.gmail.com>
References: <CAKr=oZupOAN47d2LC9Xzu0mTu6KJM--BUqrUpo41WUQ=8VedfQ@mail.gmail.com>
Message-ID: <CAKr=oZtQMA2T0749w5U0sHUOqtzHoRi3dGm_jfVOAQ4PHYOMhA@mail.gmail.com>

One thought I had pertains to a very narrow sub-set of cases, but may
provide a starting point. For the cases where a precondition, invariant, or
postcondition only involves a single parameter, attribute, or the return
value (respectively) and it's reasonably simple, one could write it as an
expression acting directly on the type annotation:

def encabulate(
        reactive_inductance: 1 >= float > 0,   # description
        capacitive_diractance: int > 1,  # description
        delta_winding: bool  # description
        ) -> len(Set[DingleArm]) > 0:  # ??? I don't know how you would
handle more complex objects...
    do_stuff
    with_things
    ....

Anyway. Just more food for thought...


On Tue, Nov 27, 2018 at 10:47 PM Abe Dillon <abedillon at gmail.com> wrote:

> I've been pulling a lot of ideas from the recent discussion on design by
> contract (DBC), the elegance and drawbacks
> <https://bemusement.org/doctests-arent-code> of doctests
> <https://docs.python.org/3/library/doctest.html>, and the amazing talk
> <https://www.youtube.com/watch?v=MYucYon2-lk> given by Hillel Wayne at
> this year's PyCon entitled "Beyond Unit Tests: Taking your Tests to the
> Next Level".
>
> To recap a lot of previous discussions:
>
> - Documentation should tell you:
>     A) What a variable represents
>     B) What kind of thing a variable is
>     C) The acceptable values a variable can take
>
> - Typing and Tests can partially take the place of documentation by
> filling in B and C (respectively) and sometimes A can be inferred from
> decent naming and context.
>
> - Contracts can take the place of many tests (especially when combined
> with a library like hypothesis)
>
> - Contracts/assertions can provide "stable" documentation in the sense
> that it can't get out of sync with the code.
>
> - Attempts to implement contracts using standard Python syntax are verbose
> and noisy because they rely heavily on decorators that add a lot of
> repetitive preamble to the methods being decorated. They may also require a
> metaclass which restricts their use to code that doesn't already use a
> metaclass.
>
> - There was some discussion about the importance of "what a variable
> represents" which pointed to this article
> <http://pgbovine.net/python-unreadable.htm> by Philip J. Guo (author of
> the magnificent pythontutor.com). I believe Guo's usage of "in-the-small"
> and "in-the-large" are confusing because a well decoupled program shouldn't
> yield functions that know or care how they're being used in the grand
> machinations of your project. The examples he gives are of functions that
> could use a doc string and some type annotations, but don't actually say
> how they relate to the rest of the project.
>
> One thing that caught me about Hillel Wayne's talk was that some of his
> examples were close to needing practically no code. He starts with:
>
> def tail(lst: List[Any]) -> List[Any]:
>   assert len(lst) > 0, "precondition"
>   result = lst[1:]
>   assert [lst[0]] + result == lst, "postcondition"
>   return result
>
> He then re-writes the function using a contracts library:
>
> @require("lst must not be empty", lambda args: len(args.lst) > 0)
> @ensure("result is tail of lst", lambda args, result: [args.lst[0]] +
> result == args.lst)
> def tail(lst: List[Any]) -> List[Any]:
>   return lst[1:]
>
> He then writes a unit test for the function:
>
> @given(lists(integers(), 1))
> def test_tail(lst):
>   tail(lst)
>
> What strikes me as interesting is that the test pretty-much doesn't need
> to be written. The 'given' statement should be redundant based on the type
> annotation and the precondition. Anyone who knows hypothesis, just imagine
> the @require is a hypothesis 'assume' call. Furthermore, hypothesis should
> be able to build strategies for more complex objects based on class
> invariants and attribute types:
>
> @invariant("no overdrafts", lambda self: self.balance >= 0)
> class Account:
>   def __init__(self, number: int, balance: float = 0):
>     super().__init__()
>     self.number: int = number
>     self.balance: float = balance
>
> A library like hypothesis should be able to generate valid account
> objects. Hypothesis also has stateful testing
> <https://hypothesis.readthedocs.io/en/1.4.1/stateful.html> but I think
> the implementation could use some work. As it is, you have inherit from a
> class that uses a metaclass AND you have to pollute your class's name-space
> with helper objects and methods.
>
> If we could figure out a cleaner syntax for defining invariants,
> preconditions, and postconditions we'd be half-way to automated testing
> UTOPIA! (ok, maybe I'm being a little over-zealous)
>
> I think there are two missing pieces to this testing problem: side-effect
> verification and failure verification.
>
> Failure verification should test that the expected exceptions get thrown
> when known bad data is passed in or when an object is put in a known
> illegal state. This should be doable by allowing Hypothesis to probe the
> bounds of unacceptable input data or states, though it might seem a bit
> silly because if you've already added a precondition, "x >= 0" to a
> function, then it obviously should raise a PreconditionViolated when passed
> any x < 0. It may be important, however; if for performance reasons, you
> need to disable invariant checking but you still want certain bad input to
> raise exceptions, or your system has two components that interact with
> slightly mis-matched invariants and you want to make sure the components
> handle the edge-condition correctly. You can think of Types from a
> set-theory perspective where the Integer type is conceptually the set of
> all integers, and invariants would specify a smaller subset than Typing
> alone, however if the set of all valid outputs of one component is not
> completely contained within the set of all valid inputs to another
> component, then there will be edge-cases resulting from the mismatch. In
> that sense, some of the invariant verification could be static-ish (as much
> as Python allows).
>
> Side-effect verification is usually done by mocking dependencies. You pass
> in a mock database connection and make sure my object sends and receives
> data as expected. As crazy as it sounds, this too can be almost completely
> automated away if all of the above tools are in place AND if Python gained
> support for Exception annotations. I wrote a Java (yuck) library at work
> that does this. I wan't to port it to Python and share it, but it basically
> enumerates a bunch of stuff: the "sources" and "destinations" of the
> system, how those relate to dependencies, how they relate to each other (if
> dependency X is unresponsive, I can't get sources A, B, or G and if I can't
> get source B, I can't write destination Y), the dependency failure modes
> (Exceptions raised, timeouts, unrecognized key, missing data, etc.), all
> the public methods of the class under test and what sources and
> destinations they use.
>
> Then I enumerate 'k' from 0 to some limit for the max number of
> simultaneous faults to test for:
>    Then for each method that can have n >= k simultaneous faults I test
> all (n choose k) combinations of faults for that method against the desired
> behavior.
>
> I'm sure that explanation is as clear as mud. I will try to get a working
> Python example at some point to demonstrate.
>
> Finally, in the PyCon video; Hillel Wayne shows an example of testing that
> an "add" function is commutative. It seems that once you write that
> invariant, it might apply to many different functions. A similar invariant
> may be "reversibility" like:
>
> @given(text())
> def test_reversable_codex(s):
>    assert s == decode(encode(s)), "not reversible"
>
> That might be a common property that other functions share:
>
> @invariant(reversible(decode))
> def encode(s: str) -> bytes: ...
>
> Having said all that, I wanted to brainstorm some possible solutions for
> implementing some or all of the above in Python without drowning you code
> in decorators.
>
> NOTE: Please don't get hung up on specific syntax suggestions! Try to see
> the forest through the trees!
>
> An example syntax could be:
>
> #Instead of this
> @require("lst must not be empty", lambda args: len(args.lst) > 0)
> @ensure("result is tail of lst", lambda args, result: [args.lst[0]] +
> result == args.lst)
> def tail(lst: List[Any]) -> List[Any]:
>   return lst[1:]
>
> #Maybe this?
> non_empty = invariant("Must not be empty", lambda x: len(x) > 0)  # can be
> re-used
>
> def tail(lst: List[Any]   d"Description of what this param represents.
> {non_empty}") -> List[Any]  d"Description of return value {lst == [lst[0]]
> + __result__}":
>   """
>   Description of function
>   """
>   return lst[1:]
>
> Python could build the full doc string like so:
>
> """
> Description of function
>
> Args:
>   lst: Description of what this param represents. Must not be empty.
>
> Returns:
>   Description of return value.
> """
>
> d-strings have some description followed by some terminator after which
> either invariant objects or [optionally strings] followed by an expression
> on the arguments and __return__?
>
> I'm sorry this is so half-baked. I don't really like the d-string concept
> and I'm pretty sure there are a million problems with it. I'll try to flesh
> out the side-effect verification concept more later along with all the
> other poorly explained stuff. I just wanted to get these thoughts out for
> discussion, but now it's super late and I have to go!
>
>
>
>
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20181128/dc3e0c11/attachment.html>

From boxed at killingar.net  Thu Nov 29 00:06:51 2018
From: boxed at killingar.net (=?utf-8?Q?Anders_Hovm=C3=B6ller?=)
Date: Thu, 29 Nov 2018 06:06:51 +0100
Subject: [Python-ideas] __len__() for map()
In-Reply-To: <20181128220323.GX4319@ando.pearwood.info>
References: <3e46b3e5-09e0-b53e-16f3-a1605c88df3f@thekunderts.net>
 <CAOTD34Z6XBM=_Gv_A2TVK2O2BEgDz6gOTbLFRHeeJsqTTWf9MA@mail.gmail.com>
 <CALD=Yf-7Jsb22thjHordR-EkK_Mj6sKX8x4kUEMFhJE=71AGWA@mail.gmail.com>
 <CAPTjJmqmx4bPr_aCbM9GHn6VuWf=yJGEObwO51Y1SQ6W7FOxxw@mail.gmail.com>
 <CAOTD34aK5fVRMDiM3s+sYWMJqUH_RF_N=QSBWD-Nbs0DeEDqqw@mail.gmail.com>
 <A0F3D279-0160-47E0-9F2D-8D1667792BE9@killingar.net>
 <20181128220323.GX4319@ando.pearwood.info>
Message-ID: <C264B8F7-1DF7-4B02-8626-C2E160C2ADCB@killingar.net>


>> +1.  Throwing away information is almost always a bad idea.
> 
> "Almost always"? Let's take this seriously, and think about the 
> consequences if we actually believed that. If I created a series of 
> integers:

?Almost". It?s part of my sentence. I have known about addition for many years in fact :)

/ Anders


From marko.ristin at gmail.com  Thu Nov 29 01:25:31 2018
From: marko.ristin at gmail.com (Marko Ristin-Kaufmann)
Date: Thu, 29 Nov 2018 07:25:31 +0100
Subject: [Python-ideas] [Brainstorm] Testing with Documented ABCs
In-Reply-To: <CAKhLUo1bMsqSWegTxivvj5VCiHSRWyp7GxQifBBxAL1cJeuqBg@mail.gmail.com>
References: <CAKr=oZupOAN47d2LC9Xzu0mTu6KJM--BUqrUpo41WUQ=8VedfQ@mail.gmail.com>
 <20181128151810.7c2393c2@fsol>
 <CAKr=oZsxC4kH=a=n2a2V0Ft5SpiL44wZ6PCw7gV49dg4ESfRYw@mail.gmail.com>
 <20181128230826.39ce721c@fsol>
 <CAEbHw4a+ad5aHW7k08PfvZqw4y2aF78YwTfdzTh=9inhtKHJSg@mail.gmail.com>
 <bee2ca82-8836-5f7e-ff7f-64445306023f@python.org>
 <CAKhLUo1bMsqSWegTxivvj5VCiHSRWyp7GxQifBBxAL1cJeuqBg@mail.gmail.com>
Message-ID: <CAGu4bVAq1XGCXm=-YNBgm9PnYT-Jh_yi9EGDFZ77+74JoC1eeQ@mail.gmail.com>

Hi,

Property based testing is not about just generating random values till the
> heath death of the universe, but generating sensible values in a
> configurable way to cover all equivalence classes we can think of. if my
> function takes two floating point numbers as arguments, hypothesis
> "strategies" won't try all possible combinations of all possible floating
> point values, but instead all possible combination of interesting values
> (NaN, Infinity, too big, too small, positive, negative, zero, None, decimal
> fractions, etc..), something that an experienced programmer probably would
> end up doing by himself with a lot of test cases, but that can be better
> done with less effort by the automation provided by the hypothesis package.
>

Exactly. A tool can go a step further and, based on the assertions and
contracts, generate the tests automatically or prove that certain
properties of the program always hold. I would encourage people interested
in automatic testing to have a look at the scientific literature on the
topic (formal static analysis). Abstract interpretation has been already
mentioned: https://en.wikipedia.org/wiki/Abstract_interpretation. For some
bleeding edge, have a look what they do at this lab with the machine
learning: https://eth-sri.github.io/publications/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20181129/185b144f/attachment.html>

From marko.ristin at gmail.com  Thu Nov 29 02:05:00 2018
From: marko.ristin at gmail.com (Marko Ristin-Kaufmann)
Date: Thu, 29 Nov 2018 08:05:00 +0100
Subject: [Python-ideas] [Brainstorm] Testing with Documented ABCs
In-Reply-To: <CAKr=oZvPsfp=1ShSJ66uD8rs-L616g9-P0oWr_MCN0vMetdTrQ@mail.gmail.com>
References: <CAKr=oZupOAN47d2LC9Xzu0mTu6KJM--BUqrUpo41WUQ=8VedfQ@mail.gmail.com>
 <CAGu4bVBmsuh5hBS9fGGc6FOPTq_gf7Jg1Z_L-kr_hnU+f9xH6Q@mail.gmail.com>
 <CAKr=oZvPsfp=1ShSJ66uD8rs-L616g9-P0oWr_MCN0vMetdTrQ@mail.gmail.com>
Message-ID: <CAGu4bVBUHhWx=pabMAPkj+xd_XSHc6EjFkgB1WXQ2dPTAGRbow@mail.gmail.com>

Hi Abe,
Thanks for your suggestions! We actually already considered the two
alternatives you propose.

*Multiple predicates per decorator. *The problem is that you can not deal
with toggling/describing individual contracts easily. While you can hack
your way through it (considering the arguments in the sequence, for
example), we found it clearer to have separate decorators. Moreover,
tracebacks are much easier to read, which is important when you debug a
program.

*AST magic. *The problem with any approach based on parsing (be it parsing
the code or the description) is that parsing is slow so you end up spending
a lot of cycles on contracts which might not be enabled (many contracts are
applied only in the testing environment, not int he production). Hence you
must have an approach that offers practically zero overhead cost to
importing a module when its contracts are turned off.

Decoding byte-code does not work as current decoding libraries can not keep
up with the changes in the language and the compiler hence they are always
lagging behind.

*Practicality of decorators. *We have retrospective meetings at the company
and I frequently survey the opinions related to the contracts (explicitly
asking about the readability and maintainability) -- so far nobody had any
difficulties and nobody was bothered by the noisy syntax. The decorator
syntax is simply not beautiful, no discussion about that. But when it comes
to maintenance,  there's a linter included (
https://github.com/Parquery/pyicontract-lint), and if you want contracts
rendered in an appealing way, there's a documentation tool for sphinx (
https://github.com/Parquery/sphinx-icontract). The linter facilitates the
maintainability a lot and sphinx tool gives you nice documentation for a
library so that you don't even have to look into the source code that often
if you don't want to.

We need to be careful not to mistake issues of aesthetics for practical
issues. Something might not be beautiful, but can be useful unless it's
unreadable.

*Conclusion. *What we do need at this moment, IMO, is a broad practical
experience of using contracts in Python. Once you make a change to the
language, it's impossible to undo. In contrast to what has been suggested
in the previous discussions (including my own voiced opinions), I actually
now don't think that introducing a language change would be beneficial *at
this precise moment*. We don't know what the use cases are, and there is no
practical experience to base the language change on.

I'd prefer to hear from people who actually use contracts in their
professional Python programming -- apart from the noisy syntax, how was the
experience? Did it help you catch bugs (and how many)? Were there big
problems with maintainability? Could you easily refactor? What were the
limits of the contracts you encountered? What kind of snapshot mechanism do
we need? How did you deal with multi-threading? And so on.

icontract library is already practically usable and, if you don't use
inheritance, dpcontracts is usable as well.  I would encourage everybody to
try out programming with contracts using an existing library and just hold
their nose when writing the noisy syntax. Once we unearthed deeper problems
related to contracts, I think it will be much easier and much more
convincing to write a proposal for introducing contracts in the core
language. If I had to write a proposal right now, it would be only based on
the experience of writing a humble 100K code base by a team of 5-10 people.
Not very convincing.


Cheers,
Marko

On Thu, 29 Nov 2018 at 02:26, Abe Dillon <abedillon at gmail.com> wrote:

> Marko, I have a few thoughts that might improve icontract.
> First, multiple clauses per decorator:
>
> @pre(
>     *lambda* x: x >= 0,
>     *lambda* y: y >= 0,
>     *lambda* width: width >= 0,
>     *lambda* height: height >= 0,
>     *lambda* x, width, img: x + width <= width_of(img),
>     *lambda* y, height, img: y + height <= height_of(img))
> @post(
>     *lambda* self: (self.x, self.y) in self,
>     *lambda* self: (self.x+self.width-1, self.y+self.height-1) in self,
>     *lambda* self: (self.x+self.width, self.y+self.height) not in self)
> *def* __init__(self, img: np.ndarray, x: int, y: int, width: int, height:
> int) -> None:
>     self.img = img[y : y+height, x : x+width].copy()
>     self.x = x
>     self.y = y
>     self.width = width
>     self.height = height
>
> *def* __contains__(self, pt: Tuple[int, int]) -> bool:
>     x, y = pt
>     return (self.x <= x < self.x + self.width) and (self.y <= y < self.y +
> self.height)
>
>
> You might be able to get away with some magic by decorating a method just
> to flag it as using contracts:
>
>
> @contract  # <- does byte-code and/or AST voodoo
> *def* __init__(self, img: np.ndarray, x: int, y: int, width: int, height:
> int) -> None:
>     pre(x >= 0,
>         y >= 0,
>         width >= 0,
>         height >= 0,
>         x + width <= width_of(img),
>         y + height <= height_of(img))
>
>     # this would probably be declared at the class level
>     inv(*lambda* self: (self.x, self.y) in self,
>         *lambda* self: (self.x+self.width-1, self.y+self.height-1) in
> self,
>         *lambda* self: (self.x+self.width, self.y+self.height) not in
> self)
>
>     self.img = img[y : y+height, x : x+width].copy()
>     self.x = x
>     self.y = y
>     self.width = width
>     self.height = height
>
> That might be super tricky to implement, but it saves you some lambda
> noise. Also, I saw a forked thread in which you were considering some sort
> of transpiler  with similar syntax to the above example. That also works.
> Another thing to consider is that the role of descriptors
> <https://www.smallsurething.com/python-descriptors-made-simple/> overlaps
> some with the role of invariants. I don't know what to do with that
> knowledge, but it seems like it might be useful.
>
> Anyway, I hope those half-baked thoughts have *some* value...
>
> On Wed, Nov 28, 2018 at 1:12 AM Marko Ristin-Kaufmann <
> marko.ristin at gmail.com> wrote:
>
>> Hi Abe,
>>
>> I've been pulling a lot of ideas from the recent discussion on design by
>>> contract (DBC), the elegance and drawbacks
>>> <https://bemusement.org/doctests-arent-code> of doctests
>>> <https://docs.python.org/3/library/doctest.html>, and the amazing talk
>>> <https://www.youtube.com/watch?v=MYucYon2-lk> given by Hillel Wayne at
>>> this year's PyCon entitled "Beyond Unit Tests: Taking your Tests to the
>>> Next Level".
>>>
>>
>> Have you looked at the recent discussions regarding design-by-contract on
>> this list (
>> https://groups.google.com/forum/m/#!topic/python-ideas/JtMgpSyODTU
>> and the following forked threads)?
>>
>> You might want to have a look at static checking techniques such as
>> abstract interpretation. I hope to be able to work on such a tool for
>> Python in some two years from now. We can stay in touch if you are
>> interested.
>>
>> Re decorators: to my own surprise, using decorators in a larger code base
>> is completely practical including the  readability and maintenance of the
>> code. It's neither that ugly nor problematic as it might seem at first look.
>>
>> We use our https://github.com/Parquery/icontract at the company. Most of
>> the design choices come from practical issues we faced -- so you might want
>> to read the doc even if you don't plant to use the library.
>>
>> Some of the aspects we still haven't figured out are: how to approach
>> multi-threading (locking around the whole function with an additional
>> decorator?) and granularity of contract switches (right now we use
>> always/optimized, production/non-optimized and teating/slow, but it seems
>> that a larger system requires finer categories).
>>
>> Cheers Marko
>>
>>
>>
>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20181129/c49980a5/attachment-0001.html>

From ricocotam at gmail.com  Thu Nov 29 02:28:28 2018
From: ricocotam at gmail.com (Adrien Ricocotam)
Date: Thu, 29 Nov 2018 08:28:28 +0100
Subject: [Python-ideas] __len__() for map()
In-Reply-To: <C264B8F7-1DF7-4B02-8626-C2E160C2ADCB@killingar.net>
References: <3e46b3e5-09e0-b53e-16f3-a1605c88df3f@thekunderts.net>
 <CAOTD34Z6XBM=_Gv_A2TVK2O2BEgDz6gOTbLFRHeeJsqTTWf9MA@mail.gmail.com>
 <CALD=Yf-7Jsb22thjHordR-EkK_Mj6sKX8x4kUEMFhJE=71AGWA@mail.gmail.com>
 <CAPTjJmqmx4bPr_aCbM9GHn6VuWf=yJGEObwO51Y1SQ6W7FOxxw@mail.gmail.com>
 <CAOTD34aK5fVRMDiM3s+sYWMJqUH_RF_N=QSBWD-Nbs0DeEDqqw@mail.gmail.com>
 <A0F3D279-0160-47E0-9F2D-8D1667792BE9@killingar.net>
 <20181128220323.GX4319@ando.pearwood.info>
 <C264B8F7-1DF7-4B02-8626-C2E160C2ADCB@killingar.net>
Message-ID: <76635835-61DC-41F3-9F76-036698D38477@gmail.com>

Hi everyone, first participation in Python?s mailing list, don?t be too hard on me

Some suggested above to change the definition of len in the long term. Then I think it could be interesting to define len such as :

- If has a finite length : return that length (the way it works now)
- If has a  length that is infinity : return infinity
- If has no length : return None

There?s an issue with this solution, having None returned add complexity to the usage of len, then I suggest to have a wrapper over __len__ methods so it throws the current error.

But still, there?s a problem with infinite length objects. If people code :

for i in range(len(infinite_list)):
    # Something

It?s not clear if people actually want to do this. It?s opened to discussion and it is just a suggestion.

If we now consider map, then the length of map (or filter or any other generator based on an iterator) is the same as the iterator itself which could be either infinite or non defined.

Cheers

> On 29 Nov 2018, at 06:06, Anders Hovm?ller <boxed at killingar.net> wrote:
> 
> 
>>> +1.  Throwing away information is almost always a bad idea.
>> 
>> "Almost always"? Let's take this seriously, and think about the 
>> consequences if we actually believed that. If I created a series of 
>> integers:
> 
> ?Almost". It?s part of my sentence. I have known about addition for many years in fact :)
> 
> / Anders
> 
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> https://mail.python.org/mailman/listinfo/python-ideas
> Code of Conduct: http://python.org/psf/codeofconduct/


From julien at tayon.net  Thu Nov 29 03:55:09 2018
From: julien at tayon.net (julien tayon)
Date: Thu, 29 Nov 2018 09:55:09 +0100
Subject: [Python-ideas] [Brainstorm] Testing with Documented ABCs
In-Reply-To: <CAKr=oZupOAN47d2LC9Xzu0mTu6KJM--BUqrUpo41WUQ=8VedfQ@mail.gmail.com>
References: <CAKr=oZupOAN47d2LC9Xzu0mTu6KJM--BUqrUpo41WUQ=8VedfQ@mail.gmail.com>
Message-ID: <CAFpLVkxZJHoSFw9LvApbg3_H1tVqbEx9z189jTo=xxFpxBBYFw@mail.gmail.com>

I wrote a lib specially for the case of validator that would also override
the documentation : default is if name of function +args speaks by it
itself then only this is added to the docstring
ex: @require_odd_numbers() => it would add require_odd_numbers at the end
of __doc__ and the possibilitly to add template of doc strings.
https://github.com/jul/check_arg
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20181129/fe05cda3/attachment.html>

From tjreedy at udel.edu  Thu Nov 29 04:25:38 2018
From: tjreedy at udel.edu (Terry Reedy)
Date: Thu, 29 Nov 2018 04:25:38 -0500
Subject: [Python-ideas] __len__() for map()
In-Reply-To: <20181128222714.GY4319@ando.pearwood.info>
References: <3e46b3e5-09e0-b53e-16f3-a1605c88df3f@thekunderts.net>
 <CAOTD34Z6XBM=_Gv_A2TVK2O2BEgDz6gOTbLFRHeeJsqTTWf9MA@mail.gmail.com>
 <ptmrkb$fam$1@blaine.gmane.org> <20181128222714.GY4319@ando.pearwood.info>
Message-ID: <ptob6h$6he$1@blaine.gmane.org>

On 11/28/2018 5:27 PM, Steven D'Aprano wrote:
> On Wed, Nov 28, 2018 at 02:53:50PM -0500, Terry Reedy wrote:
> 
>> One of the guidelines in the Zen of Python is
>> "Special cases aren't special enough to break the rules."
>>
>> This proposal claims that the Python 3 built-in iterator class 'map' is
>> so special that it should break the rule that iterators in general
>> cannot and therefore do not have .__len__ methods because their size may
>> be infinite, unknowable until exhaustion, or declining with each
>> .__next__ call.
>>
>> For iterators, 3.4 added an optional __length_hint__ method.  This makes
>> sense for iterators, like tuple_iterator, list_iterator, range_iterator,
>> and dict_keyiterator, based on a known finite collection.  At the time,
>> map.__length_hint__ was proposed and rejected as problematic, for
>> obvious reasons, and insufficiently useful.
> 
> Thanks for the background Terry, but doesn't that suggest that sometimes
> special cases ARE special enough to break the rules? *wink*

Yes, but these cases is not special enough to break the rules for len 
and __len__, especially when an alternative already exists.

> Unfortunately, I don't think it is obvious why map.__length_hint__ is
> problematic. 

It is less obvious (there are more details to fill in) than the (exact) 
length_hints for the list, tuple, range, and dict iterators.  This are 
*always* based on a sized collection.  Map is *sometimes* based on sized 
collection(s).  It is the other cases that are problematic, as 
illustrated by your next sentence.

> It only needs to return the *maximum* length, or
> sentinel (zero?) to say "I don't know".  It doesn't
> need to be accurate, unlike __len__ itself.

> Perhaps we should rethink the decision not to give map() and filter() a
> length hint?

I should have said this more explicitly.  This is why I suggested that 
someone define and test one or specific map.__length_hint__ 
implementations. Someone doing so should look into the C code for list 
to see how list handles iterators with a length hint.  I suspect that 
low estimates are better than high estimates.  Does list recognize any 
value as "I don't know"?

>> What makes the map class special among all built-in iterator classes?
>> It appears not to be a property of the class itself, as an iterator
>> class, but of its name.  In Python 2, 'map' was bound to a different
>> implementation of the map idea, a function that produced a list, which
>> has a length.  I suspect that if Python 3 were the original Python, we
>> would not have this discussion.
> 
> No, in fairness, I too have often wanted to know the length of an
> arbitrary iterator, including map(), without consuming it. In general
> this is an unsolvable problem, but sometimes it is (or at least, at first
> glance *seems*) solvable. map() is one of those cases.
> 
> If we could solve it, that would be great -- but I'm not convinced that
> it is solvable, since the solution seems worse than the problem it aims
> to solve. But I live in hope that somebody cleverer than me can point
> out the flaws in my argument.

The current situation with length_hint reminds me a bit of the situation 
with annotations before the addition of typing.  Perhaps it is time to 
think about conventions for the non-obvious 'other cases'.

>> Perhaps 2.7, in addition to future imports of text as unicode and print
>> as a function, should have had one to make map and filter be the 3.x
>> iterators.
> 
> I think that's future_builtins:
> 
> [steve at ando ~]$ python2.7 -c "from future_builtins import *; print map(len, [])"
> <itertools.imap object at 0xb7ed39ec>

Thanks for the info.

> But that wouldn't have helped E. Madison Bray or SageMath, since their
> difficulty is not their own internal use of map(), but their users' use
> of map().

In particular, by people who are not vividly aware that we broke the 
back-compatibility rule by rebinding 'map' and 'filter' in 3.0.

Breaking back-compatibility *again* by redefining len (to mean something 
like operator.length) is not the right solution to problems caused by 
the 3.0 break.

> Unless they simply ban any use of iterators at all, which I imagine will
> be a backwards-incompatible change (and for that matter an excessive
> overreaction for many uses), SageMath can't prevent users from providing
> map() objects or other iterator arguments.

I think their special case problem requires some special case solutions. 
  At this point, I am refraining from making suggestions.

-- 
Terry Jan Reedy


From erik.m.bray at gmail.com  Thu Nov 29 05:32:20 2018
From: erik.m.bray at gmail.com (E. Madison Bray)
Date: Thu, 29 Nov 2018 11:32:20 +0100
Subject: [Python-ideas] __len__() for map()
In-Reply-To: <5BFF1DCF.5040003@canterbury.ac.nz>
References: <3e46b3e5-09e0-b53e-16f3-a1605c88df3f@thekunderts.net>
 <CAOTD34Z6XBM=_Gv_A2TVK2O2BEgDz6gOTbLFRHeeJsqTTWf9MA@mail.gmail.com>
 <20181128150348.GT4319@ando.pearwood.info>
 <CAOTD34bzze1+LneSPX5+s4BGaSLo2sgS9_6uzF=_js3d=KxFMw@mail.gmail.com>
 <5BFF1DCF.5040003@canterbury.ac.nz>
Message-ID: <CAOTD34a5bL5TkpFb4L-5N4j+vgg=Zt7hn4L_K5kDTPtTHeQXDg@mail.gmail.com>

On Wed, Nov 28, 2018 at 11:59 PM Greg Ewing <greg.ewing at canterbury.ac.nz> wrote:
>
> E. Madison Bray wrote:
> > So I might want to check:
> >
> > finite_definite = True
> > for it in my_map.iters:
> >     try:
> >         len(it)
> >     except TypeError:
> >         finite_definite = False
> >
> > if finite_definite:
> >     my_seq = list(my_map)
> > else:
> >     # some other algorithm
>
> If map is being passed into your function, you can still do this
> check before calling map.
>
> If the user is doing the mapping themselves, then in Python 2 it
> would have blown up anyway before your function even got called,
> so nothing is any worse.

You either missed, or completely ignored, my previous message where I
addressed this:

"For example, previously a user might pass map(func, some_list) where
func is some pure function and the iterable is almost always a list of
some kind. Previously that map() call would be evaluated (often
slowly) first.

But now we can treat a map as something a little more formal, as a
container for a function and one or more iterables, which happens to
have this special functionality when you iterate over it, but is
otherwise just a special container. This is technically already the
case, we just can't directly access it as a container. If we could, it
would be possible to implement various optimizations that a user might
not have otherwise been obvious to the user. This is especially the
case of the iterable is a simple list, which is something we can
check. The function in this case very likely might actually be a C
function that was wrapped with Cython. I can easily convert this on
the user's behalf to a simple C loop or possibly even some other more
optimal vectorized code.

These are application-specific special cases of course, but many such
cases become easily accessible if map() and friends are usable as
specialized containers."

From erik.m.bray at gmail.com  Thu Nov 29 05:37:19 2018
From: erik.m.bray at gmail.com (E. Madison Bray)
Date: Thu, 29 Nov 2018 11:37:19 +0100
Subject: [Python-ideas] __len__() for map()
In-Reply-To: <20181128220323.GX4319@ando.pearwood.info>
References: <3e46b3e5-09e0-b53e-16f3-a1605c88df3f@thekunderts.net>
 <CAOTD34Z6XBM=_Gv_A2TVK2O2BEgDz6gOTbLFRHeeJsqTTWf9MA@mail.gmail.com>
 <CALD=Yf-7Jsb22thjHordR-EkK_Mj6sKX8x4kUEMFhJE=71AGWA@mail.gmail.com>
 <CAPTjJmqmx4bPr_aCbM9GHn6VuWf=yJGEObwO51Y1SQ6W7FOxxw@mail.gmail.com>
 <CAOTD34aK5fVRMDiM3s+sYWMJqUH_RF_N=QSBWD-Nbs0DeEDqqw@mail.gmail.com>
 <A0F3D279-0160-47E0-9F2D-8D1667792BE9@killingar.net>
 <20181128220323.GX4319@ando.pearwood.info>
Message-ID: <CAOTD34aTq-NzoBAYuQVE_WrBD9xVv-jmM-8ty6Y7asudaZ+ZwA@mail.gmail.com>

On Wed, Nov 28, 2018 at 11:04 PM Steven D'Aprano <steve at pearwood.info> wrote:
>
> On Wed, Nov 28, 2018 at 05:37:39PM +0100, Anders Hovm?ller wrote:
> >
> >
> > > I just mentioned that porting effort for background.  I still believe
> > > that the actual proposal of making the arguments to a map(...) call
> > > accessible from Python as attributes of the map object (ditto filter,
> > > zip, etc.) is useful in its own right, rather than just having this
> > > completely opaque iterator.
> >
> > +1.  Throwing away information is almost always a bad idea.
>
> "Almost always"? Let's take this seriously, and think about the
> consequences if we actually believed that. If I created a series of
> integers:
>
> a = 23
> b = 0x17
> c = 0o27
> d = 0b10111
> e = int('1b', 12)
>
> your assertion would say it is a bad idea to throw away the information
> about how they were created, and hence we ought to treat all five values
> as distinct and distinguishable. So much for the small integer cache...

Not to go too off-topic but I don't think this is a great example
either.  Although as a practical consideration I agree Python
shouldn't preserve the base representation from which an integer were
created I often *wish* it would.  It's useful information to have.
There's nothing I hate more than doing hex arithmetic in Python and
having it print out decimal results, then having to wrap everything in
hex(...) before displaying.  Base representation is still meaningful,
often useful information.

From solipsis at pitrou.net  Thu Nov 29 05:58:04 2018
From: solipsis at pitrou.net (Antoine Pitrou)
Date: Thu, 29 Nov 2018 11:58:04 +0100
Subject: [Python-ideas] [Brainstorm] Testing with Documented ABCs
References: <CAKr=oZupOAN47d2LC9Xzu0mTu6KJM--BUqrUpo41WUQ=8VedfQ@mail.gmail.com>
 <20181128151810.7c2393c2@fsol>
 <CAKr=oZsxC4kH=a=n2a2V0Ft5SpiL44wZ6PCw7gV49dg4ESfRYw@mail.gmail.com>
 <20181128230826.39ce721c@fsol>
 <CAEbHw4a+ad5aHW7k08PfvZqw4y2aF78YwTfdzTh=9inhtKHJSg@mail.gmail.com>
 <bee2ca82-8836-5f7e-ff7f-64445306023f@python.org>
 <CAKhLUo1bMsqSWegTxivvj5VCiHSRWyp7GxQifBBxAL1cJeuqBg@mail.gmail.com>
Message-ID: <20181129115804.426cca20@fsol>

On Wed, 28 Nov 2018 23:22:20 -0200
Marcos Eliziario
<marcos.eliziario at gmail.com> wrote:

> But nobody is talking about exhausting the combinatoric space of all
> possible values. Property Based Testing looks like Fuzzy Testing but it is
> not quite the same thing.

Well, the OP did talk about "exhaustively searching the
multidimensional space".  But I agree mere sampling is useful.  I might
give hypothesis a try someday.  Usually I prefer hand-rolling my own
stress testing routines.

Regards

Antoine.


From erik.m.bray at gmail.com  Thu Nov 29 06:13:56 2018
From: erik.m.bray at gmail.com (E. Madison Bray)
Date: Thu, 29 Nov 2018 12:13:56 +0100
Subject: [Python-ideas] __len__() for map()
In-Reply-To: <ptmrkb$fam$1@blaine.gmane.org>
References: <3e46b3e5-09e0-b53e-16f3-a1605c88df3f@thekunderts.net>
 <CAOTD34Z6XBM=_Gv_A2TVK2O2BEgDz6gOTbLFRHeeJsqTTWf9MA@mail.gmail.com>
 <ptmrkb$fam$1@blaine.gmane.org>
Message-ID: <CAOTD34Zry10WuoTs-HtN-H-3VAVXDH-ZXhx7DE4g2KxysB=p3Q@mail.gmail.com>

On Wed, Nov 28, 2018 at 8:54 PM Terry Reedy <tjreedy at udel.edu> wrote:
>
> On 11/28/2018 9:27 AM, E. Madison Bray wrote:
> > On Mon, Nov 26, 2018 at 10:35 PM Kale Kundert <kale at thekunderts.net> wrote:
> >>
> >> I just ran into the following behavior, and found it surprising:
> >>
> >>>>> len(map(float, [1,2,3]))
> >> TypeError: object of type 'map' has no len()
> >>
> >> I understand that map() could be given an infinite sequence and therefore might not always have a length.  But in this case, it seems like map() should've known that its length was 3.  I also understand that I can just call list() on the whole thing and get a list, but the nice thing about map() is that it doesn't copy data, so it's unfortunate to lose that advantage for no particular reason.
> >>
> >> My proposal is to delegate map.__len__() to the underlying iterable.
>
> One of the guidelines in the Zen of Python is
> "Special cases aren't special enough to break the rules."

This seems to be replying to the OP, whom I was quoting.  On one hand
I would argue that this is cherry-picking the "Zen" since not all
rules are special in the first place.  But in this case I agree that
map should not have a length or possibly even a length hint (although
the latter is more justifiable).

> > As a simple counter-proposal which I believe has fewer issues, I would
> > really like it if the built-in `map()` and `filter()` at least
> > provided a Python-level attribute to access the underlying iterables.
>
> This proposes to make map (and filter) special in a different way, by
> adding other special (dunder) attributes.  In general, built-in
> callables do not attach their args to their output, for obvious reasons.
>   If they do, they do not expose them.  If input data must be saved, the
> details are implementation dependent.  A C-coded callable would not
> necessarily save information in the form of Python objects.

Who said anything about "special", or adding "special (dunder)
attributes"?  Nor did I make any general statement about all
built-ins.  For arbitrary functions it doesn't necessarily make sense
to hold on to their arguments, but in the case of something like map()
its arguments are the only thing that give it meaning at all: The fact
remains that for something like a map in particular it can be treated
in a formal sense as a collection of a function and some sequence of
arguments (possibly unbounded) on which that function is to be
evaluated (perhaps not immediately).  As an analogy, a series in an
object in its own right without having to evaluate the entire series:
lots of information can be gleaned from the properties of a series
without having to evaluate it.  Just because you don't see the use
doesn't mean others can't find one.

The CPython map() implementation already carries this data on it as
"func" and "iters" members in its struct.  It's trivial to expose
those to Python as ".funcs" and ".iters" attributes.  Nothing
"special" about it.  However, that brings me to...

> https://docs.python.org/3/library/functions.html#map says
> "map(function, iterable, ...)
>      Return an iterator [...]"
>
> The wording is intentional.  The fact that map is a class and the
> iterator an instance of the class is a CPython implementation detail.
> Another implementation could use the generator function equivalent given
> in the Python 2 itertools doc, or a translation thereof.  I don't know
> what pypy and other implementations do.  The fact that CPython itertools
> callables are (now) C-coded classes instead Python-coded generator
> functions, or C translations thereof (which is tricky) is for
> performance and ease of maintenance.

Exactly how intentional is that wording though?  If it returns an
iterator it has to return *some object* that implements iteration in
the manner prescribed by map.  Generator functions could theoretically
allow attributes attached to them.  Roughly speaking:

def map(func, *iters):
    def map_inner():
        for args in zip(*iters):
            yield func(*args)

    gen = map_inner()
    gen.func = func
    gen.iters = iters

    return gen

As it happens this won't work in CPython since it does not allow
attribute assignment on generator objects.  Perhaps there's some good
reason for that, but AFAICT--though I may be missing a PEP or
something--this fact is not prescribed anywhere and is also particular
to CPython.  Point being, I don't think it's a massive leap or
imposition on any implementation to go from "Return an iterator [...]"
to "Return an iterator that has these attributes [...]"


P.S.

> > This is necessary because if I have a function that used to take, say,
> > a list as an argument, and it receives a `map` object, I now have to
> > be able to deal with map()s,
>
> If a function is documented as requiring a list, or a sequence, or a
> length object, it is a user bug to pass an iterator.  The only thing
> special about map and filter as errors is the rebinding of the names
> between Py2 and Py3, so that the same code may be good in 2.x and bad in
> 3.x.

It's not a user bug if you're porting a massive computer algebra
application that happens to use Python as its implementation language
(rather than inventing one from scratch) and your users don't need or
want to know too much about Python 2 vs Python 3.  Besides, the fact
that they are passing an iterator now is probably in many cases a good
thing for them, but it takes away my ability as a developer to find
out more about what they're trying to do, as opposed to say just being
given a list of finite size.

That said, I regret bringing up Sage; I was using it as an example but
I think the point stands on its own.

From erik.m.bray at gmail.com  Thu Nov 29 06:16:37 2018
From: erik.m.bray at gmail.com (E. Madison Bray)
Date: Thu, 29 Nov 2018 12:16:37 +0100
Subject: [Python-ideas] __len__() for map()
In-Reply-To: <20181128222714.GY4319@ando.pearwood.info>
References: <3e46b3e5-09e0-b53e-16f3-a1605c88df3f@thekunderts.net>
 <CAOTD34Z6XBM=_Gv_A2TVK2O2BEgDz6gOTbLFRHeeJsqTTWf9MA@mail.gmail.com>
 <ptmrkb$fam$1@blaine.gmane.org> <20181128222714.GY4319@ando.pearwood.info>
Message-ID: <CAOTD34Y8BXRS0mSQTCn8EpGRXRYJwk+zzYqSWjzte=5mB8fdYw@mail.gmail.com>

On Wed, Nov 28, 2018 at 11:27 PM Steven D'Aprano <steve at pearwood.info> wrote:
>
> On Wed, Nov 28, 2018 at 02:53:50PM -0500, Terry Reedy wrote:
> > What makes the map class special among all built-in iterator classes?
> > It appears not to be a property of the class itself, as an iterator
> > class, but of its name.  In Python 2, 'map' was bound to a different
> > implementation of the map idea, a function that produced a list, which
> > has a length.  I suspect that if Python 3 were the original Python, we
> > would not have this discussion.
>
> No, in fairness, I too have often wanted to know the length of an
> arbitrary iterator, including map(), without consuming it. In general
> this is an unsolvable problem, but sometimes it is (or at least, at first
> glance *seems*) solvable. map() is one of those cases.
>
> If we could solve it, that would be great -- but I'm not convinced that
> it is solvable, since the solution seems worse than the problem it aims
> to solve. But I live in hope that somebody cleverer than me can point
> out the flaws in my argument.

In general it's unsolvable, so no attempt should be made to provide a
pre-baked attempt at a solution that won't always work.  But in many,
if not the majority of cases, it *is* solvable.  So let's give
intelligent people the tools they need to solve it in those cases that
they know they can solve it :)

> But that wouldn't have helped E. Madison Bray or SageMath, since their
> difficulty is not their own internal use of map(), but their users' use
> of map().
>
> Unless they simply ban any use of iterators at all, which I imagine will
> be a backwards-incompatible change (and for that matter an excessive
> overreaction for many uses), SageMath can't prevent users from providing
> map() objects or other iterator arguments.

That is the majority of the case I was concerned about, yes.

From rosuav at gmail.com  Thu Nov 29 06:16:37 2018
From: rosuav at gmail.com (Chris Angelico)
Date: Thu, 29 Nov 2018 22:16:37 +1100
Subject: [Python-ideas] __len__() for map()
In-Reply-To: <CAOTD34Zry10WuoTs-HtN-H-3VAVXDH-ZXhx7DE4g2KxysB=p3Q@mail.gmail.com>
References: <3e46b3e5-09e0-b53e-16f3-a1605c88df3f@thekunderts.net>
 <CAOTD34Z6XBM=_Gv_A2TVK2O2BEgDz6gOTbLFRHeeJsqTTWf9MA@mail.gmail.com>
 <ptmrkb$fam$1@blaine.gmane.org>
 <CAOTD34Zry10WuoTs-HtN-H-3VAVXDH-ZXhx7DE4g2KxysB=p3Q@mail.gmail.com>
Message-ID: <CAPTjJmqpKjwB-umFSQ4B2ft3BA+1KeBitjrzqLBPr+CdyzssmA@mail.gmail.com>

On Thu, Nov 29, 2018 at 10:14 PM E. Madison Bray <erik.m.bray at gmail.com> wrote:
> P.S.
>
> > > This is necessary because if I have a function that used to take, say,
> > > a list as an argument, and it receives a `map` object, I now have to
> > > be able to deal with map()s,
> >
> > If a function is documented as requiring a list, or a sequence, or a
> > length object, it is a user bug to pass an iterator.  The only thing
> > special about map and filter as errors is the rebinding of the names
> > between Py2 and Py3, so that the same code may be good in 2.x and bad in
> > 3.x.
>
> It's not a user bug if you're porting a massive computer algebra
> application that happens to use Python as its implementation language
> (rather than inventing one from scratch) and your users don't need or
> want to know too much about Python 2 vs Python 3.  Besides, the fact
> that they are passing an iterator now is probably in many cases a good
> thing for them, but it takes away my ability as a developer to find
> out more about what they're trying to do, as opposed to say just being
> given a list of finite size.

If that's the case, then it should be no problem to rebind
builtins.map to return a list. Problem solved.

ChrisA

From erik.m.bray at gmail.com  Thu Nov 29 06:18:33 2018
From: erik.m.bray at gmail.com (E. Madison Bray)
Date: Thu, 29 Nov 2018 12:18:33 +0100
Subject: [Python-ideas] __len__() for map()
In-Reply-To: <CAPTjJmqpKjwB-umFSQ4B2ft3BA+1KeBitjrzqLBPr+CdyzssmA@mail.gmail.com>
References: <3e46b3e5-09e0-b53e-16f3-a1605c88df3f@thekunderts.net>
 <CAOTD34Z6XBM=_Gv_A2TVK2O2BEgDz6gOTbLFRHeeJsqTTWf9MA@mail.gmail.com>
 <ptmrkb$fam$1@blaine.gmane.org>
 <CAOTD34Zry10WuoTs-HtN-H-3VAVXDH-ZXhx7DE4g2KxysB=p3Q@mail.gmail.com>
 <CAPTjJmqpKjwB-umFSQ4B2ft3BA+1KeBitjrzqLBPr+CdyzssmA@mail.gmail.com>
Message-ID: <CAOTD34YGy9O_iud24O0hgPyKJo4sw6mPS5HfWZhyFOgrVJnWXg@mail.gmail.com>

On Thu, Nov 29, 2018 at 12:16 PM Chris Angelico <rosuav at gmail.com> wrote:
>
> On Thu, Nov 29, 2018 at 10:14 PM E. Madison Bray <erik.m.bray at gmail.com> wrote:
> > P.S.
> >
> > > > This is necessary because if I have a function that used to take, say,
> > > > a list as an argument, and it receives a `map` object, I now have to
> > > > be able to deal with map()s,
> > >
> > > If a function is documented as requiring a list, or a sequence, or a
> > > length object, it is a user bug to pass an iterator.  The only thing
> > > special about map and filter as errors is the rebinding of the names
> > > between Py2 and Py3, so that the same code may be good in 2.x and bad in
> > > 3.x.
> >
> > It's not a user bug if you're porting a massive computer algebra
> > application that happens to use Python as its implementation language
> > (rather than inventing one from scratch) and your users don't need or
> > want to know too much about Python 2 vs Python 3.  Besides, the fact
> > that they are passing an iterator now is probably in many cases a good
> > thing for them, but it takes away my ability as a developer to find
> > out more about what they're trying to do, as opposed to say just being
> > given a list of finite size.
>
> If that's the case, then it should be no problem to rebind
> builtins.map to return a list. Problem solved.

Rebind where?  How?  In sage.__init__?  How do you think that will fly
with other packages loaded in the same interpreter?

From rosuav at gmail.com  Thu Nov 29 06:21:15 2018
From: rosuav at gmail.com (Chris Angelico)
Date: Thu, 29 Nov 2018 22:21:15 +1100
Subject: [Python-ideas] __len__() for map()
In-Reply-To: <CAOTD34YGy9O_iud24O0hgPyKJo4sw6mPS5HfWZhyFOgrVJnWXg@mail.gmail.com>
References: <3e46b3e5-09e0-b53e-16f3-a1605c88df3f@thekunderts.net>
 <CAOTD34Z6XBM=_Gv_A2TVK2O2BEgDz6gOTbLFRHeeJsqTTWf9MA@mail.gmail.com>
 <ptmrkb$fam$1@blaine.gmane.org>
 <CAOTD34Zry10WuoTs-HtN-H-3VAVXDH-ZXhx7DE4g2KxysB=p3Q@mail.gmail.com>
 <CAPTjJmqpKjwB-umFSQ4B2ft3BA+1KeBitjrzqLBPr+CdyzssmA@mail.gmail.com>
 <CAOTD34YGy9O_iud24O0hgPyKJo4sw6mPS5HfWZhyFOgrVJnWXg@mail.gmail.com>
Message-ID: <CAPTjJmrH=nBwxx_eoO5FV8zy8Heupib8m271g0YrOHAq7MAKxg@mail.gmail.com>

On Thu, Nov 29, 2018 at 10:18 PM E. Madison Bray <erik.m.bray at gmail.com> wrote:
>
> On Thu, Nov 29, 2018 at 12:16 PM Chris Angelico <rosuav at gmail.com> wrote:
> >
> > On Thu, Nov 29, 2018 at 10:14 PM E. Madison Bray <erik.m.bray at gmail.com> wrote:
> > > P.S.
> > >
> > > > > This is necessary because if I have a function that used to take, say,
> > > > > a list as an argument, and it receives a `map` object, I now have to
> > > > > be able to deal with map()s,
> > > >
> > > > If a function is documented as requiring a list, or a sequence, or a
> > > > length object, it is a user bug to pass an iterator.  The only thing
> > > > special about map and filter as errors is the rebinding of the names
> > > > between Py2 and Py3, so that the same code may be good in 2.x and bad in
> > > > 3.x.
> > >
> > > It's not a user bug if you're porting a massive computer algebra
> > > application that happens to use Python as its implementation language
> > > (rather than inventing one from scratch) and your users don't need or
> > > want to know too much about Python 2 vs Python 3.  Besides, the fact
> > > that they are passing an iterator now is probably in many cases a good
> > > thing for them, but it takes away my ability as a developer to find
> > > out more about what they're trying to do, as opposed to say just being
> > > given a list of finite size.
> >
> > If that's the case, then it should be no problem to rebind
> > builtins.map to return a list. Problem solved.
>
> Rebind where?  How?  In sage.__init__?  How do you think that will fly
> with other packages loaded in the same interpreter?

Either this is Python, or it's just an algebra language that happens
to be implemented in Python. If the former, the Py2/Py3 distinction
should matter to your users, since they are programming in Python. If
the latter, it's all about Sage, ergo you can rebind map to mean what
you expect it to mean. Take your pick.

ChrisA

From steve at pearwood.info  Thu Nov 29 07:38:24 2018
From: steve at pearwood.info (Steven D'Aprano)
Date: Thu, 29 Nov 2018 23:38:24 +1100
Subject: [Python-ideas] __len__() for map()
In-Reply-To: <CAOTD34Y8BXRS0mSQTCn8EpGRXRYJwk+zzYqSWjzte=5mB8fdYw@mail.gmail.com>
References: <3e46b3e5-09e0-b53e-16f3-a1605c88df3f@thekunderts.net>
 <CAOTD34Z6XBM=_Gv_A2TVK2O2BEgDz6gOTbLFRHeeJsqTTWf9MA@mail.gmail.com>
 <ptmrkb$fam$1@blaine.gmane.org> <20181128222714.GY4319@ando.pearwood.info>
 <CAOTD34Y8BXRS0mSQTCn8EpGRXRYJwk+zzYqSWjzte=5mB8fdYw@mail.gmail.com>
Message-ID: <20181129123823.GB4319@ando.pearwood.info>

On Thu, Nov 29, 2018 at 12:16:37PM +0100, E. Madison Bray wrote:
> On Wed, Nov 28, 2018 at 11:27 PM Steven D'Aprano <steve at pearwood.info> wrote:

["it" below being the length of an arbitrary iterator]

> > If we could solve it, that would be great -- but I'm not convinced that
> > it is solvable, since the solution seems worse than the problem it aims
> > to solve. But I live in hope that somebody cleverer than me can point
> > out the flaws in my argument.
> 
> In general it's unsolvable, so no attempt should be made to provide a
> pre-baked attempt at a solution that won't always work.  But in many,
> if not the majority of cases, it *is* solvable.  So let's give
> intelligent people the tools they need to solve it in those cases that
> they know they can solve it :)

So you say, but the solutions made so far seem fatally flawed to me.

Just repeating the assertion that it is solvable isn't very convincing.


-- 
Steve

From erik.m.bray at gmail.com  Thu Nov 29 08:12:19 2018
From: erik.m.bray at gmail.com (E. Madison Bray)
Date: Thu, 29 Nov 2018 14:12:19 +0100
Subject: [Python-ideas] __len__() for map()
In-Reply-To: <CAPTjJmrH=nBwxx_eoO5FV8zy8Heupib8m271g0YrOHAq7MAKxg@mail.gmail.com>
References: <3e46b3e5-09e0-b53e-16f3-a1605c88df3f@thekunderts.net>
 <CAOTD34Z6XBM=_Gv_A2TVK2O2BEgDz6gOTbLFRHeeJsqTTWf9MA@mail.gmail.com>
 <ptmrkb$fam$1@blaine.gmane.org>
 <CAOTD34Zry10WuoTs-HtN-H-3VAVXDH-ZXhx7DE4g2KxysB=p3Q@mail.gmail.com>
 <CAPTjJmqpKjwB-umFSQ4B2ft3BA+1KeBitjrzqLBPr+CdyzssmA@mail.gmail.com>
 <CAOTD34YGy9O_iud24O0hgPyKJo4sw6mPS5HfWZhyFOgrVJnWXg@mail.gmail.com>
 <CAPTjJmrH=nBwxx_eoO5FV8zy8Heupib8m271g0YrOHAq7MAKxg@mail.gmail.com>
Message-ID: <CAOTD34YLKgS3V7O8RkhjH-zFoXo8mpCCKNd2TEkGKaAHpba_jg@mail.gmail.com>

On Thu, Nov 29, 2018 at 12:21 PM Chris Angelico <rosuav at gmail.com> wrote:
>
> On Thu, Nov 29, 2018 at 10:18 PM E. Madison Bray <erik.m.bray at gmail.com> wrote:
> >
> > On Thu, Nov 29, 2018 at 12:16 PM Chris Angelico <rosuav at gmail.com> wrote:
> > >
> > > On Thu, Nov 29, 2018 at 10:14 PM E. Madison Bray <erik.m.bray at gmail.com> wrote:
> > > > P.S.
> > > >
> > > > > > This is necessary because if I have a function that used to take, say,
> > > > > > a list as an argument, and it receives a `map` object, I now have to
> > > > > > be able to deal with map()s,
> > > > >
> > > > > If a function is documented as requiring a list, or a sequence, or a
> > > > > length object, it is a user bug to pass an iterator.  The only thing
> > > > > special about map and filter as errors is the rebinding of the names
> > > > > between Py2 and Py3, so that the same code may be good in 2.x and bad in
> > > > > 3.x.
> > > >
> > > > It's not a user bug if you're porting a massive computer algebra
> > > > application that happens to use Python as its implementation language
> > > > (rather than inventing one from scratch) and your users don't need or
> > > > want to know too much about Python 2 vs Python 3.  Besides, the fact
> > > > that they are passing an iterator now is probably in many cases a good
> > > > thing for them, but it takes away my ability as a developer to find
> > > > out more about what they're trying to do, as opposed to say just being
> > > > given a list of finite size.
> > >
> > > If that's the case, then it should be no problem to rebind
> > > builtins.map to return a list. Problem solved.
> >
> > Rebind where?  How?  In sage.__init__?  How do you think that will fly
> > with other packages loaded in the same interpreter?
>
> Either this is Python, or it's just an algebra language that happens
> to be implemented in Python. If the former, the Py2/Py3 distinction
> should matter to your users, since they are programming in Python.

Porque no los dos?  Sage is a superset of Python, and while on some
level (in terms of advanced programming constructs) users will need to
care about the distinction.  But most users don't really know exactly
what it does when they pass something like map(a_func, a_list) as an
argument to a function call.  They don't necessarily appreciate the
distinction that, depending on how that function is implemented, an
arbitrary iterable has to be treated very differently than a list.

I certainly don't mind supporting arbitrary iterables--I think they
should be supported.  But now there are optimizations I can't make
that I could have made before when map() just returned a list.  In
most cases I didn't have to make these optimizations manually because
the code is written in Cython. It's true that when a user called map()
previously some opportunities for optimization were already lost, but
now it's even worse because I have to treat a simple map of a list on
par with the necessarily slower arbitrary iterator case, when
technically-speaking there is no reason that has to be the case.
Cython could even handle that case automatically as well by turning a
map(<some_C_function_wrapped_by_cython>, <a_list>) into something
like:

    list = map.iter[0];
    for (idx=0; idx < PyList_Length(list); idx++) {
        wrapped_c_function(PyList_GET_ITEM(list, idx);
    }

> If the latter, it's all about Sage, ergo you can rebind map to mean what
> you expect it to mean. Take your pick.

I'm' still not sure what makes you think one can just blithely replace
a builtin with something that doesn't work how all other Python
libraries expect that builtin to work.  At best I could subclass map()
and add this functionality but now you're adding at least three
pointers to every map() that are not necessary since the information
is already there in the C struct.  For most cases this isn't too bad
in terms of overhead but consider cases (which I've seen plenty of),
like:

list_of_lists = [map(int, x) for x in list_of_lists]

Now the user who previously expected to have a list of lists has a
list of maps.  It's already bad enough that each map holds a pointer
to a function but I wouldn't want to make that worse.

Anyways, I'd love to get off the topic of Sage and just ask why you
would object to useful introspection capabilities?  I don't even care
if it were CPython-specific.

From erik.m.bray at gmail.com  Thu Nov 29 08:16:48 2018
From: erik.m.bray at gmail.com (E. Madison Bray)
Date: Thu, 29 Nov 2018 14:16:48 +0100
Subject: [Python-ideas] __len__() for map()
In-Reply-To: <20181129123823.GB4319@ando.pearwood.info>
References: <3e46b3e5-09e0-b53e-16f3-a1605c88df3f@thekunderts.net>
 <CAOTD34Z6XBM=_Gv_A2TVK2O2BEgDz6gOTbLFRHeeJsqTTWf9MA@mail.gmail.com>
 <ptmrkb$fam$1@blaine.gmane.org> <20181128222714.GY4319@ando.pearwood.info>
 <CAOTD34Y8BXRS0mSQTCn8EpGRXRYJwk+zzYqSWjzte=5mB8fdYw@mail.gmail.com>
 <20181129123823.GB4319@ando.pearwood.info>
Message-ID: <CAOTD34a+dN0qHebvG6Z9+TtYV9VWvF95NNA47cq4vV9oeLyY_g@mail.gmail.com>

On Thu, Nov 29, 2018 at 1:38 PM Steven D'Aprano <steve at pearwood.info> wrote:
>
> On Thu, Nov 29, 2018 at 12:16:37PM +0100, E. Madison Bray wrote:
> > On Wed, Nov 28, 2018 at 11:27 PM Steven D'Aprano <steve at pearwood.info> wrote:
>
> ["it" below being the length of an arbitrary iterator]
>
> > > If we could solve it, that would be great -- but I'm not convinced that
> > > it is solvable, since the solution seems worse than the problem it aims
> > > to solve. But I live in hope that somebody cleverer than me can point
> > > out the flaws in my argument.
> >
> > In general it's unsolvable, so no attempt should be made to provide a
> > pre-baked attempt at a solution that won't always work.  But in many,
> > if not the majority of cases, it *is* solvable.  So let's give
> > intelligent people the tools they need to solve it in those cases that
> > they know they can solve it :)
>
> So you say, but the solutions made so far seem fatally flawed to me.
>
> Just repeating the assertion that it is solvable isn't very convincing.

Okay, let's keep it simple:

m = map(str, [1, 2, 3])
len_of_m = None
if len(m.iters) == 1 and isinstance(m.iters[0], Sized):
    len_of_m = len(m.iters[0])

You can give me pathological cases where that isn't true, but you
can't say there's no context in which that wouldn't be virtually
guaranteed and consenting adults can decide whether or not that's a
safe-enough assumption in their own code.

From steve at pearwood.info  Thu Nov 29 08:13:09 2018
From: steve at pearwood.info (Steven D'Aprano)
Date: Fri, 30 Nov 2018 00:13:09 +1100
Subject: [Python-ideas] __len__() for map()
In-Reply-To: <CAPTjJmrH=nBwxx_eoO5FV8zy8Heupib8m271g0YrOHAq7MAKxg@mail.gmail.com>
References: <3e46b3e5-09e0-b53e-16f3-a1605c88df3f@thekunderts.net>
 <CAOTD34Z6XBM=_Gv_A2TVK2O2BEgDz6gOTbLFRHeeJsqTTWf9MA@mail.gmail.com>
 <ptmrkb$fam$1@blaine.gmane.org>
 <CAOTD34Zry10WuoTs-HtN-H-3VAVXDH-ZXhx7DE4g2KxysB=p3Q@mail.gmail.com>
 <CAPTjJmqpKjwB-umFSQ4B2ft3BA+1KeBitjrzqLBPr+CdyzssmA@mail.gmail.com>
 <CAOTD34YGy9O_iud24O0hgPyKJo4sw6mPS5HfWZhyFOgrVJnWXg@mail.gmail.com>
 <CAPTjJmrH=nBwxx_eoO5FV8zy8Heupib8m271g0YrOHAq7MAKxg@mail.gmail.com>
Message-ID: <20181129131308.GC4319@ando.pearwood.info>

On Thu, Nov 29, 2018 at 10:21:15PM +1100, Chris Angelico wrote:
> On Thu, Nov 29, 2018 at 10:18 PM E. Madison Bray <erik.m.bray at gmail.com> wrote:
> >
> > On Thu, Nov 29, 2018 at 12:16 PM Chris Angelico <rosuav at gmail.com> wrote:
> > >
> > > On Thu, Nov 29, 2018 at 10:14 PM E. Madison Bray <erik.m.bray at gmail.com> wrote:
[...]
> > > If that's the case, then it should be no problem to rebind
> > > builtins.map to return a list. Problem solved.
> >
> > Rebind where?  How?  In sage.__init__?  How do you think that will fly
> > with other packages loaded in the same interpreter?
> 
> Either this is Python, or it's just an algebra language that happens
> to be implemented in Python.

False dichotomy. Sage is *all* of these things:

- a stand-alone application which is (partially) written in Python;
- an application which runs under iPython/Jupiter;
- a package which has to interoperate with other Python packages;
- an algebra language.


> If the former, the Py2/Py3 distinction
> should matter to your users, since they are programming in Python.

Even if they know, and care, about the difference between iterators and 
lists, they cannot be expected to know or care about how the hundreds of 
Sage functions process lists differently from iterators. Which would be 
implementation details of the Sage functions, and subject to change 
without warning.

I sympathise with this proposal. In my own tiny little way, I've had to 
grapple with something similar for the stdlib statistics library, and 
I'm not totally happy with the work-around I came up with. And I have a 
few ideas for the future which will either render the difference moot, 
or make the problem worse, I'm not sure which :-)


> If
> the latter, it's all about Sage, ergo you can rebind map to mean what
> you expect it to mean. Take your pick.

Sage wraps a number of Python libraries, such as numpy, sympy and 
others, and itself can run under iPython which for all we know may 
already have monkeypatched the builtins for its own ~~nefarious~~ useful 
purposes. Are you really comfortable with monkeypatching the builtins in 
this way in such a complex ecosystem of packages? Maybe it will work, 
but I think you're being awfully gung-ho about the suggestion.

(At least my earlier suggestion didn't involve monkey-patching the 
builtin map, merely shadowing it.)

Personally, even if monkeypatching in this way solved the problem, as a 
(potential) user of SageMath I'd be really, really peeved if it patched 
map() in the way you suggest and regressed map() to the 2.x version.


-- 
Steve

From rosuav at gmail.com  Thu Nov 29 08:22:01 2018
From: rosuav at gmail.com (Chris Angelico)
Date: Fri, 30 Nov 2018 00:22:01 +1100
Subject: [Python-ideas] __len__() for map()
In-Reply-To: <20181129131308.GC4319@ando.pearwood.info>
References: <3e46b3e5-09e0-b53e-16f3-a1605c88df3f@thekunderts.net>
 <CAOTD34Z6XBM=_Gv_A2TVK2O2BEgDz6gOTbLFRHeeJsqTTWf9MA@mail.gmail.com>
 <ptmrkb$fam$1@blaine.gmane.org>
 <CAOTD34Zry10WuoTs-HtN-H-3VAVXDH-ZXhx7DE4g2KxysB=p3Q@mail.gmail.com>
 <CAPTjJmqpKjwB-umFSQ4B2ft3BA+1KeBitjrzqLBPr+CdyzssmA@mail.gmail.com>
 <CAOTD34YGy9O_iud24O0hgPyKJo4sw6mPS5HfWZhyFOgrVJnWXg@mail.gmail.com>
 <CAPTjJmrH=nBwxx_eoO5FV8zy8Heupib8m271g0YrOHAq7MAKxg@mail.gmail.com>
 <20181129131308.GC4319@ando.pearwood.info>
Message-ID: <CAPTjJmoSmhfG+qtuGajQ2ZhnRyCGBGCPuQ0V14US+WasSPjyCA@mail.gmail.com>

On Fri, Nov 30, 2018 at 12:18 AM Steven D'Aprano <steve at pearwood.info> wrote:
> Sage wraps a number of Python libraries, such as numpy, sympy and
> others, and itself can run under iPython which for all we know may
> already have monkeypatched the builtins for its own ~~nefarious~~ useful
> purposes. Are you really comfortable with monkeypatching the builtins in
> this way in such a complex ecosystem of packages? Maybe it will work,
> but I think you're being awfully gung-ho about the suggestion.

To be quite honest, no, I am not comfortable with it. But I *am*
comfortable with expecting Python programmers to program in Python,
and thus deeming that breakage as a result of user code being migrated
from Py2 to Py3 is to be fixed by the user. You can mess around with
map(), but there are plenty of other things you can't mess with, so I
don't see why this one thing should be Sage's problem.

ChrisA

From erik.m.bray at gmail.com  Thu Nov 29 09:05:02 2018
From: erik.m.bray at gmail.com (E. Madison Bray)
Date: Thu, 29 Nov 2018 15:05:02 +0100
Subject: [Python-ideas] __len__() for map()
In-Reply-To: <CAPTjJmoSmhfG+qtuGajQ2ZhnRyCGBGCPuQ0V14US+WasSPjyCA@mail.gmail.com>
References: <3e46b3e5-09e0-b53e-16f3-a1605c88df3f@thekunderts.net>
 <CAOTD34Z6XBM=_Gv_A2TVK2O2BEgDz6gOTbLFRHeeJsqTTWf9MA@mail.gmail.com>
 <ptmrkb$fam$1@blaine.gmane.org>
 <CAOTD34Zry10WuoTs-HtN-H-3VAVXDH-ZXhx7DE4g2KxysB=p3Q@mail.gmail.com>
 <CAPTjJmqpKjwB-umFSQ4B2ft3BA+1KeBitjrzqLBPr+CdyzssmA@mail.gmail.com>
 <CAOTD34YGy9O_iud24O0hgPyKJo4sw6mPS5HfWZhyFOgrVJnWXg@mail.gmail.com>
 <CAPTjJmrH=nBwxx_eoO5FV8zy8Heupib8m271g0YrOHAq7MAKxg@mail.gmail.com>
 <20181129131308.GC4319@ando.pearwood.info>
 <CAPTjJmoSmhfG+qtuGajQ2ZhnRyCGBGCPuQ0V14US+WasSPjyCA@mail.gmail.com>
Message-ID: <CAOTD34aiqxDLWiSYrURTeODO7Rbz8SsF4P1m0U9erjwt0iorPQ@mail.gmail.com>

On Thu, Nov 29, 2018 at 2:22 PM Chris Angelico <rosuav at gmail.com> wrote:
>
> On Fri, Nov 30, 2018 at 12:18 AM Steven D'Aprano <steve at pearwood.info> wrote:
> > Sage wraps a number of Python libraries, such as numpy, sympy and
> > others, and itself can run under iPython which for all we know may
> > already have monkeypatched the builtins for its own ~~nefarious~~ useful
> > purposes. Are you really comfortable with monkeypatching the builtins in
> > this way in such a complex ecosystem of packages? Maybe it will work,
> > but I think you're being awfully gung-ho about the suggestion.
>
> To be quite honest, no, I am not comfortable with it. But I *am*
> comfortable with expecting Python programmers to program in Python,
> and thus deeming that breakage as a result of user code being migrated
> from Py2 to Py3 is to be fixed by the user. You can mess around with
> map(), but there are plenty of other things you can't mess with, so I
> don't see why this one thing should be Sage's problem.

The users--often scientists--of SageMath and many other scientific
Python packages* are not "Python programmers" as such**.  My job as a
software engineer is to make the lower-level libraries they use for
their day-to-day research work _just work_, and in particular
_optimize_ that lower-level code in as many ways as I can find to.  In
some cases we do have to tell them about Python 2 vs Python 3 things
(especially w.r.t. print()) but most of the time it is relatively
transparent, as it should be.

Steven has the right idea about it.  Not every detail can be made
perfectly transparent in terms of how users use or misuse them, no.
But there are lots of areas where they should absolutely not have to
care (e.g. like Steven wrote they cannot be expected to know how every
single function might treat an iterator like map() over a finite
sequence distinctly from the original finite sequence itself). In the
case of map(), although maybe I have not articulated it well, I can
say for sure that I've had perfectly valid use cases that were stymied
merely by a semi-arbitrary decision to hide the data the wrapped by
the "iterator returned by map()" (if you want to be pedantic about
it).

I'm willing to accept some explanation for why that would be actively
harmful, but as someone with concrete problems to solve I'm less
convinced by appeals to abstracts, or "why not just X" as if I hadn't
considered "X" and found it flawed (which is not to say that I don't
mind any new idea being put thoroughly through its paces.)

* (Pandas, SymPy, Astropy, and even lower-level packages like NumPy,
not to mention Jupyter which implements kernels for dozens of
languages, but is primarily implemented in Python)
** With an obligatory asterisk to counter a common refrain from those
who experience impostor syndrome, that if you are using this software
then yes you are in fact a Python programmer, you just haven't
realized it yet ;)

From jfine2358 at gmail.com  Thu Nov 29 09:28:07 2018
From: jfine2358 at gmail.com (Jonathan Fine)
Date: Thu, 29 Nov 2018 14:28:07 +0000
Subject: [Python-ideas] __len__() for map()
In-Reply-To: <CAOTD34aiqxDLWiSYrURTeODO7Rbz8SsF4P1m0U9erjwt0iorPQ@mail.gmail.com>
References: <3e46b3e5-09e0-b53e-16f3-a1605c88df3f@thekunderts.net>
 <CAOTD34Z6XBM=_Gv_A2TVK2O2BEgDz6gOTbLFRHeeJsqTTWf9MA@mail.gmail.com>
 <ptmrkb$fam$1@blaine.gmane.org>
 <CAOTD34Zry10WuoTs-HtN-H-3VAVXDH-ZXhx7DE4g2KxysB=p3Q@mail.gmail.com>
 <CAPTjJmqpKjwB-umFSQ4B2ft3BA+1KeBitjrzqLBPr+CdyzssmA@mail.gmail.com>
 <CAOTD34YGy9O_iud24O0hgPyKJo4sw6mPS5HfWZhyFOgrVJnWXg@mail.gmail.com>
 <CAPTjJmrH=nBwxx_eoO5FV8zy8Heupib8m271g0YrOHAq7MAKxg@mail.gmail.com>
 <20181129131308.GC4319@ando.pearwood.info>
 <CAPTjJmoSmhfG+qtuGajQ2ZhnRyCGBGCPuQ0V14US+WasSPjyCA@mail.gmail.com>
 <CAOTD34aiqxDLWiSYrURTeODO7Rbz8SsF4P1m0U9erjwt0iorPQ@mail.gmail.com>
Message-ID: <CALD=Yf8WVxMLTg3WVC1_6S7T1TKpZJ9yHVacJpcd1CDuFD0efA@mail.gmail.com>

On Thu, Nov 29, 2018 at 2:05 PM E. Madison Bray <erik.m.bray at gmail.com> wrote:

> The users--often scientists--of SageMath and many other scientific
> Python packages* are not "Python programmers" as such**.  My job as a
> software engineer is to make the lower-level libraries they use for
> their day-to-day research work _just work_, and in particular
> _optimize_ that lower-level code in as many ways as I can find to.  In
> some cases we do have to tell them about Python 2 vs Python 3 things
> (especially w.r.t. print()) but most of the time it is relatively
> transparent, as it should be.

Well said. Unlike many people on this list, programming Python is not
their top skill. For example, Paul Romer, the 2018 Economic Nobel
Memorial Laurate. His strength is economics. Python is one of the many
tools he uses. But it's not his top skill (smile).

https://developers.slashdot.org/story/18/10/09/0042240/economics-nobel-laureate-paul-romer-is-a-python-programming-convert

In some sense, I think, what Madison wants is an internal domain
specific language (IDSL) that works well for Sage users. Just as
Django is an IDSL that works well for many web developers.

See, for example https://martinfowler.com/books/dsl.html for the
general idea. We might not agree on the specifics. But that's perhaps
mostly a matter for the domain experts, such as Madison and Sage
users.

-- 
Jonathan

From steve at pearwood.info  Thu Nov 29 09:43:12 2018
From: steve at pearwood.info (Steven D'Aprano)
Date: Fri, 30 Nov 2018 01:43:12 +1100
Subject: [Python-ideas] __len__() for map()
In-Reply-To: <CAOTD34a+dN0qHebvG6Z9+TtYV9VWvF95NNA47cq4vV9oeLyY_g@mail.gmail.com>
References: <3e46b3e5-09e0-b53e-16f3-a1605c88df3f@thekunderts.net>
 <CAOTD34Z6XBM=_Gv_A2TVK2O2BEgDz6gOTbLFRHeeJsqTTWf9MA@mail.gmail.com>
 <ptmrkb$fam$1@blaine.gmane.org> <20181128222714.GY4319@ando.pearwood.info>
 <CAOTD34Y8BXRS0mSQTCn8EpGRXRYJwk+zzYqSWjzte=5mB8fdYw@mail.gmail.com>
 <20181129123823.GB4319@ando.pearwood.info>
 <CAOTD34a+dN0qHebvG6Z9+TtYV9VWvF95NNA47cq4vV9oeLyY_g@mail.gmail.com>
Message-ID: <20181129144311.GD4319@ando.pearwood.info>

On Thu, Nov 29, 2018 at 02:16:48PM +0100, E. Madison Bray wrote:

> Okay, let's keep it simple:
> 
> m = map(str, [1, 2, 3])
> len_of_m = None
> if len(m.iters) == 1 and isinstance(m.iters[0], Sized):
>     len_of_m = len(m.iters[0])
> 
> You can give me pathological cases where that isn't true, but you
> can't say there's no context in which that wouldn't be virtually
> guaranteed

Yes I can, and they aren't pathological cases. They are ordinary cases 
working the way iterators are designed to work.

All you get is a map object. You have no way of knowing how many times 
the iterator has been advanced by calling next(). Consequently, there is 
no guarantee that len(m.iters[0]) == len(list(m)) except by the merest 
accident that the map object hasn't had next() called on it yet.

*This is not pathological behaviour*. This is how iterators are designed 
to work.

The ability to partially advance an iterator, pause, then pass it on to 
another function to be completed is a huge benefit of the iterator 
protocol. I've written code like this on more than one occasion:

# toy example
for x in it:
    process(x)
    if condition(x):
        for y in it:
            do_something_else(y)
        # Strictly speaking, this isn't needed, since "it" is consumed.
        break

If I pass the partially consumed map iterator to your function, it will 
use the wrong length and give me back inaccurate results. (Assuming it
actually uses the length as part of the calculated result.)

You might say that your users are not so advanced, or that they're naive 
enough not to even know they could do that, but that's a pretty unsafe 
assumption as well as being rather insulting to your own users, some of 
whom are surely advanced Python coders not just naive dabblers.
 
Even if only one in a hundred users knows that they can partially 
iterate over the map, and only one in a hundred of those actually do so, 
you're still making an unsafe assumption that will return inaccurate 
results based on an invalid value of len_of_m.


> and consenting adults can decide whether or not that's a
> safe-enough assumption in their own code.

Which consenting adults? How am I, wearing the hat of a Sage user, 
supposed to know which of the hundreds of Sage functions make this 
"safe-enough" assumption and return inaccurate results as a consequence?


-- 
Steve

From erik.m.bray at gmail.com  Thu Nov 29 10:32:00 2018
From: erik.m.bray at gmail.com (E. Madison Bray)
Date: Thu, 29 Nov 2018 16:32:00 +0100
Subject: [Python-ideas] __len__() for map()
In-Reply-To: <20181129144311.GD4319@ando.pearwood.info>
References: <3e46b3e5-09e0-b53e-16f3-a1605c88df3f@thekunderts.net>
 <CAOTD34Z6XBM=_Gv_A2TVK2O2BEgDz6gOTbLFRHeeJsqTTWf9MA@mail.gmail.com>
 <ptmrkb$fam$1@blaine.gmane.org> <20181128222714.GY4319@ando.pearwood.info>
 <CAOTD34Y8BXRS0mSQTCn8EpGRXRYJwk+zzYqSWjzte=5mB8fdYw@mail.gmail.com>
 <20181129123823.GB4319@ando.pearwood.info>
 <CAOTD34a+dN0qHebvG6Z9+TtYV9VWvF95NNA47cq4vV9oeLyY_g@mail.gmail.com>
 <20181129144311.GD4319@ando.pearwood.info>
Message-ID: <CAOTD34b_HeBYkZC49dpRi=g2tVrTajfALv2sDcH9d8gkY5ODvw@mail.gmail.com>

On Thu, Nov 29, 2018 at 3:43 PM Steven D'Aprano <steve at pearwood.info> wrote:
>
> On Thu, Nov 29, 2018 at 02:16:48PM +0100, E. Madison Bray wrote:
>
> > Okay, let's keep it simple:
> >
> > m = map(str, [1, 2, 3])
> > len_of_m = None
> > if len(m.iters) == 1 and isinstance(m.iters[0], Sized):
> >     len_of_m = len(m.iters[0])
> >
> > You can give me pathological cases where that isn't true, but you
> > can't say there's no context in which that wouldn't be virtually
> > guaranteed
>
> Yes I can, and they aren't pathological cases. They are ordinary cases
> working the way iterators are designed to work.
>
> All you get is a map object. You have no way of knowing how many times
> the iterator has been advanced by calling next(). Consequently, there is
> no guarantee that len(m.iters[0]) == len(list(m)) except by the merest
> accident that the map object hasn't had next() called on it yet.
>
> *This is not pathological behaviour*. This is how iterators are designed
> to work.
>
> The ability to partially advance an iterator, pause, then pass it on to
> another function to be completed is a huge benefit of the iterator
> protocol. I've written code like this on more than one occasion:

That's a fair point and probably the killer flaw in this proposal (or
any involving getting the lengths of iterators).  I still think it
would be useful to be able to introspect map objects, but this does
throw some doubt on the overall reliability of this.  I'd say that in
most cases it would still work, but you're right it's harder to
guarantee in this context.

One obvious workaround would be to attach a flag indicating whether or
not __next__ has been called (or as long as you have such a flag, why
not a counter for the number of times __next__ has been called)?  That
would effectively solve the problem, but I admit it's a taller order
in terms of adding API surface.

From mertz at gnosis.cx  Thu Nov 29 11:39:56 2018
From: mertz at gnosis.cx (David Mertz)
Date: Thu, 29 Nov 2018 11:39:56 -0500
Subject: [Python-ideas] __len__() for map()
In-Reply-To: <76635835-61DC-41F3-9F76-036698D38477@gmail.com>
References: <3e46b3e5-09e0-b53e-16f3-a1605c88df3f@thekunderts.net>
 <CAOTD34Z6XBM=_Gv_A2TVK2O2BEgDz6gOTbLFRHeeJsqTTWf9MA@mail.gmail.com>
 <CALD=Yf-7Jsb22thjHordR-EkK_Mj6sKX8x4kUEMFhJE=71AGWA@mail.gmail.com>
 <CAPTjJmqmx4bPr_aCbM9GHn6VuWf=yJGEObwO51Y1SQ6W7FOxxw@mail.gmail.com>
 <CAOTD34aK5fVRMDiM3s+sYWMJqUH_RF_N=QSBWD-Nbs0DeEDqqw@mail.gmail.com>
 <A0F3D279-0160-47E0-9F2D-8D1667792BE9@killingar.net>
 <20181128220323.GX4319@ando.pearwood.info>
 <C264B8F7-1DF7-4B02-8626-C2E160C2ADCB@killingar.net>
 <76635835-61DC-41F3-9F76-036698D38477@gmail.com>
Message-ID: <CAEbHw4Z9vKqGoxK7JrvEHRveGVNpPNP=+wo3Sb1oGWAz62Wo+g@mail.gmail.com>

On Thu, Nov 29, 2018 at 2:29 AM Adrien Ricocotam <ricocotam at gmail.com>
wrote:

> Some suggested above to change the definition of len in the long term.
> Then I think it could be interesting to define len such as :
>
> - If has a finite length : return that length (the way it works now)
> - If has a  length that is infinity : return infinity
> - If has no length : return None
>

Do you anticipate that the `len()` function will be able to solve the
Halting Problem?

It is simply not possible to know whether a given iterator will produce
finitely many or infinitely many elements.  Even those that will produce
finitely many do not, in general, have a knowable length without running
them until exhaustion.

Here's a trivial example:

>>> def seq():
...     while random() > 0.1:
...         yield 1
>>> len(seq())
# What answer do you want here?

Here's a slightly less trivial one:

In [1]: from itertools import count
In [2]: def mandelbrot(z):
   ...:     "Yield each value until escape iteration"
   ...:     c = z
   ...:     for n in count():
   ...:         if abs(z) > 2:
   ...:             return n
   ...:         yield z
   ...:         z = z*z + c

What should len(mandelbrot(my_complex_number)) be? Hint, depending on the
complex number chosen, it might be any Natural Number (or it might not
terminate).


-- 
Keeping medicines from the bloodstreams of the sick; food
from the bellies of the hungry; books from the hands of the
uneducated; technology from the underdeveloped; and putting
advocates of freedom in prisons.  Intellectual property is
to the 21st century what the slave trade was to the 16th.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20181129/8a86bfcb/attachment.html>

From ricocotam at gmail.com  Thu Nov 29 11:45:47 2018
From: ricocotam at gmail.com (Adrien Ricocotam)
Date: Thu, 29 Nov 2018 17:45:47 +0100
Subject: [Python-ideas] __len__() for map()
In-Reply-To: <CAEbHw4Z9vKqGoxK7JrvEHRveGVNpPNP=+wo3Sb1oGWAz62Wo+g@mail.gmail.com>
References: <3e46b3e5-09e0-b53e-16f3-a1605c88df3f@thekunderts.net>
 <CAOTD34Z6XBM=_Gv_A2TVK2O2BEgDz6gOTbLFRHeeJsqTTWf9MA@mail.gmail.com>
 <CALD=Yf-7Jsb22thjHordR-EkK_Mj6sKX8x4kUEMFhJE=71AGWA@mail.gmail.com>
 <CAPTjJmqmx4bPr_aCbM9GHn6VuWf=yJGEObwO51Y1SQ6W7FOxxw@mail.gmail.com>
 <CAOTD34aK5fVRMDiM3s+sYWMJqUH_RF_N=QSBWD-Nbs0DeEDqqw@mail.gmail.com>
 <A0F3D279-0160-47E0-9F2D-8D1667792BE9@killingar.net>
 <20181128220323.GX4319@ando.pearwood.info>
 <C264B8F7-1DF7-4B02-8626-C2E160C2ADCB@killingar.net>
 <76635835-61DC-41F3-9F76-036698D38477@gmail.com>
 <CAEbHw4Z9vKqGoxK7JrvEHRveGVNpPNP=+wo3Sb1oGWAz62Wo+g@mail.gmail.com>
Message-ID: <CAGQZRyKJLqcV9njYJDv8-YOsTF1i=my2qisZxboe+sCh7O9Yjw@mail.gmail.com>

Alright, I didn?t see those problem. Though I was suggesting that for
functions like map, we just let the used iterator answer, this is
interesting. Thanks for this

On Thu 29 Nov 2018 at 17:40, David Mertz <mertz at gnosis.cx> wrote:

> On Thu, Nov 29, 2018 at 2:29 AM Adrien Ricocotam <ricocotam at gmail.com>
> wrote:
>
>> Some suggested above to change the definition of len in the long term.
>> Then I think it could be interesting to define len such as :
>>
>> - If has a finite length : return that length (the way it works now)
>> - If has a  length that is infinity : return infinity
>> - If has no length : return None
>>
>
> Do you anticipate that the `len()` function will be able to solve the
> Halting Problem?
>
> It is simply not possible to know whether a given iterator will produce
> finitely many or infinitely many elements.  Even those that will produce
> finitely many do not, in general, have a knowable length without running
> them until exhaustion.
>
> Here's a trivial example:
>
> >>> def seq():
> ...     while random() > 0.1:
> ...         yield 1
> >>> len(seq())
> # What answer do you want here?
>
> Here's a slightly less trivial one:
>
> In [1]: from itertools import count
> In [2]: def mandelbrot(z):
>    ...:     "Yield each value until escape iteration"
>    ...:     c = z
>    ...:     for n in count():
>    ...:         if abs(z) > 2:
>    ...:             return n
>    ...:         yield z
>    ...:         z = z*z + c
>
> What should len(mandelbrot(my_complex_number)) be? Hint, depending on the
> complex number chosen, it might be any Natural Number (or it might not
> terminate).
>
>
> --
> Keeping medicines from the bloodstreams of the sick; food
> from the bellies of the hungry; books from the hands of the
> uneducated; technology from the underdeveloped; and putting
> advocates of freedom in prisons.  Intellectual property is
> to the 21st century what the slave trade was to the 16th.
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20181129/d30e8d64/attachment.html>

From jfine2358 at gmail.com  Thu Nov 29 13:13:50 2018
From: jfine2358 at gmail.com (Jonathan Fine)
Date: Thu, 29 Nov 2018 18:13:50 +0000
Subject: [Python-ideas] __len__() for map()
In-Reply-To: <20181129144311.GD4319@ando.pearwood.info>
References: <3e46b3e5-09e0-b53e-16f3-a1605c88df3f@thekunderts.net>
 <CAOTD34Z6XBM=_Gv_A2TVK2O2BEgDz6gOTbLFRHeeJsqTTWf9MA@mail.gmail.com>
 <ptmrkb$fam$1@blaine.gmane.org> <20181128222714.GY4319@ando.pearwood.info>
 <CAOTD34Y8BXRS0mSQTCn8EpGRXRYJwk+zzYqSWjzte=5mB8fdYw@mail.gmail.com>
 <20181129123823.GB4319@ando.pearwood.info>
 <CAOTD34a+dN0qHebvG6Z9+TtYV9VWvF95NNA47cq4vV9oeLyY_g@mail.gmail.com>
 <20181129144311.GD4319@ando.pearwood.info>
Message-ID: <CALD=Yf9WrsLNOm4OjK1CJjQjv_VC3jdwYbZ2SL5LXFbvMDV+3A@mail.gmail.com>

On Thu, Nov 29, 2018 at 2:44 PM Steven D'Aprano <steve at pearwood.info> wrote:

> You might say that your users are not so advanced, or that they're naive
> enough not to even know they could do that, but that's a pretty unsafe
> assumption as well as being rather insulting to your own users, some of
> whom are surely advanced Python coders not just naive dabblers.

I think that what above all unites Sage users is knowledge of
mathematics. Use of Python would be secondary. The goal surely is to
discover and develop conventions and interface that work for such a
group of users. In this area the original poster is probably the
expert, and I think should be respected as such.

Steve's post divides Sage users into "advanced Python coders" and
"naive dabblers". This misses the point, which is to get something
that works well for all users. This, I'd say, is one of the features
of Python's success. Most Python users are people who want to get
something done.

By the way, I'd expect that most Sage users fall into the middle range
of Python expertise. I think that to focus on the extremes is both
unhelpful and divisive.

-- 
Jonathan

From tjreedy at udel.edu  Thu Nov 29 15:36:29 2018
From: tjreedy at udel.edu (Terry Reedy)
Date: Thu, 29 Nov 2018 15:36:29 -0500
Subject: [Python-ideas] __len__() for map()
In-Reply-To: <CAOTD34Zry10WuoTs-HtN-H-3VAVXDH-ZXhx7DE4g2KxysB=p3Q@mail.gmail.com>
References: <3e46b3e5-09e0-b53e-16f3-a1605c88df3f@thekunderts.net>
 <CAOTD34Z6XBM=_Gv_A2TVK2O2BEgDz6gOTbLFRHeeJsqTTWf9MA@mail.gmail.com>
 <ptmrkb$fam$1@blaine.gmane.org>
 <CAOTD34Zry10WuoTs-HtN-H-3VAVXDH-ZXhx7DE4g2KxysB=p3Q@mail.gmail.com>
Message-ID: <ptpigc$6q1$1@blaine.gmane.org>

On 11/29/2018 6:13 AM, E. Madison Bray wrote:
> On Wed, Nov 28, 2018 at 8:54 PM Terry Reedy <tjreedy at udel.edu> wrote:

> The CPython map() implementation already carries this data on it as
> "func" and "iters" members in its struct.  It's trivial to expose
> those to Python as ".funcs" and ".iters" attributes.  Nothing
> "special" about it.  However, that brings me to...

I will come back to this when you do.

>> https://docs.python.org/3/library/functions.html#map says
>> "map(function, iterable, ...)
>>       Return an iterator [...]"
>>
>> The wording is intentional.  The fact that map is a class and the
>> iterator an instance of the class is a CPython implementation detail.
>> Another implementation could use the generator function equivalent given
>> in the Python 2 itertools doc, or a translation thereof.  I don't know
>> what pypy and other implementations do.  The fact that CPython itertools
>> callables are (now) C-coded classes instead Python-coded generator
>> functions, or C translations thereof (which is tricky) is for
>> performance and ease of maintenance.
> 
> Exactly how intentional is that wording though?

The use of 'iterator' is exactly intended, and the iterator protocol is 
*intentionally minimal*, with one iterator specific __next__ method and 
one boilerplate __iter__ method returning self.  This is more minimal 
than some might like.  An argument against the addition of length_hint 
and __length_hint__ was that it might be seen as extending at least the 
'expected' iterator protocol.  The docs were written to avoid this.

> If it returns an
> iterator it has to return *some object* that implements iteration in
> the manner prescribed by map.

> Generator functions could theoretically
> allow attributes attached to them.  Roughly speaking:
> 
> def map(func, *iters):
>      def map_inner():
>          for args in zip(*iters):
>              yield func(*args)
> 
>      gen = map_inner()
>      gen.func = func
>      gen.iters = iters
> 
>      return gen

> As it happens this won't work in CPython since it does not allow
> attribute assignment on generator objects.  Perhaps there's some good
> reason for that, but AFAICT--though I may be missing a PEP or
> something--this fact is not prescribed anywhere and is also particular
> to CPython.

Instances of C-coded classes generally cannot be augmented.  But set 
this issue aside.

>  Point being, I don't think it's a massive leap or
> imposition on any implementation to go from "Return an iterator [...]"
> to "Return an iterator that has these attributes [...]"

Do you propose exposing the inner struct members of *all* C-coded 
iterators?  (And would you propose that all Python-coded iterators 
should use public names for the equivalents?)  Some subset thereof? 
(What choice rule?)  Or only for map?  If the latter, why do you 
consider map so special?

>>> This is necessary because if I have a function that used to take, say,
>>> a list as an argument, and it receives a `map` object, I now have to
>>> be able to deal with map()s,

In both 2 and 3, the function has to deal with iterator inputs one way 
or another.  In both 2 and 3, possible interator inputs includes maps 
passed as generator comprehensions, '(<expression with x> for x in 
iterable)'.

>> If a function is documented as requiring a list, or a sequence, or a
>> length object, it is a user bug to pass an iterator.  The only thing
>> special about map and filter as errors is the rebinding of the names
>> between Py2 and Py3, so that the same code may be good in 2.x and bad in
>> 3.x.
> 
> It's not a user bug if you're porting a massive computer algebra
> application that happens to use Python as its implementation language
> (rather than inventing one from scratch) and your users don't need or
> want to know too much about Python 2 vs Python 3.

As a former 'scientist who programs' I can understand the desire for 
ignorance of such details.  As a Python core developer, I would say that 
if you want Sage to allow and cater to such ignorance, you have to 
either make Sage a '2 and 3' environment, without burdening Python 3, or 
make future Sage a strictly Python 3 environment (as many scientific 
stack packages are doing or planning to do).

...
> That said, I regret bringing up Sage; I was using it as an example but
> I think the point stands on its own.

Yes, the issues of hiding versus exposing implementation details, and 
that of saving versus deleting and, when needed, recreating 'redundant' 
information, are independent of Sage and 2 versus 3.

-- 
Terry Jan Reedy


From tjreedy at udel.edu  Thu Nov 29 16:28:19 2018
From: tjreedy at udel.edu (Terry Reedy)
Date: Thu, 29 Nov 2018 16:28:19 -0500
Subject: [Python-ideas] __len__() for map()
In-Reply-To: <CAOTD34a+dN0qHebvG6Z9+TtYV9VWvF95NNA47cq4vV9oeLyY_g@mail.gmail.com>
References: <3e46b3e5-09e0-b53e-16f3-a1605c88df3f@thekunderts.net>
 <CAOTD34Z6XBM=_Gv_A2TVK2O2BEgDz6gOTbLFRHeeJsqTTWf9MA@mail.gmail.com>
 <ptmrkb$fam$1@blaine.gmane.org> <20181128222714.GY4319@ando.pearwood.info>
 <CAOTD34Y8BXRS0mSQTCn8EpGRXRYJwk+zzYqSWjzte=5mB8fdYw@mail.gmail.com>
 <20181129123823.GB4319@ando.pearwood.info>
 <CAOTD34a+dN0qHebvG6Z9+TtYV9VWvF95NNA47cq4vV9oeLyY_g@mail.gmail.com>
Message-ID: <ptplhi$9m3$1@blaine.gmane.org>

On 11/29/2018 8:16 AM, E. Madison Bray wrote:

> Okay, let's keep it simple:
> 
> m = map(str, [1, 2, 3])
> len_of_m = None
> if len(m.iters) == 1 and isinstance(m.iters[0], Sized):
>      len_of_m = len(m.iters[0])

As I have noted before, the existing sized collection __length_hint__ 
methods (properly) return the remaining items = len(underlying_iterable) 
- items_already_produced.  This is fairly easy at the C level.  The 
following seems to work in Python.

class map1:
     def __init__(self, func, sized):
         "
         if isinstance(sized, (list, tuple, range, dict)):
             self._iter = iter(sized)
             self._gen = (func(x) for x in self._iter)
         else:
             raise TypeError(f'{size} not one of list, tuple, range, dict')
     def __iter__(self):
         return self
     def __next__(self):
         return next(self._gen)
     def __length_hint__(self):
         return __length_hint__(self._iter)

m = map1(int, [1.0, 2.0, 3.0])
print(m.__length_hint__())
print('first item', next(m))
print(m.__length_hint__())
print('remainer', list(m))
print(m.__length_hint__())
# prints, as expected and desired
3
first item 1
2
remainer [2, 3]
0

A package could include a version of this, possibly compiled, for use 
when applicable.

-- 
Terry Jan Reedy


From abedillon at gmail.com  Thu Nov 29 18:03:57 2018
From: abedillon at gmail.com (Abe Dillon)
Date: Thu, 29 Nov 2018 17:03:57 -0600
Subject: [Python-ideas] __len__() for map()
In-Reply-To: <CAEbHw4Z9vKqGoxK7JrvEHRveGVNpPNP=+wo3Sb1oGWAz62Wo+g@mail.gmail.com>
References: <3e46b3e5-09e0-b53e-16f3-a1605c88df3f@thekunderts.net>
 <CAOTD34Z6XBM=_Gv_A2TVK2O2BEgDz6gOTbLFRHeeJsqTTWf9MA@mail.gmail.com>
 <CALD=Yf-7Jsb22thjHordR-EkK_Mj6sKX8x4kUEMFhJE=71AGWA@mail.gmail.com>
 <CAPTjJmqmx4bPr_aCbM9GHn6VuWf=yJGEObwO51Y1SQ6W7FOxxw@mail.gmail.com>
 <CAOTD34aK5fVRMDiM3s+sYWMJqUH_RF_N=QSBWD-Nbs0DeEDqqw@mail.gmail.com>
 <A0F3D279-0160-47E0-9F2D-8D1667792BE9@killingar.net>
 <20181128220323.GX4319@ando.pearwood.info>
 <C264B8F7-1DF7-4B02-8626-C2E160C2ADCB@killingar.net>
 <76635835-61DC-41F3-9F76-036698D38477@gmail.com>
 <CAEbHw4Z9vKqGoxK7JrvEHRveGVNpPNP=+wo3Sb1oGWAz62Wo+g@mail.gmail.com>
Message-ID: <CAKr=oZs9qDYkPSHYw5eQFE5mPtBZ66fUBRrEBYJnTP-4G1Wc=A@mail.gmail.com>

[David Mertz]

> Do you anticipate that the `len()` function will be able to solve the
> Halting Problem?
>


It is simply not possible to know whether a given iterator will produce
> finitely many or infinitely many elements.  Even those that will produce
> finitely many do not, in general, have a knowable length without running
> them until exhaustion.


You don't have to solve the halting problem. You simply ask the object. The
default behavior would be "I don't know" whether that's communicated by
returning None or some other sentinel value (NaN?) or by raising a special
exception. Then you simply override the default behavior for cases where
the object does or at least might know. itertools.repeat, for example;
would have an infinite length unless "times" is provided, then its length
would be the value of "times". map would return the length of the shortest
iterable unless there is an unknown sized iterable, then len would be
unknown, if all iterables are infinite, the length would be infinite.

We could add a decorator for length and/or length hints on generator
functions:

@length(lambda times: times or float("+inf"))*def* repeat(obj, times=None):
    if times is None:
        while True:
            yield obj
    else:
        for i in range(times):
            yield obj


On Thu, Nov 29, 2018 at 10:40 AM David Mertz <mertz at gnosis.cx> wrote:

> On Thu, Nov 29, 2018 at 2:29 AM Adrien Ricocotam <ricocotam at gmail.com>
> wrote:
>
>> Some suggested above to change the definition of len in the long term.
>> Then I think it could be interesting to define len such as :
>>
>> - If has a finite length : return that length (the way it works now)
>> - If has a  length that is infinity : return infinity
>> - If has no length : return None
>>
>
> Do you anticipate that the `len()` function will be able to solve the
> Halting Problem?
>
> It is simply not possible to know whether a given iterator will produce
> finitely many or infinitely many elements.  Even those that will produce
> finitely many do not, in general, have a knowable length without running
> them until exhaustion.
>
> Here's a trivial example:
>
> >>> def seq():
> ...     while random() > 0.1:
> ...         yield 1
> >>> len(seq())
> # What answer do you want here?
>
> Here's a slightly less trivial one:
>
> In [1]: from itertools import count
> In [2]: def mandelbrot(z):
>    ...:     "Yield each value until escape iteration"
>    ...:     c = z
>    ...:     for n in count():
>    ...:         if abs(z) > 2:
>    ...:             return n
>    ...:         yield z
>    ...:         z = z*z + c
>
> What should len(mandelbrot(my_complex_number)) be? Hint, depending on the
> complex number chosen, it might be any Natural Number (or it might not
> terminate).
>
>
> --
> Keeping medicines from the bloodstreams of the sick; food
> from the bellies of the hungry; books from the hands of the
> uneducated; technology from the underdeveloped; and putting
> advocates of freedom in prisons.  Intellectual property is
> to the 21st century what the slave trade was to the 16th.
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> https://mail.python.org/mailman/listinfo/python-ideas
> Code of Conduct: http://python.org/psf/codeofconduct/
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20181129/9f5f99ef/attachment-0001.html>

From apalala at gmail.com  Thu Nov 29 19:31:13 2018
From: apalala at gmail.com (=?UTF-8?Q?Juancarlo_A=C3=B1ez?=)
Date: Thu, 29 Nov 2018 20:31:13 -0400
Subject: [Python-ideas] [off-topic?] Unwinding generators
Message-ID: <CAN1YFWvkq9JmHKVhO6oZYJXstCnP+SSQMD-yUzVWeREZeRw24g@mail.gmail.com>

This is code from the Twisted library:

https://github.com/twisted/twisted/blob/trunk/src/twisted/internet/defer.py#L1542-L1614

It "unwinds" a generator to yield a result before others.

I don't have hard evidence, but my experience is that that kind of
manipulation leaks resources, specially if exceptions escape from the final
callback.

Is there a bug in exception handling in the generator logic, or is
unwinding just inherently wrong?

How could the needs tried to solved with unwinding be handled with async?

-- 
Juancarlo *A?ez*
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20181129/7f33f1ec/attachment.html>

From paul-python at svensson.org  Thu Nov 29 20:13:12 2018
From: paul-python at svensson.org (Paul Svensson)
Date: Thu, 29 Nov 2018 20:13:12 -0500 (EST)
Subject: [Python-ideas] __len__() for map()
In-Reply-To: <CAOTD34Z6XBM=_Gv_A2TVK2O2BEgDz6gOTbLFRHeeJsqTTWf9MA@mail.gmail.com>
References: <3e46b3e5-09e0-b53e-16f3-a1605c88df3f@thekunderts.net>
 <CAOTD34Z6XBM=_Gv_A2TVK2O2BEgDz6gOTbLFRHeeJsqTTWf9MA@mail.gmail.com>
Message-ID: <alpine.DEB.2.21.1811291929400.9348@home.svensson.org>

On Mon, Nov 26, 2018 at 10:35 PM Kale Kundert <kale at thekunderts.net> wrote:
>
> I just ran into the following behavior, and found it surprising:
>
>>>> len(map(float, [1,2,3]))
> TypeError: object of type 'map' has no len()
>
> I understand that map() could be given an infinite sequence and therefore might not always have a length.  But in this case, it seems like map() should've known that its length was 3.  I also understand that I can just call list() on the whole thing and get a list, but the nice thing about map() is that it doesn't copy data, so it's unfortunate to lose that advantage for no particular reason.
>
> My proposal is to delegate map.__len__() to the underlying iterable.  Similarly, map.__getitem__() could be implemented if the underlying iterable supports item access:
>

Excellent proposal, followed by a flood of confused replies,
which I will mostly disregard, since all miss the obvious.

What's being proposed is simple, either:
  * len(map(f, x)) == len(x), or
  * both raise TypeError

That implies, loosely speaking:
  * map(f, Iterable) -> Iterable, and
  * map(f, Sequence) -> Sequence

But, *not*:
  * map(f, Iterable|Sequence) -> Magic.

So, the map() function becomes a factory, returning an object
with __len__ or without, depending on what it was called with.

        /Paul

From abedillon at gmail.com  Thu Nov 29 23:59:43 2018
From: abedillon at gmail.com (Abe Dillon)
Date: Thu, 29 Nov 2018 22:59:43 -0600
Subject: [Python-ideas] __len__() for map()
In-Reply-To: <alpine.DEB.2.21.1811291929400.9348@home.svensson.org>
References: <3e46b3e5-09e0-b53e-16f3-a1605c88df3f@thekunderts.net>
 <CAOTD34Z6XBM=_Gv_A2TVK2O2BEgDz6gOTbLFRHeeJsqTTWf9MA@mail.gmail.com>
 <alpine.DEB.2.21.1811291929400.9348@home.svensson.org>
Message-ID: <CAKr=oZvNJfEX7iefm97pE1dyWL6JivcYyTmkD-y+Y8CjMrFgBA@mail.gmail.com>

That would be great especially if it returned objects of a subclass of map
so that it didn't break any code that checks isinstance, however; I think
this goes a little beyond map. I've run into cases using itertools where I
wished the iterators could support len. I suppose you could turn those all
into factories too, but I wonder if that's the most elegant solution.


On Thu, Nov 29, 2018 at 7:22 PM Paul Svensson <paul-python at svensson.org>
wrote:

> On Mon, Nov 26, 2018 at 10:35 PM Kale Kundert <kale at thekunderts.net>
> wrote:
> >
> > I just ran into the following behavior, and found it surprising:
> >
> >>>> len(map(float, [1,2,3]))
> > TypeError: object of type 'map' has no len()
> >
> > I understand that map() could be given an infinite sequence and
> therefore might not always have a length.  But in this case, it seems like
> map() should've known that its length was 3.  I also understand that I can
> just call list() on the whole thing and get a list, but the nice thing
> about map() is that it doesn't copy data, so it's unfortunate to lose that
> advantage for no particular reason.
> >
> > My proposal is to delegate map.__len__() to the underlying iterable.
> Similarly, map.__getitem__() could be implemented if the underlying
> iterable supports item access:
> >
>
> Excellent proposal, followed by a flood of confused replies,
> which I will mostly disregard, since all miss the obvious.
>
> What's being proposed is simple, either:
>   * len(map(f, x)) == len(x), or
>   * both raise TypeError
>
> That implies, loosely speaking:
>   * map(f, Iterable) -> Iterable, and
>   * map(f, Sequence) -> Sequence
>
> But, *not*:
>   * map(f, Iterable|Sequence) -> Magic.
>
> So, the map() function becomes a factory, returning an object
> with __len__ or without, depending on what it was called with.
>
>         /Paul
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> https://mail.python.org/mailman/listinfo/python-ideas
> Code of Conduct: http://python.org/psf/codeofconduct/
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20181129/67f54f22/attachment.html>

From erik.m.bray at gmail.com  Fri Nov 30 03:54:47 2018
From: erik.m.bray at gmail.com (E. Madison Bray)
Date: Fri, 30 Nov 2018 09:54:47 +0100
Subject: [Python-ideas] __len__() for map()
In-Reply-To: <CALD=Yf9WrsLNOm4OjK1CJjQjv_VC3jdwYbZ2SL5LXFbvMDV+3A@mail.gmail.com>
References: <3e46b3e5-09e0-b53e-16f3-a1605c88df3f@thekunderts.net>
 <CAOTD34Z6XBM=_Gv_A2TVK2O2BEgDz6gOTbLFRHeeJsqTTWf9MA@mail.gmail.com>
 <ptmrkb$fam$1@blaine.gmane.org> <20181128222714.GY4319@ando.pearwood.info>
 <CAOTD34Y8BXRS0mSQTCn8EpGRXRYJwk+zzYqSWjzte=5mB8fdYw@mail.gmail.com>
 <20181129123823.GB4319@ando.pearwood.info>
 <CAOTD34a+dN0qHebvG6Z9+TtYV9VWvF95NNA47cq4vV9oeLyY_g@mail.gmail.com>
 <20181129144311.GD4319@ando.pearwood.info>
 <CALD=Yf9WrsLNOm4OjK1CJjQjv_VC3jdwYbZ2SL5LXFbvMDV+3A@mail.gmail.com>
Message-ID: <CAOTD34bbWyHVeR9Ld3R0p7PUhtJxSkUxnYg_Q1OPrim+ok_hTA@mail.gmail.com>

On Thu, Nov 29, 2018 at 7:16 PM Jonathan Fine <jfine2358 at gmail.com> wrote:
>
> On Thu, Nov 29, 2018 at 2:44 PM Steven D'Aprano <steve at pearwood.info> wrote:
>
> > You might say that your users are not so advanced, or that they're naive
> > enough not to even know they could do that, but that's a pretty unsafe
> > assumption as well as being rather insulting to your own users, some of
> > whom are surely advanced Python coders not just naive dabblers.
>
> I think that what above all unites Sage users is knowledge of
> mathematics. Use of Python would be secondary. The goal surely is to
> discover and develop conventions and interface that work for such a
> group of users. In this area the original poster is probably the
> expert, and I think should be respected as such.
>
> Steve's post divides Sage users into "advanced Python coders" and
> "naive dabblers". This misses the point, which is to get something
> that works well for all users. This, I'd say, is one of the features
> of Python's success. Most Python users are people who want to get
> something done.
>
> By the way, I'd expect that most Sage users fall into the middle range
> of Python expertise. I think that to focus on the extremes is both
> unhelpful and divisive.

Yes, thank you.  They are all very smart people--most of them much
moreso than I.  The vast majority are mathematicians first, and
software developers second, third, fourth, or even further down the
line.  Some of the most prolific contributors to Sage barely know how
to use git without some wrappers we've provided around it (not that
they couldn't learn, but let's be honest git is a terrible tool for
anyone who isn't Linus Torvalds).  They still write good code and
sometimes brilliant algorithms.  But they're not all Python experts.

Many of them are also students who are only using Python because Sage
uses it, and not using Sage because it uses Python.  The Sagebook [1]
may be their first introduction to Python, and even then it only
introduces Python programming in drips and drabs as needed for the
topics at hand (e.g. variables, loops, functions).  I'm trying to
consider users at all levels.

[1] http://dl.lateralis.org/public/sagebook/sagebook-ba6596d.pdf

From erik.m.bray at gmail.com  Fri Nov 30 04:32:31 2018
From: erik.m.bray at gmail.com (E. Madison Bray)
Date: Fri, 30 Nov 2018 10:32:31 +0100
Subject: [Python-ideas] __len__() for map()
In-Reply-To: <ptpigc$6q1$1@blaine.gmane.org>
References: <3e46b3e5-09e0-b53e-16f3-a1605c88df3f@thekunderts.net>
 <CAOTD34Z6XBM=_Gv_A2TVK2O2BEgDz6gOTbLFRHeeJsqTTWf9MA@mail.gmail.com>
 <ptmrkb$fam$1@blaine.gmane.org>
 <CAOTD34Zry10WuoTs-HtN-H-3VAVXDH-ZXhx7DE4g2KxysB=p3Q@mail.gmail.com>
 <ptpigc$6q1$1@blaine.gmane.org>
Message-ID: <CAOTD34Ykf2rjBpv2W-RrRQZfexxMRFf0yFOfC8uc2_JuZ6pgnw@mail.gmail.com>

On Thu, Nov 29, 2018 at 9:36 PM Terry Reedy <tjreedy at udel.edu> wrote:
> >> https://docs.python.org/3/library/functions.html#map says
> >> "map(function, iterable, ...)
> >>       Return an iterator [...]"
> >>
> >> The wording is intentional.  The fact that map is a class and the
> >> iterator an instance of the class is a CPython implementation detail.
> >> Another implementation could use the generator function equivalent given
> >> in the Python 2 itertools doc, or a translation thereof.  I don't know
> >> what pypy and other implementations do.  The fact that CPython itertools
> >> callables are (now) C-coded classes instead Python-coded generator
> >> functions, or C translations thereof (which is tricky) is for
> >> performance and ease of maintenance.
> >
> > Exactly how intentional is that wording though?
>
> The use of 'iterator' is exactly intended, and the iterator protocol is
> *intentionally minimal*, with one iterator specific __next__ method and
> one boilerplate __iter__ method returning self.  This is more minimal
> than some might like.  An argument against the addition of length_hint
> and __length_hint__ was that it might be seen as extending at least the
> 'expected' iterator protocol.  The docs were written to avoid this.

You still seem to be confusing my point.  I'm not advocating even for
__length_hint__ (I think there are times that would be useful but it's
still pretty problematic).

I admit one thing I'm a little stuck on though is that map() currently
just immediately calls iter() on its arguments to get their iterators,
and does not store references to the original iterables.  It would be
nice if more iterators could have an exposed reference to the objects
they're iterating, in cases where that's even meaningful.  For some
reason I thought, for example, that a list_iterator could give me a
reference back to the list itself.  This was probably omitted
intentionally but it still feels pretty limiting :(

> >  Point being, I don't think it's a massive leap or
> > imposition on any implementation to go from "Return an iterator [...]"
> > to "Return an iterator that has these attributes [...]"
>
> Do you propose exposing the inner struct members of *all* C-coded
> iterators?  (And would you propose that all Python-coded iterators
> should use public names for the equivalents?)  Some subset thereof?
> (What choice rule?)  Or only for map?  If the latter, why do you
> consider map so special?

Not necessarily, no.  But certainly a few: I'm using map() as an
example but at the very least map() and filter().  An exact choice
rule is something worth thinking about but I don't think you're going
to find an "objective" rule.  I think it goes without saying that
map() is special in a way: It's one of the most basic extensions to
function application and is a fundamental construct in functional
programming and from a category-theortical perspective.  I'm not
saying Python's built-in map() needs to represent anything
mathematically formal, but it's certainly quite fundamental which is
why it's a built-in in the first place.

> >>> This is necessary because if I have a function that used to take, say,
> >>> a list as an argument, and it receives a `map` object, I now have to
> >>> be able to deal with map()s,
>
> In both 2 and 3, the function has to deal with iterator inputs one way
> or another.  In both 2 and 3, possible interator inputs includes maps
> passed as generator comprehensions, '(<expression with x> for x in
> iterable)'.

Yes, but those are still less common, and generator expressions were
not even around when Sage was first started: I've been around long
enough to remember when they were added to the language, and were well
predated by map() and filter().  The Sagebook [1] introduces them
around page 60.  I'm not sure if it even introduces generators
expressions at all.

I think a lot of Python and C++ experts don't realize that the
"iterator" concept is not at all immediately obvious to a lot of
non-programmers.  Most iterator inputs supplied by users are things
like sized collections for which it's easy to think about "going over
them one by one" and not more abstract iterators.  This is true
whether the user is a Python expert or not.

> >> If a function is documented as requiring a list, or a sequence, or a
> >> length object, it is a user bug to pass an iterator.  The only thing
> >> special about map and filter as errors is the rebinding of the names
> >> between Py2 and Py3, so that the same code may be good in 2.x and bad in
> >> 3.x.
> >
> > It's not a user bug if you're porting a massive computer algebra
> > application that happens to use Python as its implementation language
> > (rather than inventing one from scratch) and your users don't need or
> > want to know too much about Python 2 vs Python 3.
>
> As a former 'scientist who programs' I can understand the desire for
> ignorance of such details.  As a Python core developer, I would say that
> if you want Sage to allow and cater to such ignorance, you have to
> either make Sage a '2 and 3' environment, without burdening Python 3, or
> make future Sage a strictly Python 3 environment (as many scientific
> stack packages are doing or planning to do).

"ignorance" is not a word I would use here, frankly.

> ...
> > That said, I regret bringing up Sage; I was using it as an example but
> > I think the point stands on its own.
>
> Yes, the issues of hiding versus exposing implementation details, and
> that of saving versus deleting and, when needed, recreating 'redundant'
> information, are independent of Sage and 2 versus 3.

I agree there, that this is not really an argument about Sage or
Python 2/3.  Though I don't think this is an "implementation detail".
In an abstract sense a map is a special container for a function and a
sequence that has special semantics.  As far as I'm concerned this is
what it *is* in some ontological sense, and this fact is not a mere
implementation detail.

[1] http://dl.lateralis.org/public/sagebook/sagebook-ba6596d.pdf

From steve at pearwood.info  Fri Nov 30 10:45:01 2018
From: steve at pearwood.info (Steven D'Aprano)
Date: Sat, 1 Dec 2018 02:45:01 +1100
Subject: [Python-ideas] __len__() for map()
In-Reply-To: <CAOTD34Ykf2rjBpv2W-RrRQZfexxMRFf0yFOfC8uc2_JuZ6pgnw@mail.gmail.com>
References: <3e46b3e5-09e0-b53e-16f3-a1605c88df3f@thekunderts.net>
 <CAOTD34Z6XBM=_Gv_A2TVK2O2BEgDz6gOTbLFRHeeJsqTTWf9MA@mail.gmail.com>
 <ptmrkb$fam$1@blaine.gmane.org>
 <CAOTD34Zry10WuoTs-HtN-H-3VAVXDH-ZXhx7DE4g2KxysB=p3Q@mail.gmail.com>
 <ptpigc$6q1$1@blaine.gmane.org>
 <CAOTD34Ykf2rjBpv2W-RrRQZfexxMRFf0yFOfC8uc2_JuZ6pgnw@mail.gmail.com>
Message-ID: <20181130154500.GJ4319@ando.pearwood.info>

On Fri, Nov 30, 2018 at 10:32:31AM +0100, E. Madison Bray wrote:

> I think it goes without saying that
> map() is special in a way: It's one of the most basic extensions to
> function application and is a fundamental construct in functional
> programming and from a category-theortical perspective.  I'm not
> saying Python's built-in map() needs to represent anything
> mathematically formal, but it's certainly quite fundamental which is
> why it's a built-in in the first place.

Its a built-in in the first place, because back in Python 0.9 or 1.0 or 
thereabouts, a fan of Lisp added it to the builtins (together with 
filter and reduce) and nobody objected (possibly because they didn't 
notice) at the time.

It was much easier to add things to the language back then.

During the transition to Python 3, Guido wanted to remove all three (as 
well as lambda):

https://www.artima.com/weblogs/viewpost.jsp?thread=98196

Although map, filter and lambda have stayed, reduce has been relegated 
to the functools module.


-- 
Steve

From steve at pearwood.info  Fri Nov 30 20:17:35 2018
From: steve at pearwood.info (Steven D'Aprano)
Date: Sat, 1 Dec 2018 12:17:35 +1100
Subject: [Python-ideas] __len__() for map()
In-Reply-To: <alpine.DEB.2.21.1811291929400.9348@home.svensson.org>
References: <3e46b3e5-09e0-b53e-16f3-a1605c88df3f@thekunderts.net>
 <CAOTD34Z6XBM=_Gv_A2TVK2O2BEgDz6gOTbLFRHeeJsqTTWf9MA@mail.gmail.com>
 <alpine.DEB.2.21.1811291929400.9348@home.svensson.org>
Message-ID: <20181201011734.GN4319@ando.pearwood.info>

On Thu, Nov 29, 2018 at 08:13:12PM -0500, Paul Svensson wrote:

> Excellent proposal, followed by a flood of confused replies,
> which I will mostly disregard, since all miss the obvious.

When everyone around you is making technical responses which you think 
are "confused", it is wise to consider the possibility that it is you 
who is missing something rather than everyone else.


> What's being proposed is simple, either:
>  * len(map(f, x)) == len(x), or
>  * both raise TypeError

Simple, obvious, and problematic.

Here's a map object I prepared earlier:

from itertools import islice
mo = map(lambda x: x, "aardvark")
list(islice(mo, 3))


If I now pass you the map object, mo, what should len(mo) return? Five 
or eight?

No matter which choice you make, you're going to surprise and annoy 
people, and there will be circumstances where that choice will introduce 
bugs into their code.


> That implies, loosely speaking:
>  * map(f, Iterable) -> Iterable, and
>  * map(f, Sequence) -> Sequence

But map objects aren't sequences. They're iterators. Just adding a 
__len__ method isn't going to make them sequences (not even "loosely 
speaking") or solve the problem above.

In principle, we could make this work, by turning the output of map() 
into a view like dict.keys() etc, or a lazy sequence type like range(). 
wrapping the underlying sequence. That might be worth exploring. I can't 
think of any obvious problems with a view-like interface, but that 
doesn't mean there aren't any. I've spent like 30 seconds thinking about 
it, so the fact that I can't see any problems with it means little.

But its also a big change, not just a matter of exposing the __len__ 
method of the underlying iterable (or iterables).


-- 
Steve