From musicdenotation at gmail.com  Sun Sep  1 10:31:54 2013
From: musicdenotation at gmail.com (Musical Notation)
Date: Sun, 1 Sep 2013 15:31:54 +0700
Subject: [Python-ideas] Another indentation style
Message-ID: <470AF9E3-9750-4B11-9BAF-257535DE245E@gmail.com>

In Haskell, you can write:

let x=1
    y=2

In Python, why you can't write:

if True: x=x+1
         y=x

?

From steve at pearwood.info  Sun Sep  1 11:21:00 2013
From: steve at pearwood.info (Steven D'Aprano)
Date: Sun, 01 Sep 2013 19:21:00 +1000
Subject: [Python-ideas] Another indentation style
In-Reply-To: <470AF9E3-9750-4B11-9BAF-257535DE245E@gmail.com>
References: <470AF9E3-9750-4B11-9BAF-257535DE245E@gmail.com>
Message-ID: <522306FC.3070202@pearwood.info>

On 01/09/13 18:31, Musical Notation wrote:
> In Haskell, you can write:
>
> let x=1
>      y=2
>
> In Python, why you can't write:
>
> if True: x=x+1
>           y=x
>
> ?

Because allowing that does not let you do anything different or new that you couldn't do before, it would not make code any clearer or more understandable, and it would decrease the readability of the code.

The one-line per block form:

if condition: do_this()

avoids emphasizing the one-line block, and puts the focus on the `if`.  As far as I am concerned, it is not much more than a convenience for the interactive interpreter. Dropping the `if` block onto a second line returns shares focus between the two equally:

if condition:
     do_this()


If Python allowed the form you want with multi-line blocks:

if condition: do_this()
     do_that()
     do_something_else()


the call to `do_this` would be lost, up there in the same line as the test. The block structure looks like this:

..............BLOCK
....BLOCK
....BLOCK


instead of:

.............
....BLOCK
....BLOCK
....BLOCK


and that hurts readability. Worse is the temptation to waste time trying to line everything up:

if condition: do_this()
               do_that()
               do_something_else()
if flag: do_this()
          do_that()
          do_something_else()
if really_long_clause_in_a_boolean_context: do_this()
                                             do_that()
                                             do_something_else()


which obscures the fact that all three `if` blocks are at the same indent level. Even worse:

if condition:                               do_this()
                                             do_that()
                                             do_something_else()
if flag:                                    do_this()
                                             do_that()
                                             do_something_else()
if really_long_clause_in_a_boolean_context: do_this()
                                             do_that()
                                             do_something_else()


which is just abominable. Python doesn't prevent you from writing ugly code, but neither does it allow syntax which encourages you to write ugly code.


-- 
Steven

From rob.cliffe at btinternet.com  Sun Sep  1 13:10:19 2013
From: rob.cliffe at btinternet.com (Rob Cliffe)
Date: Sun, 01 Sep 2013 12:10:19 +0100
Subject: [Python-ideas] Another indentation style
In-Reply-To: <522306FC.3070202@pearwood.info>
References: <470AF9E3-9750-4B11-9BAF-257535DE245E@gmail.com>
 <522306FC.3070202@pearwood.info>
Message-ID: <5223209B.9020806@btinternet.com>


On 01/09/2013 10:21, Steven D'Aprano wrote:
> On 01/09/13 18:31, Musical Notation wrote:
>> In Haskell, you can write:
>>
>> let x=1
>>      y=2
>>
>> In Python, why you can't write:
>>
>> if True: x=x+1
>>           y=x
>>
>> ?
>
> Because allowing that does not let you do anything different or new 
> that you couldn't do before, it would not make code any clearer or 
> more understandable, and it would decrease the readability of the code.
>
If I can be forgiven for slightly changing the subject (sorry, Musical 
Notation):

Talking of unconventional ways of indenting:

In Python, writing multiple code statements on a single line, especially 
with a semi-colin, appears to be a taboo roughly on a par with appearing 
naked in public.  But I believe that there are times when it is the 
clearest way of writing code, viz. when it makes visually obvious a 
*pattern* in the code.
Here is one example, not very different from some "real" code that I wrote:

def test(condition, a, b):
     if condition=='equals'          : return a==b
     if condition=='is greater than' : return a>b
     if condition=='contains'        : return b in a
     if condition=='starts with'     : return a.startswith(b)
         etc.

And here is some real code that I wrote (not worth explaining in 
detail).  I am sorry that it breaks another convention, having lines 
longer than 80 characters - this happens not be inconvenient for me, and 
was the best authentic *real* example I could find without spending a 
long time searching:

         assert PN[0].isalpha()    ; FirstPart  = PN[0] ; PN = 
PN[1:].lstrip(Seps) # Must be a letter
         if PN[0].isalpha()        : FirstPart += PN[0] ; PN = 
PN[1:].lstrip(Seps) # May be a second letter
         assert PN[0].isdigit()    ; FirstPart += PN[0] ; PN = 
PN[1:].lstrip(Seps) # Must be a digit
         if PN and PN[0].isalnum() : FirstPart += PN[0] ; PN = 
PN[1:]              # May be a letter or digit

(These examples look best with the colons/semicolons/equals 
signs/statements lined up vertically.  They will probably look ragged in 
an e-mail.  They should look as intended if they are cut and pasted into 
a (fixed-size font) editor.)

Writing the code like this makes apparent:
     (1) There is a pattern to the code.
     (2) Where the pattern is not quite consistent.  E.g. in my second 
example the first line contains "FirstPart  =", the other lines contain 
"FirstPart +=". *Seeing* this is half-way to understanding it.
     (3) The conceptual separation of the whole chunk of code from what 
precedes and what follows it (which can be emphasised by putting a blank 
line before and after it).
*None* of this would be so apparent if the code were written one 
statement per line.  (Is 'statement' the correct technical term?  Please 
correct me.)  There is also a minor advantage to writing fewer lines of 
code - you can see more of the program in one screenfull at a time.

(And: that you may find a smarter way of rewriting these specific 
examples is not really the point.  In my younger days I might have written:
     if wkday==0: return 'Monday'
     if wkday==1: return 'Tuesday'
        etc.
Nowadays I would probably write something like
     return { 0 : 'Monday', 1 : 'Tuesday'  ... etc. }[wkday]
And you may have an even better way.
Again - not really the point.
)

Best wishes,
Rob Cliffe
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20130901/2301f7d3/attachment.html>

From abarnert at yahoo.com  Sun Sep  1 15:21:39 2013
From: abarnert at yahoo.com (Andrew Barnert)
Date: Sun, 1 Sep 2013 06:21:39 -0700
Subject: [Python-ideas] Another indentation style
In-Reply-To: <5223209B.9020806@btinternet.com>
References: <470AF9E3-9750-4B11-9BAF-257535DE245E@gmail.com>
 <522306FC.3070202@pearwood.info> <5223209B.9020806@btinternet.com>
Message-ID: <25F9189B-567A-44E6-91B2-9DD5868FAE96@yahoo.com>

On Sep 1, 2013, at 4:10, Rob Cliffe <rob.cliffe at btinternet.com> wrote:

> def test(condition, a, b):
>     if condition=='equals'          : return a==b
>     if condition=='is greater than' : return a>b
>     if condition=='contains'        : return b in a
>     if condition=='starts with'     : return a.startswith(b)

This isn't _terrible_... But it argues against the original proposal even further, because it's an example of a one-line if statement that's explicitly set off in a way that's unmistakable, because the statements can't be continued.

That being said, I think it would be much better to write:

_ops = {
    'equals': eq,
    'is greater than': gt,
    ...
}

def test(condition, a, b):
    return _ops[condition](a, b)

... Or maybe the equivalent with methods instead of functions from operator. Or, if some of the conditions don't already have functions with nice names to map to (e.g. "within 1%" mapping to .99*b <= a <= 1.01*b) maybe even inline lambdas.  

Besides reducing all the boilerplate repetition, having the mapping in data instead of code gives you the flexibility to do all kinds of things that would otherwise be impossible--register new conditions dynamically, inspect the conditions, reuse them in another function without a parallel chain of if statements, etc.


From jon at jon-foster.co.uk  Mon Sep  2 01:28:05 2013
From: jon at jon-foster.co.uk (Jon Foster)
Date: Mon, 02 Sep 2013 00:28:05 +0100
Subject: [Python-ideas] ipaddress: Interface inheriting from Address
In-Reply-To: <521CA73D.4010703@trueblade.com>
References: <52154558.4080102@jon-foster.co.uk>
 <521CA73D.4010703@trueblade.com>
Message-ID: <5223CD85.7080100@jon-foster.co.uk>

Hi all,

After looking at ipaddress some more, I've got a few patches to
suggest.  I've pushed them all to a Mercurial repository which
you can see here:

https://bitbucket.org/jonfoster/python-ipaddress/commits/all?page=3

The "Interface not inheriting from Address" patch is:

https://bitbucket.org/jonfoster/python-ipaddress/commits/146d1ffa832fd0b72696a57c806995e6c53601a3

It depends on some refactoring, which I did in a separate commit:

https://bitbucket.org/jonfoster/python-ipaddress/commits/af480dbe385f65da3cf6b20d85a31854bf233772

There are a few other ipaddress patches in that repository, too.
I'd appreciate any feedback you have on these.

Kind regards,

Jon


P.S. I have just signed a Contributor Agreement.


On 27/08/2013 14:18, Eric V. Smith wrote:
> On 08/21/2013 06:55 PM, Jon Foster wrote:
>> Hi all,
>>
>> I'd like to propose changing ipaddress.IPv[46]Interface to not inherit
>> from IPv[46]Address.
>
> <stuff deleted>
>
> I agree that it's odd that an [x]Interface would inherit from an
> [x]Address. I think it should be a has-a relationship, as you describe
> with the "ip" property.
>
>> If there is interest in this idea, I'll try to put together a patch next
>> week.
>
> I'd review the patch.
>
> Eric.
>
>
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> http://mail.python.org/mailman/listinfo/python-ideas
>

From steve at pearwood.info  Mon Sep  2 02:30:25 2013
From: steve at pearwood.info (Steven D'Aprano)
Date: Mon, 02 Sep 2013 10:30:25 +1000
Subject: [Python-ideas] Another indentation style
In-Reply-To: <B4710BFD-2DE2-4EFA-B6F3-2952E98918D7@gmail.com>
References: <470AF9E3-9750-4B11-9BAF-257535DE245E@gmail.com>
 <522306FC.3070202@pearwood.info>
 <B4710BFD-2DE2-4EFA-B6F3-2952E98918D7@gmail.com>
Message-ID: <5223DC21.6010206@pearwood.info>

On 01/09/13 21:49, Musical Notation wrote:

> View my original proposal in a fixed-width font and you will understand it.

Your assumption that I didn't use a fixed-width font is wrong. I *always* use fixed-width fonts for email. And don't imagine that the only reason I could disagree with your proposal is that I don't understand it. I understand your proposal very well, and still disagree.


>  That indentation style is quite idiomatic in Haskell.

Irrelevant. Python does not have a "let name=value" statement, so the Haskell "let" idiom does not apply. There is an enormous difference between the leading fixed-width "let" and variable-width "if" clauses:


let x = value
     y = name
let extremely_long_name_that_goes_on_and_on = value
     name = value
let foo = value
     bar = value


versus:


if f: do_this()
       do_that()
if long_condition: do_this()
                    do_that()
elif flag: do_this()
            do_that()


-- 
Steven

From steve at pearwood.info  Mon Sep  2 03:28:53 2013
From: steve at pearwood.info (Steven D'Aprano)
Date: Mon, 02 Sep 2013 11:28:53 +1000
Subject: [Python-ideas] Another indentation style
In-Reply-To: <5223209B.9020806@btinternet.com>
References: <470AF9E3-9750-4B11-9BAF-257535DE245E@gmail.com>
 <522306FC.3070202@pearwood.info> <5223209B.9020806@btinternet.com>
Message-ID: <5223E9D5.4040502@pearwood.info>

On 01/09/13 21:10, Rob Cliffe wrote:

> In Python, writing multiple code statements on a single line, especially with a semi-colin, appears to be a taboo roughly on a par with appearing naked in public.  But I believe that there are times when it is the clearest way of writing code, viz. when it makes visually obvious a *pattern* in the code.
> Here is one example, not very different from some "real" code that I wrote:
>
> def test(condition, a, b):
>      if condition=='equals'          : return a==b
>      if condition=='is greater than' : return a>b
>      if condition=='contains'        : return b in a
>      if condition=='starts with'     : return a.startswith(b)
>          etc.

I believe that the pattern is *more* readily apparent written the conventional way:

def test(condition, a, b):
     if condition=='equals':
         return a==b
     if condition=='is greater than':
         return a>b
     if condition=='contains':
         return b in a
     if condition=='starts with':
         return a.startswith(b)

You can run your eye down indent level 2 and see "condition return condition return condition return", which avoids needing to scan left-to-right (which is a significant slowdown when skimming code), and the distraction of that great big river of whitespace running down the middle of the block.

Another issue is the time spent deleting and inserting spaces in the middle of the lines to keep the return statements lined up after edits. With no clear benefit, that's just unproductive make-work. (But good if you're paid by the hour and your boss doesn't cotton on to what you are doing *wink*)


Later in your post, you write:

> And you may have an even better way.
> Again - not really the point.

But that precisely is the point! If the layout of code is obscuring the ways it can be simplified, generalized or refactored, then the layout is *actively* harmful. Now, maybe you have a reason for preferring a long list of if...return statements instead of the more usual idiom of a dict lookup. I don't understand your code to comment on that. But it is obvious to me that splitting the if...return over two lines certainly doesn't hurt the ability to visualise the pattern in the code, and probably helps make it even more clear.


> And here is some real code that I wrote (not worth explaining in detail).  I am sorry that it breaks another convention, having lines longer than 80 characters - this happens not be inconvenient for me, and was the best authentic *real* example I could find without spending a long time searching:
>
>          assert PN[0].isalpha()    ; FirstPart  = PN[0] ; PN = PN[1:].lstrip(Seps) # Must be a letter
>          if PN[0].isalpha()        : FirstPart += PN[0] ; PN = PN[1:].lstrip(Seps) # May be a second letter

That second line uses a layout that I wish was a SyntaxError, because it is ambiguous whether

if cond: statementA; statementB

should be grouped as follows (using braces as visual aids):

if cond:
     { statementA; statementB }

or like this:

{ if cond: statementA }
statementB


Such a shame that it is allowed. I can just count myself fortunate that I've never seen it before in the wild.


-- 
Steven

From stephen at xemacs.org  Mon Sep  2 05:39:33 2013
From: stephen at xemacs.org (Stephen J. Turnbull)
Date: Mon, 02 Sep 2013 12:39:33 +0900
Subject: [Python-ideas] Another indentation style
In-Reply-To: <5223E9D5.4040502@pearwood.info>
References: <470AF9E3-9750-4B11-9BAF-257535DE245E@gmail.com>
 <522306FC.3070202@pearwood.info> <5223209B.9020806@btinternet.com>
 <5223E9D5.4040502@pearwood.info>
Message-ID: <8738pndh8q.fsf@uwakimon.sk.tsukuba.ac.jp>

Steven D'Aprano writes:

 > def test(condition, a, b):
 >      if condition=='equals':
 >          return a==b
 >      if condition=='is greater than':
 >          return a>b
 >      if condition=='contains':
 >          return b in a
 >      if condition=='starts with':
 >          return a.startswith(b)
 > 
 > You can run your eye down indent level 2 and see "condition return
 > condition return condition return", which avoids needing to scan
 > left-to-right

No, you can't, FVO "you" == "me".  In fact, I can only read about 20
characters without moving my eyes, and your format encourages my eyes
to zigzag.  I find it *much* easier to scan the OP's format for a
particular condition or a particular return expression.

 > Another issue is the time spent deleting and inserting spaces in
 > the middle of the lines to keep the return statements lined up
 > after edits.

"Obviously" you're not an Emacs user (or other editor with powerful
on-the-fly scripting capability).  It takes about 10 minutes to write
enough Lisp to maintain that table with one keystroke, including
detecting the beginning and end of the table, the column widths, and
so on.  Not everybody uses their editor that way, but people who make
such suites in tabular form *should* -- time is not an issue here.

*Despite* the above, I don't like the OP's format.  With Andrew
Barnert's suggestion of a dictionary, I get all the above benefits,
with *less* detritus (no "if", "condition==", "return"; jus' the facs,
ma'am) in the tabular format, conformity to common practice
("intuitive" is just an alternative spelling of "familiar"<wink/>),
and it's monkey-patchable if I need to add a new condition at runtime.

 > > And here is some real code that I wrote (not worth explaining in
 > > detail).
 > >
 > >          assert PN[0].isalpha()    ; FirstPart  = PN[0] ; PN = PN[1:].lstrip(Seps) # Must be a letter
 > >          if PN[0].isalpha()        : FirstPart += PN[0] ; PN = PN[1:].lstrip(Seps) # May be a second letter
 > 
 > That second line uses a layout that I wish was a SyntaxError,
 > because it is ambiguous whether
 > 
 > if cond: statementA; statementB
 > 
 > should be grouped as follows (using braces as visual aids):
 > 
 > if cond:
 >      { statementA; statementB }
 > 
 > or like this:
 > 
 > { if cond: statementA }
 > statementB

I don't have a problem with it from this point of view, because the
clear intent is that the semicolons separate the statements of the
suite controlled by the if, and that's what they do.  I do have a
problem with the fact that 'assert' does not introduce a suite.
"Syntax must not look like grit on Tim's screen!"  Besides, it would
screw up my Lisp.<wink/>


From rob.cliffe at btinternet.com  Tue Sep  3 13:47:49 2013
From: rob.cliffe at btinternet.com (Rob Cliffe)
Date: Tue, 03 Sep 2013 12:47:49 +0100
Subject: [Python-ideas] Another indentation style
In-Reply-To: <1378159467.10776.YahooMailNeo@web184703.mail.ne1.yahoo.com>
References: <470AF9E3-9750-4B11-9BAF-257535DE245E@gmail.com>
 <522306FC.3070202@pearwood.info> <5223209B.9020806@btinternet.com>
 <25F9189B-567A-44E6-91B2-9DD5868FAE96@yahoo.com>
 <522367A1.1010804@btinternet.com>
 <4F21A705-17D0-42C9-81FB-24E362272696@yahoo.com>
 <52247D6B.5030900@btinternet.com>
 <3137AD20-5BA2-4034-91EE-DF687EF5A215@yahoo.com>
 <5224AEE3.9020309@btinternet.com>
 <1378159467.10776.YahooMailNeo@web184703.mail.ne1.yahoo.com>
Message-ID: <5225CC65.5070100@btinternet.com>


On 02/09/2013 23:04, Andrew Barnert wrote:
> From:Rob Cliffe <rob.cliffe at btinternet.com>
> Sent:Monday, September 2, 2013 8:29 AM
>> You seem to have replied just to me, not to the list, which is a pity, because it means this reply is going just to you.
> Your last message was just to me, so I didn't think you wanted it going to the list.
Sorry, my mistake, I'm attempting to copy this to the list now.
>
>>>> What do you think is easier and quicker to write (or indeed, to understand, if you're not a Python expert)?  Say at 1am when you're trying to meet a deadline and keeping awake with coffee?
>>> I can write either one easily.
>> OK, you can.  (How many years have you been programming in Python?)  That does not mean that everybody can.  (I can too but maybe not as easily as you.  But I might write it in my simple way at first, to get it working, then come back later and smarten it up.)
> Novices looking for help on Stack Overflow can create dictionaries. My coworker who just learned Python over the last two months can. This is in the tutorial, it's not some deep magic for experts only.
>
>>> The question is which one I can write without making a stupid mistake that I'll have to debug six weeks later. Less repetition and less visual noise means fewer places to make a mistake, and easier to spot it when you do.
>
>> But the very repetition (and vertical alignment) mean that many of the possible mistakes stand out like a sore thumb.  The human brain is good at seeing patterns.
> Repetitive code is exactly where people make the most mistakes. And if you've ever debugged any serious project, it really can be hard to notice that you used ssock instead of csock in one of eight near-identical blocks of code. If you just write one block of code instead of eight, it's impossible to make that mistake in the first place.
But if the csock s are vertically aligned, the mistake stands out.
>> My version (boring, repetitive, unimaginative ... but simple and straightforward),
> or something involving dictionaries, lambda functions, and having to look up the docs (I didn't actually know about __contains__ or operator.contains) ?
>>> The whole point of programming is using your imagination to eliminate boring, repetitive, and simple tasks.
> Er, no.  No.  The point of programming varies.  And "good" and "bad" code do not exist in isolation, context matters too (commercial environment? academic?).
>
> No, it really doesn't. Except for learning and language-research purposes, when you write a program, it's to accomplish something that otherwise you or another person would have to do manually, which would be tedious, or difficult, or error-prone. You could go through a spreadsheet and count up the number of unique users (column 3) in each state (column 2), but it would take days, and you'd make dozens of mistakes, and you'd be miserable. Or you could write a program in a few minutes or hours.
>
>>>> Obviously this is a judgement call, but I know what my view is.
> As a touchstone: my version could be rewritten in just about any other language with minimal effort, just because it is so simple and uses no "tricks".  How many languages could yours be rewritten in as quickly (assuming you're not already an expert in the target language) ?
>>> Putting functions into a dictionary is not some advanced "trick", it's a basic idiom. It's used in the tutorial, the FAQ, the stdlib, etc.
>> Sure it is (in Python).  But if I wanted to translate this code into another language (especially in a hurry), I would (as I said before) need minimal knowledge of that language to translate my boring version.  (LISP?)
True but is does add one level of abstraction, and one's brain has to go 
through that level to understand the code.  The more levels of 
abstraction, the less intuitive and harder to understand code becomes 
(look at Twisted, possibly the ultimate example).
> Why would you need to translate it into another language in a hurry?
Who knows?  We live in an unpredictable world; we adapt to it or die.  
If I could predict all the things my manager asks me to do ...  well, I 
guess I wouldn't need a manager.
>
>> We have got a bit bogged down discussing this particular example
>      (partly my fault).  I was simply trying to make the point that there
>      may be circumstances when it is OK, even a good thing, to put 2, 3
>      or (heaven forbid) 4 statements on a single line, and to illustrate
>      the point with a couple of examples.  That (you find that) the
>      examples are less than perfect does not, in itself, mean that my
>      point is entirely wrong.
>
> I said right at the beginning that your code isn't terrible; you're the one who insisted that other people will say it is.
>
> And there are certainly examples where writing two statements on a line makes sense. I have code like this:
>
>      if stop: break
>
>      x += a; y += b

>
> So I agree with you that sometimes putting multiple statements on a line is a good thing.
Good.  I also quite often (not always) write 1-line if-suites like that 
(as in my first example).  I'm glad we agree on something.
>   But doing it so you can avoid using one of Python's fundamental features because you're afraid you might have to translate the code to a language you barely know is not a good reason to do it.
We'll have to disagree on that.  I think sometimes it might be.  But my 
primary motive was to write the code so that it was simple to write and 
simple to understand.

Another (invented) example occurred to me (before I saw yours above):

Version 1:

     x1 += 1
     x2 += 1
     x3 += 1
     y1 += 1
     y2 += 1
     y2 += 1
     z1 += 1
     z2 += 1
     z3 += 1

Version 2 (Rob's version):

     x1 += 1 ;  x2 += 1 ; x3 += 1
     y1 += 1 ;  y2 += 1 ; y2 += 1
     z1 += 1 ;  z2 += 1 ; z3 += 1

In which version is it easier to grasp what this code does?  In which 
version is it easier to spot the deliberate mistake?
(And please don't rubbish the example by saying it should have been 
written differently in the first place.  Circumstances *do* alter 
cases;  I could be making a minor alteration to a huge inherited program 
which it is not practicable or necessary to rewrite.)
Rob Cliffe

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20130903/dbf678db/attachment.html>

From techtonik at gmail.com  Wed Sep 11 18:05:22 2013
From: techtonik at gmail.com (anatoly techtonik)
Date: Wed, 11 Sep 2013 19:05:22 +0300
Subject: [Python-ideas] AST Hash
Message-ID: <CAPkN8x+S--RVHUtgJCN2m8RxzHTpVr7W2mBj-RwFctMUNoGJMA@mail.gmail.com>

Hi,

We need a checksum for code pieces. The goal of the checksum is to
reliably detect pieces of code with absolutely identical behaviour.
Borders of such checksum can be functions, classes, modules,.
Practical application for such checksums are:

 - detecting usage of recipes and examples across PyPI packages
 - detecting usage of standard stdlib calls
 - creating execution safe serialization formats for data
   - choosing class to deserialize data fields of the object based on its hash
 - enable consistent validation and testing of results across various AST tools

There can be two approaches to build such checksum:
1. Code Section Hash
2. AST Hash

Code Section Hash is built from a substring of a source code, cut on
function or class boundaries. This hash is flaky - whitespace and
comment differences ruin it, even when behaviour (and bytecode) stays
the same. It is possible to reduce the effect of whitespace and
comment changes by normalizing the substring - dedenting, reindenting
with 4 spaces, stripping empty lines, comments and trailing
whitespace. And it still will be unreliable and affected by whitespace
changes in the middle of the string. Therefore a 2nd way of hashing is
more preferable.

AST Hash is build on AST. This excludes any comments, whitespace etc.
and makes the hash strict and reliable. This is a canonical Default
AST Hash.

There are cases when Default AST Hash may not be enough for
comparison. For example, if local variables are renamed, or docstrings
changed, the behaviour of a function may not change, but its AST hash
will. In these cases additional normalization rules apply. Such as
changing all local variable names to var1, var2, ... in order of
appearance, stripping docstrings etc. Every set of such normalization
rules should have a name. This will also be the name of resulting
custom AST Hash.

Explicit naming of AST Hashes and hardlinking of names to rules that
are used to build them will settle common ground (base) for AST tools
interoperability and research papers. As such, it most likely require
a separate PEP.
--
anatoly t.

From techtonik at gmail.com  Wed Sep 11 18:54:00 2013
From: techtonik at gmail.com (anatoly techtonik)
Date: Wed, 11 Sep 2013 19:54:00 +0300
Subject: [Python-ideas] Python Sound API (Was: Cross Platform Python Sound
	Module/Library)
Message-ID: <CAPkN8x+y=0Zoc-N8RtypFCDK5C44EJgTou4+YAuXaZ7e5tMyFg@mail.gmail.com>

On Sat, Apr 27, 2013 at 11:59 PM, M.-A. Lemburg <mal at egenix.com> wrote:
> On 27.04.2013 22:19, anatoly techtonik wrote:
>> On Sat, Apr 27, 2013 at 1:51 PM, Antoine Pitrou <solipsis at pitrou.net> wrote:
>>
>>> On Fri, 26 Apr 2013 20:39:30 -0700
>>> Andrew Barnert <abarnert at yahoo.com> wrote:
>>>> On Apr 26, 2013, at 16:59, Greg Ewing <greg.ewing at canterbury.ac.nz>
>>> wrote:
>>>>
>>>>> Oleg Broytman wrote:
>>>>>> Are there cross-platform audio libraries that Python could wrap?
>>>>>
>>>>> There's OpenAL:
>>>>>
>>>>>   http://connect.creativelabs.com/openal/default.aspx
>>>>>
>>>> There's actually a bunch of options.
>>>>
>>>> The hard question is picking one and endorsing it as "right", or at least
>>>> "good enough to enshrine in stdlib ala tkinter".
>>>
>>> When you notice how "good enough" tkinter is (and has been for 10
>>> years at least), you realize the trap hidden in this question.
>>>
>>> Really, see my message earlier in this thread. This is better left to
>>> third-party libraries (which already exist, please do some research).
>>>
>>
>>>From the other side if 80% of cases can be covered without Python packaging
>> problems - that's already an advantage. For example most people find date /
>> time functionality in Python enough to avoid using mxDateTime as a
>> dependency. As for audio, most people find it insufficient.
>
> I'm not sure whether 3D audio support is really needed as core
> feature in a general purpose programming language ;-)
>
> I'd suggest to have a look at http://www.libsdl.org/, which can
> be used from Python via http://pygame.org/

3D audio support is not the basic common cross-platform base layer.
Many devices that run Python are mono at all.

What stdlib should concentrate on are basic audio operations needed by
people with accent on pure Python implementation of everything that
not represents system layer. Safe sound synthesis and sound output in
canonical format. If people need advanced algorithms and operations,
they are free to use SDL2, OpenAL, FFmpeg and other libs that are
inherently insecure due to amount of low level C code.

Audio even on Android devices doesn't require any advanced privileges.
It is a basic need for many programs and attraction for many creative
people who may use Python as an auxiliarry language in their works.

>From stdlib I also expected abstract scheduling and buffering
algorithms with explanations and documentation with pictures. I expect
super simple API for all basic cases. Ability to use this API on any
platform.

I'd expect Audio API to be multi-level.

Level 1: Beep - make audio signal to attract attention with the most
basic level OS provides
Level 2: Customize audio signal to attract attention (query signals,
chose, beep)
Level 3: Play pre-rendered waveform (such as WAV file) with the most
basic OS level (default format)
Level 4: Play continuous pre-rendered stream
Level 5: Mix pre-renderered streams and waveforms
Level 6: Synthesize sound in pure Python
Level 7: Synthesize sound indirectly (using GPU, MIDI interfaces,
external libs, ...)
Level 8: Audio device control - formats, channels, volumes -
everything hardware specific
--
anatoly t.

From amauryfa at gmail.com  Wed Sep 11 19:05:52 2013
From: amauryfa at gmail.com (Amaury Forgeot d'Arc)
Date: Wed, 11 Sep 2013 19:05:52 +0200
Subject: [Python-ideas] AST Hash
In-Reply-To: <CAPkN8x+S--RVHUtgJCN2m8RxzHTpVr7W2mBj-RwFctMUNoGJMA@mail.gmail.com>
References: <CAPkN8x+S--RVHUtgJCN2m8RxzHTpVr7W2mBj-RwFctMUNoGJMA@mail.gmail.com>
Message-ID: <CAGmFidZpd5ocuOUR9KcS5QXY6q19-9GCwUDfZvEJat_7xMRvdg@mail.gmail.com>

2013/9/11 anatoly techtonik <techtonik at gmail.com>

> Hi,
>
> We need a checksum for code pieces. The goal of the checksum is to
> reliably detect pieces of code with absolutely identical behaviour.
> Borders of such checksum can be functions, classes, modules,.
>

This looks like a nice project; I think this should first take the form of
an external package.
I'm sure there are many details to iron before this kind of technique can
be widely adopted.

For example:
- Is there only one kind of hash? you suggested to erase the differences in
variable names, are there other possible customizations?
- To detect common patterns, is it interesting to hash and index all the
nodes of an AST tree?
- Is there a central repository to store hashes of recipes? Is Google
Search enough?

I don't need answers, only a reference implementation that people can
discuss!

Good luck,


> Practical application for such checksums are:
>
>  - detecting usage of recipes and examples across PyPI packages
>  - detecting usage of standard stdlib calls
>  - creating execution safe serialization formats for data
>    - choosing class to deserialize data fields of the object based on its
> hash
>  - enable consistent validation and testing of results across various AST
> tools
>
> There can be two approaches to build such checksum:
> 1. Code Section Hash
> 2. AST Hash
>
> Code Section Hash is built from a substring of a source code, cut on
> function or class boundaries. This hash is flaky - whitespace and
> comment differences ruin it, even when behaviour (and bytecode) stays
> the same. It is possible to reduce the effect of whitespace and
> comment changes by normalizing the substring - dedenting, reindenting
> with 4 spaces, stripping empty lines, comments and trailing
> whitespace. And it still will be unreliable and affected by whitespace
> changes in the middle of the string. Therefore a 2nd way of hashing is
> more preferable.
>
> AST Hash is build on AST. This excludes any comments, whitespace etc.
> and makes the hash strict and reliable. This is a canonical Default
> AST Hash.
>
> There are cases when Default AST Hash may not be enough for
> comparison. For example, if local variables are renamed, or docstrings
> changed, the behaviour of a function may not change, but its AST hash
> will. In these cases additional normalization rules apply. Such as
> changing all local variable names to var1, var2, ... in order of
> appearance, stripping docstrings etc. Every set of such normalization
> rules should have a name. This will also be the name of resulting
> custom AST Hash.
>
> Explicit naming of AST Hashes and hardlinking of names to rules that
> are used to build them will settle common ground (base) for AST tools
> interoperability and research papers. As such, it most likely require
> a separate PEP.
> --
> anatoly t.
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> https://mail.python.org/mailman/listinfo/python-ideas
>


-- 
Amaury Forgeot d'Arc
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20130911/d7bb45e4/attachment.html>

From kn0m0n3 at gmail.com  Wed Sep 11 20:49:48 2013
From: kn0m0n3 at gmail.com (Jason Bursey)
Date: Wed, 11 Sep 2013 13:49:48 -0500
Subject: [Python-ideas] AST Hash
In-Reply-To: <CAPkN8x+S--RVHUtgJCN2m8RxzHTpVr7W2mBj-RwFctMUNoGJMA@mail.gmail.com>
References: <CAPkN8x+S--RVHUtgJCN2m8RxzHTpVr7W2mBj-RwFctMUNoGJMA@mail.gmail.com>
Message-ID: <CABy7BMv1AYDkBv7mgOCF11_EBR5Hd4N2fg4YNfQLnLWbaK3m5A@mail.gmail.com>

????? ????? ????? ????? ????? ??????????? ??? ???? ????


On Wed, Sep 11, 2013 at 11:05 AM, anatoly techtonik <techtonik at gmail.com>wrote:

> Hi,
>
> We need a checksum for code pieces. The goal of the checksum is to
> reliably detect pieces of code with absolutely identical behaviour.
> Borders of such checksum can be functions, classes, modules,.
> Practical application for such checksums are:
>
>  - detecting usage of recipes and examples across PyPI packages
>  - detecting usage of standard stdlib calls
>  - creating execution safe serialization formats for data
>    - choosing class to deserialize data fields of the object based on its
> hash
>  - enable consistent validation and testing of results across various AST
> tools
>
> There can be two approaches to build such checksum:
> 1. Code Section Hash
> 2. AST Hash
>
> Code Section Hash is built from a substring of a source code, cut on
> function or class boundaries. This hash is flaky - whitespace and
> comment differences ruin it, even when behaviour (and bytecode) stays
> the same. It is possible to reduce the effect of whitespace and
> comment changes by normalizing the substring - dedenting, reindenting
> with 4 spaces, stripping empty lines, comments and trailing
> whitespace. And it still will be unreliable and affected by whitespace
> changes in the middle of the string. Therefore a 2nd way of hashing is
> more preferable.
>
> AST Hash is build on AST. This excludes any comments, whitespace etc.
> and makes the hash strict and reliable. This is a canonical Default
> AST Hash.
>
> There are cases when Default AST Hash may not be enough for
> comparison. For example, if local variables are renamed, or docstrings
> changed, the behaviour of a function may not change, but its AST hash
> will. In these cases additional normalization rules apply. Such as
> changing all local variable names to var1, var2, ... in order of
> appearance, stripping docstrings etc. Every set of such normalization
> rules should have a name. This will also be the name of resulting
> custom AST Hash.
>
> Explicit naming of AST Hashes and hardlinking of names to rules that
> are used to build them will settle common ground (base) for AST tools
> interoperability and research papers. As such, it most likely require
> a separate PEP.
> --
> anatoly t.
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> https://mail.python.org/mailman/listinfo/python-ideas
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20130911/9570121c/attachment-0001.html>

From mal at egenix.com  Wed Sep 11 22:53:39 2013
From: mal at egenix.com (M.-A. Lemburg)
Date: Wed, 11 Sep 2013 22:53:39 +0200
Subject: [Python-ideas] AST Hash
In-Reply-To: <CAPkN8x+S--RVHUtgJCN2m8RxzHTpVr7W2mBj-RwFctMUNoGJMA@mail.gmail.com>
References: <CAPkN8x+S--RVHUtgJCN2m8RxzHTpVr7W2mBj-RwFctMUNoGJMA@mail.gmail.com>
Message-ID: <5230D853.6090108@egenix.com>

On 11.09.2013 18:05, anatoly techtonik wrote:
> Hi,
> 
> We need a checksum for code pieces. The goal of the checksum is to
> reliably detect pieces of code with absolutely identical behaviour.
> Borders of such checksum can be functions, classes, modules,.
> Practical application for such checksums are:
> 
>  - detecting usage of recipes and examples across PyPI packages
>  - detecting usage of standard stdlib calls
>  - creating execution safe serialization formats for data
>    - choosing class to deserialize data fields of the object based on its hash
>  - enable consistent validation and testing of results across various AST tools
> 
> There can be two approaches to build such checksum:
> 1. Code Section Hash
> 2. AST Hash
> 
> Code Section Hash is built from a substring of a source code, cut on
> function or class boundaries. This hash is flaky - whitespace and
> comment differences ruin it, even when behaviour (and bytecode) stays
> the same. It is possible to reduce the effect of whitespace and
> comment changes by normalizing the substring - dedenting, reindenting
> with 4 spaces, stripping empty lines, comments and trailing
> whitespace. And it still will be unreliable and affected by whitespace
> changes in the middle of the string. Therefore a 2nd way of hashing is
> more preferable.
> 
> AST Hash is build on AST. This excludes any comments, whitespace etc.
> and makes the hash strict and reliable. This is a canonical Default
> AST Hash.
> 
> There are cases when Default AST Hash may not be enough for
> comparison. For example, if local variables are renamed, or docstrings
> changed, the behaviour of a function may not change, but its AST hash
> will. In these cases additional normalization rules apply. Such as
> changing all local variable names to var1, var2, ... in order of
> appearance, stripping docstrings etc. Every set of such normalization
> rules should have a name. This will also be the name of resulting
> custom AST Hash.
> 
> Explicit naming of AST Hashes and hardlinking of names to rules that
> are used to build them will settle common ground (base) for AST tools
> interoperability and research papers. As such, it most likely require
> a separate PEP.

You might want to have a look at this paper which discussed
AST compression (for Java, but the ideas apply to Python just
as well):

http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.135.5917&rep=rep1&type=pdf

If you compress the AST into a string and take its hash,
you should pretty much have what you want.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Sep 11 2013)
>>> Python Projects, Consulting and Support ...   http://www.egenix.com/
>>> mxODBC.Zope/Plone.Database.Adapter ...       http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________
2013-09-11: Released eGenix PyRun 1.3.0 ...       http://egenix.com/go49
2013-09-04: Released eGenix pyOpenSSL 0.13.2 ...  http://egenix.com/go48
2013-09-20: PyCon UK 2013, Coventry, UK ...                 9 days to go

   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
    D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
           Registered at Amtsgericht Duesseldorf: HRB 46611
               http://www.egenix.com/company/contact/

From tjreedy at udel.edu  Thu Sep 12 00:07:10 2013
From: tjreedy at udel.edu (Terry Reedy)
Date: Wed, 11 Sep 2013 18:07:10 -0400
Subject: [Python-ideas] AST Hash
In-Reply-To: <5230D853.6090108@egenix.com>
References: <CAPkN8x+S--RVHUtgJCN2m8RxzHTpVr7W2mBj-RwFctMUNoGJMA@mail.gmail.com>
 <5230D853.6090108@egenix.com>
Message-ID: <l0qpi8$hg0$1@ger.gmane.org>

On 9/11/2013 4:53 PM, M.-A. Lemburg wrote:

> You might want to have a look at this paper which discussed
> AST compression (for Java, but the ideas apply to Python just
> as well):
>
> http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.135.5917&rep=rep1&type=pdf

The prototype implementation is written in Python! (p.3, right).

-- 
Terry Jan Reedy


From mistersheik at gmail.com  Thu Sep 12 00:18:01 2013
From: mistersheik at gmail.com (Neil Girdhar)
Date: Wed, 11 Sep 2013 15:18:01 -0700 (PDT)
Subject: [Python-ideas] Replace option set/get methods through the standard
 library with a ChainMap; add a context manager to ChainMap
Message-ID: <8a0c3260-7c85-46ba-9dae-102e7fceb8f4@googlegroups.com>

With numpy print options, for example, the usual pattern is to save some of 
the print options, set some of them, and then restore the old options.  Why 
not expose the options as a ChainMap called numpy.printoptions?  ChainMap 
could then expose a context manager that pushes a new dictionary on entry 
and pops it on exit via, say, child_context that accepts a dictionary. 
 Now, instead of:

saved_precision = np.get_printoptions()['precision']
np.set_printoptions(precision=23)
do_something()
np.set_printoptions(precision=saved_precision)

You can do the same with a context manager, which I think is stylistically 
better (as it's impossible to forget to reset the option, and no explicit 
temporary invades the local variables):

with np.printoptions.child_context({'precision', 23}):
    do_something()

Best,

Neil
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20130911/8d776369/attachment.html>

From mistersheik at gmail.com  Wed Sep 11 23:44:11 2013
From: mistersheik at gmail.com (Neil Girdhar)
Date: Wed, 11 Sep 2013 14:44:11 -0700 (PDT)
Subject: [Python-ideas] Idea: Compressing the stack on the fly
In-Reply-To: <3daafc97-115a-4525-907c-fdbf356fd749@googlegroups.com>
References: <3daafc97-115a-4525-907c-fdbf356fd749@googlegroups.com>
Message-ID: <34b72f2c-94bf-4452-9ec3-54336e6ed562@googlegroups.com>

You can just use a memoize decorator to automatically convert your 
recursive solution to a fast linear time one.  Search for "python 
memoization decorator".

Best,

Neil

On Sunday, May 26, 2013 8:00:13 AM UTC-4, Ram Rachum wrote:
>
> Hi everybody,
>
> Here's an idea I had a while ago. Now, I'm an ignoramus when it comes to 
> how programming languages are implemented, so this idea will most likely be 
> either (a) completely impossible or (b) trivial knowledge.
>
> I was thinking about the implementation of the factorial in Python. I was 
> comparing in my mind 2 different solutions: The recursive one, and the one 
> that uses a loop. Here are example implementations for them:
>
> def factorial_recursive(n):
>     if n == 1:
>         return 1
>     return n * factorial_recursive(n - 1)
>
> def factorial_loop(n):
>     result = 1
>     for i in range(1, n + 1):
>         result *= i
>     return result
>
>
> I know that the recursive one is problematic, because it's putting a lot 
> of items on the stack. In fact it's using the stack as if it was a loop 
> variable. The stack wasn't meant to be used like that.
>
> Then the question came to me, why? Maybe the stack could be built to 
> handle this kind of (ab)use?
>
> I read about tail-call optimization on Wikipedia. If I understand 
> correctly, the gist of it is that the interpreter tries to recognize, on a 
> frame-by-frame basis, which frames could be completely eliminated, and then 
> it eliminates those. Then I read Guido's blog post explaining why he 
> doesn't want it in Python. In that post he outlined 4 different reasons why 
> TCO shouldn't be implemented in Python.
>
> But then I thought, maybe you could do something smarter than eliminating 
> individual stack frames. Maybe we could create something that is to the 
> current implementation of the stack what `xrange` is to the old-style 
> `range`. A smart object that allows access to any of a long list of items 
> in it, without actually having to store those items. This would solve the 
> first argument that Guido raises in his post, which I found to be the most 
> substantial one.
>
> What I'm saying is: Imagine the stack of the interpreter when it runs the 
> factorial example above for n=1000. It has around 1000 items in it and it's 
> just about to explode. But then, if you'd look at the contents of that 
> stack, you'd see it's embarrassingly regular, a compression algorithm's 
> wet dream. It's just the same code location over and over again, with a 
> different value for `n`.
>
> So what I'm suggesting is an algorithm to compress that stack on the fly. 
> An algorithm that would detect regularities in the stack and instead of 
> saving each individual frame, save just the pattern. Then, there wouldn't 
> be any problem with showing informative stack trace: Despite not storing 
> every individual frame, each individual frame could still be *accessed*, 
> similarly to how `xrange` allow access to each individual member without 
> having to store each of them.
>
> Then, the stack could store a lot more items, and tasks that currently 
> require recursion (like pickling using the standard library) will be able 
> to handle much deeper recursions.
>
> What do you think?
>
>
> Ram.
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20130911/e68ad424/attachment-0001.html>

From mistersheik at gmail.com  Thu Sep 12 00:07:45 2013
From: mistersheik at gmail.com (Neil Girdhar)
Date: Wed, 11 Sep 2013 15:07:45 -0700 (PDT)
Subject: [Python-ideas] Replace numpy get_printoptions/set_printoptions,
 and similar patterns with a ChainMap; add a context manager to ChainMap
In-Reply-To: <00094cfb-73b9-422c-b9d0-50e290813819@googlegroups.com>
References: <00094cfb-73b9-422c-b9d0-50e290813819@googlegroups.com>
Message-ID: <ab9df868-00a6-495b-9296-5a0d4c6c8125@googlegroups.com>

Just to be clear, my proposal is to replace all such get/set options 
patterns throughout Python's standard library.

On Wednesday, September 11, 2013 5:13:55 PM UTC-4, Neil Girdhar wrote:
>
> With numpy, the usual pattern is to get_printoptions, set some of them, 
> and then restore the old options.  Why not expose the options in a ChainMap 
> as numpy.printoptions?  ChainMap could then expose a context manager that 
> pushes a new dictionary on entry and pops it on exit via, say, 
> child_context that accepts a dictionary.  Now, instead of:
>
> saved_precision = np.get_printoptions()['precision']
> np.set_printoptions(precision=23)
> do_something()
> np.set_printoptions(precision=saved_precision)
>
> You can do the same with a context manager, which I think is stylistically 
> better (as it's impossible to forget to reset the option, and no explicit 
> temporary invades the local variables):
>
> with np.printoptions.child_context({'precision', 23}):
>     do_something()
>
> Best,
>
> Neil
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20130911/c4a74d9f/attachment.html>

From mistersheik at gmail.com  Thu Sep 12 00:15:49 2013
From: mistersheik at gmail.com (Neil Girdhar)
Date: Wed, 11 Sep 2013 15:15:49 -0700 (PDT)
Subject: [Python-ideas] Idea: Compressing the stack on the fly
In-Reply-To: <3daafc97-115a-4525-907c-fdbf356fd749@googlegroups.com>
References: <3daafc97-115a-4525-907c-fdbf356fd749@googlegroups.com>
Message-ID: <dba7deda-e5de-4de5-a045-0f0917e850f1@googlegroups.com>

You can just use a memoize decorator to automatically convert your 
recursive solution to a fast linear time one.  Search for "python 
memoization decorator".  This would make a much broader range of recursive 
solution run in linear time.

Best,

Neil

On Sunday, May 26, 2013 8:00:13 AM UTC-4, Ram Rachum wrote:
>
> Hi everybody,
>
> Here's an idea I had a while ago. Now, I'm an ignoramus when it comes to 
> how programming languages are implemented, so this idea will most likely be 
> either (a) completely impossible or (b) trivial knowledge.
>
> I was thinking about the implementation of the factorial in Python. I was 
> comparing in my mind 2 different solutions: The recursive one, and the one 
> that uses a loop. Here are example implementations for them:
>
> def factorial_recursive(n):
>     if n == 1:
>         return 1
>     return n * factorial_recursive(n - 1)
>
> def factorial_loop(n):
>     result = 1
>     for i in range(1, n + 1):
>         result *= i
>     return result
>
>
> I know that the recursive one is problematic, because it's putting a lot 
> of items on the stack. In fact it's using the stack as if it was a loop 
> variable. The stack wasn't meant to be used like that.
>
> Then the question came to me, why? Maybe the stack could be built to 
> handle this kind of (ab)use?
>
> I read about tail-call optimization on Wikipedia. If I understand 
> correctly, the gist of it is that the interpreter tries to recognize, on a 
> frame-by-frame basis, which frames could be completely eliminated, and then 
> it eliminates those. Then I read Guido's blog post explaining why he 
> doesn't want it in Python. In that post he outlined 4 different reasons why 
> TCO shouldn't be implemented in Python.
>
> But then I thought, maybe you could do something smarter than eliminating 
> individual stack frames. Maybe we could create something that is to the 
> current implementation of the stack what `xrange` is to the old-style 
> `range`. A smart object that allows access to any of a long list of items 
> in it, without actually having to store those items. This would solve the 
> first argument that Guido raises in his post, which I found to be the most 
> substantial one.
>
> What I'm saying is: Imagine the stack of the interpreter when it runs the 
> factorial example above for n=1000. It has around 1000 items in it and it's 
> just about to explode. But then, if you'd look at the contents of that 
> stack, you'd see it's embarrassingly regular, a compression algorithm's 
> wet dream. It's just the same code location over and over again, with a 
> different value for `n`.
>
> So what I'm suggesting is an algorithm to compress that stack on the fly. 
> An algorithm that would detect regularities in the stack and instead of 
> saving each individual frame, save just the pattern. Then, there wouldn't 
> be any problem with showing informative stack trace: Despite not storing 
> every individual frame, each individual frame could still be *accessed*, 
> similarly to how `xrange` allow access to each individual member without 
> having to store each of them.
>
> Then, the stack could store a lot more items, and tasks that currently 
> require recursion (like pickling using the standard library) will be able 
> to handle much deeper recursions.
>
> What do you think?
>
>
> Ram.
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20130911/b65f37f8/attachment.html>

From anikom15 at gmail.com  Thu Sep 12 06:12:43 2013
From: anikom15 at gmail.com (=?UTF-8?Q?Westley_Mart=C3=ADnez?=)
Date: Wed, 11 Sep 2013 21:12:43 -0700
Subject: [Python-ideas] FW:  Idea: Compressing the stack on the fly
References: <3daafc97-115a-4525-907c-fdbf356fd749@googlegroups.com> 
Message-ID: <002a01ceaf6e$55629330$0027b990$@gmail.com>


-----Original Message-----
From: Westley Mart?nez [mailto:anikom15 at gmail.com] 
Sent: Wednesday, September 11, 2013 9:03 PM
To: 'Ram Rachum'; 'python-ideas at googlegroups.com'
Cc: 'Ram Rachum'
Subject: RE: [Python-ideas] Idea: Compressing the stack on the fly

> -----Original Message-----
> From: Python-ideas [mailto:python-ideas-
> bounces+anikom15=gmail.com at python.org] On Behalf Of Ram Rachum
> Sent: Sunday, May 26, 2013 5:00 AM
> To: python-ideas at googlegroups.com
> Cc: Ram Rachum
> Subject: [Python-ideas] Idea: Compressing the stack on the fly
> 
> So what I'm suggesting is an algorithm to compress that stack on the
> fly. An algorithm that would detect regularities in the stack and
> instead of saving each individual frame, save just the pattern. Then,
> there wouldn't be any problem with showing informative stack trace:
> Despite not storing every individual frame, each individual frame
> could still be accessed, similarly to how `xrange` allow access to
> each individual member without having to store each of them.
> 
> 
> Then, the stack could store a lot more items, and tasks that currently
> require recursion (like pickling using the standard library) will be
> able to handle much deeper recursions.
> 
> 
> What do you think?

I think this is an interesting idea.  It sounds possible, but the
question is whether or not it can be efficiently done with Python.

I'd heed Guido's advice in first implementing this.  It could probably
be done effectively with a compiled language like C, but I'd imagine
it'd be too difficult for Python.

The other question is usability.  What would this actually be used for.
I'm not a fan of recursion.  I think anything that uses recursion could
be restructured into something simpler.  A lot of people find recursion
to be elegant.  For me it just hurts my brain.


From joshua at landau.ws  Thu Sep 12 06:29:12 2013
From: joshua at landau.ws (Joshua Landau)
Date: Thu, 12 Sep 2013 05:29:12 +0100
Subject: [Python-ideas] FW: Idea: Compressing the stack on the fly
In-Reply-To: <002a01ceaf6e$55629330$0027b990$@gmail.com>
References: <3daafc97-115a-4525-907c-fdbf356fd749@googlegroups.com>
 <002a01ceaf6e$55629330$0027b990$@gmail.com>
Message-ID: <CAN1F8qUBapjW_RoEbYDp5ZF30XP=iKh=0Yh0cQD9fNxXA1R6hw@mail.gmail.com>

Does anyone actually write recursive Python code where the recursion
in a significant bottleneck? The only such code I can think of is
either for a tree, in which case stack depth is irrelevant, or bad
code.

Why would anyone care, basically?

From clay.sweetser at gmail.com  Thu Sep 12 06:46:42 2013
From: clay.sweetser at gmail.com (Clay Sweetser)
Date: Thu, 12 Sep 2013 00:46:42 -0400
Subject: [Python-ideas] FW: Idea: Compressing the stack on the fly
In-Reply-To: <CAN1F8qUBapjW_RoEbYDp5ZF30XP=iKh=0Yh0cQD9fNxXA1R6hw@mail.gmail.com>
References: <3daafc97-115a-4525-907c-fdbf356fd749@googlegroups.com>
 <002a01ceaf6e$55629330$0027b990$@gmail.com>
 <CAN1F8qUBapjW_RoEbYDp5ZF30XP=iKh=0Yh0cQD9fNxXA1R6hw@mail.gmail.com>
Message-ID: <CANW+cAXPuFqXhb0kqGhnVSv82ddCfadR-xhSXWHN4Kx9qoy0xg@mail.gmail.com>

This sounds like something that the PyPy team might be interested in.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20130912/e46125ea/attachment-0001.html>

From abarnert at yahoo.com  Thu Sep 12 07:12:23 2013
From: abarnert at yahoo.com (Andrew Barnert)
Date: Wed, 11 Sep 2013 22:12:23 -0700 (PDT)
Subject: [Python-ideas] AST Hash
In-Reply-To: <CAPkN8x+S--RVHUtgJCN2m8RxzHTpVr7W2mBj-RwFctMUNoGJMA@mail.gmail.com>
References: <CAPkN8x+S--RVHUtgJCN2m8RxzHTpVr7W2mBj-RwFctMUNoGJMA@mail.gmail.com>
Message-ID: <1378962743.33918.YahooMailNeo@web184705.mail.ne1.yahoo.com>

From: anatoly techtonik <techtonik at gmail.com>

Sent: Wednesday, September 11, 2013 9:05 AM


> We need a checksum for code pieces. The goal of the checksum is to
> reliably detect pieces of code with absolutely identical behaviour.
> Borders of such checksum can be functions, classes, modules,.

Cool idea.

But why not also fragments? A single expression, statement, or suite could be useful in many contexts without having to artificially wrap it in a function, couldn't it?

> Practical application for such checksums are:
> 
> - detecting usage of recipes and examples across PyPI packages
> - detecting usage of standard stdlib calls
> - creating execution safe serialization formats for data
> ?  - choosing class to deserialize data fields of the object based on its hash
> - enable consistent validation and testing of results across various AST tools
> 
> There can be two approaches to build such checksum:
> 1. Code Section Hash
> 2. AST Hash

I'm not sure either of these is right.

You want to treat two functions as equal if they have renamed locals:

? ? def f1():
? ? ? ? i = 0
? ? ? ? return i
? ? def f2():
? ? ? ? j = 0
? ? ? ? return j

But I'm guessing you _don't_ want to treat them as equal if they reference different globals:

? ? i, j = 0, 1
? ? def f1():
? ? ? ? return i
? ? def f2():
? ? ? ? return j

If it's not obvious why you want those two to be different, consider this identical case:


? ? def f1():
? ? ? ? return os.open('foo')
? ? def f2():
? ? ? ? return zipfile.open('foo')

But the difference in the ASTs in this case looks identical to the difference in the local-renaming case (every Name node with id i/os has it changed to j/zipfile).?Unless you implement the exact same logic as the compiler to distinguish local and global names, there's no way to allow local renaming but not global renaming.

And it gets even worse if you consider closures; two functions could have identical ASTs but different meanings, and even applying the compiler's name-distinguishing logic doesn't help, unless you also look at the context around the definition.


But there's an obvious answer here that takes care of all of this: Just has the compiled code objects. I'm not sure _exactly_ which attributes you want, but something like co_code, co_flags,?co_consts, co_names, co_freevars, and maybe the ones related to the parameters.

I don't think you need anything from the function, class, or module that owns the code. (Yes, two functions with identical __code__ (including co_names and co_freevars) but different __globals__ or __closure__ will act differently? but I don't think there's any reasonable rule you could apply except either (a) ignore them, or (b) raise an exception if f.__globals__ != globals() or f.__closure__. Besides, the functions will also act differently if you just change the values of global variables. So I think just ignore them.)

Anyway, all of these attributes are easy to hash: one's a bytes, one's a fixed-size int, and the rest are tuples of strings.

There are some obvious downsides to this, but I don't think any of them are too serious.?For example, you can't hash anything that doesn't both parse and compile, while you can build an AST from code that just parses. But practically, there aren't too many good examples of things that parse but won't compile; if you really want to be able to hash invalid code, you really need to stick with source.

But I'm sure someone will come up with something big and obvious that I'm missing.

From abarnert at yahoo.com  Thu Sep 12 07:15:15 2013
From: abarnert at yahoo.com (Andrew Barnert)
Date: Wed, 11 Sep 2013 22:15:15 -0700 (PDT)
Subject: [Python-ideas] AST Hash
In-Reply-To: <1378962743.33918.YahooMailNeo@web184705.mail.ne1.yahoo.com>
References: <CAPkN8x+S--RVHUtgJCN2m8RxzHTpVr7W2mBj-RwFctMUNoGJMA@mail.gmail.com>
 <1378962743.33918.YahooMailNeo@web184705.mail.ne1.yahoo.com>
Message-ID: <1378962915.56806.YahooMailNeo@web184706.mail.ne1.yahoo.com>

From: Andrew Barnert <abarnert at yahoo.com>

Sent: Wednesday, September 11, 2013 10:12 PM


> But there's an obvious answer here that takes care of all of this: Just has 
> the compiled code objects.

??

> There are some obvious downsides to this, but I don't think any of them are 
> too serious.

?

> But I'm sure someone will come up with something big and obvious that 
> I'm missing.


Actually, I just thought of one.

Let's say mod.py looks like this:

? ? def f():
? ? ? ? def g():
? ? ? ? ? ? pass

How do we get the code for the g function?

With an AST, it's obvious?you may not get sufficient/correct context, but at least you can get to it.

With a compiled module, whether we explicitly compile mod.py or import it or whatever? well, there's?a code object for g compiled in there, and it even knows that it's part of module fc, and from line 1 of fc.py, and so on? but you can't access it as fc.g, or anything else obvious, without picking through the compiled code format.


From abarnert at yahoo.com  Thu Sep 12 07:26:28 2013
From: abarnert at yahoo.com (Andrew Barnert)
Date: Wed, 11 Sep 2013 22:26:28 -0700 (PDT)
Subject: [Python-ideas] FW:  Idea: Compressing the stack on the fly
In-Reply-To: <002a01ceaf6e$55629330$0027b990$@gmail.com>
References: <3daafc97-115a-4525-907c-fdbf356fd749@googlegroups.com>
 <002a01ceaf6e$55629330$0027b990$@gmail.com>
Message-ID: <1378963588.75466.YahooMailNeo@web184704.mail.ne1.yahoo.com>

Why is everyone suddenly responding to a thread that died months ago? If anyone really wants to re-propose the idea, they should at least go back and look over the discussion that followed it.


----- Original Message -----
> From: Westley Mart?nez <anikom15 at gmail.com>
> To: python-ideas at python.org
> Cc: 
> Sent: Wednesday, September 11, 2013 9:12 PM
> Subject: [Python-ideas] FW:  Idea: Compressing the stack on the fly
> 
> 
> 
> -----Original Message-----
> From: Westley Mart?nez [mailto:anikom15 at gmail.com] 
> Sent: Wednesday, September 11, 2013 9:03 PM
> To: 'Ram Rachum'; 'python-ideas at googlegroups.com'
> Cc: 'Ram Rachum'
> Subject: RE: [Python-ideas] Idea: Compressing the stack on the fly
> 
>>  -----Original Message-----
>>  From: Python-ideas [mailto:python-ideas-
>>  bounces+anikom15=gmail.com at python.org] On Behalf Of Ram Rachum
>>  Sent: Sunday, May 26, 2013 5:00 AM
>>  To: python-ideas at googlegroups.com
>>  Cc: Ram Rachum
>>  Subject: [Python-ideas] Idea: Compressing the stack on the fly
>> 
>>  So what I'm suggesting is an algorithm to compress that stack on the
>>  fly. An algorithm that would detect regularities in the stack and
>>  instead of saving each individual frame, save just the pattern. Then,
>>  there wouldn't be any problem with showing informative stack trace:
>>  Despite not storing every individual frame, each individual frame
>>  could still be accessed, similarly to how `xrange` allow access to
>>  each individual member without having to store each of them.
>> 
>> 
>>  Then, the stack could store a lot more items, and tasks that currently
>>  require recursion (like pickling using the standard library) will be
>>  able to handle much deeper recursions.
>> 
>> 
>>  What do you think?
> 
> I think this is an interesting idea.? It sounds possible, but the
> question is whether or not it can be efficiently done with Python.
> 
> I'd heed Guido's advice in first implementing this.? It could probably
> be done effectively with a compiled language like C, but I'd imagine
> it'd be too difficult for Python.
> 
> The other question is usability.? What would this actually be used for.
> I'm not a fan of recursion.? I think anything that uses recursion could
> be restructured into something simpler.? A lot of people find recursion
> to be elegant.? For me it just hurts my brain.
> 
> 
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> https://mail.python.org/mailman/listinfo/python-ideas
> 

From mal at egenix.com  Thu Sep 12 08:59:40 2013
From: mal at egenix.com (M.-A. Lemburg)
Date: Thu, 12 Sep 2013 08:59:40 +0200
Subject: [Python-ideas] FW: Idea: Compressing the stack on the fly
In-Reply-To: <CAN1F8qUBapjW_RoEbYDp5ZF30XP=iKh=0Yh0cQD9fNxXA1R6hw@mail.gmail.com>
References: <3daafc97-115a-4525-907c-fdbf356fd749@googlegroups.com>	<002a01ceaf6e$55629330$0027b990$@gmail.com>
 <CAN1F8qUBapjW_RoEbYDp5ZF30XP=iKh=0Yh0cQD9fNxXA1R6hw@mail.gmail.com>
Message-ID: <5231665C.207@egenix.com>

On 12.09.2013 06:29, Joshua Landau wrote:
> Does anyone actually write recursive Python code where the recursion
> in a significant bottleneck? The only such code I can think of is
> either for a tree, in which case stack depth is irrelevant, or bad
> code.

Any kind of backtracking algorithm will need recursion or a separate
stack data structure to keep track of the various decisions made
up to a certain point on the path.

The C stack is rather limited in size, so a recursive parser can
easily blow up if it uses the C stack alone for managing
backtracking.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Sep 12 2013)
>>> Python Projects, Consulting and Support ...   http://www.egenix.com/
>>> mxODBC.Zope/Plone.Database.Adapter ...       http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________
2013-09-11: Released eGenix PyRun 1.3.0 ...       http://egenix.com/go49
2013-09-04: Released eGenix pyOpenSSL 0.13.2 ...  http://egenix.com/go48
2013-09-20: PyCon UK 2013, Coventry, UK ...                 8 days to go

   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
    D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
           Registered at Amtsgericht Duesseldorf: HRB 46611
               http://www.egenix.com/company/contact/

From joshua at landau.ws  Thu Sep 12 09:03:08 2013
From: joshua at landau.ws (Joshua Landau)
Date: Thu, 12 Sep 2013 08:03:08 +0100
Subject: [Python-ideas] FW: Idea: Compressing the stack on the fly
In-Reply-To: <5231665C.207@egenix.com>
References: <3daafc97-115a-4525-907c-fdbf356fd749@googlegroups.com>
 <002a01ceaf6e$55629330$0027b990$@gmail.com>
 <CAN1F8qUBapjW_RoEbYDp5ZF30XP=iKh=0Yh0cQD9fNxXA1R6hw@mail.gmail.com>
 <5231665C.207@egenix.com>
Message-ID: <CAN1F8qUHUGeQ92WqDjd_Uvj3iYygaL16FzshsuGK1_j=z4o79Q@mail.gmail.com>

On 12 September 2013 07:59, M.-A. Lemburg <mal at egenix.com> wrote:
> On 12.09.2013 06:29, Joshua Landau wrote:
>> Does anyone actually write recursive Python code where the recursion
>> in a significant bottleneck? The only such code I can think of is
>> either for a tree, in which case stack depth is irrelevant, or bad
>> code.
>
> Any kind of backtracking algorithm will need recursion or a separate
> stack data structure to keep track of the various decisions made
> up to a certain point on the path.
>
> The C stack is rather limited in size, so a recursive parser can
> easily blow up if it uses the C stack alone for managing
> backtracking.

What sort of algorithm would backtrack that many times? I doubt a
parser would and I can't think of anything worse ATM.

From rosuav at gmail.com  Thu Sep 12 10:21:05 2013
From: rosuav at gmail.com (Chris Angelico)
Date: Thu, 12 Sep 2013 18:21:05 +1000
Subject: [Python-ideas] FW: Idea: Compressing the stack on the fly
In-Reply-To: <CAN1F8qUHUGeQ92WqDjd_Uvj3iYygaL16FzshsuGK1_j=z4o79Q@mail.gmail.com>
References: <3daafc97-115a-4525-907c-fdbf356fd749@googlegroups.com>
 <002a01ceaf6e$55629330$0027b990$@gmail.com>
 <CAN1F8qUBapjW_RoEbYDp5ZF30XP=iKh=0Yh0cQD9fNxXA1R6hw@mail.gmail.com>
 <5231665C.207@egenix.com>
 <CAN1F8qUHUGeQ92WqDjd_Uvj3iYygaL16FzshsuGK1_j=z4o79Q@mail.gmail.com>
Message-ID: <CAPTjJmo5-WQvEQDM_gpRYJEXBUp9RxuCnGG7k5TfioSp7hEO_g@mail.gmail.com>

On Thu, Sep 12, 2013 at 5:03 PM, Joshua Landau <joshua at landau.ws> wrote:
> On 12 September 2013 07:59, M.-A. Lemburg <mal at egenix.com> wrote:
>> On 12.09.2013 06:29, Joshua Landau wrote:
>>> Does anyone actually write recursive Python code where the recursion
>>> in a significant bottleneck? The only such code I can think of is
>>> either for a tree, in which case stack depth is irrelevant, or bad
>>> code.
>>
>> Any kind of backtracking algorithm will need recursion or a separate
>> stack data structure to keep track of the various decisions made
>> up to a certain point on the path.
>>
>> The C stack is rather limited in size, so a recursive parser can
>> easily blow up if it uses the C stack alone for managing
>> backtracking.
>
> What sort of algorithm would backtrack that many times? I doubt a
> parser would and I can't think of anything worse ATM.

Solve chess.

ChrisA

From mal at egenix.com  Thu Sep 12 10:34:42 2013
From: mal at egenix.com (M.-A. Lemburg)
Date: Thu, 12 Sep 2013 10:34:42 +0200
Subject: [Python-ideas] FW: Idea: Compressing the stack on the fly
In-Reply-To: <CAN1F8qUHUGeQ92WqDjd_Uvj3iYygaL16FzshsuGK1_j=z4o79Q@mail.gmail.com>
References: <3daafc97-115a-4525-907c-fdbf356fd749@googlegroups.com>	<002a01ceaf6e$55629330$0027b990$@gmail.com>	<CAN1F8qUBapjW_RoEbYDp5ZF30XP=iKh=0Yh0cQD9fNxXA1R6hw@mail.gmail.com>	<5231665C.207@egenix.com>
 <CAN1F8qUHUGeQ92WqDjd_Uvj3iYygaL16FzshsuGK1_j=z4o79Q@mail.gmail.com>
Message-ID: <52317CA2.50206@egenix.com>

On 12.09.2013 09:03, Joshua Landau wrote:
> On 12 September 2013 07:59, M.-A. Lemburg <mal at egenix.com> wrote:
>> On 12.09.2013 06:29, Joshua Landau wrote:
>>> Does anyone actually write recursive Python code where the recursion
>>> in a significant bottleneck? The only such code I can think of is
>>> either for a tree, in which case stack depth is irrelevant, or bad
>>> code.
>>
>> Any kind of backtracking algorithm will need recursion or a separate
>> stack data structure to keep track of the various decisions made
>> up to a certain point on the path.
>>
>> The C stack is rather limited in size, so a recursive parser can
>> easily blow up if it uses the C stack alone for managing
>> backtracking.
> 
> What sort of algorithm would backtrack that many times? I doubt a
> parser would and I can't think of anything worse ATM.

Oh, that's easy. It just depends on the given data set that you're
working on and how often you have to branch when working on it.

http://en.wikipedia.org/wiki/Backtracking lists a few problems.

Here's a regular expression example that would blow the stack,
if the re module were still using it (it was fixed in 2003 to
no longer do):

    re.match('(.*a|.*b|x)+', 'x' * 100000)

The expression still uses exponential time, though.

With Python 2.3, you see the stack limit error:

Python 2.3.5 (#1, Aug 24 2011, 15:52:42)
[GCC 4.5.0 20100604 [gcc-4_5-branch revision 160292]] on linux2
Type "help", "copyright", "credits" or "license" for more information.

>>> import re
>>> re.match('(.*a|.*b|x)+', 'x' * 100000)
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
  File "/usr/local/python-2.3-ucs2/lib/python2.3/sre.py", line 132, in match
    return _compile(pattern, flags).match(string)
RuntimeError: maximum recursion limit exceeded

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Sep 12 2013)
>>> Python Projects, Consulting and Support ...   http://www.egenix.com/
>>> mxODBC.Zope/Plone.Database.Adapter ...       http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________
2013-09-11: Released eGenix PyRun 1.3.0 ...       http://egenix.com/go49
2013-09-04: Released eGenix pyOpenSSL 0.13.2 ...  http://egenix.com/go48
2013-09-20: PyCon UK 2013, Coventry, UK ...                 8 days to go

   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
    D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
           Registered at Amtsgericht Duesseldorf: HRB 46611
               http://www.egenix.com/company/contact/

From joshua at landau.ws  Thu Sep 12 11:11:00 2013
From: joshua at landau.ws (Joshua Landau)
Date: Thu, 12 Sep 2013 10:11:00 +0100
Subject: [Python-ideas] FW: Idea: Compressing the stack on the fly
In-Reply-To: <CAPTjJmo5-WQvEQDM_gpRYJEXBUp9RxuCnGG7k5TfioSp7hEO_g@mail.gmail.com>
References: <3daafc97-115a-4525-907c-fdbf356fd749@googlegroups.com>
 <002a01ceaf6e$55629330$0027b990$@gmail.com>
 <CAN1F8qUBapjW_RoEbYDp5ZF30XP=iKh=0Yh0cQD9fNxXA1R6hw@mail.gmail.com>
 <5231665C.207@egenix.com>
 <CAN1F8qUHUGeQ92WqDjd_Uvj3iYygaL16FzshsuGK1_j=z4o79Q@mail.gmail.com>
 <CAPTjJmo5-WQvEQDM_gpRYJEXBUp9RxuCnGG7k5TfioSp7hEO_g@mail.gmail.com>
Message-ID: <CAN1F8qXNdka-B_0MnBbtDCCtwo1FJ995av4x=MQCD=CXpJhisA@mail.gmail.com>

On 12 September 2013 09:21, Chris Angelico <rosuav at gmail.com> wrote:
> On Thu, Sep 12, 2013 at 5:03 PM, Joshua Landau <joshua at landau.ws> wrote:
>> On 12 September 2013 07:59, M.-A. Lemburg <mal at egenix.com> wrote:
>>> On 12.09.2013 06:29, Joshua Landau wrote:
>>>> Does anyone actually write recursive Python code where the recursion
>>>> in a significant bottleneck? The only such code I can think of is
>>>> either for a tree, in which case stack depth is irrelevant, or bad
>>>> code.
>>>
>>> Any kind of backtracking algorithm will need recursion or a separate
>>> stack data structure to keep track of the various decisions made
>>> up to a certain point on the path.
>>>
>>> The C stack is rather limited in size, so a recursive parser can
>>> easily blow up if it uses the C stack alone for managing
>>> backtracking.
>>
>> What sort of algorithm would backtrack that many times? I doubt a
>> parser would and I can't think of anything worse ATM.
>
> Solve chess.

If you're managing to simulate more than 1000 moves ahead either
you're doing depth first or you've got a *blisteringly* fast computer.

From joshua at landau.ws  Thu Sep 12 11:18:48 2013
From: joshua at landau.ws (Joshua Landau)
Date: Thu, 12 Sep 2013 10:18:48 +0100
Subject: [Python-ideas] FW: Idea: Compressing the stack on the fly
In-Reply-To: <52317CA2.50206@egenix.com>
References: <3daafc97-115a-4525-907c-fdbf356fd749@googlegroups.com>
 <002a01ceaf6e$55629330$0027b990$@gmail.com>
 <CAN1F8qUBapjW_RoEbYDp5ZF30XP=iKh=0Yh0cQD9fNxXA1R6hw@mail.gmail.com>
 <5231665C.207@egenix.com>
 <CAN1F8qUHUGeQ92WqDjd_Uvj3iYygaL16FzshsuGK1_j=z4o79Q@mail.gmail.com>
 <52317CA2.50206@egenix.com>
Message-ID: <CAN1F8qUPvvnc1vYb=cvvqDXxiT+DLpBSF8jev5RMfb+YgBRRtw@mail.gmail.com>

On 12 September 2013 09:34, M.-A. Lemburg <mal at egenix.com> wrote:
> On 12.09.2013 09:03, Joshua Landau wrote:
>> On 12 September 2013 07:59, M.-A. Lemburg <mal at egenix.com> wrote:
>>> On 12.09.2013 06:29, Joshua Landau wrote:
>>>> Does anyone actually write recursive Python code where the recursion
>>>> in a significant bottleneck? The only such code I can think of is
>>>> either for a tree, in which case stack depth is irrelevant, or bad
>>>> code.
>>>
>>> Any kind of backtracking algorithm will need recursion or a separate
>>> stack data structure to keep track of the various decisions made
>>> up to a certain point on the path.
>>>
>>> The C stack is rather limited in size, so a recursive parser can
>>> easily blow up if it uses the C stack alone for managing
>>> backtracking.
>>
>> What sort of algorithm would backtrack that many times? I doubt a
>> parser would and I can't think of anything worse ATM.
>
> Oh, that's easy. It just depends on the given data set that you're
> working on and how often you have to branch when working on it.
>
> http://en.wikipedia.org/wiki/Backtracking lists a few problems.
>
> Here's a regular expression example that would blow the stack,
> if the re module were still using it (it was fixed in 2003 to
> no longer do):
>
>     re.match('(.*a|.*b|x)+', 'x' * 100000)
>
> The expression still uses exponential time, though.

Ah, Regex. 'Could'a guessed.

I'll file that under "bad code". *wink*

From oscar.j.benjamin at gmail.com  Thu Sep 12 11:57:18 2013
From: oscar.j.benjamin at gmail.com (Oscar Benjamin)
Date: Thu, 12 Sep 2013 10:57:18 +0100
Subject: [Python-ideas] FW: Idea: Compressing the stack on the fly
In-Reply-To: <CAN1F8qUBapjW_RoEbYDp5ZF30XP=iKh=0Yh0cQD9fNxXA1R6hw@mail.gmail.com>
References: <3daafc97-115a-4525-907c-fdbf356fd749@googlegroups.com>
 <002a01ceaf6e$55629330$0027b990$@gmail.com>
 <CAN1F8qUBapjW_RoEbYDp5ZF30XP=iKh=0Yh0cQD9fNxXA1R6hw@mail.gmail.com>
Message-ID: <CAHVvXxSq6h_F2+uw4Z0A2r-_vdYKhB=2SwfJwdnEGQwb--z2ew@mail.gmail.com>

On 12 September 2013 05:29, Joshua Landau <joshua at landau.ws> wrote:
> Does anyone actually write recursive Python code where the recursion
> in a significant bottleneck? The only such code I can think of is
> either for a tree, in which case stack depth is irrelevant, or bad
> code.
>
> Why would anyone care, basically?

I think you're asking this question the wrong way. Recursion isn't a
bottleneck that slows down your program. When you hit the recursion
limit your program just blows up. Since Python doesn't have the
optimisations that make any particular kind of recursion scale well
people do not generally use it unless they know that the depth is
small enough. Currently code that is susceptible to hitting the
recursion limit is "bad code" because it depends on optimisations that
don't exist. However, if the optimisations did exist then people could
choose to take advantage of them.

As an example, I once implemented Tarjan's algorithm in Python using
the recursive form shown here:
http://en.wikipedia.org/wiki/Tarjan's_strongly_connected_components_algorithm#The_algorithm_in_pseudocode

After implementing it and confirming that it worked I immediately
found that it hit the recursion limit in my real problem. So I
reimplemented it without the recursion. Had there been optimisations
that would have made the reimplementation unnecessary I would have
happily stuck with the first form since it was easier to understand
than the explicit stack of iterators version that I ended up with. For
the same reasons you won't see much code out there where recursion is
a bottleneck unless, as you say, it is "bad code".


Oscar

From oscar.j.benjamin at gmail.com  Thu Sep 12 12:04:29 2013
From: oscar.j.benjamin at gmail.com (Oscar Benjamin)
Date: Thu, 12 Sep 2013 11:04:29 +0100
Subject: [Python-ideas] Replace option set/get methods through the
 standard library with a ChainMap; add a context manager to ChainMap
In-Reply-To: <8a0c3260-7c85-46ba-9dae-102e7fceb8f4@googlegroups.com>
References: <8a0c3260-7c85-46ba-9dae-102e7fceb8f4@googlegroups.com>
Message-ID: <CAHVvXxQTj6DH_m6E5LUGwNqGDey6OxbViuet-2t99SaAOeAO3w@mail.gmail.com>

On 11 September 2013 23:18, Neil Girdhar <mistersheik at gmail.com> wrote:
>
> With numpy print options, for example, the usual pattern is to save some of
> the print options, set some of them, and then restore the old options.  Why
> not expose the options as a ChainMap called numpy.printoptions?  ChainMap
> could then expose a context manager that pushes a new dictionary on entry
> and pops it on exit via, say, child_context that accepts a dictionary.  Now,
> instead of:
>
> saved_precision = np.get_printoptions()['precision']
> np.set_printoptions(precision=23)
> do_something()
> np.set_printoptions(precision=saved_precision)
>
> You can do the same with a context manager, which I think is stylistically
> better (as it's impossible to forget to reset the option, and no explicit
> temporary invades the local variables):
>
> with np.printoptions.child_context({'precision', 23}):
>     do_something()

You can write this yourself If you like (untested):

from contextlib import contextmanager

@contextmanager
def print_options(**opts):
    oldopts = np.get_print_options()
    newopts = oldopts.copy()
    newopts.update(opts)
    try:
        np.set_print_options(**newopts)
        yield
    finally:
        np.set_print_options(**oldopts)

with print_options(precision=23):
    do_something()

Generally speaking numpy doesn't use context managers much. You may be
right that it should use them more but this isn't the right place to
make that suggestion since numpy is not part of core Python or of the
standard library. I suggest that you ask this on the scipy-users
mailing list.


Oscar

From oscar.j.benjamin at gmail.com  Thu Sep 12 12:12:27 2013
From: oscar.j.benjamin at gmail.com (Oscar Benjamin)
Date: Thu, 12 Sep 2013 11:12:27 +0100
Subject: [Python-ideas] Replace option set/get methods through the
 standard library with a ChainMap; add a context manager to ChainMap
In-Reply-To: <CAA68w_kzQhVfp9z5mBL0wXrF3OruGjQwZoSiFcVAKyCkH76iWw@mail.gmail.com>
References: <8a0c3260-7c85-46ba-9dae-102e7fceb8f4@googlegroups.com>
 <CAHVvXxRsVFzpVFLrByx=jd=FF3zh0d8eoLc1iDLityoQHCowFg@mail.gmail.com>
 <CAA68w_kzQhVfp9z5mBL0wXrF3OruGjQwZoSiFcVAKyCkH76iWw@mail.gmail.com>
Message-ID: <CAHVvXxS8S68_7jK+i+LC_iSZDxosBEsPb4s5RH27Ks_KDEhGkg@mail.gmail.com>

On 12 September 2013 11:06, Neil Girdhar <mistersheik at gmail.com> wrote:
>
> Exactly.  I just assumed when I wrote my comment that this was a general
> problem in the standard library, but as was pointed out to me, it seems to
> be only a problem with numpy.  My suggestion is for printoptions to be
> implemented as a ChainMap to facilitate things on the numpy end of things as
> well.

Your suggestion doesn't seem unreasonable to me. However you're asking
on the wrong mailing list:
http://www.scipy.org/scipylib/mailing-lists.html


Oscar

From mistersheik at gmail.com  Thu Sep 12 12:14:54 2013
From: mistersheik at gmail.com (Neil Girdhar)
Date: Thu, 12 Sep 2013 06:14:54 -0400
Subject: [Python-ideas] Replace option set/get methods through the
 standard library with a ChainMap; add a context manager to ChainMap
In-Reply-To: <CAHVvXxS8S68_7jK+i+LC_iSZDxosBEsPb4s5RH27Ks_KDEhGkg@mail.gmail.com>
References: <8a0c3260-7c85-46ba-9dae-102e7fceb8f4@googlegroups.com>
 <CAHVvXxRsVFzpVFLrByx=jd=FF3zh0d8eoLc1iDLityoQHCowFg@mail.gmail.com>
 <CAA68w_kzQhVfp9z5mBL0wXrF3OruGjQwZoSiFcVAKyCkH76iWw@mail.gmail.com>
 <CAHVvXxS8S68_7jK+i+LC_iSZDxosBEsPb4s5RH27Ks_KDEhGkg@mail.gmail.com>
Message-ID: <CAA68w_nD5tNv9zaAyj7jhS_ngZH2q_+s6vZxWYcrJLEWUhDdxg@mail.gmail.com>

Thank you.  I will ask there about adding numpy context managers.  However,
the extra member function to ChainMap to use it as a context manager would
be a question for this mailing list, right?

Best,
Neil

On Thu, Sep 12, 2013 at 6:12 AM, Oscar Benjamin
<oscar.j.benjamin at gmail.com>wrote:

> On
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20130912/3b3d5280/attachment.html>

From oscar.j.benjamin at gmail.com  Thu Sep 12 12:24:05 2013
From: oscar.j.benjamin at gmail.com (Oscar Benjamin)
Date: Thu, 12 Sep 2013 11:24:05 +0100
Subject: [Python-ideas] Replace option set/get methods through the
 standard library with a ChainMap; add a context manager to ChainMap
In-Reply-To: <CAA68w_nD5tNv9zaAyj7jhS_ngZH2q_+s6vZxWYcrJLEWUhDdxg@mail.gmail.com>
References: <8a0c3260-7c85-46ba-9dae-102e7fceb8f4@googlegroups.com>
 <CAHVvXxRsVFzpVFLrByx=jd=FF3zh0d8eoLc1iDLityoQHCowFg@mail.gmail.com>
 <CAA68w_kzQhVfp9z5mBL0wXrF3OruGjQwZoSiFcVAKyCkH76iWw@mail.gmail.com>
 <CAHVvXxS8S68_7jK+i+LC_iSZDxosBEsPb4s5RH27Ks_KDEhGkg@mail.gmail.com>
 <CAA68w_nD5tNv9zaAyj7jhS_ngZH2q_+s6vZxWYcrJLEWUhDdxg@mail.gmail.com>
Message-ID: <CAHVvXxS0kQHNYvzvBEd000sQCghOc+iDx7uyd+SMa9uOOM23oQ@mail.gmail.com>

On 12 September 2013 11:14, Neil Girdhar <mistersheik at gmail.com> wrote:
>
> Thank you.  I will ask there about adding numpy context managers.  However,
> the extra member function to ChainMap to use it as a context manager would
> be a question for this mailing list, right?

Perhaps you could spell out that part of the idea in more detail then.
Why in particular would it need to be a ChainMap and not a regular
dict? Does the method return a new ChainMap instance? What would be
seen by other code that holds references to the same ChainMap?


Oscar

From g.rodola at gmail.com  Thu Sep 12 20:59:46 2013
From: g.rodola at gmail.com (Giampaolo Rodola')
Date: Thu, 12 Sep 2013 20:59:46 +0200
Subject: [Python-ideas] multiprocessing and physical CPU cores count
Message-ID: <CAFYqXL_4yDAatFxVNcqHiyMgqn2PR187Mfyez=etcDW_hK2izg@mail.gmail.com>

This is a follow up of a feature request which recently appeared on psutil
bug tracker:
https://code.google.com/p/psutil/issues/detail?id=427

I don't know whether the proposal makes sense for psutil per-se but it
certainly made me think about multiprocessing.cpu_count() and the fact that
it currently returns the number of virtual CPUs (physical + logical).

Given that multiple processes cannot take any advantage of hyper threading
technology then maybe it makes sense for multiprocessing to expose a
physical_cpu_count() function in order to preemptively figure out how many
processes to spawn.

Same thing is discussed here:
https://groups.google.com/forum/#!msg/nzpug/_5sFW9BEMQ4/Y4laXRNlXkMJ

Thoughts?

--- Giampaolo
https://code.google.com/p/pyftpdlib/
https://code.google.com/p/psutil/
https://code.google.com/p/pysendfile/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20130912/9fea36d5/attachment.html>

From solipsis at pitrou.net  Thu Sep 12 21:10:04 2013
From: solipsis at pitrou.net (Antoine Pitrou)
Date: Thu, 12 Sep 2013 21:10:04 +0200
Subject: [Python-ideas] multiprocessing and physical CPU cores count
References: <CAFYqXL_4yDAatFxVNcqHiyMgqn2PR187Mfyez=etcDW_hK2izg@mail.gmail.com>
Message-ID: <20130912211004.74746942@fsol>

On Thu, 12 Sep 2013 20:59:46 +0200
"Giampaolo Rodola'" <g.rodola at gmail.com>
wrote:
> This is a follow up of a feature request which recently appeared on psutil
> bug tracker:
> https://code.google.com/p/psutil/issues/detail?id=427
> 
> I don't know whether the proposal makes sense for psutil per-se but it
> certainly made me think about multiprocessing.cpu_count() and the fact that
> it currently returns the number of virtual CPUs (physical + logical).
> 
> Given that multiple processes cannot take any advantage of hyper threading
> technology

Of course they can.  The CPU doesn't distinguish between different
kinds of "threads", they can either belong to the same process or to
different ones.

Regards

Antoine.


From g.rodola at gmail.com  Thu Sep 12 21:26:07 2013
From: g.rodola at gmail.com (Giampaolo Rodola')
Date: Thu, 12 Sep 2013 21:26:07 +0200
Subject: [Python-ideas] multiprocessing and physical CPU cores count
In-Reply-To: <20130912211004.74746942@fsol>
References: <CAFYqXL_4yDAatFxVNcqHiyMgqn2PR187Mfyez=etcDW_hK2izg@mail.gmail.com>
 <20130912211004.74746942@fsol>
Message-ID: <CAFYqXL9tcYTgsyN_R+wNV=g9cjYX9zW_WH7hOFYfAfA1LZzHwQ@mail.gmail.com>

On Thu, Sep 12, 2013 at 9:10 PM, Antoine Pitrou <solipsis at pitrou.net> wrote:

> On Thu, 12 Sep 2013 20:59:46 +0200
> "Giampaolo Rodola'" <g.rodola at gmail.com>
> wrote:
> > This is a follow up of a feature request which recently appeared on
> psutil
> > bug tracker:
> > https://code.google.com/p/psutil/issues/detail?id=427
> >
> > I don't know whether the proposal makes sense for psutil per-se but it
> > certainly made me think about multiprocessing.cpu_count() and the fact
> that
> > it currently returns the number of virtual CPUs (physical + logical).
> >
> > Given that multiple processes cannot take any advantage of hyper
> threading
> > technology
>
> Of course they can.  The CPU doesn't distinguish between different
> kinds of "threads", they can either belong to the same process or to
> different ones.


Of course you're right, I'm sorry. I should have phrased my statement more
carefully before sending the email.
Then the question is whether having physical CPU cores count can be useful.

--- Giampaolo
https://code.google.com/p/pyftpdlib/
https://code.google.com/p/psutil/
https://code.google.com/p/pysendfile/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20130912/c65aa7ac/attachment-0001.html>

From shibturn at gmail.com  Thu Sep 12 21:27:41 2013
From: shibturn at gmail.com (Richard Oudkerk)
Date: Thu, 12 Sep 2013 20:27:41 +0100
Subject: [Python-ideas] multiprocessing and physical CPU cores count
In-Reply-To: <CAFYqXL_4yDAatFxVNcqHiyMgqn2PR187Mfyez=etcDW_hK2izg@mail.gmail.com>
References: <CAFYqXL_4yDAatFxVNcqHiyMgqn2PR187Mfyez=etcDW_hK2izg@mail.gmail.com>
Message-ID: <l0t4j5$pp7$1@ger.gmane.org>

On 12/09/2013 7:59pm, Giampaolo Rodola' wrote:
> Given that multiple processes cannot take any advantage of hyper
> threading technology then maybe it makes sense for multiprocessing to
> expose a physical_cpu_count() function in order to preemptively figure
> out how many processes to spawn.

Do you have a reference?  Wikipedia may not be reliable, but it seems to 
think otherwise:

Hyper-threading works by duplicating certain sections of the processor?
those that store the architectural state? but not duplicating the main
execution resources. This allows a hyper-threading processor to appear
as the usual "physical" processor and an extra "logical" processor to
the host operating system (HTT-unaware operating systems see two
"physical" processors), allowing the operating system to schedule two
threads or processes simultaneously and appropriately.
            ^^^^^^^^^

-- 
Richard


From solipsis at pitrou.net  Thu Sep 12 21:32:51 2013
From: solipsis at pitrou.net (Antoine Pitrou)
Date: Thu, 12 Sep 2013 21:32:51 +0200
Subject: [Python-ideas] multiprocessing and physical CPU cores count
References: <CAFYqXL_4yDAatFxVNcqHiyMgqn2PR187Mfyez=etcDW_hK2izg@mail.gmail.com>
 <20130912211004.74746942@fsol>
 <CAFYqXL9tcYTgsyN_R+wNV=g9cjYX9zW_WH7hOFYfAfA1LZzHwQ@mail.gmail.com>
Message-ID: <20130912213251.3d12b714@fsol>

On Thu, 12 Sep 2013 21:26:07 +0200
"Giampaolo Rodola'" <g.rodola at gmail.com>
wrote:
> On Thu, Sep 12, 2013 at 9:10 PM, Antoine Pitrou <solipsis at pitrou.net> wrote:
> 
> > On Thu, 12 Sep 2013 20:59:46 +0200
> > "Giampaolo Rodola'" <g.rodola at gmail.com>
> > wrote:
> > > This is a follow up of a feature request which recently appeared on
> > psutil
> > > bug tracker:
> > > https://code.google.com/p/psutil/issues/detail?id=427
> > >
> > > I don't know whether the proposal makes sense for psutil per-se but it
> > > certainly made me think about multiprocessing.cpu_count() and the fact
> > that
> > > it currently returns the number of virtual CPUs (physical + logical).
> > >
> > > Given that multiple processes cannot take any advantage of hyper
> > threading
> > > technology
> >
> > Of course they can.  The CPU doesn't distinguish between different
> > kinds of "threads", they can either belong to the same process or to
> > different ones.
> 
> 
> Of course you're right, I'm sorry. I should have phrased my statement more
> carefully before sending the email.
> Then the question is whether having physical CPU cores count can be useful.

I suppose it doesn't hurt :-) I don't think it belongs specifically in
multiprocessing, though. Perhaps in the platform module?

(unless you want to contribute psutil to the stdlib?)

Regards

Antoine.


From g.rodola at gmail.com  Thu Sep 12 21:35:13 2013
From: g.rodola at gmail.com (Giampaolo Rodola')
Date: Thu, 12 Sep 2013 21:35:13 +0200
Subject: [Python-ideas] multiprocessing and physical CPU cores count
In-Reply-To: <l0t4j5$pp7$1@ger.gmane.org>
References: <CAFYqXL_4yDAatFxVNcqHiyMgqn2PR187Mfyez=etcDW_hK2izg@mail.gmail.com>
 <l0t4j5$pp7$1@ger.gmane.org>
Message-ID: <CAFYqXL9WUXzK+taA-2Zrp=x9NwqkCazHF1dTOM8woBV5Ef67dA@mail.gmail.com>

On Thu, Sep 12, 2013 at 9:27 PM, Richard Oudkerk <shibturn at gmail.com> wrote:

> On 12/09/2013 7:59pm, Giampaolo Rodola' wrote:
>
>> Given that multiple processes cannot take any advantage of hyper
>> threading technology then maybe it makes sense for multiprocessing to
>> expose a physical_cpu_count() function in order to preemptively figure
>> out how many processes to spawn.
>>
>
> Do you have a reference?  Wikipedia may not be reliable, but it seems to
> think otherwise:
>
> Hyper-threading works by duplicating certain sections of the processor?
> those that store the architectural state? but not duplicating the main
> execution resources. This allows a hyper-threading processor to appear
> as the usual "physical" processor and an extra "logical" processor to
> the host operating system (HTT-unaware operating systems see two
> "physical" processors), allowing the operating system to schedule two
> threads or processes simultaneously and appropriately.
>            ^^^^^^^^^
>

No, I was wrong.
Please ignore that statement.
I got confused by the name "hyper-threading" and erroneously thought it
only affected threads. =)


--- Giampaolo
https://code.google.com/p/pyftpdlib/
https://code.google.com/p/psutil/
https://code.google.com/p/pysendfile/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20130912/8b15c407/attachment.html>

From ckaynor at zindagigames.com  Thu Sep 12 21:20:08 2013
From: ckaynor at zindagigames.com (Chris Kaynor)
Date: Thu, 12 Sep 2013 12:20:08 -0700
Subject: [Python-ideas] multiprocessing and physical CPU cores count
In-Reply-To: <20130912211004.74746942@fsol>
References: <CAFYqXL_4yDAatFxVNcqHiyMgqn2PR187Mfyez=etcDW_hK2izg@mail.gmail.com>
 <20130912211004.74746942@fsol>
Message-ID: <CALvWhxv-PyCF7C6spHa0MUHgLg6NuiCBmeST3PSmPtw9BwiouA@mail.gmail.com>

On Thu, Sep 12, 2013 at 12:10 PM, Antoine Pitrou <solipsis at pitrou.net>wrote:

> On Thu, 12 Sep 2013 20:59:46 +0200
> "Giampaolo Rodola'" <g.rodola at gmail.com>
> wrote:
> > This is a follow up of a feature request which recently appeared on
> psutil
> > bug tracker:
> > https://code.google.com/p/psutil/issues/detail?id=427
> >
> > I don't know whether the proposal makes sense for psutil per-se but it
> > certainly made me think about multiprocessing.cpu_count() and the fact
> that
> > it currently returns the number of virtual CPUs (physical + logical).
> >
> > Given that multiple processes cannot take any advantage of hyper
> threading
> > technology
>
> Of course they can.  The CPU doesn't distinguish between different
> kinds of "threads", they can either belong to the same process or to
> different ones.
>
> Regards
>
> Antoine.
>

Antoine's claim is backed by a document written by Intel:
http://software.intel.com/en-us/articles/performance-insights-to-intel-hyper-threading-technology/.
Specifically,
in the section "Software Use of Intel HT Technology".
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20130912/cfc65c65/attachment.html>

From g.rodola at gmail.com  Thu Sep 12 21:51:17 2013
From: g.rodola at gmail.com (Giampaolo Rodola')
Date: Thu, 12 Sep 2013 21:51:17 +0200
Subject: [Python-ideas] multiprocessing and physical CPU cores count
In-Reply-To: <20130912213251.3d12b714@fsol>
References: <CAFYqXL_4yDAatFxVNcqHiyMgqn2PR187Mfyez=etcDW_hK2izg@mail.gmail.com>
 <20130912211004.74746942@fsol>
 <CAFYqXL9tcYTgsyN_R+wNV=g9cjYX9zW_WH7hOFYfAfA1LZzHwQ@mail.gmail.com>
 <20130912213251.3d12b714@fsol>
Message-ID: <CAFYqXL_=F=Sy2abGf-WRS_EH8=9vkffSdkXyFOAH+wrmv2=dbw@mail.gmail.com>

On Thu, Sep 12, 2013 at 9:32 PM, Antoine Pitrou <solipsis at pitrou.net> wrote:

> On Thu, 12 Sep 2013 21:26:07 +0200
> "Giampaolo Rodola'" <g.rodola at gmail.com>
> wrote:
> > On Thu, Sep 12, 2013 at 9:10 PM, Antoine Pitrou <solipsis at pitrou.net>
> wrote:
> >
> > > On Thu, 12 Sep 2013 20:59:46 +0200
> > > "Giampaolo Rodola'" <g.rodola at gmail.com>
> > > wrote:
> > > > This is a follow up of a feature request which recently appeared on
> > > psutil
> > > > bug tracker:
> > > > https://code.google.com/p/psutil/issues/detail?id=427
> > > >
> > > > I don't know whether the proposal makes sense for psutil per-se but
> it
> > > > certainly made me think about multiprocessing.cpu_count() and the
> fact
> > > that
> > > > it currently returns the number of virtual CPUs (physical + logical).
> > > >
> > > > Given that multiple processes cannot take any advantage of hyper
> > > threading
> > > > technology
> > >
> > > Of course they can.  The CPU doesn't distinguish between different
> > > kinds of "threads", they can either belong to the same process or to
> > > different ones.
> >
> >
> > Of course you're right, I'm sorry. I should have phrased my statement
> more
> > carefully before sending the email.
> > Then the question is whether having physical CPU cores count can be
> useful.
>
> I suppose it doesn't hurt :-) I don't think it belongs specifically in
> multiprocessing, though. Perhaps in the platform module?
>

I'd be +0.5 for multiprocessing because:

- cpu_count() is already there
- physical_cpu_count() will likely be used by multiprocessing users only

...but my main concern was first figuring out whether it might actually
make sense to distinguish between virtual and physical CPUs in a real world
app.


> (unless you want to contribute psutil to the stdlib?)


That's something I'd be happy to do if there's general approval but I guess
that's for another thread.

--- Giampaolo
https://code.google.com/p/pyftpdlib/
https://code.google.com/p/psutil/
https://code.google.com/p/pysendfile/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20130912/07f5647e/attachment-0001.html>

From christian at python.org  Thu Sep 12 22:19:45 2013
From: christian at python.org (Christian Heimes)
Date: Thu, 12 Sep 2013 22:19:45 +0200
Subject: [Python-ideas] multiprocessing and physical CPU cores count
In-Reply-To: <CAFYqXL_=F=Sy2abGf-WRS_EH8=9vkffSdkXyFOAH+wrmv2=dbw@mail.gmail.com>
References: <CAFYqXL_4yDAatFxVNcqHiyMgqn2PR187Mfyez=etcDW_hK2izg@mail.gmail.com>
 <20130912211004.74746942@fsol>
 <CAFYqXL9tcYTgsyN_R+wNV=g9cjYX9zW_WH7hOFYfAfA1LZzHwQ@mail.gmail.com>
 <20130912213251.3d12b714@fsol>
 <CAFYqXL_=F=Sy2abGf-WRS_EH8=9vkffSdkXyFOAH+wrmv2=dbw@mail.gmail.com>
Message-ID: <l0t7kp$273$1@ger.gmane.org>

Am 12.09.2013 21:51, schrieb Giampaolo Rodola':
> I'd be +0.5 for multiprocessing because:
> 
> - cpu_count() is already there
> - physical_cpu_count() will likely be used by multiprocessing users only
> 
> ...but my main concern was first figuring out whether it might actually
> make sense to distinguish between virtual and physical CPUs in a real
> world app.

I would go one step further and expose the topology of the CPUs. It's
much, much more complicated than just physical and logical CPUs.

For example with Intel CPUs, two hyper-threading units have different
registers but share the same L1 and L2 cache. All CPU core inside a
physical processor share a common L3 cache. Multiple processor on
machines with several processor slots have to communicate through QPI
(QuickPath Interconnect). ccNUMA (cache coherent non-uniform memory
access) ensures that memory barriers syncs these caches when a process
uses multiple processors. Every processor has its own memory banks so
'remote' memory is more expensive to access. Other processors have a
different internal structure. Some aren't ccNUMA ...

Christian


From victor.stinner at gmail.com  Thu Sep 12 23:03:30 2013
From: victor.stinner at gmail.com (Victor Stinner)
Date: Thu, 12 Sep 2013 23:03:30 +0200
Subject: [Python-ideas] multiprocessing and physical CPU cores count
In-Reply-To: <CAFYqXL_=F=Sy2abGf-WRS_EH8=9vkffSdkXyFOAH+wrmv2=dbw@mail.gmail.com>
References: <CAFYqXL_4yDAatFxVNcqHiyMgqn2PR187Mfyez=etcDW_hK2izg@mail.gmail.com>
 <20130912211004.74746942@fsol>
 <CAFYqXL9tcYTgsyN_R+wNV=g9cjYX9zW_WH7hOFYfAfA1LZzHwQ@mail.gmail.com>
 <20130912213251.3d12b714@fsol>
 <CAFYqXL_=F=Sy2abGf-WRS_EH8=9vkffSdkXyFOAH+wrmv2=dbw@mail.gmail.com>
Message-ID: <CAMpsgwZfcvDxSrniwXq7_UU771axBVoD-uZrNyUo0+Au-jiCRA@mail.gmail.com>

Python 3.4 has os.cpu_count().

Victor

Le 12 sept. 2013 21:52, "Giampaolo Rodola'" <g.rodola at gmail.com> a ?crit :
>
> On Thu, Sep 12, 2013 at 9:32 PM, Antoine Pitrou <solipsis at pitrou.net>
wrote:
>>
>> On Thu, 12 Sep 2013 21:26:07 +0200
>> "Giampaolo Rodola'" <g.rodola at gmail.com>
>> wrote:
>> > On Thu, Sep 12, 2013 at 9:10 PM, Antoine Pitrou <solipsis at pitrou.net>
wrote:
>> >
>> > > On Thu, 12 Sep 2013 20:59:46 +0200
>> > > "Giampaolo Rodola'" <g.rodola at gmail.com>
>> > > wrote:
>> > > > This is a follow up of a feature request which recently appeared on
>> > > psutil
>> > > > bug tracker:
>> > > > https://code.google.com/p/psutil/issues/detail?id=427
>> > > >
>> > > > I don't know whether the proposal makes sense for psutil per-se
but it
>> > > > certainly made me think about multiprocessing.cpu_count() and the
fact
>> > > that
>> > > > it currently returns the number of virtual CPUs (physical +
logical).
>> > > >
>> > > > Given that multiple processes cannot take any advantage of hyper
>> > > threading
>> > > > technology
>> > >
>> > > Of course they can.  The CPU doesn't distinguish between different
>> > > kinds of "threads", they can either belong to the same process or to
>> > > different ones.
>> >
>> >
>> > Of course you're right, I'm sorry. I should have phrased my statement
more
>> > carefully before sending the email.
>> > Then the question is whether having physical CPU cores count can be
useful.
>>
>> I suppose it doesn't hurt :-) I don't think it belongs specifically in
>> multiprocessing, though. Perhaps in the platform module?
>
>
> I'd be +0.5 for multiprocessing because:
>
> - cpu_count() is already there
> - physical_cpu_count() will likely be used by multiprocessing users only
>
> ...but my main concern was first figuring out whether it might actually
make sense to distinguish between virtual and physical CPUs in a real world
app.
>
>>
>> (unless you want to contribute psutil to the stdlib?)
>
>
> That's something I'd be happy to do if there's general approval but I
guess that's for another thread.
>
> --- Giampaolo
> https://code.google.com/p/pyftpdlib/
> https://code.google.com/p/psutil/
> https://code.google.com/p/pysendfile/
>
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> https://mail.python.org/mailman/listinfo/python-ideas
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20130912/51b47ecb/attachment.html>

From solipsis at pitrou.net  Thu Sep 12 23:15:57 2013
From: solipsis at pitrou.net (Antoine Pitrou)
Date: Thu, 12 Sep 2013 23:15:57 +0200
Subject: [Python-ideas] multiprocessing and physical CPU cores count
References: <CAFYqXL_4yDAatFxVNcqHiyMgqn2PR187Mfyez=etcDW_hK2izg@mail.gmail.com>
 <20130912211004.74746942@fsol>
 <CAFYqXL9tcYTgsyN_R+wNV=g9cjYX9zW_WH7hOFYfAfA1LZzHwQ@mail.gmail.com>
 <20130912213251.3d12b714@fsol>
 <CAFYqXL_=F=Sy2abGf-WRS_EH8=9vkffSdkXyFOAH+wrmv2=dbw@mail.gmail.com>
 <l0t7kp$273$1@ger.gmane.org>
Message-ID: <20130912231557.443806b3@fsol>

On Thu, 12 Sep 2013 22:19:45 +0200
Christian Heimes <christian at python.org>
wrote:
> Am 12.09.2013 21:51, schrieb Giampaolo Rodola':
> > I'd be +0.5 for multiprocessing because:
> > 
> > - cpu_count() is already there
> > - physical_cpu_count() will likely be used by multiprocessing users only
> > 
> > ...but my main concern was first figuring out whether it might actually
> > make sense to distinguish between virtual and physical CPUs in a real
> > world app.
> 
> I would go one step further and expose the topology of the CPUs. It's
> much, much more complicated than just physical and logical CPUs.

I'm not sure what the point would be.  From the point of the view of an
application programmer, the CPU topology is an almost esoteric detail.
This would be appropriate for a third-party "system information"
package, IMO (with memory speed, number of PCIe channels, cache
associativity, etc.).

Regards

Antoine.


From rymg19 at gmail.com  Fri Sep 13 01:31:08 2013
From: rymg19 at gmail.com (Ryan)
Date: Thu, 12 Sep 2013 18:31:08 -0500
Subject: [Python-ideas] AST Pretty Printer
Message-ID: <70c75000-23cb-482d-b12c-4610de34e0b1@email.android.com>

I always encounter one problem when dealing with Python ASTs: When I print it, it looks like Lisp(aka Lots of Irritated Superfluous Parenthesis). In short: it's a mess.

My idea is an AST pretty printer built on ast.NodeVisitor. If anyone finds this interesting, I can probably have a prototype of the class between later today and sometime tomorrow.
-- 
Sent from my Android phone with K-9 Mail. Please excuse my brevity.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20130912/8e7df07e/attachment.html>

From haoyi.sg at gmail.com  Fri Sep 13 01:40:22 2013
From: haoyi.sg at gmail.com (Haoyi Li)
Date: Thu, 12 Sep 2013 16:40:22 -0700
Subject: [Python-ideas] AST Pretty Printer
In-Reply-To: <70c75000-23cb-482d-b12c-4610de34e0b1@email.android.com>
References: <70c75000-23cb-482d-b12c-4610de34e0b1@email.android.com>
Message-ID: <CALruUQJf==8z6+kmYRKpaWQqb4rMLbNih7cvvLhw6a=+Lk+aHA@mail.gmail.com>

I would be interested in it; would have made developing macropy much easier
if there was a way to nicely print large blobs of AST (i.e. nicer than
ast.dump).


On Thu, Sep 12, 2013 at 4:31 PM, Ryan <rymg19 at gmail.com> wrote:

> I always encounter one problem when dealing with Python ASTs: When I print
> it, it looks like Lisp(aka Lots of Irritated Superfluous Parenthesis). In
> short: it's a mess.
>
> My idea is an AST pretty printer built on ast.NodeVisitor. If anyone finds
> this interesting, I can probably have a prototype of the class between
> later today and sometime tomorrow.
> --
> Sent from my Android phone with K-9 Mail. Please excuse my brevity.
>
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> https://mail.python.org/mailman/listinfo/python-ideas
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20130912/e70b46f5/attachment-0001.html>

From abarnert at yahoo.com  Fri Sep 13 02:01:46 2013
From: abarnert at yahoo.com (Andrew Barnert)
Date: Thu, 12 Sep 2013 17:01:46 -0700
Subject: [Python-ideas] AST Pretty Printer
In-Reply-To: <70c75000-23cb-482d-b12c-4610de34e0b1@email.android.com>
References: <70c75000-23cb-482d-b12c-4610de34e0b1@email.android.com>
Message-ID: <0759F11C-2B77-475D-9B2D-C71BD5A95582@yahoo.com>

On Sep 12, 2013, at 16:31, Ryan <rymg19 at gmail.com> wrote:

> I always encounter one problem when dealing with Python ASTs: When I print it, it looks like Lisp(aka Lots of Irritated Superfluous Parenthesis).

Why are the parentheses irritated? Have you been taunting them? :)

> In short: it's a mess.
> 
> My idea is an AST pretty printer built on ast.NodeVisitor. If anyone finds this interesting, I can probably have a prototype of the class between later today and sometime tomorrow.

Yes please!

I'll bet most people who play with ASTs want this, build something half-assed, never finish it, and lose it by the next time they look at ASTs again three years later... So if you finish something, that'll save effort for hundreds of people in the future (who have no idea they'll want it one day).

From anikom15 at gmail.com  Fri Sep 13 04:17:41 2013
From: anikom15 at gmail.com (=?iso-8859-1?Q?Westley_Mart=EDnez?=)
Date: Thu, 12 Sep 2013 19:17:41 -0700
Subject: [Python-ideas] multiprocessing and physical CPU cores count
In-Reply-To: <20130912231557.443806b3@fsol>
References: <CAFYqXL_4yDAatFxVNcqHiyMgqn2PR187Mfyez=etcDW_hK2izg@mail.gmail.com>
 <20130912211004.74746942@fsol>
 <CAFYqXL9tcYTgsyN_R+wNV=g9cjYX9zW_WH7hOFYfAfA1LZzHwQ@mail.gmail.com>
 <20130912213251.3d12b714@fsol>
 <CAFYqXL_=F=Sy2abGf-WRS_EH8=9vkffSdkXyFOAH+wrmv2=dbw@mail.gmail.com>
 <l0t7kp$273$1@ger.gmane.org> <20130912231557.443806b3@fsol>
Message-ID: <001801ceb027$6e811710$4b834530$@gmail.com>

> From: Python-ideas [mailto:python-ideas-
> bounces+anikom15=gmail.com at python.org] On Behalf Of Antoine Pitrou
> Sent: Thursday, September 12, 2013 2:16 PM
> To: python-ideas at python.org
> Subject: Re: [Python-ideas] multiprocessing and physical CPU cores
> count
> 
> I'm not sure what the point would be.  From the point of the view of
an
> application programmer, the CPU topology is an almost esoteric detail.
> This would be appropriate for a third-party "system information"
> package, IMO (with memory speed, number of PCIe channels, cache
> associativity, etc.).
> 
> Regards
> 
> Antoine.

Isn't the whole point of a high-level language to be able to not
have to know about the hardware?


From paultag at debian.org  Fri Sep 13 04:44:14 2013
From: paultag at debian.org (Paul Tagliamonte)
Date: Thu, 12 Sep 2013 22:44:14 -0400
Subject: [Python-ideas] AST Pretty Printer
In-Reply-To: <70c75000-23cb-482d-b12c-4610de34e0b1@email.android.com>
References: <70c75000-23cb-482d-b12c-4610de34e0b1@email.android.com>
Message-ID: <20130913024414.GA18064@leliel>

On Thu, Sep 12, 2013 at 06:31:08PM -0500, Ryan wrote:
>    I always encounter one problem when dealing with Python ASTs: When I print
>    it, it looks like Lisp(aka Lots of Irritated Superfluous Parenthesis). In
>    short: it's a mess.

Bwahah; well, to each their own.

As some might remember from PyCon this year[1], I actually wrote a lisp
front-end (OK, not *really* lisp) to Python AST. Works pretty well (even
smoothed out the 2.x and 3.x differences, so most code is valid between
the two)

    https://github.com/hylang/hy

Yes, it's hilarious. No, parens aren't ugly :)

>    My idea is an AST pretty printer built on ast.NodeVisitor. If anyone finds
>    this interesting, I can probably have a prototype of the class between
>    later today and sometime tomorrow.

I'd enjoy such a thing!


[1]: http://pyvideo.org/video/1853/friday-evening-lightning-talks
     http://hylang.org/
     http://www.youtube.com/watch?v=ulekCWvDFVI

-- 
 .''`.  Paul Tagliamonte <paultag at debian.org>
: :'  : Proud Debian Developer
`. `'`  4096R / 8F04 9AD8 2C92 066C 7352  D28A 7B58 5B30 807C 2A87
 `-     http://people.debian.org/~paultag
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 836 bytes
Desc: Digital signature
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20130912/d7add3df/attachment.sig>

From abarnert at yahoo.com  Fri Sep 13 04:37:09 2013
From: abarnert at yahoo.com (Andrew Barnert)
Date: Thu, 12 Sep 2013 19:37:09 -0700
Subject: [Python-ideas] multiprocessing and physical CPU cores count
In-Reply-To: <001801ceb027$6e811710$4b834530$@gmail.com>
References: <CAFYqXL_4yDAatFxVNcqHiyMgqn2PR187Mfyez=etcDW_hK2izg@mail.gmail.com>
 <20130912211004.74746942@fsol>
 <CAFYqXL9tcYTgsyN_R+wNV=g9cjYX9zW_WH7hOFYfAfA1LZzHwQ@mail.gmail.com>
 <20130912213251.3d12b714@fsol>
 <CAFYqXL_=F=Sy2abGf-WRS_EH8=9vkffSdkXyFOAH+wrmv2=dbw@mail.gmail.com>
 <l0t7kp$273$1@ger.gmane.org> <20130912231557.443806b3@fsol>
 <001801ceb027$6e811710$4b834530$@gmail.com>
Message-ID: <63D98CDD-EFCD-4BD3-8ADE-589B8967822C@yahoo.com>

On Sep 12, 2013, at 19:17, Westley Mart?nez <anikom15 at gmail.com> wrote:

>> From: Python-ideas [mailto:python-ideas-
>> bounces+anikom15=gmail.com at python.org] On Behalf Of Antoine Pitrou
>> Sent: Thursday, September 12, 2013 2:16 PM
>> To: python-ideas at python.org
>> Subject: Re: [Python-ideas] multiprocessing and physical CPU cores
>> count
>> 
>> I'm not sure what the point would be.  From the point of the view of
> an
>> application programmer, the CPU topology is an almost esoteric detail.
>> This would be appropriate for a third-party "system information"
>> package, IMO (with memory speed, number of PCIe channels, cache
>> associativity, etc.).
>> 
>> Regards
>> 
>> Antoine.
> 
> Isn't the whole point of a high-level language to be able to not
> have to know about the hardware?

Most programmers won't care; they'll just use the default value for multiprocessing.Pool. 

But the implementation of multiprocessing, or any similar third-party module like pp, needs that information, so it can pick a good default value so the programmers don't have to.

Also, very occasionally, you need to build a pool of processes manually. So if the module has the info, it might as well expose it.

From mal at egenix.com  Fri Sep 13 10:17:40 2013
From: mal at egenix.com (M.-A. Lemburg)
Date: Fri, 13 Sep 2013 10:17:40 +0200
Subject: [Python-ideas] multiprocessing and physical CPU cores count
In-Reply-To: <CAFYqXL_=F=Sy2abGf-WRS_EH8=9vkffSdkXyFOAH+wrmv2=dbw@mail.gmail.com>
References: <CAFYqXL_4yDAatFxVNcqHiyMgqn2PR187Mfyez=etcDW_hK2izg@mail.gmail.com>	<20130912211004.74746942@fsol>	<CAFYqXL9tcYTgsyN_R+wNV=g9cjYX9zW_WH7hOFYfAfA1LZzHwQ@mail.gmail.com>	<20130912213251.3d12b714@fsol>
 <CAFYqXL_=F=Sy2abGf-WRS_EH8=9vkffSdkXyFOAH+wrmv2=dbw@mail.gmail.com>
Message-ID: <5232CA24.5070804@egenix.com>

On 12.09.2013 21:51, Giampaolo Rodola' wrote:
> On Thu, Sep 12, 2013 at 9:32 PM, Antoine Pitrou <solipsis at pitrou.net> wrote:
> 
>>> Then the question is whether having physical CPU cores count can be
>> useful.
>>
>> I suppose it doesn't hurt :-) I don't think it belongs specifically in
>> multiprocessing, though. Perhaps in the platform module?
>>
> 
> I'd be +0.5 for multiprocessing because:
> 
> - cpu_count() is already there
> - physical_cpu_count() will likely be used by multiprocessing users only
> 
> ...but my main concern was first figuring out whether it might actually
> make sense to distinguish between virtual and physical CPUs in a real world
> app.

I'm with Antoine here: both APIs would make more sense in the
platform or os module.

Victor mentioned that there already is an os.cpu_count()
in Python 3.4, so perhaps add it there.

Do you need C code for determining the physical count ?

>> (unless you want to contribute psutil to the stdlib?)
> 
> 
> That's something I'd be happy to do if there's general approval but I guess
> that's for another thread.

I'd love to see psutils in the stdlib, but also be warned: once
the code lives in the stdlib,

a) making changes is difficult and adding new features as well,

b) you are bound by the Python release cycle.

For a package such psutil, it may actually be better to keep it
outside the stdlib, since the outside world changes regularly
and doesn't adhere to the Python release cycle or feature
for patch level releases ;-)

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Sep 13 2013)
>>> Python Projects, Consulting and Support ...   http://www.egenix.com/
>>> mxODBC.Zope/Plone.Database.Adapter ...       http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________
2013-09-11: Released eGenix PyRun 1.3.0 ...       http://egenix.com/go49
2013-09-04: Released eGenix pyOpenSSL 0.13.2 ...  http://egenix.com/go48
2013-09-20: PyCon UK 2013, Coventry, UK ...                 7 days to go

   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
    D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
           Registered at Amtsgericht Duesseldorf: HRB 46611
               http://www.egenix.com/company/contact/

From g.rodola at gmail.com  Fri Sep 13 14:54:31 2013
From: g.rodola at gmail.com (Giampaolo Rodola')
Date: Fri, 13 Sep 2013 14:54:31 +0200
Subject: [Python-ideas] multiprocessing and physical CPU cores count
In-Reply-To: <5232C94C.6090201@python.org>
References: <CAFYqXL_4yDAatFxVNcqHiyMgqn2PR187Mfyez=etcDW_hK2izg@mail.gmail.com>
 <20130912211004.74746942@fsol>
 <CAFYqXL9tcYTgsyN_R+wNV=g9cjYX9zW_WH7hOFYfAfA1LZzHwQ@mail.gmail.com>
 <20130912213251.3d12b714@fsol>
 <CAFYqXL_=F=Sy2abGf-WRS_EH8=9vkffSdkXyFOAH+wrmv2=dbw@mail.gmail.com>
 <5232C94C.6090201@python.org>
Message-ID: <CAFYqXL-+J17tcBuo2-Af3e60moqHZv1qzY42AsroQOrf_qBf7w@mail.gmail.com>

On Fri, Sep 13, 2013 at 10:14 AM, M.-A. Lemburg <mal at python.org> wrote:

> On 12.09.2013 21:51, Giampaolo Rodola' wrote:
> > On Thu, Sep 12, 2013 at 9:32 PM, Antoine Pitrou <solipsis at pitrou.net>
> wrote:
> >
> >>> Then the question is whether having physical CPU cores count can be
> >> useful.
> >>
> >> I suppose it doesn't hurt :-) I don't think it belongs specifically in
> >> multiprocessing, though. Perhaps in the platform module?
> >>
> >
> > I'd be +0.5 for multiprocessing because:
> >
> > - cpu_count() is already there
> > - physical_cpu_count() will likely be used by multiprocessing users only
> >
> > ...but my main concern was first figuring out whether it might actually
> > make sense to distinguish between virtual and physical CPUs in a real
> world
> > app.
>
> I'm with Antoine here: both APIs would make more sense in the
> platform module.


In the end it appears the os module would probably be better as cpu_count()
already ended up there (http://bugs.python.org/issue17914) as pointed out
by Victor a couple of emails ago.
I have the impression no one is opposed so I can probably start working on
a patch and submit it on the bug tracker.


> Do you need C code for determining the physical count ?


Yes, except on Linux where you'll just read /proc/cpuinfo.


>  >> (unless you want to contribute psutil to the stdlib?)
> >
> >
> > That's something I'd be happy to do if there's general approval but I
> guess
> > that's for another thread.
>
> I'd love to see psutils in the stdlib, but also be warned: once
> the code lives in the stdlib,
>
> a) making changes is difficult and adding new features as well,
>
> b) you are bound by the Python release cycle.
>
> For a package such psutil, it may actually be better to keep it
> outside the stdlib, since the outside world changes regularly
> and doesn't adhere to the Python release cycle or feature
> for patch level releases ;-)
>

Yeah, you're probably right,and there's at least a couple of high priority
functionalities I'd like to add first (to say one: dragonfly/open/net BSD
support).


--- Giampaolo
https://code.google.com/p/pyftpdlib/
https://code.google.com/p/psutil/
https://code.google.com/p/pysendfile/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20130913/fbd2a213/attachment.html>

From rymg19 at gmail.com  Fri Sep 13 18:14:19 2013
From: rymg19 at gmail.com (Ryan)
Date: Fri, 13 Sep 2013 11:14:19 -0500
Subject: [Python-ideas] AST Pretty Printer
In-Reply-To: <20130913024414.GA18064@leliel>
References: <70c75000-23cb-482d-b12c-4610de34e0b1@email.android.com>
 <20130913024414.GA18064@leliel>
Message-ID: <1701a638-4ddc-492d-99ef-f7cc2042d7f4@email.android.com>

Honestly, it wasn't really the parenthesis that made me hate Lisp; it was everything else. The lack of line numbers,... It made Perl look nice for a second there.

Paul Tagliamonte <paultag at debian.org> wrote:

>On Thu, Sep 12, 2013 at 06:31:08PM -0500, Ryan wrote:
>>    I always encounter one problem when dealing with Python ASTs: When
>I print
>>    it, it looks like Lisp(aka Lots of Irritated Superfluous
>Parenthesis). In
>>    short: it's a mess.
>
>Bwahah; well, to each their own.
>
>As some might remember from PyCon this year[1], I actually wrote a lisp
>front-end (OK, not *really* lisp) to Python AST. Works pretty well
>(even
>smoothed out the 2.x and 3.x differences, so most code is valid between
>the two)
>
>    https://github.com/hylang/hy
>
>Yes, it's hilarious. No, parens aren't ugly :)
>
>>    My idea is an AST pretty printer built on ast.NodeVisitor. If
>anyone finds
>>    this interesting, I can probably have a prototype of the class
>between
>>    later today and sometime tomorrow.
>
>I'd enjoy such a thing!
>
>
>[1]: http://pyvideo.org/video/1853/friday-evening-lightning-talks
>     http://hylang.org/
>     http://www.youtube.com/watch?v=ulekCWvDFVI
>
>-- 
> .''`.  Paul Tagliamonte <paultag at debian.org>
>: :'  : Proud Debian Developer
>`. `'`  4096R / 8F04 9AD8 2C92 066C 7352  D28A 7B58 5B30 807C 2A87
> `-     http://people.debian.org/~paultag

-- 
Sent from my Android phone with K-9 Mail. Please excuse my brevity.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20130913/7d91bb6d/attachment.html>

From rymg19 at gmail.com  Sat Sep 14 02:25:25 2013
From: rymg19 at gmail.com (Ryan Gonzalez)
Date: Fri, 13 Sep 2013 19:25:25 -0500
Subject: [Python-ideas] AST Pretty Printer
Message-ID: <CAO41-mNSEDpyYEkL9XouEYYZs3pzkbK9Wg8L4Ezmg-kfK48PhA@mail.gmail.com>

Note: I didn't know who to reply to, so I just restarted the thread with
the same subject.

Here is the code:

class astpp(ast.NodeVisitor):
    def __init__(self, tree):
        super(ast.NodeVisitor, self).__init__()
        self.indent = 0
        self.visit(tree)
    def _print(self, text):
        print (' ' * self.indent + text)
    def generic_visit(self, node):
        self._print(node.__class__.__name__ + '(')
        self.indent += 1
        for name, item in node.__dict__.iteritems():
            if isinstance(item, ast.AST):
                self._print(name + '=')
                self.indent += 1
                self.generic_visit(item)
                self.indent -= 1
            elif isinstance(item, list):
                self._print(name + '=[')
                self.indent += 1
                [self.generic_visit(attr) for attr in item]
                self.indent -= 1
                self._print(']')
            else:
                self._print(name + '=' + str(item))
        self.indent -= 1
        self._print(')')

Sample usage:
    astpp('''len('My friends are my power!')''')

Output:

Module(
 body=[
  Expr(
   lineno=1
   value=
    Call(
     col_offset=0
     starargs=None
     args=[
      Str(
       s=My friends are my power!
       lineno=1
       col_offset=4
      )
     ]
     lineno=1
     func=
      Name(
       ctx=
        Load(
        )
       id=len
       col_offset=0
       lineno=1
      )
     kwargs=None
     keywords=[
     ]
    )
   col_offset=0
  )
 ]
)

-- 
Ryan
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20130913/59b969ad/attachment.html>

From clay.sweetser at gmail.com  Sat Sep 14 07:36:25 2013
From: clay.sweetser at gmail.com (Clay Sweetser)
Date: Sat, 14 Sep 2013 01:36:25 -0400
Subject: [Python-ideas] Style for multi-line generator expressions
Message-ID: <CANW+cAXXLrhZ1wgVhwPVMh+8prQ1Ai9VVU6CQt1P5NCxkAeuJw@mail.gmail.com>

PEP 8 currently lacks any suggestions for how multi-line generator
expressions and list comprehensions should be formatted. In the absence of
any official style suggestion (that I can find), I suggest the style used
the most in the standard library.

[<expression>
 for <target_list> in <container>
 if <condition>]

Note, lines could still be combined where it makes sense, eg, the first two
lines could be combined if they aren't too long.

-- 
"Evil begins when you begin to treat people as things." - Terry Pratchett
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20130914/0d075ee2/attachment-0001.html>

From g.brandl at gmx.net  Sat Sep 14 07:49:46 2013
From: g.brandl at gmx.net (Georg Brandl)
Date: Sat, 14 Sep 2013 07:49:46 +0200
Subject: [Python-ideas] Style for multi-line generator expressions
In-Reply-To: <CANW+cAXXLrhZ1wgVhwPVMh+8prQ1Ai9VVU6CQt1P5NCxkAeuJw@mail.gmail.com>
References: <CANW+cAXXLrhZ1wgVhwPVMh+8prQ1Ai9VVU6CQt1P5NCxkAeuJw@mail.gmail.com>
Message-ID: <l10tcc$ged$1@ger.gmane.org>

On 09/14/2013 07:36 AM, Clay Sweetser wrote:
> PEP 8 currently lacks any suggestions for how multi-line generator expressions
> and list comprehensions should be formatted. In the absence of any official
> style suggestion (that I can find), I suggest the style used the most in the
> standard library.
>
> [<expression>
>  for <target_list> in <container>
>  if <condition>]
> 
> Note, lines could still be combined where it makes sense, eg, the first two
> lines could be combined if they aren't too long.

But that amounts to simply respecting the line length limit and using
logical breakpoints.  I don't think that requires a special mention in
PEP 8.

cheers,
Georg


From storchaka at gmail.com  Sat Sep 14 17:09:15 2013
From: storchaka at gmail.com (Serhiy Storchaka)
Date: Sat, 14 Sep 2013 18:09:15 +0300
Subject: [Python-ideas] Add dict.getkey() and set.get()
Message-ID: <l11u5v$j9n$1@ger.gmane.org>

I propose to add two methods:

dict.getkey(key) returns original key stored in the dict which is equal 
to specified key. E.g.

 >>> d = {2: 'a', 5.0: 'b'}
 >>> d.getkey(2.0)
2
 >>> d.getkey(5)
5.0
 >>> d.getkey(17)
Traceback (most recent call last):
   File "<stdin>", line 1, in <module>
KeyError: 17

set.get(value) returns original value stored in the set which is equal 
to specified value. E.g.

 >>> s = {2, 5.0}
 >>> s.get(2.0)
2
 >>> s.get(5)
5.0
 >>> s.get(17)
Traceback (most recent call last):
   File "<stdin>", line 1, in <module>
KeyError: 17


From victor.stinner at gmail.com  Sat Sep 14 18:19:48 2013
From: victor.stinner at gmail.com (Victor Stinner)
Date: Sat, 14 Sep 2013 18:19:48 +0200
Subject: [Python-ideas] Add dict.getkey() and set.get()
In-Reply-To: <l11u5v$j9n$1@ger.gmane.org>
References: <l11u5v$j9n$1@ger.gmane.org>
Message-ID: <CAMpsgwaUL-6n6dBv2wxWnu4_H+qjrU5rQjrqj-Uk_5vyp7B-bw@mail.gmail.com>

What is the use case of such methods? Why not using dict.kets() and
tuple(set)?

Victor
 Le 14 sept. 2013 17:09, "Serhiy Storchaka" <storchaka at gmail.com> a ?crit :

> I propose to add two methods:
>
> dict.getkey(key) returns original key stored in the dict which is equal to
> specified key. E.g.
>
> >>> d = {2: 'a', 5.0: 'b'}
> >>> d.getkey(2.0)
> 2
> >>> d.getkey(5)
> 5.0
> >>> d.getkey(17)
> Traceback (most recent call last):
>   File "<stdin>", line 1, in <module>
> KeyError: 17
>
> set.get(value) returns original value stored in the set which is equal to
> specified value. E.g.
>
> >>> s = {2, 5.0}
> >>> s.get(2.0)
> 2
> >>> s.get(5)
> 5.0
> >>> s.get(17)
> Traceback (most recent call last):
>   File "<stdin>", line 1, in <module>
> KeyError: 17
>
> ______________________________**_________________
> Python-ideas mailing list
> Python-ideas at python.org
> https://mail.python.org/**mailman/listinfo/python-ideas<https://mail.python.org/mailman/listinfo/python-ideas>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20130914/b1fb480b/attachment.html>

From mertz at gnosis.cx  Sat Sep 14 18:52:30 2013
From: mertz at gnosis.cx (David Mertz)
Date: Sat, 14 Sep 2013 09:52:30 -0700
Subject: [Python-ideas] Add dict.getkey() and set.get()
In-Reply-To: <l11u5v$j9n$1@ger.gmane.org>
References: <l11u5v$j9n$1@ger.gmane.org>
Message-ID: <CAEbHw4Zb860agw+W4qb=O6vnc+4v-OkdRkwebJTStY2zoTdDig@mail.gmail.com>

One way you can spell it currently is:

>>> getexact = lambda m, v: [x for x in m if x==v][0]
>>> d = {2: 'a', 5.0: 'b'}
>>> s = {2, 5.0}
>>> getexact(d, 2.0)
2
>>> getexact(d, 17)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "<stdin>", line 1, in <lambda>
IndexError: list index out of range
>>> getexact(s, 2.0)
2

It's true that the exception leaves a little to be desired here.  If you
actually want to use this function much, and worry about the extra equality
comparisons in my one-line version, maybe:

def getexact(m, v):
    for x in m:
        if x==v: return x
    else:
        raise KeyError(v)

Like Victor, I'd want to see an actual use case before wanting these as
extra methods of the actual data types.  My function versions have the
advantage also that they work on ANY iterable, not only dict and set, and
also hence have one spelling for the same conceptual operation.


On Sat, Sep 14, 2013 at 8:09 AM, Serhiy Storchaka <storchaka at gmail.com>wrote:

> I propose to add two methods:
>
> dict.getkey(key) returns original key stored in the dict which is equal to
> specified key. E.g.
>
> >>> d = {2: 'a', 5.0: 'b'}
> >>> d.getkey(2.0)
> 2
> >>> d.getkey(5)
> 5.0
> >>> d.getkey(17)
> Traceback (most recent call last):
>   File "<stdin>", line 1, in <module>
> KeyError: 17
>
> set.get(value) returns original value stored in the set which is equal to
> specified value. E.g.
>
> >>> s = {2, 5.0}
> >>> s.get(2.0)
> 2
> >>> s.get(5)
> 5.0
> >>> s.get(17)
> Traceback (most recent call last):
>   File "<stdin>", line 1, in <module>
> KeyError: 17
>
> ______________________________**_________________
> Python-ideas mailing list
> Python-ideas at python.org
> https://mail.python.org/**mailman/listinfo/python-ideas<https://mail.python.org/mailman/listinfo/python-ideas>
>


-- 
Keeping medicines from the bloodstreams of the sick; food
from the bellies of the hungry; books from the hands of the
uneducated; technology from the underdeveloped; and putting
advocates of freedom in prisons.  Intellectual property is
to the 21st century what the slave trade was to the 16th.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20130914/f9785c28/attachment.html>

From ncoghlan at gmail.com  Sat Sep 14 18:59:38 2013
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Sun, 15 Sep 2013 02:59:38 +1000
Subject: [Python-ideas] Add dict.getkey() and set.get()
In-Reply-To: <CAEbHw4Zb860agw+W4qb=O6vnc+4v-OkdRkwebJTStY2zoTdDig@mail.gmail.com>
References: <l11u5v$j9n$1@ger.gmane.org>
 <CAEbHw4Zb860agw+W4qb=O6vnc+4v-OkdRkwebJTStY2zoTdDig@mail.gmail.com>
Message-ID: <CADiSq7fuCvHrj95=p0nqHb9FM8j7ronBk6LVW05krGaNODnGCA@mail.gmail.com>

I'll also note that in any case where I've needed to be able to
determine the "canonical form" of a key, I've known the transform to
use (e.g. str.lower, int, operator.index) rather than (or in addition
to) having a container holding the canonical forms.

If this is inspired by the transform dict PEP, then it may be better
to expose the conversion function rather than a way to ask the
container to do the conversion itself (yes, I'm aware I suggested the
latter approach the other day - this thread is making me reconsider).

Cheers,
Nick.

From mertz at gnosis.cx  Sat Sep 14 19:04:03 2013
From: mertz at gnosis.cx (David Mertz)
Date: Sat, 14 Sep 2013 10:04:03 -0700
Subject: [Python-ideas] Add dict.getkey() and set.get()
In-Reply-To: <CAEbHw4Zb860agw+W4qb=O6vnc+4v-OkdRkwebJTStY2zoTdDig@mail.gmail.com>
References: <l11u5v$j9n$1@ger.gmane.org>
 <CAEbHw4Zb860agw+W4qb=O6vnc+4v-OkdRkwebJTStY2zoTdDig@mail.gmail.com>
Message-ID: <CAEbHw4Z=0MOqVOeWqLKFJ2_dMAthPntVVNAKmcS83eRk7wpODA@mail.gmail.com>

Perhaps being pedantic, but there is not necessarily ONE key in the
original collection which is equal to the search value:

>>> d = {2: 'a', 5.0: 'b', 7.2: 'low', 7.6: 'high'}
>>> class FuzzyNumber(float):
...     def __eq__(self, other):
...         return abs(self-other) < 0.5
...
>>> fn = FuzzyNumber(7.5)
>>> getexact(d, fn) # What gets returned here?!
7.6

ANY implementation of this idea would either have to pick the arbitrary
first match, or ... well, something else.  Equality isn't actually
transitive across all Python objects... and even my quick example isn't a
completely absurd data type (it would need to be fleshed out better, but a
FuzzyNumber could well have sensible purposes).


On Sat, Sep 14, 2013 at 9:52 AM, David Mertz <mertz at gnosis.cx> wrote:

> One way you can spell it currently is:
>
> >>> getexact = lambda m, v: [x for x in m if x==v][0]
> >>> d = {2: 'a', 5.0: 'b'}
> >>> s = {2, 5.0}
> >>> getexact(d, 2.0)
> 2
> >>> getexact(d, 17)
> Traceback (most recent call last):
>   File "<stdin>", line 1, in <module>
>   File "<stdin>", line 1, in <lambda>
> IndexError: list index out of range
> >>> getexact(s, 2.0)
> 2
>
> It's true that the exception leaves a little to be desired here.  If you
> actually want to use this function much, and worry about the extra equality
> comparisons in my one-line version, maybe:
>
> def getexact(m, v):
>     for x in m:
>         if x==v: return x
>     else:
>         raise KeyError(v)
>
> Like Victor, I'd want to see an actual use case before wanting these as
> extra methods of the actual data types.  My function versions have the
> advantage also that they work on ANY iterable, not only dict and set, and
> also hence have one spelling for the same conceptual operation.
>
>
> On Sat, Sep 14, 2013 at 8:09 AM, Serhiy Storchaka <storchaka at gmail.com>wrote:
>
>> I propose to add two methods:
>>
>> dict.getkey(key) returns original key stored in the dict which is equal
>> to specified key. E.g.
>>
>> >>> d = {2: 'a', 5.0: 'b'}
>> >>> d.getkey(2.0)
>> 2
>> >>> d.getkey(5)
>> 5.0
>> >>> d.getkey(17)
>> Traceback (most recent call last):
>>   File "<stdin>", line 1, in <module>
>> KeyError: 17
>>
>> set.get(value) returns original value stored in the set which is equal to
>> specified value. E.g.
>>
>> >>> s = {2, 5.0}
>> >>> s.get(2.0)
>> 2
>> >>> s.get(5)
>> 5.0
>> >>> s.get(17)
>> Traceback (most recent call last):
>>   File "<stdin>", line 1, in <module>
>> KeyError: 17
>>
>> ______________________________**_________________
>> Python-ideas mailing list
>> Python-ideas at python.org
>> https://mail.python.org/**mailman/listinfo/python-ideas<https://mail.python.org/mailman/listinfo/python-ideas>
>>
>
>
>
> --
> Keeping medicines from the bloodstreams of the sick; food
> from the bellies of the hungry; books from the hands of the
> uneducated; technology from the underdeveloped; and putting
> advocates of freedom in prisons.  Intellectual property is
> to the 21st century what the slave trade was to the 16th.
>


-- 
Keeping medicines from the bloodstreams of the sick; food
from the bellies of the hungry; books from the hands of the
uneducated; technology from the underdeveloped; and putting
advocates of freedom in prisons.  Intellectual property is
to the 21st century what the slave trade was to the 16th.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20130914/45ff2cfa/attachment-0001.html>

From python at mrabarnett.plus.com  Sat Sep 14 19:27:00 2013
From: python at mrabarnett.plus.com (MRAB)
Date: Sat, 14 Sep 2013 18:27:00 +0100
Subject: [Python-ideas] Add dict.getkey() and set.get()
In-Reply-To: <l11u5v$j9n$1@ger.gmane.org>
References: <l11u5v$j9n$1@ger.gmane.org>
Message-ID: <52349C64.5050904@mrabarnett.plus.com>

On 14/09/2013 16:09, Serhiy Storchaka wrote:
> I propose to add two methods:
>
> dict.getkey(key) returns original key stored in the dict which is equal
> to specified key. E.g.
>
>   >>> d = {2: 'a', 5.0: 'b'}
>   >>> d.getkey(2.0)
> 2
>   >>> d.getkey(5)
> 5.0
>   >>> d.getkey(17)
> Traceback (most recent call last):
>     File "<stdin>", line 1, in <module>
> KeyError: 17
>
> set.get(value) returns original value stored in the set which is equal
> to specified value. E.g.
>
>   >>> s = {2, 5.0}
>   >>> s.get(2.0)
> 2
>   >>> s.get(5)
> 5.0
>   >>> s.get(17)
> Traceback (most recent call last):
>     File "<stdin>", line 1, in <module>
> KeyError: 17
>
There's discussion on python-dev about adding TransformDict with a
.getitem method. Adding a .getkey method to dicts would not be
consistent with that; really they should either both have .getitem or
both have .getkey.

From tjreedy at udel.edu  Sat Sep 14 19:56:25 2013
From: tjreedy at udel.edu (Terry Reedy)
Date: Sat, 14 Sep 2013 13:56:25 -0400
Subject: [Python-ideas] Add dict.getkey() and set.get()
In-Reply-To: <CAEbHw4Z=0MOqVOeWqLKFJ2_dMAthPntVVNAKmcS83eRk7wpODA@mail.gmail.com>
References: <l11u5v$j9n$1@ger.gmane.org>
 <CAEbHw4Zb860agw+W4qb=O6vnc+4v-OkdRkwebJTStY2zoTdDig@mail.gmail.com>
 <CAEbHw4Z=0MOqVOeWqLKFJ2_dMAthPntVVNAKmcS83eRk7wpODA@mail.gmail.com>
Message-ID: <l12805$kj1$1@ger.gmane.org>

On 9/14/2013 1:04 PM, David Mertz wrote:
> Perhaps being pedantic, but there is not necessarily ONE key in the
> original collection which is equal to the search value:

Sets presume that == is an equivalence relation. When that is not true, 
as for such FuzzyNumbers (which break transitivity), they should not be 
used with sets at all, as many operations will be somewhat broken. For 
one thing, the composition of such 'sets' will depend on the order of 
addition.

>  >>> d = {2: 'a', 5.0: 'b', 7.2: 'low', 7.6: 'high'}
>  >>> class FuzzyNumber(float):
> ...     def __eq__(self, other):
> ...         return abs(self-other) < 0.5

Better to call this method 'similar' or 'is_similar', since similarity 
is not expected to be transitive.

>  >>> fn = FuzzyNumber(7.5)
>  >>> getexact(d, fn) # What gets returned here?!
> 7.6
>
> ANY implementation of this idea would either have to pick the arbitrary
> first match, or ... well, something else.  Equality isn't actually
> transitive across all Python objects...

Except for NaNs, the non-floats called floats for the benefit of 
languages with typed operations, I believe equality is (at least as far 
as possible) transitive for the built-in classes as delivered. We fixed
   0.0 == 0 == Decimal(0) != 0.0
because of the problems caused by the non-transitivity. Avoiding 
breaking transitivity was one of the design constraints of the Enums.

 > and even my quick example isn't
> a completely absurd data type (it would need to be fleshed out better,
> but a FuzzyNumber could well have sensible purposes).

The only absurd thing is calling similarity 'equality' ;=).

-- 
Terry Jan Reedy


From tjreedy at udel.edu  Sat Sep 14 20:03:51 2013
From: tjreedy at udel.edu (Terry Reedy)
Date: Sat, 14 Sep 2013 14:03:51 -0400
Subject: [Python-ideas] Add dict.getkey() and set.get()
In-Reply-To: <l11u5v$j9n$1@ger.gmane.org>
References: <l11u5v$j9n$1@ger.gmane.org>
Message-ID: <l128e2$p3f$1@ger.gmane.org>

On 9/14/2013 11:09 AM, Serhiy Storchaka wrote:
> I propose to add two methods:
>
> dict.getkey(key) returns original key stored in the dict which is equal
> to specified key. E.g.
>
>  >>> d = {2: 'a', 5.0: 'b'}
>  >>> d.getkey(2.0)
> 2
>  >>> d.getkey(5)
> 5.0
>  >>> d.getkey(17)
> Traceback (most recent call last):
>    File "<stdin>", line 1, in <module>
> KeyError: 17
>
> set.get(value) returns original value stored in the set which is equal
> to specified value. E.g.
>
>  >>> s = {2, 5.0}
>  >>> s.get(2.0)
> 2
>  >>> s.get(5)
> 5.0
>  >>> s.get(17)
> Traceback (most recent call last):
>    File "<stdin>", line 1, in <module>
> KeyError: 17

If sets had get() as you described, there would be no need for 
dict.getkey as long as the set-like key view had get(). This would be 
the appropriate place since get() has nothing to do with the values but 
only the set of keys.

-- 
Terry Jan Reedy


From mertz at gnosis.cx  Sat Sep 14 20:20:53 2013
From: mertz at gnosis.cx (David Mertz)
Date: Sat, 14 Sep 2013 11:20:53 -0700
Subject: [Python-ideas] Add dict.getkey() and set.get()
In-Reply-To: <l12805$kj1$1@ger.gmane.org>
References: <l11u5v$j9n$1@ger.gmane.org>
 <CAEbHw4Zb860agw+W4qb=O6vnc+4v-OkdRkwebJTStY2zoTdDig@mail.gmail.com>
 <CAEbHw4Z=0MOqVOeWqLKFJ2_dMAthPntVVNAKmcS83eRk7wpODA@mail.gmail.com>
 <l12805$kj1$1@ger.gmane.org>
Message-ID: <CAEbHw4ZL8AOiv99ANqMpekiAwJM1QGDuLV3d-aB77uZ102aZbA@mail.gmail.com>

On Sat, Sep 14, 2013 at 10:56 AM, Terry Reedy <tjreedy at udel.edu> wrote:

> Sets presume that == is an equivalence relation. When that is not true, as
> for such FuzzyNumbers (which break transitivity), they should not be used
> with sets at all, as many operations will be somewhat broken. For one
> thing, the composition of such 'sets' will depend on the order of addition.
>

I'm not putting any huge weight in my toy class.  But notice that it
deliberately is NOT hashable, hence cannot make it into a set (or dict key):

>>> fn = FuzzyNumber(7.5)
>>> d = {fn: "fuzzy"}
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: unhashable type: 'FuzzyNumber'

What the hypothetical class might be useful for, in my passing thought, is
for e.g. an imprecise measurement of something.  Any value that is "close"
is equal within the error of measurement. Hence perhaps we want to be able
to say:

>>> [x for x in (1,2, 7.2, 7.3, 7.9, 8.0, 9) if x==fn]
[7.2, 7.3, 7.9]

I do know one *could* spell that like:

>>> [x for x in (1,2, 7.2, 7.3, 7.9, 8.0, 9) if ref_val.close_to(x)]

But anyway, whether or not my FuzzyNumber class is a *good* idea, it is
something that end users *could* do as long as we give them an .__eq__()
magic method to play with.  Hence a 'getexact()' function or a
dict.getkey() method would have to do SOMETHING when presented with such a
transitivity-of-equality-breaking object.


>
>
>
>   >>> d = {2: 'a', 5.0: 'b', 7.2: 'low', 7.6: 'high'}
>>  >>> class FuzzyNumber(float):
>> ...     def __eq__(self, other):
>> ...         return abs(self-other) < 0.5
>>
>
> Better to call this method 'similar' or 'is_similar', since similarity is
> not expected to be transitive.
>
>
>   >>> fn = FuzzyNumber(7.5)
>>  >>> getexact(d, fn) # What gets returned here?!
>> 7.6
>>
>> ANY implementation of this idea would either have to pick the arbitrary
>> first match, or ... well, something else.  Equality isn't actually
>> transitive across all Python objects...
>>
>
> Except for NaNs, the non-floats called floats for the benefit of languages
> with typed operations, I believe equality is (at least as far as possible)
> transitive for the built-in classes as delivered. We fixed
>   0.0 == 0 == Decimal(0) != 0.0
> because of the problems caused by the non-transitivity. Avoiding breaking
> transitivity was one of the design constraints of the Enums.
>
>
> > and even my quick example isn't
>
>> a completely absurd data type (it would need to be fleshed out better,
>> but a FuzzyNumber could well have sensible purposes).
>>
>
> The only absurd thing is calling similarity 'equality' ;=).
>
> --
> Terry Jan Reedy
>
>
> ______________________________**_________________
> Python-ideas mailing list
> Python-ideas at python.org
> https://mail.python.org/**mailman/listinfo/python-ideas<https://mail.python.org/mailman/listinfo/python-ideas>
>


-- 
Keeping medicines from the bloodstreams of the sick; food
from the bellies of the hungry; books from the hands of the
uneducated; technology from the underdeveloped; and putting
advocates of freedom in prisons.  Intellectual property is
to the 21st century what the slave trade was to the 16th.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20130914/fad2cbf7/attachment.html>

From davidhalter88 at gmail.com  Sat Sep 14 21:15:14 2013
From: davidhalter88 at gmail.com (David Halter)
Date: Sat, 14 Sep 2013 23:45:14 +0430
Subject: [Python-ideas] Should we improve `dir`?
Message-ID: <CAA=HWYgAn-Ub2pdG1w7=x6uxr4D78tciPx_yHYZfkPc2payQqw@mail.gmail.com>

I recently stumbled over `dir()` not working correctly in the case of
classes:

http://jedidjah.ch/code/2013/9/8/wrong_dir_function/

In short:

`dir` doesn't list the `type` methods, which it should in my opinion,
because there are very important attributes in there like `__name__` or
`__bases__`.

This led to some confusion in the past, e.g.
http://www.gossamer-threads.com/lists/python/python/507363.

The long version is in the above link.

After discussions, I realized that I should probably bring this up in
python-ideas, I think the current implementation can be very confusing for
people trying to introspect classes with `dir`, which is IMHO its typical
use case.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20130914/cec3b250/attachment-0001.html>

From storchaka at gmail.com  Sat Sep 14 21:48:21 2013
From: storchaka at gmail.com (Serhiy Storchaka)
Date: Sat, 14 Sep 2013 22:48:21 +0300
Subject: [Python-ideas] Add dict.getkey() and set.get()
In-Reply-To: <l128e2$p3f$1@ger.gmane.org>
References: <l11u5v$j9n$1@ger.gmane.org> <l128e2$p3f$1@ger.gmane.org>
Message-ID: <l12eh4$4vq$1@ger.gmane.org>

14.09.13 21:03, Terry Reedy ???????(??):
> If sets had get() as you described, there would be no need for
> dict.getkey as long as the set-like key view had get(). This would be
> the appropriate place since get() has nothing to do with the values but
> only the set of keys.

Agree.


From storchaka at gmail.com  Sat Sep 14 22:10:35 2013
From: storchaka at gmail.com (Serhiy Storchaka)
Date: Sat, 14 Sep 2013 23:10:35 +0300
Subject: [Python-ideas] Add dict.getkey() and set.get()
In-Reply-To: <CAMpsgwaUL-6n6dBv2wxWnu4_H+qjrU5rQjrqj-Uk_5vyp7B-bw@mail.gmail.com>
References: <l11u5v$j9n$1@ger.gmane.org>
 <CAMpsgwaUL-6n6dBv2wxWnu4_H+qjrU5rQjrqj-Uk_5vyp7B-bw@mail.gmail.com>
Message-ID: <l12fqv$e29$1@ger.gmane.org>

14.09.13 19:19, Victor Stinner ???????(??):
> What is the use case of such methods? Why not using dict.kets() and
> tuple(set)?

Scanning dict.keys() or set has linear complexity. dict.getkey() and 
set.get() can be easily implemented with O(1).

I have no good use cases. Perhaps every problem which requires 
dict.getkey() or set.get() can be solved with additional synchronized 
dict which maps key to key. This is also true for TransformDict.


From mistersheik at gmail.com  Sat Sep 14 22:10:00 2013
From: mistersheik at gmail.com (Neil Girdhar)
Date: Sat, 14 Sep 2013 16:10:00 -0400
Subject: [Python-ideas] Replace option set/get methods through the
 standard library with a ChainMap; add a context manager to ChainMap
In-Reply-To: <CAHVvXxS0kQHNYvzvBEd000sQCghOc+iDx7uyd+SMa9uOOM23oQ@mail.gmail.com>
References: <8a0c3260-7c85-46ba-9dae-102e7fceb8f4@googlegroups.com>
 <CAHVvXxRsVFzpVFLrByx=jd=FF3zh0d8eoLc1iDLityoQHCowFg@mail.gmail.com>
 <CAA68w_kzQhVfp9z5mBL0wXrF3OruGjQwZoSiFcVAKyCkH76iWw@mail.gmail.com>
 <CAHVvXxS8S68_7jK+i+LC_iSZDxosBEsPb4s5RH27Ks_KDEhGkg@mail.gmail.com>
 <CAA68w_nD5tNv9zaAyj7jhS_ngZH2q_+s6vZxWYcrJLEWUhDdxg@mail.gmail.com>
 <CAHVvXxS0kQHNYvzvBEd000sQCghOc+iDx7uyd+SMa9uOOM23oQ@mail.gmail.com>
Message-ID: <CAA68w_kqQrzKoA+OAKCRNuLOrSr4P4qFhPoPM=8HJJRgSAxxaw@mail.gmail.com>

ChainMap supports the pattern of a "dictionary that supports temporarily
overriding items".  The method I'm suggesting is as follows:

class ChainMap:
    @contextmanager
    def child_contex(self, **kwargs):
        self.add_child(**kwargs)
        try:
            yield
        finally:
            self = self.parents

Then, when updating numpy.printoptions:

with numpy.printoptions.child_context(precision=23):
    ...  # do something

With a regular dict, numpy would end up implementing the necessary context
manager once for each set of options instead of factoring that code out
into ChainMap.


On Thu, Sep 12, 2013 at 6:24 AM, Oscar Benjamin
<oscar.j.benjamin at gmail.com>wrote:

> On 12 September 2013 11:14, Neil Girdhar <mistersheik at gmail.com> wrote:
> >
> > Thank you.  I will ask there about adding numpy context managers.
>  However,
> > the extra member function to ChainMap to use it as a context manager
> would
> > be a question for this mailing list, right?
>
> Perhaps you could spell out that part of the idea in more detail then.
> Why in particular would it need to be a ChainMap and not a regular
> dict? Does the method return a new ChainMap instance? What would be
> seen by other code that holds references to the same ChainMap?
>
>
> Oscar
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20130914/4bf47e7b/attachment.html>

From jbvsmo at gmail.com  Sat Sep 14 22:21:52 2013
From: jbvsmo at gmail.com (=?ISO-8859-1?Q?Jo=E3o_Bernardo?=)
Date: Sat, 14 Sep 2013 17:21:52 -0300
Subject: [Python-ideas] Should we improve `dir`?
In-Reply-To: <CAA=HWYgAn-Ub2pdG1w7=x6uxr4D78tciPx_yHYZfkPc2payQqw@mail.gmail.com>
References: <CAA=HWYgAn-Ub2pdG1w7=x6uxr4D78tciPx_yHYZfkPc2payQqw@mail.gmail.com>
Message-ID: <CAOyAWgj15Ohbphz=2UH0rAk7v-=6RF1KsSyZf1YREzP2woxwZQ@mail.gmail.com>

2013/9/14 David Halter <davidhalter88 at gmail.com>

> I recently stumbled over `dir()` not working correctly in the case of
> classes:
>
> http://jedidjah.ch/code/2013/9/8/wrong_dir_function/
>
>
That's expected and the right behavior IMHO. If you need your classes (or
metaclasses) to behave differently, set the __dir__ method.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20130914/6b2b5718/attachment.html>

From tjreedy at udel.edu  Sat Sep 14 22:22:09 2013
From: tjreedy at udel.edu (Terry Reedy)
Date: Sat, 14 Sep 2013 16:22:09 -0400
Subject: [Python-ideas] Add dict.getkey() and set.get()
In-Reply-To: <CAEbHw4ZL8AOiv99ANqMpekiAwJM1QGDuLV3d-aB77uZ102aZbA@mail.gmail.com>
References: <l11u5v$j9n$1@ger.gmane.org>
 <CAEbHw4Zb860agw+W4qb=O6vnc+4v-OkdRkwebJTStY2zoTdDig@mail.gmail.com>
 <CAEbHw4Z=0MOqVOeWqLKFJ2_dMAthPntVVNAKmcS83eRk7wpODA@mail.gmail.com>
 <l12805$kj1$1@ger.gmane.org>
 <CAEbHw4ZL8AOiv99ANqMpekiAwJM1QGDuLV3d-aB77uZ102aZbA@mail.gmail.com>
Message-ID: <l12ghc$gtn$1@ger.gmane.org>

On 9/14/2013 2:20 PM, David Mertz wrote:

> I'm not putting any huge weight in my toy class.  But notice that it
> deliberately is NOT hashable, hence cannot make it into a set (or dict key):
...
>  >>> [x for x in (1,2, 7.2, 7.3, 7.9, 8.0, 9) if ref_val.close_to(x)]
...
> But anyway, whether or not my FuzzyNumber class is a *good* idea, it is
> something that end users *could* do as long as we give them an .__eq__()
> magic method to play with.  Hence a 'getexact()' function or a
> dict.getkey() method would have to do SOMETHING when presented with such
> a transitivity-of-equality-breaking object.

I would expect the proposed set/dict methods to work by hashing the 
target. Otherwise, they would be no be justified as *methods*, as the 
generic function should be, as you said, a function.

The generic 'iterate and return the first item matching the target' will 
either return the first match or do whatever the else: clause dictates. 
I do not see why you think there is a special problem with this. There 
is nothing special about equality versus any other match predicate. 
Transitivity is irrelevant here. On the other hand, symmetry is a 
concern, as 'item == target' and 'target == item' could be different and 
the result of the function would depend on which is used.

-- 
Terry Jan Reedy


From raymond.hettinger at gmail.com  Sat Sep 14 22:54:06 2013
From: raymond.hettinger at gmail.com (Raymond Hettinger)
Date: Sat, 14 Sep 2013 13:54:06 -0700
Subject: [Python-ideas] Should we improve `dir`?
In-Reply-To: <CAA=HWYgAn-Ub2pdG1w7=x6uxr4D78tciPx_yHYZfkPc2payQqw@mail.gmail.com>
References: <CAA=HWYgAn-Ub2pdG1w7=x6uxr4D78tciPx_yHYZfkPc2payQqw@mail.gmail.com>
Message-ID: <F1DC6C48-04D0-4F1E-AE2A-AD01529D903B@gmail.com>


On Sep 14, 2013, at 12:15 PM, David Halter <davidhalter88 at gmail.com> wrote:

> After discussions, I realized that I should probably bring this up in python-ideas, I think the current implementation can be very confusing for people trying to introspect classes with `dir`, which is IMHO its typical use case.

The current behavior of dir() is a bit irritating when I am teaching how Python works.

That said, the irritation is minor and easily overcome.

I would not want to change the behavior and risk breaking
existing introspection code (that code is tends to be more
fragile and implementation than most other code).

In other words, I just don't think it is worth changing something
that has been in-place for a very long long time.  The minor
benefit doesn't want the downsides that goes with API churn.


Raymond
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20130914/563ad534/attachment-0001.html>

From raymond.hettinger at gmail.com  Sat Sep 14 23:03:09 2013
From: raymond.hettinger at gmail.com (Raymond Hettinger)
Date: Sat, 14 Sep 2013 14:03:09 -0700
Subject: [Python-ideas] Add dict.getkey() and set.get()
In-Reply-To: <l12fqv$e29$1@ger.gmane.org>
References: <l11u5v$j9n$1@ger.gmane.org>
 <CAMpsgwaUL-6n6dBv2wxWnu4_H+qjrU5rQjrqj-Uk_5vyp7B-bw@mail.gmail.com>
 <l12fqv$e29$1@ger.gmane.org>
Message-ID: <F55FCFF0-0F20-467D-BF21-C7D301EBCAED@gmail.com>


On Sep 14, 2013, at 1:10 PM, Serhiy Storchaka <storchaka at gmail.com> wrote:

> I have no good use cases.

That should be the end of the story ;-)

Also, we need to have a strong preference to keep the core APIs small.
Python is becoming harder and harder to teach -- it no longer
"fits in your head".

If you look at mapping and set APIs in other languages, you will
see than this particular feature creep has usually been 
deemed unnecessary.

The most important thing we can do for Python is to teach how
to use the core objects to solve problems rather than trying to
add a method for every single idea that has ever occurred to us.

Dictionaries and lists are very flexible tools.  We need to teach
people to use them to solve simple problems:

   canonical = {}

   def intern(obj):
          'Return a canonical member of an equivalent class'
          if obj in canonical:
              return canonical[obj]
          canonical[obj] = obj
          return obj


Raymond
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20130914/fc3de074/attachment.html>

From steve at pearwood.info  Sun Sep 15 01:07:57 2013
From: steve at pearwood.info (Steven D'Aprano)
Date: Sun, 15 Sep 2013 09:07:57 +1000
Subject: [Python-ideas] Add dict.getkey() and set.get()
In-Reply-To: <CAEbHw4Zb860agw+W4qb=O6vnc+4v-OkdRkwebJTStY2zoTdDig@mail.gmail.com>
References: <l11u5v$j9n$1@ger.gmane.org>
 <CAEbHw4Zb860agw+W4qb=O6vnc+4v-OkdRkwebJTStY2zoTdDig@mail.gmail.com>
Message-ID: <20130914230756.GQ16820@ando>

On Sat, Sep 14, 2013 at 09:52:30AM -0700, David Mertz wrote:

> def getexact(m, v):
>     for x in m:
>         if x==v: return x
>     else:
>         raise KeyError(v)

This has the flaw that it is O(N) rather than O(1). It's really quite 
unfortunate to have dicts and sets able to access keys in (almost) 
constant time, but not be able to communicate that key back to the 
caller except by walking the entire dict/set.

Now O(N) is tolerable if all you want to do is retrieve the canonical 
version of a single key. But if you want to do so for *all* of the keys 
in the dict, the naive way to do it ends up walking the dict for 
each key, giving O(N**2) in total. Is there a non-naive way to speed 
this up? I haven't had breakfast yet so I can't think of one :-)

My feeling here is that for ordinary dicts, needing to retrieve the 
canonical key is rare enough that they don't need a dedicated method to 
do so. But for the TransformDict suggested on the python-dev list, it 
will be a common need, and deserves an O(1) lookup method.


-- 
Steven

From tjreedy at udel.edu  Sun Sep 15 01:12:08 2013
From: tjreedy at udel.edu (Terry Reedy)
Date: Sat, 14 Sep 2013 19:12:08 -0400
Subject: [Python-ideas] Should we improve `dir`?
In-Reply-To: <F1DC6C48-04D0-4F1E-AE2A-AD01529D903B@gmail.com>
References: <CAA=HWYgAn-Ub2pdG1w7=x6uxr4D78tciPx_yHYZfkPc2payQqw@mail.gmail.com>
 <F1DC6C48-04D0-4F1E-AE2A-AD01529D903B@gmail.com>
Message-ID: <l12qg3$spm$1@ger.gmane.org>

On 9/14/2013 4:54 PM, Raymond Hettinger wrote:

> On Sep 14, 2013, at 12:15 PM, David Halter
> <davidhalter88 at gmail.com
> <mailto:davidhalter88 at gmail.com>> wrote:
>
>> After discussions, I realized that I should probably bring this up in
>> python-ideas, I think the current implementation can be very confusing
>> for people trying to introspect classes with `dir`, which is IMHO its
>> typical use case.
>
> The current behavior of dir() is a bit irritating when I am teaching how
> Python works.
>
> That said, the irritation is minor and easily overcome.
>
> I would not want to change the behavior and risk breaking
> existing introspection code (that code is tends to be more
> fragile and implementation than most other code).

This was the basis for rejecting http://bugs.python.org/issue19002
``dir`` function does not work correctly with classes.
The proposal obviously broke pydoc and inspect modules.

> In other words, I just don't think it is worth changing something
> that has been in-place for a very long long time.  The minor
> benefit doesn't want the downsides that goes with API churn.

The other point is that people *usually* call dir(cls) in order to find 
the methods they can call in instances of cls:

-- 
Terry Jan Reedy


From steve at pearwood.info  Sun Sep 15 01:26:54 2013
From: steve at pearwood.info (Steven D'Aprano)
Date: Sun, 15 Sep 2013 09:26:54 +1000
Subject: [Python-ideas] Add dict.getkey() and set.get()
In-Reply-To: <l12805$kj1$1@ger.gmane.org>
References: <l11u5v$j9n$1@ger.gmane.org>
 <CAEbHw4Zb860agw+W4qb=O6vnc+4v-OkdRkwebJTStY2zoTdDig@mail.gmail.com>
 <CAEbHw4Z=0MOqVOeWqLKFJ2_dMAthPntVVNAKmcS83eRk7wpODA@mail.gmail.com>
 <l12805$kj1$1@ger.gmane.org>
Message-ID: <20130914232654.GR16820@ando>

On Sat, Sep 14, 2013 at 01:56:25PM -0400, Terry Reedy wrote:
[...]
> > and even my quick example isn't
> >a completely absurd data type (it would need to be fleshed out better,
> >but a FuzzyNumber could well have sensible purposes).
> 
> The only absurd thing is calling similarity 'equality' ;=).

While I agree with the general thrust of your post, I'd like to point 
out that the creator of APL, Ken Iverson did not agree with you. A 
couple of relevant quotes:


In an early talk Ken was explaining the advantages of tolerant 
comparison. A member of the audience asked incredulously, ?Surely you 
don?t mean that when A=B and B=C, A may not equal C?? Without skipping a 
beat, Ken replied, ?Any carpenter knows that!? and went on to the next 
question.
? quoted by Paul Berry
 

The intransitivity of [tolerant] equality is well known in practical 
situations and can be easily demonstrated by sawing several pieces of 
wood of equal length. In one case, use the first piece to measure 
subsequent lengths; in the second case, use the last piece cut to 
measure the next. Compare the lengths of the two final pieces.
? Richard Lathwell, APL Comparison Tolerance, APL76, 1976


Mathematicians and programmers treat transitivity as far more 
fundamental than it actually is in real life :-)


-- 
Steven

From mertz at gnosis.cx  Sun Sep 15 01:49:04 2013
From: mertz at gnosis.cx (David Mertz)
Date: Sat, 14 Sep 2013 16:49:04 -0700
Subject: [Python-ideas] Add dict.getkey() and set.get()
In-Reply-To: <20130914230756.GQ16820@ando>
References: <l11u5v$j9n$1@ger.gmane.org>
 <CAEbHw4Zb860agw+W4qb=O6vnc+4v-OkdRkwebJTStY2zoTdDig@mail.gmail.com>
 <20130914230756.GQ16820@ando>
Message-ID: <CAEbHw4ZvSqzevGVQQ4c5Jp3xE-+h6bM=URrHLF_-xZsvamXAuA@mail.gmail.com>

>
> On Sat, Sep 14, 2013 at 09:52:30AM -0700, David Mertz wrote:
> > def getexact(m, v):
> >     for x in m:
> >         if x==v: return x
> >     else:
> >         raise KeyError(v)
>
> This has the flaw that it is O(N) rather than O(1).


It's true, it is relatively inefficient.  But we also don't have a use case
where we actually need to do this enough that it matters.


> Now O(N) is tolerable if all you want to do is retrieve the canonical
> version of a single key. But if you want to do so for *all* of the keys
> in the dict, the naive way to do it ends up walking the dict for
> each key,


I thought the naive way to retrieve ALL the keys (in canonical form) was
'mydict.keys()'. :-)

I'm sure you can spin other variations on this.  Here's all the keys except
one, removed using a non-canonical equivalent value:

>>> {1:2, 3:4, 5:6, 7:8}.keys() - {Decimal(1.0)}
{3, 5, 7}

It's true though that I can't think of an efficient way to get the
canonical form of a key from a dictionary in O(1).  But also I think it is
rare enough not to worry, and TransformDict is a good specialization that
will do this in those unusual cases where we care.


> giving O(N**2) in total. Is there a non-naive way to speed
> this up? I haven't had breakfast yet so I can't think of one :-)
>
> My feeling here is that for ordinary dicts, needing to retrieve the
> canonical key is rare enough that they don't need a dedicated method to
> do so. But for the TransformDict suggested on the python-dev list, it
> will be a common need, and deserves an O(1) lookup method.
>
>
>
> --
> Steven
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> https://mail.python.org/mailman/listinfo/python-ideas
>


-- 
Keeping medicines from the bloodstreams of the sick; food
from the bellies of the hungry; books from the hands of the
uneducated; technology from the underdeveloped; and putting
advocates of freedom in prisons.  Intellectual property is
to the 21st century what the slave trade was to the 16th.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20130914/f376bedd/attachment-0001.html>

From raymond.hettinger at gmail.com  Sun Sep 15 02:32:48 2013
From: raymond.hettinger at gmail.com (Raymond Hettinger)
Date: Sat, 14 Sep 2013 17:32:48 -0700
Subject: [Python-ideas] Add dict.getkey() and set.get()
In-Reply-To: <l11u5v$j9n$1@ger.gmane.org>
References: <l11u5v$j9n$1@ger.gmane.org>
Message-ID: <8D234C67-1B8E-4653-9AA9-86DE1E5A1EC5@gmail.com>


On Sep 14, 2013, at 8:09 AM, Serhiy Storchaka <storchaka at gmail.com> wrote:

> I propose to add two methods:
> 
> dict.getkey(key) returns original key stored in the dict which is equal to specified key. E.g.

For what is worth:

* this idea was proposed and rejected at least once before

* the one obvious way to do it is an interning dictionary
  that maps values back to themselves

* the use cases for this are somewhat uncommon
  (i.e. most people don't need it most of the time)

* if you think you really need this functionality, 
  it is possible to write a function that works with all
  containers as they are already implemented
  (i.e. you could you it today):
  http://code.activestate.com/recipes/499299-get_equivalentcontainer-item

* all the participants on this list would be well served to teach
   some python classes to get an appreciation of the negative
   consequences of further expanding the APIs of the core containers.
   Bigger is not better.  Learnability matters.  


Raymond

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20130914/c004c81f/attachment.html>

From dreamingforward at gmail.com  Sun Sep 15 03:37:05 2013
From: dreamingforward at gmail.com (Mark Janssen)
Date: Sat, 14 Sep 2013 18:37:05 -0700
Subject: [Python-ideas] Should we improve `dir`?
In-Reply-To: <CAA=HWYgAn-Ub2pdG1w7=x6uxr4D78tciPx_yHYZfkPc2payQqw@mail.gmail.com>
References: <CAA=HWYgAn-Ub2pdG1w7=x6uxr4D78tciPx_yHYZfkPc2payQqw@mail.gmail.com>
Message-ID: <CAMjeLr-T_mP0oymMGrt261CGDEw0iWM12-P1vLPpoykTZwMkeg@mail.gmail.com>

> I recently stumbled over `dir()` not working correctly in the case of
> classes:

Not working correctly?  That would imply an adequate definition of
"correctness".

Dir should divulge all method and attribute names of a class -- a
"directory", as it were.  In my opinion, it should not report
__bases__, __name__, __doc__, or __class__ -- all of which are
meta-things not meant for the user of a class.  If a programmer wants
to see more, then the inspect module would presumably be appropriate,
or simply calling for help().

--mark


>
> http://jedidjah.ch/code/2013/9/8/wrong_dir_function/
>
> In short:
>
> `dir` doesn't list the `type` methods, which it should in my opinion,
> because there are very important attributes in there like `__name__` or
> `__bases__`.
>
> This led to some confusion in the past, e.g.
> http://www.gossamer-threads.com/lists/python/python/507363.
>
> The long version is in the above link.
>
> After discussions, I realized that I should probably bring this up in
> python-ideas, I think the current implementation can be very confusing for
> people trying to introspect classes with `dir`, which is IMHO its typical
> use case.
>
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> https://mail.python.org/mailman/listinfo/python-ideas
>


-- 
MarkJ
Tacoma, Washington

From ncoghlan at gmail.com  Sun Sep 15 03:35:37 2013
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Sun, 15 Sep 2013 11:35:37 +1000
Subject: [Python-ideas] Should we improve `dir`?
In-Reply-To: <l12qg3$spm$1@ger.gmane.org>
References: <CAA=HWYgAn-Ub2pdG1w7=x6uxr4D78tciPx_yHYZfkPc2payQqw@mail.gmail.com>
 <F1DC6C48-04D0-4F1E-AE2A-AD01529D903B@gmail.com>
 <l12qg3$spm$1@ger.gmane.org>
Message-ID: <CADiSq7e7G-D+SGOEj1no3Y1Rarm1Gt_OEYHc39QLUxxm2jeXEw@mail.gmail.com>

On 15 Sep 2013 09:13, "Terry Reedy" <tjreedy at udel.edu> wrote:
>
> On 9/14/2013 4:54 PM, Raymond Hettinger wrote:
>
>> On Sep 14, 2013, at 12:15 PM, David Halter
>> <davidhalter88 at gmail.com
>> <mailto:davidhalter88 at gmail.com>> wrote:
>>
>>> After discussions, I realized that I should probably bring this up in
>>> python-ideas, I think the current implementation can be very confusing
>>> for people trying to introspect classes with `dir`, which is IMHO its
>>> typical use case.
>>
>>
>> The current behavior of dir() is a bit irritating when I am teaching how
>> Python works.
>>
>> That said, the irritation is minor and easily overcome.
>>
>> I would not want to change the behavior and risk breaking
>> existing introspection code (that code is tends to be more
>> fragile and implementation than most other code).
>
>
> This was the basis for rejecting http://bugs.python.org/issue19002
> ``dir`` function does not work correctly with classes.
> The proposal obviously broke pydoc and inspect modules.
>
>
>> In other words, I just don't think it is worth changing something
>> that has been in-place for a very long long time.  The minor
>> benefit doesn't want the downsides that goes with API churn.
>
>
> The other point is that people *usually* call dir(cls) in order to find
the methods they can call in instances of cls:

Right, this is one of the behavioural differences between classes and
instances, and, as far as I am aware, it isn't an accident. Not only that,
but (as Raymond pointed out) even if it was originally an accident it's too
late to change it now.

If introspection tools want to show all the operations available *on the
class*, then they need to include "dir(type(cls))" as well. So there may be
a legitimate feature request for a new section in the pydoc output showing
"class only" methods and attributes.

Cheers,
Nick.

>
> --
> Terry Jan Reedy
>
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> https://mail.python.org/mailman/listinfo/python-ideas
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20130915/27e9f78c/attachment.html>

From dreamingforward at gmail.com  Sun Sep 15 03:47:41 2013
From: dreamingforward at gmail.com (Mark Janssen)
Date: Sat, 14 Sep 2013 18:47:41 -0700
Subject: [Python-ideas] Should we improve `dir`?
In-Reply-To: <CADiSq7e7G-D+SGOEj1no3Y1Rarm1Gt_OEYHc39QLUxxm2jeXEw@mail.gmail.com>
References: <CAA=HWYgAn-Ub2pdG1w7=x6uxr4D78tciPx_yHYZfkPc2payQqw@mail.gmail.com>
 <F1DC6C48-04D0-4F1E-AE2A-AD01529D903B@gmail.com>
 <l12qg3$spm$1@ger.gmane.org>
 <CADiSq7e7G-D+SGOEj1no3Y1Rarm1Gt_OEYHc39QLUxxm2jeXEw@mail.gmail.com>
Message-ID: <CAMjeLr_V8tB_nSZVgJWgbe=fOf00wdBkB9Rbdp5AYLLKp3dqxQ@mail.gmail.com>

> Right, this is one of the behavioural differences between classes and
> instances, and, as far as I am aware, it isn't an accident.

In fact, I'd argue that is a critical distinction.  A language
definition has to help enforce this distinction, otherwise confusion
abounds.

> If introspection tools want to show all the operations available *on the
> class*, then they need to include "dir(type(cls))" as well.

If users want to find the operations available "on the class", they
should learn Python.

--mark

From rosuav at gmail.com  Sun Sep 15 04:59:49 2013
From: rosuav at gmail.com (Chris Angelico)
Date: Sun, 15 Sep 2013 12:59:49 +1000
Subject: [Python-ideas] Should we improve `dir`?
In-Reply-To: <CAMjeLr_V8tB_nSZVgJWgbe=fOf00wdBkB9Rbdp5AYLLKp3dqxQ@mail.gmail.com>
References: <CAA=HWYgAn-Ub2pdG1w7=x6uxr4D78tciPx_yHYZfkPc2payQqw@mail.gmail.com>
 <F1DC6C48-04D0-4F1E-AE2A-AD01529D903B@gmail.com>
 <l12qg3$spm$1@ger.gmane.org>
 <CADiSq7e7G-D+SGOEj1no3Y1Rarm1Gt_OEYHc39QLUxxm2jeXEw@mail.gmail.com>
 <CAMjeLr_V8tB_nSZVgJWgbe=fOf00wdBkB9Rbdp5AYLLKp3dqxQ@mail.gmail.com>
Message-ID: <CAPTjJmo4HFoYqTkcP8Tw+421+k8T4VmCEwcVCnyhXsGTv0R7vw@mail.gmail.com>

On Sun, Sep 15, 2013 at 11:47 AM, Mark Janssen
<dreamingforward at gmail.com> wrote:
>> Right, this is one of the behavioural differences between classes and
>> instances, and, as far as I am aware, it isn't an accident.
>
> In fact, I'd argue that is a critical distinction.  A language
> definition has to help enforce this distinction, otherwise confusion
> abounds.
>
>> If introspection tools want to show all the operations available *on the
>> class*, then they need to include "dir(type(cls))" as well.
>
> If users want to find the operations available "on the class", they
> should learn Python.

The other day I was looking for __bases__ but couldn't remember what
it was called. I did the obvious thing and used IDLE's tab
completion... and it wasn't there. There definitely is value in having
those sorts of things be in dir().

ChrisA

From anthonyfk at gmail.com  Sun Sep 15 05:12:37 2013
From: anthonyfk at gmail.com (Kyle Fisher)
Date: Sat, 14 Sep 2013 21:12:37 -0600
Subject: [Python-ideas] Keep free list of popular iterator objects
Message-ID: <CALRznwQ9nk1AqJKWQJZnd+DufsKZJaCiFnxbfE_ti5f8xFNbXw@mail.gmail.com>

We tend to do a lot of iterating over dictionaries in our product in some
performance critical areas.  It occurred to me that allocating a new
iterator object every single time seems a little wasteful, especially
considering that there's probably only a handful of them alive at any
time.  Doing a quick test with dictiterobject and 3 free lists (one for
Keys, Values and Items) showed about a 4% speedup in this (best) case:

python -m timeit -s "a = {'k%d' % i: i for i in xrange($2)}" "[_ for _ in
a.iteritems()]"

However, this seems like almost too simple of an idea.  Has this been tried
before?  Would it be too little gain?  Given that the extra memory used is
negligible (3 of each iterator?), how much of a performance gain would be
needed to justify it?

Thanks,
-Kyle
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20130914/f3801910/attachment.html>

From raymond.hettinger at gmail.com  Sun Sep 15 05:28:50 2013
From: raymond.hettinger at gmail.com (Raymond Hettinger)
Date: Sat, 14 Sep 2013 20:28:50 -0700
Subject: [Python-ideas] Keep free list of popular iterator objects
In-Reply-To: <CALRznwQ9nk1AqJKWQJZnd+DufsKZJaCiFnxbfE_ti5f8xFNbXw@mail.gmail.com>
References: <CALRznwQ9nk1AqJKWQJZnd+DufsKZJaCiFnxbfE_ti5f8xFNbXw@mail.gmail.com>
Message-ID: <A03EC075-3101-4ED8-B429-8EBBFDC58A25@gmail.com>


On Sep 14, 2013, at 8:12 PM, Kyle Fisher <anthonyfk at gmail.com> wrote:

> We tend to do a lot of iterating over dictionaries in our product in some performance critical areas.  It occurred to me that allocating a new iterator object every single time seems a little wasteful, especially considering that there's probably only a handful of them alive at any time.  Doing a quick test with dictiterobject and 3 free lists (one for Keys, Values and Items) showed about a 4% speedup in this (best) case:


It is surprising that you saw any performance gain at all.

Python already has a default Python freelist scheme 
in the _PyObject_Malloc() function in Objects/obmalloc.c.

Another thought is that this isn't an inner-loop optimization.
The O(1) time for iterator creation is dominated by the O(n)
time to actually iterate over the dict keys, values, and items.

Raymond
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20130914/6aa7f17e/attachment.html>

From anthonyfk at gmail.com  Sun Sep 15 07:04:53 2013
From: anthonyfk at gmail.com (Kyle Fisher)
Date: Sat, 14 Sep 2013 23:04:53 -0600
Subject: [Python-ideas] Keep free list of popular iterator objects
In-Reply-To: <A03EC075-3101-4ED8-B429-8EBBFDC58A25@gmail.com>
References: <CALRznwQ9nk1AqJKWQJZnd+DufsKZJaCiFnxbfE_ti5f8xFNbXw@mail.gmail.com>
 <A03EC075-3101-4ED8-B429-8EBBFDC58A25@gmail.com>
Message-ID: <CALRznwR3K-YHWKbfg5PzSdBVttWqUQc_qvTUREnLLTiaodPykA@mail.gmail.com>

On Sat, Sep 14, 2013 at 9:28 PM, Raymond Hettinger <
raymond.hettinger at gmail.com> wrote:

>
> It is surprising that you saw any performance gain at all.
>
> Python already has a default Python freelist scheme
> in the _PyObject_Malloc() function in Objects/obmalloc.c.
>
> Another thought is that this isn't an inner-loop optimization.
> The O(1) time for iterator creation is dominated by the O(n)
> time to actually iterate over the dict keys, values, and items.
>
> Raymond
>


Hi Raymond,

Taking a look at _PyObject_Malloc in Objects/obmalloc.c, I see that it
needs to do some lock and unlock operations.  Perhaps it's the avoidance of
this overhead that I'm seeing?  After all, there must be a reason that
dict, tuple and others are keeping their own free lists, right?

I'm curious what the overhead in creating the iterator is compared to the
time to iterate.  Obviously there's an O(1) / O(n) difference, but perhaps
the constant time outweighs smaller values of n?  In our case, we are often
doing something like the following (2.7):

def onNewData(datapoints):
    for dp in datapoints:
        for val in dp.outputs.itervalues():
            # Do things with val
        for status in dp.statuses.itervalues():
            # Do things with status

Where datapoints can have 100000 items and "outputs" and "statuses" tend to
be small.  So, while creating the iterator obviously isn't the slowest part
of the code, it does have some impact.

Cheers,
-Kyle

P.S. - I'm a newbie to the mailing list, so if I'm replying "wrong" sorry
about that!
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20130914/4990b18f/attachment.html>

From anthonyfk at gmail.com  Sun Sep 15 08:26:00 2013
From: anthonyfk at gmail.com (Kyle Fisher)
Date: Sun, 15 Sep 2013 00:26:00 -0600
Subject: [Python-ideas] Keep free list of popular iterator objects
In-Reply-To: <CALRznwR3K-YHWKbfg5PzSdBVttWqUQc_qvTUREnLLTiaodPykA@mail.gmail.com>
References: <CALRznwQ9nk1AqJKWQJZnd+DufsKZJaCiFnxbfE_ti5f8xFNbXw@mail.gmail.com>
 <A03EC075-3101-4ED8-B429-8EBBFDC58A25@gmail.com>
 <CALRznwR3K-YHWKbfg5PzSdBVttWqUQc_qvTUREnLLTiaodPykA@mail.gmail.com>
Message-ID: <CALRznwSSfb+rt_muDcBzMNGe0bdtuDYcLWRMVfUz=C0BHEa_jg@mail.gmail.com>

I've realized that my original example is far too complex, so I've
simplified it:

Status quo:
./python -m timeit -r 100 -s "a=[1]" "iter(a)"
10000000 loops, best of 100: 0.0662 usec per loop

With patch:
./python -m timeit -r 100 -s "a=[1]" "iter(a)"
10000000 loops, best of 100: 0.0557 usec per loop
List iter allocations: 6
List iter reuse through freelist: 1011111554
100.00% reuse rate

Which seems to show a 15% speedup.  I'd be curious what others get.

Cheers,
-Kyle
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20130915/9649c544/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: list_iterator_freelist.patch
Type: application/octet-stream
Size: 2540 bytes
Desc: not available
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20130915/9649c544/attachment-0001.obj>

From mal at egenix.com  Sun Sep 15 12:56:15 2013
From: mal at egenix.com (M.-A. Lemburg)
Date: Sun, 15 Sep 2013 12:56:15 +0200
Subject: [Python-ideas] Keep free list of popular iterator objects
In-Reply-To: <CALRznwSSfb+rt_muDcBzMNGe0bdtuDYcLWRMVfUz=C0BHEa_jg@mail.gmail.com>
References: <CALRznwQ9nk1AqJKWQJZnd+DufsKZJaCiFnxbfE_ti5f8xFNbXw@mail.gmail.com>	<A03EC075-3101-4ED8-B429-8EBBFDC58A25@gmail.com>	<CALRznwR3K-YHWKbfg5PzSdBVttWqUQc_qvTUREnLLTiaodPykA@mail.gmail.com>
 <CALRznwSSfb+rt_muDcBzMNGe0bdtuDYcLWRMVfUz=C0BHEa_jg@mail.gmail.com>
Message-ID: <5235924F.50907@egenix.com>

On 15.09.2013 08:26, Kyle Fisher wrote:
> I've realized that my original example is far too complex, so I've
> simplified it:
> 
> Status quo:
> ./python -m timeit -r 100 -s "a=[1]" "iter(a)"
> 10000000 loops, best of 100: 0.0662 usec per loop
> 
> With patch:
> ./python -m timeit -r 100 -s "a=[1]" "iter(a)"
> 10000000 loops, best of 100: 0.0557 usec per loop
> List iter allocations: 6
> List iter reuse through freelist: 1011111554
> 100.00% reuse rate
> 
> Which seems to show a 15% speedup.  I'd be curious what others get.

I'd suggest to open a ticket for this and then continue
the discussion there.

Given how often iterators are used nowadays in Python, a separate
free list may actually make sense (for the same reasons it makes
sense to have around for lists, tuples, etc.).

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Sep 15 2013)
>>> Python Projects, Consulting and Support ...   http://www.egenix.com/
>>> mxODBC.Zope/Plone.Database.Adapter ...       http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________
2013-09-11: Released eGenix PyRun 1.3.0 ...       http://egenix.com/go49
2013-09-04: Released eGenix pyOpenSSL 0.13.2 ...  http://egenix.com/go48
2013-09-20: PyCon UK 2013, Coventry, UK ...                 5 days to go

   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
    D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
           Registered at Amtsgericht Duesseldorf: HRB 46611
               http://www.egenix.com/company/contact/

From solipsis at pitrou.net  Sun Sep 15 13:27:58 2013
From: solipsis at pitrou.net (Antoine Pitrou)
Date: Sun, 15 Sep 2013 13:27:58 +0200
Subject: [Python-ideas] Keep free list of popular iterator objects
References: <CALRznwQ9nk1AqJKWQJZnd+DufsKZJaCiFnxbfE_ti5f8xFNbXw@mail.gmail.com>
 <A03EC075-3101-4ED8-B429-8EBBFDC58A25@gmail.com>
 <CALRznwR3K-YHWKbfg5PzSdBVttWqUQc_qvTUREnLLTiaodPykA@mail.gmail.com>
Message-ID: <20130915132758.4cd3e697@fsol>

On Sat, 14 Sep 2013 23:04:53 -0600
Kyle Fisher <anthonyfk at gmail.com> wrote:
> On Sat, Sep 14, 2013 at 9:28 PM, Raymond Hettinger <
> raymond.hettinger at gmail.com> wrote:
> 
> >
> > It is surprising that you saw any performance gain at all.
> >
> > Python already has a default Python freelist scheme
> > in the _PyObject_Malloc() function in Objects/obmalloc.c.
> >
> > Another thought is that this isn't an inner-loop optimization.
> > The O(1) time for iterator creation is dominated by the O(n)
> > time to actually iterate over the dict keys, values, and items.
> >
> > Raymond
> >
> 
> 
> Hi Raymond,
> 
> Taking a look at _PyObject_Malloc in Objects/obmalloc.c, I see that it
> needs to do some lock and unlock operations.

Please read carefully. The lock and unlock "operations" are no-ops.

Regards

Antoine.


From solipsis at pitrou.net  Sun Sep 15 13:30:23 2013
From: solipsis at pitrou.net (Antoine Pitrou)
Date: Sun, 15 Sep 2013 13:30:23 +0200
Subject: [Python-ideas] Keep free list of popular iterator objects
References: <CALRznwQ9nk1AqJKWQJZnd+DufsKZJaCiFnxbfE_ti5f8xFNbXw@mail.gmail.com>
 <A03EC075-3101-4ED8-B429-8EBBFDC58A25@gmail.com>
 <CALRznwR3K-YHWKbfg5PzSdBVttWqUQc_qvTUREnLLTiaodPykA@mail.gmail.com>
 <CALRznwSSfb+rt_muDcBzMNGe0bdtuDYcLWRMVfUz=C0BHEa_jg@mail.gmail.com>
 <5235924F.50907@egenix.com>
Message-ID: <20130915133023.56566d75@fsol>

On Sun, 15 Sep 2013 12:56:15 +0200
"M.-A. Lemburg" <mal at egenix.com> wrote:
> On 15.09.2013 08:26, Kyle Fisher wrote:
> > I've realized that my original example is far too complex, so I've
> > simplified it:
> > 
> > Status quo:
> > ./python -m timeit -r 100 -s "a=[1]" "iter(a)"
> > 10000000 loops, best of 100: 0.0662 usec per loop
> > 
> > With patch:
> > ./python -m timeit -r 100 -s "a=[1]" "iter(a)"
> > 10000000 loops, best of 100: 0.0557 usec per loop
> > List iter allocations: 6
> > List iter reuse through freelist: 1011111554
> > 100.00% reuse rate
> > 
> > Which seems to show a 15% speedup.  I'd be curious what others get.
> 
> I'd suggest to open a ticket for this and then continue
> the discussion there.
> 
> Given how often iterators are used nowadays in Python, a separate
> free list may actually make sense (for the same reasons it makes
> sense to have around for lists, tuples, etc.).

I'm -1 on adding freelists everywhere. A best-case 15% improvement on a
trivial microbenchmark probably means a 0% improvement on real-world
workloads. Furthermore, using specialized freelists will increase
memory fragmentation and prevent the main allocator from returning
memory to the system.

Regards

Antoine.


From mal at egenix.com  Sun Sep 15 13:52:39 2013
From: mal at egenix.com (M.-A. Lemburg)
Date: Sun, 15 Sep 2013 13:52:39 +0200
Subject: [Python-ideas] Keep free list of popular iterator objects
In-Reply-To: <20130915133023.56566d75@fsol>
References: <CALRznwQ9nk1AqJKWQJZnd+DufsKZJaCiFnxbfE_ti5f8xFNbXw@mail.gmail.com>	<A03EC075-3101-4ED8-B429-8EBBFDC58A25@gmail.com>	<CALRznwR3K-YHWKbfg5PzSdBVttWqUQc_qvTUREnLLTiaodPykA@mail.gmail.com>	<CALRznwSSfb+rt_muDcBzMNGe0bdtuDYcLWRMVfUz=C0BHEa_jg@mail.gmail.com>	<5235924F.50907@egenix.com>
 <20130915133023.56566d75@fsol>
Message-ID: <52359F87.2030409@egenix.com>

On 15.09.2013 13:30, Antoine Pitrou wrote:
> On Sun, 15 Sep 2013 12:56:15 +0200
> "M.-A. Lemburg" <mal at egenix.com> wrote:
>> On 15.09.2013 08:26, Kyle Fisher wrote:
>>> I've realized that my original example is far too complex, so I've
>>> simplified it:
>>>
>>> Status quo:
>>> ./python -m timeit -r 100 -s "a=[1]" "iter(a)"
>>> 10000000 loops, best of 100: 0.0662 usec per loop
>>>
>>> With patch:
>>> ./python -m timeit -r 100 -s "a=[1]" "iter(a)"
>>> 10000000 loops, best of 100: 0.0557 usec per loop
>>> List iter allocations: 6
>>> List iter reuse through freelist: 1011111554
>>> 100.00% reuse rate
>>>
>>> Which seems to show a 15% speedup.  I'd be curious what others get.
>>
>> I'd suggest to open a ticket for this and then continue
>> the discussion there.
>>
>> Given how often iterators are used nowadays in Python, a separate
>> free list may actually make sense (for the same reasons it makes
>> sense to have them around for lists, tuples, etc.).
> 
> I'm -1 on adding freelists everywhere.

Not everywhere :-) Just for objects that are often created
and freed again.

> A best-case 15% improvement on a
> trivial microbenchmark probably means a 0% improvement on real-world
> workloads. Furthermore, using specialized freelists will increase
> memory fragmentation and prevent the main allocator from returning
> memory to the system.

Keeping e.g. a hundred such objects in a free list shouldn't
really affect the memory load of the Python interpreter.

A 15% improvement isn't a lot, but such small improvements
add up if they are consistent and the net result is an overall
performance improvement.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Sep 15 2013)
>>> Python Projects, Consulting and Support ...   http://www.egenix.com/
>>> mxODBC.Zope/Plone.Database.Adapter ...       http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________
2013-09-11: Released eGenix PyRun 1.3.0 ...       http://egenix.com/go49
2013-09-04: Released eGenix pyOpenSSL 0.13.2 ...  http://egenix.com/go48
2013-09-20: PyCon UK 2013, Coventry, UK ...                 5 days to go

   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
    D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
           Registered at Amtsgericht Duesseldorf: HRB 46611
               http://www.egenix.com/company/contact/

From solipsis at pitrou.net  Sun Sep 15 14:09:53 2013
From: solipsis at pitrou.net (Antoine Pitrou)
Date: Sun, 15 Sep 2013 14:09:53 +0200
Subject: [Python-ideas] Keep free list of popular iterator objects
References: <CALRznwQ9nk1AqJKWQJZnd+DufsKZJaCiFnxbfE_ti5f8xFNbXw@mail.gmail.com>
 <A03EC075-3101-4ED8-B429-8EBBFDC58A25@gmail.com>
 <CALRznwR3K-YHWKbfg5PzSdBVttWqUQc_qvTUREnLLTiaodPykA@mail.gmail.com>
 <CALRznwSSfb+rt_muDcBzMNGe0bdtuDYcLWRMVfUz=C0BHEa_jg@mail.gmail.com>
 <5235924F.50907@egenix.com> <20130915133023.56566d75@fsol>
 <52359F87.2030409@egenix.com>
Message-ID: <20130915140953.2b2a751a@fsol>

On Sun, 15 Sep 2013 13:52:39 +0200
"M.-A. Lemburg" <mal at egenix.com> wrote:
> > A best-case 15% improvement on a
> > trivial microbenchmark probably means a 0% improvement on real-world
> > workloads. Furthermore, using specialized freelists will increase
> > memory fragmentation and prevent the main allocator from returning
> > memory to the system.
> 
> Keeping e.g. a hundred such objects in a free list shouldn't
> really affect the memory load of the Python interpreter.

Well, it can. The object allocator uses 256KB arenas, so if each of
the hundred objects in the free list keeps a different arena alive, we
are talking about a 25 MB fragmentation overhead.

Yes, that's a worse case (and irrealistic for common workloads)
overhead, but the 15% improvement is a best case (and very irrealistic
for common workloads) performance gain :-)

> A 15% improvement isn't a lot, but such small improvements
> add up if they are consistent and the net result is an overall
> performance improvement.

I've grown skeptical that such small improvements actually "add up" to
something significant. Performance differences between CPython versions
can generally be attributed to one or two important changes (hopefully
improvements :-)) such as e.g. PEP 393, the method lookup cache, or
new-style classes.

Anyway, if there's a non-trivial benchmark that can measure the
real-world potential of this optimization, it would help the
discussion :-)

Regards

Antoine.


From mal at egenix.com  Sun Sep 15 15:50:12 2013
From: mal at egenix.com (M.-A. Lemburg)
Date: Sun, 15 Sep 2013 15:50:12 +0200
Subject: [Python-ideas] Keep free list of popular iterator objects
In-Reply-To: <20130915140953.2b2a751a@fsol>
References: <CALRznwQ9nk1AqJKWQJZnd+DufsKZJaCiFnxbfE_ti5f8xFNbXw@mail.gmail.com>	<A03EC075-3101-4ED8-B429-8EBBFDC58A25@gmail.com>	<CALRznwR3K-YHWKbfg5PzSdBVttWqUQc_qvTUREnLLTiaodPykA@mail.gmail.com>	<CALRznwSSfb+rt_muDcBzMNGe0bdtuDYcLWRMVfUz=C0BHEa_jg@mail.gmail.com>	<5235924F.50907@egenix.com>
 <20130915133023.56566d75@fsol>	<52359F87.2030409@egenix.com>
 <20130915140953.2b2a751a@fsol>
Message-ID: <5235BB14.8080208@egenix.com>

On 15.09.2013 14:09, Antoine Pitrou wrote:
> On Sun, 15 Sep 2013 13:52:39 +0200
> "M.-A. Lemburg" <mal at egenix.com> wrote:
>>> A best-case 15% improvement on a
>>> trivial microbenchmark probably means a 0% improvement on real-world
>>> workloads. Furthermore, using specialized freelists will increase
>>> memory fragmentation and prevent the main allocator from returning
>>> memory to the system.
>>
>> Keeping e.g. a hundred such objects in a free list shouldn't
>> really affect the memory load of the Python interpreter.
> 
> Well, it can. The object allocator uses 256KB arenas, so if each of
> the hundred objects in the free list keeps a different arena alive, we
> are talking about a 25 MB fragmentation overhead.
> 
> Yes, that's a worse case (and irrealistic for common workloads)
> overhead, but the 15% improvement is a best case (and very irrealistic
> for common workloads) performance gain :-)

The trick here is to preallocate the pool of those 100 iterator
objects, so you only use one such arena - hopefully the ones
that's also used for the other free lists :-)

>> A 15% improvement isn't a lot, but such small improvements
>> add up if they are consistent and the net result is an overall
>> performance improvement.
> 
> I've grown skeptical that such small improvements actually "add up" to
> something significant. Performance differences between CPython versions
> can generally be attributed to one or two important changes (hopefully
> improvements :-)) such as e.g. PEP 393, the method lookup cache, or
> new-style classes.

For Python 1.5 I had done a whole serious of such smaller
improvements. The net effect was a speedup of between 20-30%,
so I wouldn't be too skeptical :-)

What's important about such small enhancements is that they
provide consistent speedups and have a sane ratio between
complexity/maintenance overhead and performance improvement.

> Anyway, if there's a non-trivial benchmark that can measure the
> real-world potential of this optimization, it would help the
> discussion :-)

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Sep 15 2013)
>>> Python Projects, Consulting and Support ...   http://www.egenix.com/
>>> mxODBC.Zope/Plone.Database.Adapter ...       http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________
2013-09-11: Released eGenix PyRun 1.3.0 ...       http://egenix.com/go49
2013-09-04: Released eGenix pyOpenSSL 0.13.2 ...  http://egenix.com/go48
2013-09-20: PyCon UK 2013, Coventry, UK ...                 5 days to go

   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
    D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
           Registered at Amtsgericht Duesseldorf: HRB 46611
               http://www.egenix.com/company/contact/

From ethan at stoneleaf.us  Sun Sep 15 16:43:40 2013
From: ethan at stoneleaf.us (Ethan Furman)
Date: Sun, 15 Sep 2013 07:43:40 -0700
Subject: [Python-ideas] Add dict.getkey() and set.get()
In-Reply-To: <8D234C67-1B8E-4653-9AA9-86DE1E5A1EC5@gmail.com>
References: <l11u5v$j9n$1@ger.gmane.org>
 <8D234C67-1B8E-4653-9AA9-86DE1E5A1EC5@gmail.com>
Message-ID: <5235C79C.6030907@stoneleaf.us>

On 09/14/2013 05:32 PM, Raymond Hettinger wrote:
>
>     Bigger is not better.  Learnability matters.

+1

Which may sound strange coming from someone who was/is a proponent of adding Enum, TransformDict, and stats.  A key 
distinction is that Python the language is not the same as the stdlib that ships with Python.

Python the language should grow very slowly.

The stdlib should grow in order to provide a consistent, stable, and sane user experience.  This includes criteria such as:

   - is it widely re-implemented?   (indicating a common need,
     and probably slightly different multiple APIs)

   - is it easy to get wrong?  (indicating a complex subject)

   - is it a quickly changing field?  (indicating an often
     changing code base)

The first two are reasons why inclusion could be a good thing, the last a reason why inclusion would be a bad thing.

--
~Ethan~

From raymond.hettinger at gmail.com  Sun Sep 15 18:38:39 2013
From: raymond.hettinger at gmail.com (Raymond Hettinger)
Date: Sun, 15 Sep 2013 09:38:39 -0700
Subject: [Python-ideas] Keep free list of popular iterator objects
In-Reply-To: <CALRznwSSfb+rt_muDcBzMNGe0bdtuDYcLWRMVfUz=C0BHEa_jg@mail.gmail.com>
References: <CALRznwQ9nk1AqJKWQJZnd+DufsKZJaCiFnxbfE_ti5f8xFNbXw@mail.gmail.com>
 <A03EC075-3101-4ED8-B429-8EBBFDC58A25@gmail.com>
 <CALRznwR3K-YHWKbfg5PzSdBVttWqUQc_qvTUREnLLTiaodPykA@mail.gmail.com>
 <CALRznwSSfb+rt_muDcBzMNGe0bdtuDYcLWRMVfUz=C0BHEa_jg@mail.gmail.com>
Message-ID: <0D2417A7-18FB-4520-B22B-7B2385D93603@gmail.com>


On Sep 14, 2013, at 11:26 PM, Kyle Fisher <anthonyfk at gmail.com> wrote:

> I've realized that my original example is far too complex, so I've simplified it:
> 
> Status quo:
> ./python -m timeit -r 100 -s "a=[1]" "iter(a)"
> 10000000 loops, best of 100: 0.0662 usec per loop
> 
> With patch:
> ./python -m timeit -r 100 -s "a=[1]" "iter(a)"
> 10000000 loops, best of 100: 0.0557 usec per loop
> List iter allocations: 6
> List iter reuse through freelist: 1011111554
> 100.00% reuse rate
> 
> Which seems to show a 15% speedup.  I'd be curious what others get.

This 15% claim is incredibly deceptive.  You're looping over a list of length one and the "benefits" fall away immediately for anything longer.  It seems like it is intentionally ignoring that you're optimizing an O(1) setup step in an O(n) operation.  And the timing loop does not exercise the cases where the freelist misses.

More important effects are being masked by the tight timing loop that only exercises the most favorable case.  In real programs, your patch may actually make performance worse.  The default freelisting scheme is heavily used and tends to always be in cache.  In contrast, a freelist for less frequently used objects tend to not be in cache when you need them.   Similar logic applies to branch prediction here as well. (In short, I believe that the patch serves only to optimize an unrealistic benchmark and would make actual programs worse-off).

I'm -1 on adding freelisting to iterators.  I recently removed the freelist scheme from Objects/setobject.c because it provided no incremental benefit over the default freelisting scheme.

Please focus your optimization efforts elsewhere in the code.  There are real improvements to be had for operations that matter.  The time to create an iterator is one of the least important operations in Python.   If this had been a real win, I would have incorporated it into itertools long ago. 


Raymond


P.S.  If you want to help benchmark the effects of aligned versus unaligned memory allocations, that is an area this likely to bear fruit (for example, if integer objects with 32 byte aligned, it would guarantee that the object head and body would be in the same cache line).


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20130915/033489ae/attachment.html>

From anthonyfk at gmail.com  Sun Sep 15 20:52:57 2013
From: anthonyfk at gmail.com (Kyle Fisher)
Date: Sun, 15 Sep 2013 12:52:57 -0600
Subject: [Python-ideas] Keep free list of popular iterator objects
In-Reply-To: <CALRznwSwyoZVjj2NqnYwmkxNj9vY76iaOf=jeM=cXqAjbSyZPg@mail.gmail.com>
References: <CALRznwQ9nk1AqJKWQJZnd+DufsKZJaCiFnxbfE_ti5f8xFNbXw@mail.gmail.com>
 <A03EC075-3101-4ED8-B429-8EBBFDC58A25@gmail.com>
 <CALRznwR3K-YHWKbfg5PzSdBVttWqUQc_qvTUREnLLTiaodPykA@mail.gmail.com>
 <CALRznwSSfb+rt_muDcBzMNGe0bdtuDYcLWRMVfUz=C0BHEa_jg@mail.gmail.com>
 <0D2417A7-18FB-4520-B22B-7B2385D93603@gmail.com>
 <CALRznwSwyoZVjj2NqnYwmkxNj9vY76iaOf=jeM=cXqAjbSyZPg@mail.gmail.com>
Message-ID: <CALRznwRJ4WU9SAepf_d9y95wgxZQoaETeLbDTrHMKkqtM133_Q@mail.gmail.com>

"In the case not, we'll just check an iterator and fall back to the default
freelist."

Should of course be:

"In the case not, we'll just check an integer and fall back to the default
freelist."


On Sun, Sep 15, 2013 at 12:50 PM, Kyle Fisher <anthonyfk at gmail.com> wrote:

> Hi Raymond,
>
> Thanks for taking the time to respond, I appreciate that!  Please note
> that I'm not attempting to be deceptive.  I'm not looping over a list of
> length one; I'm just timing the creation of the iterator object.  In doing
> this, I've shown that, in a micro-benchmark, creating an iterator object
> can be made 15% faster with a freelist.
>
> Yes, in one loop this is focusing on an O(1) setup of an O(n) operation.
> I think the main benefit of this would be for inner loops.  (Which is what
> my benchmark tested, no?)  In our case, we tend to have a large list to
> iterate over (n) where each item has a couple containers that we also need
> to iterate over (size m).  In this case, I'm focusing on the O(n) setup of
> the O(n*m) operation where n is large.  Surely this isn't completely
> wasteful?
>
> I'm not completely sure what the best way to exercise the case where the
> freelist misses.  Create one more than the number of freelisted iterators,
> perhaps?  I'm not sure if that would be reflective of real-world uses
> though.  Ignoring threads or other spontaneous iterator creation, either a
> particular loop is going to have its iterator in the freelist or not.  In
> the case not, we'll just check an iterator and fall back to the default
> freelist.
>
> In regards to your second paragraph, would a more real-world benchmark
> help?  I don't want to put too many more resources into a bad idea, but I
> know in our app we tend to iterate over things a lot, so my hunch is that
> the iterator freelist would be in cache more often than not.  Forgive my
> ignorance, but is there a macro-benchmark suite I could try this against?
> Even if the iterator freelist isn't in first-level cache, I'm almost
> certain it would exist within last-level cache.  Do you know what the cost
> of fetching from this is compared to grabbing a lock like the current
> _Py_Malloc does?
>
> Again, thanks for the time.  This is literally the first thing I've tried
> hacking into Python because it seemed like a cheap, easy (albeit, minor)
> improvement to a very common operation.
>
> Best,
> -Kyle
>
> P.S. I would like to put some effort into aligned memory allocations!
> I've been casually browsing the issue on the bug tracker over the last week
> or so; I'm somewhat surprised this isn't already the case for the numerical
> types!
>
>
> On Sun, Sep 15, 2013 at 10:38 AM, Raymond Hettinger <
> raymond.hettinger at gmail.com> wrote:
>
>>
>> On Sep 14, 2013, at 11:26 PM, Kyle Fisher <anthonyfk at gmail.com> wrote:
>>
>> I've realized that my original example is far too complex, so I've
>> simplified it:
>>
>> Status quo:
>> ./python -m timeit -r 100 -s "a=[1]" "iter(a)"
>> 10000000 loops, best of 100: 0.0662 usec per loop
>>
>> With patch:
>> ./python -m timeit -r 100 -s "a=[1]" "iter(a)"
>> 10000000 loops, best of 100: 0.0557 usec per loop
>> List iter allocations: 6
>> List iter reuse through freelist: 1011111554
>> 100.00% reuse rate
>>
>> Which seems to show a 15% speedup.  I'd be curious what others get.
>>
>>
>> This 15% claim is incredibly deceptive.  You're looping over a list of
>> length one and the "benefits" fall away immediately for anything longer.
>>  It seems like it is intentionally ignoring that you're optimizing an O(1)
>> setup step in an O(n) operation.  And the timing loop does not exercise the
>> cases where the freelist misses.
>>
>> More important effects are being masked by the tight timing loop that
>> only exercises the most favorable case.  In real programs, your patch may
>> actually make performance worse.  The default freelisting scheme is heavily
>> used and tends to always be in cache.  In contrast, a freelist for less
>> frequently used objects tend to not be in cache when you need them.
>> Similar logic applies to branch prediction here as well. (In short, I
>> believe that the patch serves only to optimize an unrealistic benchmark and
>> would make actual programs worse-off).
>>
>> I'm -1 on adding freelisting to iterators.  I recently removed the
>> freelist scheme from Objects/setobject.c because it provided no incremental
>> benefit over the default freelisting scheme.
>>
>> Please focus your optimization efforts elsewhere in the code.  There are
>> real improvements to be had for operations that matter.  The time to create
>> an iterator is one of the least important operations in Python.   If this
>> had been a real win, I would have incorporated it into itertools long ago.
>>
>>
>> Raymond
>>
>>
>> P.S.  If you want to help benchmark the effects of aligned versus
>> unaligned memory allocations, that is an area this likely to bear fruit
>> (for example, if integer objects with 32 byte aligned, it would guarantee
>> that the object head and body would be in the same cache line).
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20130915/61ba0fd8/attachment-0001.html>

From anthonyfk at gmail.com  Sun Sep 15 20:50:42 2013
From: anthonyfk at gmail.com (Kyle Fisher)
Date: Sun, 15 Sep 2013 12:50:42 -0600
Subject: [Python-ideas] Keep free list of popular iterator objects
In-Reply-To: <0D2417A7-18FB-4520-B22B-7B2385D93603@gmail.com>
References: <CALRznwQ9nk1AqJKWQJZnd+DufsKZJaCiFnxbfE_ti5f8xFNbXw@mail.gmail.com>
 <A03EC075-3101-4ED8-B429-8EBBFDC58A25@gmail.com>
 <CALRznwR3K-YHWKbfg5PzSdBVttWqUQc_qvTUREnLLTiaodPykA@mail.gmail.com>
 <CALRznwSSfb+rt_muDcBzMNGe0bdtuDYcLWRMVfUz=C0BHEa_jg@mail.gmail.com>
 <0D2417A7-18FB-4520-B22B-7B2385D93603@gmail.com>
Message-ID: <CALRznwSwyoZVjj2NqnYwmkxNj9vY76iaOf=jeM=cXqAjbSyZPg@mail.gmail.com>

Hi Raymond,

Thanks for taking the time to respond, I appreciate that!  Please note that
I'm not attempting to be deceptive.  I'm not looping over a list of length
one; I'm just timing the creation of the iterator object.  In doing this,
I've shown that, in a micro-benchmark, creating an iterator object can be
made 15% faster with a freelist.

Yes, in one loop this is focusing on an O(1) setup of an O(n) operation.  I
think the main benefit of this would be for inner loops.  (Which is what my
benchmark tested, no?)  In our case, we tend to have a large list to
iterate over (n) where each item has a couple containers that we also need
to iterate over (size m).  In this case, I'm focusing on the O(n) setup of
the O(n*m) operation where n is large.  Surely this isn't completely
wasteful?

I'm not completely sure what the best way to exercise the case where the
freelist misses.  Create one more than the number of freelisted iterators,
perhaps?  I'm not sure if that would be reflective of real-world uses
though.  Ignoring threads or other spontaneous iterator creation, either a
particular loop is going to have its iterator in the freelist or not.  In
the case not, we'll just check an iterator and fall back to the default
freelist.

In regards to your second paragraph, would a more real-world benchmark
help?  I don't want to put too many more resources into a bad idea, but I
know in our app we tend to iterate over things a lot, so my hunch is that
the iterator freelist would be in cache more often than not.  Forgive my
ignorance, but is there a macro-benchmark suite I could try this against?
Even if the iterator freelist isn't in first-level cache, I'm almost
certain it would exist within last-level cache.  Do you know what the cost
of fetching from this is compared to grabbing a lock like the current
_Py_Malloc does?

Again, thanks for the time.  This is literally the first thing I've tried
hacking into Python because it seemed like a cheap, easy (albeit, minor)
improvement to a very common operation.

Best,
-Kyle

P.S. I would like to put some effort into aligned memory allocations!  I've
been casually browsing the issue on the bug tracker over the last week or
so; I'm somewhat surprised this isn't already the case for the numerical
types!


On Sun, Sep 15, 2013 at 10:38 AM, Raymond Hettinger <
raymond.hettinger at gmail.com> wrote:

>
> On Sep 14, 2013, at 11:26 PM, Kyle Fisher <anthonyfk at gmail.com> wrote:
>
> I've realized that my original example is far too complex, so I've
> simplified it:
>
> Status quo:
> ./python -m timeit -r 100 -s "a=[1]" "iter(a)"
> 10000000 loops, best of 100: 0.0662 usec per loop
>
> With patch:
> ./python -m timeit -r 100 -s "a=[1]" "iter(a)"
> 10000000 loops, best of 100: 0.0557 usec per loop
> List iter allocations: 6
> List iter reuse through freelist: 1011111554
> 100.00% reuse rate
>
> Which seems to show a 15% speedup.  I'd be curious what others get.
>
>
> This 15% claim is incredibly deceptive.  You're looping over a list of
> length one and the "benefits" fall away immediately for anything longer.
>  It seems like it is intentionally ignoring that you're optimizing an O(1)
> setup step in an O(n) operation.  And the timing loop does not exercise the
> cases where the freelist misses.
>
> More important effects are being masked by the tight timing loop that only
> exercises the most favorable case.  In real programs, your patch may
> actually make performance worse.  The default freelisting scheme is heavily
> used and tends to always be in cache.  In contrast, a freelist for less
> frequently used objects tend to not be in cache when you need them.
> Similar logic applies to branch prediction here as well. (In short, I
> believe that the patch serves only to optimize an unrealistic benchmark and
> would make actual programs worse-off).
>
> I'm -1 on adding freelisting to iterators.  I recently removed the
> freelist scheme from Objects/setobject.c because it provided no incremental
> benefit over the default freelisting scheme.
>
> Please focus your optimization efforts elsewhere in the code.  There are
> real improvements to be had for operations that matter.  The time to create
> an iterator is one of the least important operations in Python.   If this
> had been a real win, I would have incorporated it into itertools long ago.
>
>
> Raymond
>
>
> P.S.  If you want to help benchmark the effects of aligned versus
> unaligned memory allocations, that is an area this likely to bear fruit
> (for example, if integer objects with 32 byte aligned, it would guarantee
> that the object head and body would be in the same cache line).
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20130915/73e06b86/attachment.html>

From anthonyfk at gmail.com  Sun Sep 15 20:57:29 2013
From: anthonyfk at gmail.com (Kyle Fisher)
Date: Sun, 15 Sep 2013 12:57:29 -0600
Subject: [Python-ideas] Keep free list of popular iterator objects
In-Reply-To: <5235924F.50907@egenix.com>
References: <CALRznwQ9nk1AqJKWQJZnd+DufsKZJaCiFnxbfE_ti5f8xFNbXw@mail.gmail.com>
 <A03EC075-3101-4ED8-B429-8EBBFDC58A25@gmail.com>
 <CALRznwR3K-YHWKbfg5PzSdBVttWqUQc_qvTUREnLLTiaodPykA@mail.gmail.com>
 <CALRznwSSfb+rt_muDcBzMNGe0bdtuDYcLWRMVfUz=C0BHEa_jg@mail.gmail.com>
 <5235924F.50907@egenix.com>
Message-ID: <CALRznwTBt_qfDD03Nz2CU3F=xkrOaJNZwHf5qqa8cMWBJGcnrg@mail.gmail.com>

Hello Marc-Andre,

Thanks for the suggestion!  I think I'd like to get a better handle on
Raymond's concerns before opening a ticket, as he does bring up some good
criticisms.

Thanks,
-Kyle
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20130915/22a06896/attachment.html>

From tim.peters at gmail.com  Sun Sep 15 21:18:19 2013
From: tim.peters at gmail.com (Tim Peters)
Date: Sun, 15 Sep 2013 14:18:19 -0500
Subject: [Python-ideas] Keep free list of popular iterator objects
In-Reply-To: <CALRznwSwyoZVjj2NqnYwmkxNj9vY76iaOf=jeM=cXqAjbSyZPg@mail.gmail.com>
References: <CALRznwQ9nk1AqJKWQJZnd+DufsKZJaCiFnxbfE_ti5f8xFNbXw@mail.gmail.com>
 <A03EC075-3101-4ED8-B429-8EBBFDC58A25@gmail.com>
 <CALRznwR3K-YHWKbfg5PzSdBVttWqUQc_qvTUREnLLTiaodPykA@mail.gmail.com>
 <CALRznwSSfb+rt_muDcBzMNGe0bdtuDYcLWRMVfUz=C0BHEa_jg@mail.gmail.com>
 <0D2417A7-18FB-4520-B22B-7B2385D93603@gmail.com>
 <CALRznwSwyoZVjj2NqnYwmkxNj9vY76iaOf=jeM=cXqAjbSyZPg@mail.gmail.com>
Message-ID: <CAExdVNnex_8g=Esw8Kn1DA5qQv82V8b020OkHmBJvo9x79pcYA@mail.gmail.com>

[ Kyle Fisher]
> ...
> Even if the iterator freelist isn't in first-level cache, I'm almost certain
> it would exist within last-level cache.  Do you know what the cost of
> fetching from this is compared to grabbing a lock like the current
> _Py_Malloc does?

It's infinitely more expensive than grabbing a lock ;-)  As Antoine
noted earlier in this thread, while obmalloc.c is sprinkled with
LOCK() and UNLOCK() macros, they all expand  to "nothing" - obmalloc.c
doesn't actually grab any locks (it relies on the GIL to serialize
threads).

For example, LOCK is defined thusly:

#define LOCK()          SIMPLELOCK_LOCK(_malloc_lock)

and above that there's:

#define SIMPLELOCK_LOCK(lock)   /* acquire released lock */

Just FYI ;-)

From anthonyfk at gmail.com  Sun Sep 15 21:19:30 2013
From: anthonyfk at gmail.com (Kyle Fisher)
Date: Sun, 15 Sep 2013 13:19:30 -0600
Subject: [Python-ideas] Keep free list of popular iterator objects
In-Reply-To: <CAExdVNnex_8g=Esw8Kn1DA5qQv82V8b020OkHmBJvo9x79pcYA@mail.gmail.com>
References: <CALRznwQ9nk1AqJKWQJZnd+DufsKZJaCiFnxbfE_ti5f8xFNbXw@mail.gmail.com>
 <A03EC075-3101-4ED8-B429-8EBBFDC58A25@gmail.com>
 <CALRznwR3K-YHWKbfg5PzSdBVttWqUQc_qvTUREnLLTiaodPykA@mail.gmail.com>
 <CALRznwSSfb+rt_muDcBzMNGe0bdtuDYcLWRMVfUz=C0BHEa_jg@mail.gmail.com>
 <0D2417A7-18FB-4520-B22B-7B2385D93603@gmail.com>
 <CALRznwSwyoZVjj2NqnYwmkxNj9vY76iaOf=jeM=cXqAjbSyZPg@mail.gmail.com>
 <CAExdVNnex_8g=Esw8Kn1DA5qQv82V8b020OkHmBJvo9x79pcYA@mail.gmail.com>
Message-ID: <CALRznwQqE+AvKi+uPpkbi8JU3UM=R1Gwvfs2br4ZxbkpR3SeMg@mail.gmail.com>

Well, don't I feel the fool. Thanks. :-)

-Kyle
On 2013-09-15 1:18 PM, "Tim Peters" <tim.peters at gmail.com> wrote:

> [ Kyle Fisher]
> > ...
> > Even if the iterator freelist isn't in first-level cache, I'm almost
> certain
> > it would exist within last-level cache.  Do you know what the cost of
> > fetching from this is compared to grabbing a lock like the current
> > _Py_Malloc does?
>
> It's infinitely more expensive than grabbing a lock ;-)  As Antoine
> noted earlier in this thread, while obmalloc.c is sprinkled with
> LOCK() and UNLOCK() macros, they all expand  to "nothing" - obmalloc.c
> doesn't actually grab any locks (it relies on the GIL to serialize
> threads).
>
> For example, LOCK is defined thusly:
>
> #define LOCK()          SIMPLELOCK_LOCK(_malloc_lock)
>
> and above that there's:
>
> #define SIMPLELOCK_LOCK(lock)   /* acquire released lock */
>
> Just FYI ;-)
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20130915/2624d563/attachment-0001.html>

From barry at python.org  Sun Sep 15 21:53:55 2013
From: barry at python.org (Barry Warsaw)
Date: Sun, 15 Sep 2013 15:53:55 -0400
Subject: [Python-ideas] Style for multi-line generator expressions
References: <CANW+cAXXLrhZ1wgVhwPVMh+8prQ1Ai9VVU6CQt1P5NCxkAeuJw@mail.gmail.com>
 <l10tcc$ged$1@ger.gmane.org>
Message-ID: <20130915155355.2a25a1f1@anarchist>

On Sep 14, 2013, at 07:49 AM, Georg Brandl wrote:

>On 09/14/2013 07:36 AM, Clay Sweetser wrote:
>> PEP 8 currently lacks any suggestions for how multi-line generator expressions
>> and list comprehensions should be formatted. In the absence of any official
>> style suggestion (that I can find), I suggest the style used the most in the
>> standard library.
>>
>> [<expression>
>>  for <target_list> in <container>
>>  if <condition>]
>> 
>> Note, lines could still be combined where it makes sense, eg, the first two
>> lines could be combined if they aren't too long.
>
>But that amounts to simply respecting the line length limit and using
>logical breakpoints.  I don't think that requires a special mention in
>PEP 8.

Agreed.
-Barry
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 836 bytes
Desc: not available
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20130915/f68b6716/attachment.sig>

From oscar.j.benjamin at gmail.com  Sun Sep 15 22:02:33 2013
From: oscar.j.benjamin at gmail.com (Oscar Benjamin)
Date: Sun, 15 Sep 2013 21:02:33 +0100
Subject: [Python-ideas] Add dict.getkey() and set.get()
In-Reply-To: <20130914230756.GQ16820@ando>
References: <l11u5v$j9n$1@ger.gmane.org>
 <CAEbHw4Zb860agw+W4qb=O6vnc+4v-OkdRkwebJTStY2zoTdDig@mail.gmail.com>
 <20130914230756.GQ16820@ando>
Message-ID: <CAHVvXxQP6JtPvnLxOrnoA2ow_hQAUpGat55tb4yiBT9psMdD2A@mail.gmail.com>

On 15 September 2013 00:07, Steven D'Aprano <steve at pearwood.info> wrote:
> On Sat, Sep 14, 2013 at 09:52:30AM -0700, David Mertz wrote:
>
>> def getexact(m, v):
>>     for x in m:
>>         if x==v: return x
>>     else:
>>         raise KeyError(v)
>
> This has the flaw that it is O(N) rather than O(1). It's really quite
> unfortunate to have dicts and sets able to access keys in (almost)
> constant time, but not be able to communicate that key back to the
> caller except by walking the entire dict/set.

I don't know whether this is relying on undefined behaviour but the
following is O(1) and seems to work:

>>> def canonical_key(d, k):
...     k, = {k} & d.keys()
...     return k
...
>>> canonical_key({1:'q', 2.0:'w'}, 1.0)
1
>>> canonical_key({1:'q', 2.0:'w'}, 2)
2.0


Oscar

From techtonik at gmail.com  Sun Sep 15 22:34:24 2013
From: techtonik at gmail.com (anatoly techtonik)
Date: Sun, 15 Sep 2013 23:34:24 +0300
Subject: [Python-ideas] AST Pretty Printer
In-Reply-To: <0759F11C-2B77-475D-9B2D-C71BD5A95582@yahoo.com>
References: <70c75000-23cb-482d-b12c-4610de34e0b1@email.android.com>
 <0759F11C-2B77-475D-9B2D-C71BD5A95582@yahoo.com>
Message-ID: <CAPkN8xJZ9LP2ertKBaZ-24DQhr+CQ2rou1tCwXMhnTDKFgEthQ@mail.gmail.com>

On Fri, Sep 13, 2013 at 3:01 AM, Andrew Barnert <abarnert at yahoo.com> wrote:
> On Sep 12, 2013, at 16:31, Ryan <rymg19 at gmail.com> wrote:
>
>> I always encounter one problem when dealing with Python ASTs: When I print it, it looks like Lisp(aka Lots of Irritated Superfluous Parenthesis).
>
> Why are the parentheses irritated? Have you been taunting them? :)

These look like smiley monsta to myeyes.

Module([ImportFrom('distutils.core', [alias('setup', None)], 0),
Expr(Call(Name('setup', Load()), [], [keyword('name', Str('astdump')),
keyword('version', Str('1.0')), keyword('author', Str('anatoly
techtonik')), keyword('author_email', Str('techtonik at gmail.com')),
keyword('description', Str('Extract information from Python module
without importing it.')), keyword('license', Str('Public Domain')),
keyword('py_modules', List([Str('astdump')], Load()))], None, None))])

>> In short: it's a mess.
>>
>> My idea is an AST pretty printer built on ast.NodeVisitor. If anyone finds this interesting, I can probably have a prototype of the class between later today and sometime tomorrow.
>
> Yes please!
>
> I'll bet most people who play with ASTs want this, build something half-assed, never finish it, and lose it by the next time they look at ASTs again three years later... So if you finish something, that'll save effort for hundreds of people in the future (who have no idea they'll want it one day).

My version of half-assed, semi-finished, only one year fresh and code
complete for that it does. =)

$ hg clone https://bitbucket.org/techtonik/astdump
$ cd astdump
$ ./astdump.py --generate astdump.py > setup.py
$ ./astdump.py --dump setup.py
Module
  ImportFrom
    alias
  Expr
    Call
      Name
        Load
      keyword
        Str
      keyword
        Str
      keyword
        Str
      keyword
        Str
      keyword
        Str
      keyword
        Str
      keyword
        List
          Str
          Load

Source code is in public domain, latest version:
https://bitbucket.org/techtonik/astdump/src/tip/astdump.py?at=default

The API for dumping is:
  TreeDumper().dump(root)

  class TreeDumper(ast.NodeVisitor):
    def dump(self, node, types=[], level=None, callback=None):
      """pretty-print AST tree

         if `types` is set, process only types in the list
         if `level` is set, limit output to the given depth
         `callback` (if set) will be called to process filtered node
      """

To customize, just supply a callback. Example callbacks:

  def printcb(node, level):
    nodename = node.__class__.__name__
    print(' '*level*2 + nodename)

It played with it on Python 2, but it should be runnable on Python 3
with simple print replacements.

--
anatoly t.

From dreamingforward at gmail.com  Sun Sep 15 23:29:10 2013
From: dreamingforward at gmail.com (Mark Janssen)
Date: Sun, 15 Sep 2013 14:29:10 -0700
Subject: [Python-ideas] Add dict.getkey() and set.get()
In-Reply-To: <8D234C67-1B8E-4653-9AA9-86DE1E5A1EC5@gmail.com>
References: <l11u5v$j9n$1@ger.gmane.org>
 <8D234C67-1B8E-4653-9AA9-86DE1E5A1EC5@gmail.com>
Message-ID: <CAMjeLr8BnDag2nRQoVHEu6sSRpBTJ5bVn5kQOyRmY_LTN_7X+w@mail.gmail.com>

> * all the participants on this list would be well served to teach
>    some python classes to get an appreciation of the negative
>    consequences of further expanding the APIs of the core containers.
>    Bigger is not better.  Learnability matters.

+1 on that.

--mark

From elazarg at gmail.com  Mon Sep 16 01:39:46 2013
From: elazarg at gmail.com (=?UTF-8?B?15DXnNei15bXqA==?=)
Date: Mon, 16 Sep 2013 01:39:46 +0200
Subject: [Python-ideas] Compressing excepthook output
Message-ID: <CAPw6O2SWjYsptiqvPVBowOr7uermDRXe4BH44=m+6gyC5563tg@mail.gmail.com>

I suggest adding an excepthook that prints out a compressed version of the
stack trace. The new excepthook should be the default at least for
interactive mode.

The use case is this: you are using an interactive interpreter, or perhaps
in eclipse's PyDev, experimenting with some code. The code happen to has an
infinite recursion - maybe an erroneous boundary condition, maybe the
recursion itself was an accident. You didn't catch the RuntimeError so you
get a print of the traceback. This is by default a 2000 lines of highly
repetitive call chain. Most likely, a single cycle repeating some 300 times.

The main problem is that in most environments, by default, you have only a
limited amount of lines kept in the window. So you can't just scroll up and
see what was the error in the first place - where is the entry point into
the cycle. You have to reproduce it, and catch RuntimeError. You can't just
use printing for debugging, either, because you won't see them. And even if
you can see it, you had lost much of your "history" for nothing.

I have tried to implement an alternative for sys.excepthook (see below),
which compresses the last simple cycle in the call graph. Turns out it's
not trivial, since the traceback object is not well documented (and maybe
it shouldn't be, as it is an implementation detail) so it's non trivial (if
at all possible) to change the trace list in an existing traceback. I don't
think it is reasonable to just send anyone interested in such a feature to
implement it themselves - especially given that newcomers ate its main
target - and even if we do, there is no simple way to make it a default.

Such a compression will not always help, since the call graph may be
arbitrarily complex, so there has to be some threshold below which there
won't be any compression. this threshold should be chosen after considering
the number of lines accessible by default in common environments
(Linux/Windows terminals, eclipse's console, etc.).
Needless to say, the output should be correct in all cases. I am not sure
that my example implementation is.

Another suggestion, related but distinct: `class
RecursionLimitError(RuntimeError)` should be raised instead of a plain
RuntimeError. One should be able to except this specific case,
and "Exception messages are not part of the Python API".

---

Example for the desired result (non interactive):

Traceback (most recent call last):
  File "/workspace/compress.py", line 48, in <module>
    bar()
  File "/workspace/compress.py", line 46, in bar
    p()
  File "/workspace/compress.py", line 43, in p
    def p(): p0()
  File "/workspace/compress.py", line 41, in p0
    def p0(): p2()
  File "/workspace/compress.py", line 39, in p2
    def p2(): p()
RuntimeError: maximum recursion depth exceeded
332.67 occurrences of cycle of size 3 detected

Code:

import traceback
import sys

def print_exception(name, value, count, size, newtrace):
    # this is ugly and fragile
    sys.stderr.write('Traceback (most recent call last):\n')
    sys.stderr.writelines(traceback.format_list(newtrace))
    sys.stderr.write('{}: {}\n'.format(name ,value))
    sys.stderr.write('{} occurrences of cycle of size {}
detected\n'.format(count, size))

def analyze_cycles(tb):
    calls = set()
    size = 0
    for i, call in enumerate(reversed(tb)):
        if size == 0:
            calls.add(call)
            if call == tb[-1]:
                size = i
        elif call not in calls:
            length = i
            break
    return size, length

def cycle_detect_excepthook(exctype, value, trace):
    if exctype is RuntimeError:
        tb = traceback.extract_tb(trace)
        # Feels like a hack here
        if len(tb) >= sys.getrecursionlimit()-1:
            size, length = analyze_cycles(tb)
            count = round(length/size, 2)
            if count >= 2:
                print_exception(exctype.__name__, value, count, size,
tb[:-length+size])
                return
    sys.__excepthook__(exctype, value, tb)

sys.excepthook = cycle_detect_excepthook

if __name__ == '__main__':
    def p2(): p()

    def p0(): p2()

    def p(): p()

    def bar():
        p()

    bar()

###

Elazar
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20130916/24d5e094/attachment-0001.html>

From anthonyfk at gmail.com  Mon Sep 16 01:50:53 2013
From: anthonyfk at gmail.com (Kyle Fisher)
Date: Sun, 15 Sep 2013 17:50:53 -0600
Subject: [Python-ideas] Keep free list of popular iterator objects
Message-ID: <CALRznwQorVYYF4gLuDQxcdPZq5ET1oa5gK0gFMMUjPxS2B9Q6A@mail.gmail.com>

Hi Antoine,

Thanks for taking the time to respond.  Sorry I didn't see your comments
earlier, I have my mailing list settings to digest and for some reason they
weren't showing up in my inbox.  Anyway, I agree that a real-world test
case would be best.  Marc-Andre tossed out "100 objects" for the free list
size, but I'd like to point out that it probably doesn't need to be
anywhere near that large.  How many iterators are active in the interpreter
simultaneously?  I think we could get away with only a dozen or so.

Perhaps it's best for me at this point to try out the patch in our
application and see what some real world results would be.  It'd also be
nice if there was some other macro-benchmark that I could run this against
to verify that it doesn't make things worse, which seems to be Raymond's
biggest concern.  Is there something like this available?  Maybe even just
the unit test suite?

Thanks,
-Kyle
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20130915/c90f849b/attachment.html>

From tim.peters at gmail.com  Mon Sep 16 01:56:25 2013
From: tim.peters at gmail.com (Tim Peters)
Date: Sun, 15 Sep 2013 18:56:25 -0500
Subject: [Python-ideas] Keep free list of popular iterator objects
In-Reply-To: <CALRznwSSfb+rt_muDcBzMNGe0bdtuDYcLWRMVfUz=C0BHEa_jg@mail.gmail.com>
References: <CALRznwQ9nk1AqJKWQJZnd+DufsKZJaCiFnxbfE_ti5f8xFNbXw@mail.gmail.com>
 <A03EC075-3101-4ED8-B429-8EBBFDC58A25@gmail.com>
 <CALRznwR3K-YHWKbfg5PzSdBVttWqUQc_qvTUREnLLTiaodPykA@mail.gmail.com>
 <CALRznwSSfb+rt_muDcBzMNGe0bdtuDYcLWRMVfUz=C0BHEa_jg@mail.gmail.com>
Message-ID: <CAExdVNkY4OrNdOK_xcmTcwNokZnz=Hcf7YiBRG8T1+iH0yCuEg@mail.gmail.com>

[Kyle Fisher]
> I've realized that my original example is far too complex, so I've
> simplified it:
>
> Status quo:
> ./python -m timeit -r 100 -s "a=[1]" "iter(a)"
> 10000000 loops, best of 100: 0.0662 usec per loop
>
> With patch:
> ./python -m timeit -r 100 -s "a=[1]" "iter(a)"
> 10000000 loops, best of 100: 0.0557 usec per loop
> List iter allocations: 6
> List iter reuse through freelist: 1011111554
> 100.00% reuse rate
>
> Which seems to show a 15% speedup.

Nope!  More like 19%.  Sometime early in my career, benchmarks
universally changed from reporting speedups via:

    (old - new) / old * 100

to

    (old - new) / new * 100

and

>>> (0.0662 - 0.0557) / 0.0557 * 100
18.850987432675037

Why did they make this change?  So that slowdowns were never shown as
worse than -100%, and especially so that there was no upper bound on
reported speedups ;-)

What I'm surprised by is that you didn't get a larger speedup.  You
don't just save cycles allocating when using a free list, you also
save cycles when free'ing the object.  When free'ing, obmalloc's
Py_ADDRESS_IN_RANGE alone may consume as many cycles as the free
list's combined allocation and deallocation work.  While I tried to
make the common paths in obmalloc.c as fast as possible, it's still a
mostly "general purpose" allocator so has to worry about silly things
a dedicated free list can ignore (like:  are they asking for 0 bytes?
asking for something small enough that I _can_ use one of my internal
free lists?  if so, which one?  did I pass out the pointer they're
asking me to free. or do I have to hand it off to someone else?).

I'm mostly with MAL (Marc Andre) on this one:  it's worth doing if and
only if many "real world" programs would benefit.  :Unfortunately,
there's never been a clear way to decide that :-(

From ncoghlan at gmail.com  Mon Sep 16 02:00:58 2013
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Mon, 16 Sep 2013 10:00:58 +1000
Subject: [Python-ideas] Compressing excepthook output
In-Reply-To: <CAPw6O2SWjYsptiqvPVBowOr7uermDRXe4BH44=m+6gyC5563tg@mail.gmail.com>
References: <CAPw6O2SWjYsptiqvPVBowOr7uermDRXe4BH44=m+6gyC5563tg@mail.gmail.com>
Message-ID: <CADiSq7fQesiH_JXcXsXTwDsK+3ZjJ+tBBDutk8_dTA5yrzA+Wg@mail.gmail.com>

Better display of recursion errors sounds reasonable to me, as does giving
them a dedicated subclass.

Step one would be coming up with test cases and a solid implementation and
display format for cyclic call detection.

Once that is available, then using it in the default excepthook for the
CPython REPL is a separate question.

Cheers,
Nick.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20130916/ee7b50d8/attachment.html>

From steve at pearwood.info  Mon Sep 16 02:07:55 2013
From: steve at pearwood.info (Steven D'Aprano)
Date: Mon, 16 Sep 2013 10:07:55 +1000
Subject: [Python-ideas] Compressing excepthook output
In-Reply-To: <CAPw6O2SWjYsptiqvPVBowOr7uermDRXe4BH44=m+6gyC5563tg@mail.gmail.com>
References: <CAPw6O2SWjYsptiqvPVBowOr7uermDRXe4BH44=m+6gyC5563tg@mail.gmail.com>
Message-ID: <20130916000754.GB7914@ando>

On Mon, Sep 16, 2013 at 01:39:46AM +0200, ????? wrote:
> I suggest adding an excepthook that prints out a compressed version of the
> stack trace. The new excepthook should be the default at least for
> interactive mode.
[...]
> I have tried to implement an alternative for sys.excepthook (see below),
> which compresses the last simple cycle in the call graph. Turns out it's
> not trivial, since the traceback object is not well documented (and maybe
> it shouldn't be, as it is an implementation detail) so it's non trivial (if
> at all possible) to change the trace list in an existing traceback. I don't
> think it is reasonable to just send anyone interested in such a feature to
> implement it themselves - especially given that newcomers ate its main
> target - and even if we do, there is no simple way to make it a default.

I like where this is going. Tracebacks for recursive function calls 
are extremely noisy, with the extra lines rarely giving any useful 
information. 

Have a look at the cgitb module in the standard library.

I think you should start off by cleaning up your traceback handler to be 
less "ugly and fragile" (your words), if possible, and then consider 
publishing it on ActiveState's website as a Python recipe. That would be 
the first step in gathering user feedback and experience in the real 
world, and if it turns out to be useful in practice, at a later 
date we can look at adding it to the standard library.

http://code.activestate.com/recipes/


-- 
Steven

From ncoghlan at gmail.com  Mon Sep 16 02:30:25 2013
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Mon, 16 Sep 2013 10:30:25 +1000
Subject: [Python-ideas] Compressing excepthook output
In-Reply-To: <20130916000754.GB7914@ando>
References: <CAPw6O2SWjYsptiqvPVBowOr7uermDRXe4BH44=m+6gyC5563tg@mail.gmail.com>
 <20130916000754.GB7914@ando>
Message-ID: <CADiSq7ctDq6YBbUXfYcD7EwYHEqrvxSgCXM81Rqo6DjVcWoDVQ@mail.gmail.com>

On 16 September 2013 10:07, Steven D'Aprano <steve at pearwood.info> wrote:
> On Mon, Sep 16, 2013 at 01:39:46AM +0200, ????? wrote:
>> I suggest adding an excepthook that prints out a compressed version of the
>> stack trace. The new excepthook should be the default at least for
>> interactive mode.
> [...]
>> I have tried to implement an alternative for sys.excepthook (see below),
>> which compresses the last simple cycle in the call graph. Turns out it's
>> not trivial, since the traceback object is not well documented (and maybe
>> it shouldn't be, as it is an implementation detail) so it's non trivial (if
>> at all possible) to change the trace list in an existing traceback. I don't
>> think it is reasonable to just send anyone interested in such a feature to
>> implement it themselves - especially given that newcomers ate its main
>> target - and even if we do, there is no simple way to make it a default.
>
> I like where this is going. Tracebacks for recursive function calls
> are extremely noisy, with the extra lines rarely giving any useful
> information.
>
> Have a look at the cgitb module in the standard library.
>
> I think you should start off by cleaning up your traceback handler to be
> less "ugly and fragile" (your words), if possible, and then consider
> publishing it on ActiveState's website as a Python recipe. That would be
> the first step in gathering user feedback and experience in the real
> world, and if it turns out to be useful in practice, at a later
> date we can look at adding it to the standard library.
>
> http://code.activestate.com/recipes/

Another couple of potentially useful pointers:

- the traceback.py source is a good place to get more details on how
traceback objects work
(http://hg.python.org/cpython/file/default/Lib/traceback.py)

- you may want to try out the updated traceback extraction API
proposed in http://bugs.python.org/issue17911 and see if that cleans
up your code. If it helps, that would be good validation of the
proposed new API, if it doesn't, it may provide hints for further
improvement.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia

From steve at pearwood.info  Mon Sep 16 02:30:39 2013
From: steve at pearwood.info (Steven D'Aprano)
Date: Mon, 16 Sep 2013 10:30:39 +1000
Subject: [Python-ideas] Add dict.getkey() and set.get()
In-Reply-To: <CAHVvXxQP6JtPvnLxOrnoA2ow_hQAUpGat55tb4yiBT9psMdD2A@mail.gmail.com>
References: <l11u5v$j9n$1@ger.gmane.org>
 <CAEbHw4Zb860agw+W4qb=O6vnc+4v-OkdRkwebJTStY2zoTdDig@mail.gmail.com>
 <20130914230756.GQ16820@ando>
 <CAHVvXxQP6JtPvnLxOrnoA2ow_hQAUpGat55tb4yiBT9psMdD2A@mail.gmail.com>
Message-ID: <20130916003039.GC7914@ando>

On Sun, Sep 15, 2013 at 09:02:33PM +0100, Oscar Benjamin wrote:

> I don't know whether this is relying on undefined behaviour but the
> following is O(1) and seems to work:
> 
> >>> def canonical_key(d, k):
> ...     k, = {k} & d.keys()
> ...     return k
> ...

I'm pretty sure that (1) it relies on implementation-specific behaviour, 
and (2) it's O(N), not O(1).

The implementation-specific part is whether & takes the key from the 
left-hand or right-hand operand when the keys are equal. For example, in 
Python 3.3:

py> {1} & {1.0}
{1.0}
py> {1} & {1.0, 2.0}
{1}

And surely it's O(N) -- to be precise, O(M+N) -- because & has to walk 
all the keys in both operands? I suppose technically & could special 
case "one of the operands has length 1" and optimize it, but that too 
would be an implementation detail.


-- 
Steven

From abarnert at yahoo.com  Mon Sep 16 03:24:59 2013
From: abarnert at yahoo.com (Andrew Barnert)
Date: Sun, 15 Sep 2013 18:24:59 -0700 (PDT)
Subject: [Python-ideas] Add dict.getkey() and set.get()
In-Reply-To: <CAHVvXxQP6JtPvnLxOrnoA2ow_hQAUpGat55tb4yiBT9psMdD2A@mail.gmail.com>
References: <l11u5v$j9n$1@ger.gmane.org>
 <CAEbHw4Zb860agw+W4qb=O6vnc+4v-OkdRkwebJTStY2zoTdDig@mail.gmail.com>
 <20130914230756.GQ16820@ando>
 <CAHVvXxQP6JtPvnLxOrnoA2ow_hQAUpGat55tb4yiBT9psMdD2A@mail.gmail.com>
Message-ID: <1379294699.25404.YahooMailNeo@web184703.mail.ne1.yahoo.com>

From: Oscar Benjamin <oscar.j.benjamin at gmail.com>

Sent: Sunday, September 15, 2013 1:02 PM


> I don't know whether this is relying on undefined behaviour but the
> following is O(1) and seems to work:
> 
>>>>  def canonical_key(d, k):
> ...? ?  k, = {k} & d.keys()
> ...? ?  return k


I'm pretty sure it's undefined behavior.

It does seem to work with the CPython and PyPy 3.x versions I have around, with?every test I throw at it, and if you look through the source you can see why? but there isn't any good reason it should.

set.intersection(other) and set & other don't appear to be documented beyond "Return a new set with elements common to the set and all others" (http://docs.python.org/3/library/stdtypes.html#set.intersection).?dict_keys doesn't define what its methods do, beyond saying that the type is?"set-like", and?implements collections.abc.Set. (http://docs.python.org/3/library/stdtypes.html#dictionary-view-objects).?And in fact, here you're relying on the fact that dict_keys doesn't actually do the same thing as set. {1} & {1.0, 2.0} gives you {1}, but {1} & {1.0: 0, 2.0: 0}.keys() gives you {1.0}.

As a side note, that "k, = " bit is going to give you an ugly "ValueError: need more than 0 values to unpack" instead of a nice "KeyError: 3" if k isn't in d, so you might want to wrap it in a try to convert the exception.


Meanwhile, there is something that seems like it _should_ be guaranteed to work? but it doesn't.?intersection_update says "Update the set, keeping only elements found in it and all others", which seems to say you'll keep the elements in the original set. So this ought to work:

? ? s = set(d.keys())
? ? s &= {k}
? ? s, = s
? ? return s

But it doesn't. You have to do it the other way around, which seems to be incorrect:

? ? s = {k}
? ? s &= d.keys()
? ? s, = s
? ? return s

And in fact, that's the only reason your method works.?Ultimately, what {k} & d.keys() does is to call dict_keys.np_and({k}, d.keys()). If you look at the source (http://hg.python.org/cpython/file/7df61fa27f71/Objects/dictobject.c#l3320), this is basically the backward version that works (except that it makes a copy tmp = set(s), and calls intersection_update instead of using &=).

From abarnert at yahoo.com  Mon Sep 16 03:34:53 2013
From: abarnert at yahoo.com (Andrew Barnert)
Date: Sun, 15 Sep 2013 18:34:53 -0700 (PDT)
Subject: [Python-ideas] Add dict.getkey() and set.get()
In-Reply-To: <20130916003039.GC7914@ando>
References: <l11u5v$j9n$1@ger.gmane.org>
 <CAEbHw4Zb860agw+W4qb=O6vnc+4v-OkdRkwebJTStY2zoTdDig@mail.gmail.com>
 <20130914230756.GQ16820@ando>
 <CAHVvXxQP6JtPvnLxOrnoA2ow_hQAUpGat55tb4yiBT9psMdD2A@mail.gmail.com>
 <20130916003039.GC7914@ando>
Message-ID: <1379295293.70475.YahooMailNeo@web184704.mail.ne1.yahoo.com>

From: Steven D'Aprano <steve at pearwood.info>

Sent: Sunday, September 15, 2013 5:30 PM


> On Sun, Sep 15, 2013 at 09:02:33PM +0100, Oscar Benjamin wrote:
> 
>>  I don't know whether this is relying on undefined behaviour but the
>>  following is O(1) and seems to work:
>> 
>>  >>> def canonical_key(d, k):
>>  ...? ?  k, = {k} & d.keys()
>>  ...? ?  return k
>>  ...
> 
> I'm pretty sure that (1) it relies on implementation-specific behaviour, 
> and (2) it's O(N), not O(1).
> 
> The implementation-specific part is whether & takes the key from the 
> left-hand or right-hand operand when the keys are equal. For example, in 
> Python 3.3:
> 
> py> {1} & {1.0}
> {1.0}
> py> {1} & {1.0, 2.0}
> {1}

Actually, this isn't strictly relevant, because he's calling dict_keys.__rand__, not set.__and__. There's no reason they have to do the same thing?and, in fact, they don't, which is why it works in the first place.

But the larger point is valid: both methods are implementation-specific?and the fact that they produce opposite results is a nice illustration of that.

> And surely it's O(N) -- to be precise, O(M+N) -- because & has to walk 
> all the keys in both operands? I suppose technically & could special 
> case "one of the operands has length 1" and optimize it, but that too 
> would be an implementation detail.


No it doesn't. In fact, it _can't_ walk all the keys in both operands; unless they were sorted (and they aren't), there's no way to make that work. Instead, it?does an O(N) or O(M) walk over one operand, calling the other operand's __contains__ method for each.

But again, the larger point is valid: whichever one it returns the values from is obviously the one it's walking. So, in his case, it's calling {k}.__contains__(n) for each key in d.keys(), which is O(N).

From tjreedy at udel.edu  Mon Sep 16 03:39:34 2013
From: tjreedy at udel.edu (Terry Reedy)
Date: Sun, 15 Sep 2013 21:39:34 -0400
Subject: [Python-ideas] Add dict.getkey() and set.get()
In-Reply-To: <20130916003039.GC7914@ando>
References: <l11u5v$j9n$1@ger.gmane.org>
 <CAEbHw4Zb860agw+W4qb=O6vnc+4v-OkdRkwebJTStY2zoTdDig@mail.gmail.com>
 <20130914230756.GQ16820@ando>
 <CAHVvXxQP6JtPvnLxOrnoA2ow_hQAUpGat55tb4yiBT9psMdD2A@mail.gmail.com>
 <20130916003039.GC7914@ando>
Message-ID: <l15ngh$fvh$1@ger.gmane.org>

On 9/15/2013 8:30 PM, Steven D'Aprano wrote:
> On Sun, Sep 15, 2013 at 09:02:33PM +0100, Oscar Benjamin wrote:
>
>> I don't know whether this is relying on undefined behaviour but the
>> following is O(1) and seems to work:
>>
>>>>> def canonical_key(d, k):
>> ...     k, = {k} & d.keys()
>> ...     return k
>> ...
>
> I'm pretty sure that (1) it relies on implementation-specific behaviour,
> and (2) it's O(N), not O(1).
>
> The implementation-specific part is whether & takes the key from the
> left-hand or right-hand operand when the keys are equal. For example, in
> Python 3.3:
>
> py> {1} & {1.0}
> {1.0}
> py> {1} & {1.0, 2.0}
> {1}
>
> And surely it's O(N) -- to be precise, O(M+N) -- because & has to walk
> all the keys in both operands?

No, just the keys in one of the operands. That can be chosen to be the 
smaller of the two, making it O(min(M,N)). I believe that is what is 
happening above: the operands are switched when the first is smaller. I 
know there was a tracker issue that added this optimization.

-- 
Terry Jan Reedy


From abarnert at yahoo.com  Mon Sep 16 09:25:57 2013
From: abarnert at yahoo.com (Andrew Barnert)
Date: Mon, 16 Sep 2013 00:25:57 -0700
Subject: [Python-ideas] Add dict.getkey() and set.get()
In-Reply-To: <l15ngh$fvh$1@ger.gmane.org>
References: <l11u5v$j9n$1@ger.gmane.org>
 <CAEbHw4Zb860agw+W4qb=O6vnc+4v-OkdRkwebJTStY2zoTdDig@mail.gmail.com>
 <20130914230756.GQ16820@ando>
 <CAHVvXxQP6JtPvnLxOrnoA2ow_hQAUpGat55tb4yiBT9psMdD2A@mail.gmail.com>
 <20130916003039.GC7914@ando> <l15ngh$fvh$1@ger.gmane.org>
Message-ID: <511EE65A-AE97-476E-8796-C0BB9805A6A9@yahoo.com>

On Sep 15, 2013, at 18:39, Terry Reedy <tjreedy at udel.edu> wrote:

> On 9/15/2013 8:30 PM, Steven D'Aprano wrote:
>> On Sun, Sep 15, 2013 at 09:02:33PM +0100, Oscar Benjamin wrote:
>> 
>>> I don't know whether this is relying on undefined behaviour but the
>>> following is O(1) and seems to work:
>>> 
>>>>>> def canonical_key(d, k):
>>> ...     k, = {k} & d.keys()
>>> ...     return k
>>> ...
>> 
>> I'm pretty sure that (1) it relies on implementation-specific behaviour,
>> and (2) it's O(N), not O(1).
>> 
>> The implementation-specific part is whether & takes the key from the
>> left-hand or right-hand operand when the keys are equal. For example, in
>> Python 3.3:
>> 
>> py> {1} & {1.0}
>> {1.0}
>> py> {1} & {1.0, 2.0}
>> {1}
>> 
>> And surely it's O(N) -- to be precise, O(M+N) -- because & has to walk
>> all the keys in both operands?
> 
> No, just the keys in one of the operands. That can be chosen to be the smaller of the two, making it O(min(M,N)). I believe that is what is happening above: the operands are switched when the first is smaller. I know there was a tracker issue that added this optimization.

But it doesn't happen for dict_keys, because that ends up calling intersection_update on a copy of the left operand, which means it always walks the right operand (in this case the dict_keys itself), instead of calling set_intersection, as it would on two sets.

At any rate, whenever this does what's desired, it does a linear walk of all keys in the dict; conversely, any variant that only walks the single key in {k} ends up returning {k} instead of what you were looking for.

From solipsis at pitrou.net  Mon Sep 16 10:17:31 2013
From: solipsis at pitrou.net (Antoine Pitrou)
Date: Mon, 16 Sep 2013 10:17:31 +0200
Subject: [Python-ideas] Keep free list of popular iterator objects
References: <CALRznwQorVYYF4gLuDQxcdPZq5ET1oa5gK0gFMMUjPxS2B9Q6A@mail.gmail.com>
Message-ID: <20130916101731.60bb20dd@pitrou.net>


Hi,

Le Sun, 15 Sep 2013 17:50:53 -0600,
Kyle Fisher <anthonyfk at gmail.com> a
?crit :
> Hi Antoine,
> 
> Thanks for taking the time to respond.  Sorry I didn't see your
> comments earlier, I have my mailing list settings to digest and for
> some reason they weren't showing up in my inbox.  Anyway, I agree
> that a real-world test case would be best.  Marc-Andre tossed out
> "100 objects" for the free list size, but I'd like to point out that
> it probably doesn't need to be anywhere near that large.

I agree. There can't be that many dict iterators in flight at a given
time :-)

> Perhaps it's best for me at this point to try out the patch in our
> application and see what some real world results would be.  It'd also
> be nice if there was some other macro-benchmark that I could run this
> against to verify that it doesn't make things worse, which seems to
> be Raymond's biggest concern.  Is there something like this
> available?  Maybe even just the unit test suite?

We have a benchmark suite here:
http://hg.python.org/benchmarks/
It spans the range between micro and macro, but not to the point of
running wholesale applications.

Regards

Antoine.


From oscar.j.benjamin at gmail.com  Mon Sep 16 11:16:10 2013
From: oscar.j.benjamin at gmail.com (Oscar Benjamin)
Date: Mon, 16 Sep 2013 10:16:10 +0100
Subject: [Python-ideas] Add dict.getkey() and set.get()
In-Reply-To: <511EE65A-AE97-476E-8796-C0BB9805A6A9@yahoo.com>
References: <l11u5v$j9n$1@ger.gmane.org>
 <CAEbHw4Zb860agw+W4qb=O6vnc+4v-OkdRkwebJTStY2zoTdDig@mail.gmail.com>
 <20130914230756.GQ16820@ando>
 <CAHVvXxQP6JtPvnLxOrnoA2ow_hQAUpGat55tb4yiBT9psMdD2A@mail.gmail.com>
 <20130916003039.GC7914@ando> <l15ngh$fvh$1@ger.gmane.org>
 <511EE65A-AE97-476E-8796-C0BB9805A6A9@yahoo.com>
Message-ID: <CAHVvXxT84Ra=fpUWfeSBUhkm_YpH+d6i0rw-SN+DuE1Rz4CX8A@mail.gmail.com>

On 16 September 2013 08:25, Andrew Barnert <abarnert at yahoo.com> wrote:
> On Sep 15, 2013, at 18:39, Terry Reedy <tjreedy at udel.edu> wrote:
>
>> On 9/15/2013 8:30 PM, Steven D'Aprano wrote:
>>> On Sun, Sep 15, 2013 at 09:02:33PM +0100, Oscar Benjamin wrote:
>>>
>>>> I don't know whether this is relying on undefined behaviour but the
>>>> following is O(1) and seems to work:
>>>>
>>>>>>> def canonical_key(d, k):
>>>> ...     k, = {k} & d.keys()
>>>> ...     return k
>>>> ...
>>>
>>> I'm pretty sure that (1) it relies on implementation-specific behaviour,
>>> and (2) it's O(N), not O(1).
>>>
>>> The implementation-specific part is whether & takes the key from the
>>> left-hand or right-hand operand when the keys are equal. For example, in
>>> Python 3.3:
>>>
>>> py> {1} & {1.0}
>>> {1.0}
>>> py> {1} & {1.0, 2.0}
>>> {1}
>>>
>>> And surely it's O(N) -- to be precise, O(M+N) -- because & has to walk
>>> all the keys in both operands?
>>
>> No, just the keys in one of the operands. That can be chosen to be the smaller of the two, making it O(min(M,N)). I believe that is what is happening above: the operands are switched when the first is smaller. I know there was a tracker issue that added this optimization.
>
> But it doesn't happen for dict_keys, because that ends up calling intersection_update on a copy of the left operand, which means it always walks the right operand (in this case the dict_keys itself), instead of calling set_intersection, as it would on two sets.
>
> At any rate, whenever this does what's desired, it does a linear walk of all keys in the dict; conversely, any variant that only walks the single key in {k} ends up returning {k} instead of what you were looking for.

Ah, right you are. That's unfortunate since it means that set
intersection semantics are determined by an optimisation and dict_key
intersection lacks an obvious optimisation (to iterate over the
smaller set).

Note that the optimisation is not incompatible with having a defined
"take keys from the left (or from the right)" semantic since the
set_contains_entry function that performs the lookup can access the
matching key from the same lookup used for the containment test:
http://hg.python.org/cpython/file/95b3efe3d7b7/Objects/setobject.c#l689


Oscar

From anthonyfk at gmail.com  Mon Sep 16 14:37:52 2013
From: anthonyfk at gmail.com (Kyle Fisher)
Date: Mon, 16 Sep 2013 06:37:52 -0600
Subject: [Python-ideas] Keep free list of popular iterator objects
In-Reply-To: <20130916101731.60bb20dd@pitrou.net>
References: <CALRznwQorVYYF4gLuDQxcdPZq5ET1oa5gK0gFMMUjPxS2B9Q6A@mail.gmail.com>
 <20130916101731.60bb20dd@pitrou.net>
Message-ID: <CALRznwQ=TKMg4Qu3GcUBx5MxjP7S+jrPMad2X4DVJyKJayt8+w@mail.gmail.com>

Fantastic, thank you.

-Kyle
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20130916/9c1d35fa/attachment.html>

From random832 at fastmail.us  Mon Sep 16 20:49:11 2013
From: random832 at fastmail.us (random832 at fastmail.us)
Date: Mon, 16 Sep 2013 14:49:11 -0400
Subject: [Python-ideas] Reduce platform dependence of date and time related
	functions
Message-ID: <1379357351.28759.22678441.285099D6@webmail.messagingengine.com>

The practice of using OS functions for time handling has its worst
effects on Windows, where many functions are unable to process times
from before 1970-01-01 even though there is no reason for Python to have
such a limitation. It also results in uneven support for strftime
specifiers. Some of these functions also suffer from the Year 2038
problem on OSes with a 32-bit time_t type.

I propose supplying pure-python implementations (in accordance with PEP
399) for the entire datetime module, and additionally the asctime,
strftime, strptime, and gmtime functions in the time module, and
calendar.timegm. Unfortunately, functions dealing with local time stamps
in the system's idea of local time are still dependent on the platform's
C library functions (localtime, mktime, ctime)

Or, if this is not practical, supplying alternate implementations of the
relevant C functions, and calling these instead wherever these are used.
If it is practical to do so, these functions should use python integers
as the type for timestamps; if not, they should use 64-bit integers in
preference to the platform time_t.

Is it reasonable to expose the possibility of an epoch other than 1970
(or of timestamps that handle leap seconds in a different manner than
POSIX) at a python level? Even if such a platform ever comes to be
supported, it could be done so with a layer that hides these
differences.

From alexander.belopolsky at gmail.com  Mon Sep 16 21:02:13 2013
From: alexander.belopolsky at gmail.com (Alexander Belopolsky)
Date: Mon, 16 Sep 2013 15:02:13 -0400
Subject: [Python-ideas] Reduce platform dependence of date and time
 related functions
In-Reply-To: <1379357351.28759.22678441.285099D6@webmail.messagingengine.com>
References: <1379357351.28759.22678441.285099D6@webmail.messagingengine.com>
Message-ID: <CAP7h-xaKbsWFapLYmUoGqTLRpNJWau1hVmu2it=QvJfBX_fAzg@mail.gmail.com>

On Mon, Sep 16, 2013 at 2:49 PM, <random832 at fastmail.us> wrote:

> I propose supplying pure-python implementations (in accordance with PEP
> 399) for the entire datetime module
>

We already have that in python 3.x:

http://bugs.python.org/issue7989

I believe it still has some platform dependencies through the time module.

The idea to provide pure python implementation of the time module was
proposed and rejected:

http://bugs.python.org/issue9528

If you would like to improve cross-platform compatibility in this area, I
would start with re-implementation of strftime().  See

http://bugs.python.org/issue3173
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20130916/46a4e35c/attachment.html>

From phd at phdru.name  Mon Sep 16 21:01:18 2013
From: phd at phdru.name (Oleg Broytman)
Date: Mon, 16 Sep 2013 23:01:18 +0400
Subject: [Python-ideas] Reduce platform dependence of date and time
 related functions
In-Reply-To: <1379357351.28759.22678441.285099D6@webmail.messagingengine.com>
References: <1379357351.28759.22678441.285099D6@webmail.messagingengine.com>
Message-ID: <20130916190118.GA12268@iskra.aviel.ru>

On Mon, Sep 16, 2013 at 02:49:11PM -0400, random832 at fastmail.us wrote:
> I propose supplying pure-python implementations (in accordance with PEP
> 399) for the entire datetime module
[...]
> Or, if this is not practical, supplying alternate implementations of the
> relevant C functions

   There is a well-known module mx.DateTime. It is not a drop-in
replacement for module datetime, but it's quite good for its task and
has excellent documentation. eGenix provides binaries for all major OSes
and Python versions under a liberal open source license. Take a look at:

   http://www.egenix.com/products/python/mxBase/mxDateTime/

Oleg.
-- 
     Oleg Broytman            http://phdru.name/            phd at phdru.name
           Programmers don't die, they just GOSUB without RETURN.

From rob.cliffe at btinternet.com  Tue Sep 17 00:48:24 2013
From: rob.cliffe at btinternet.com (Rob Cliffe)
Date: Mon, 16 Sep 2013 23:48:24 +0100
Subject: [Python-ideas] Reduce platform dependence of date and time
 related functions
In-Reply-To: <1379357351.28759.22678441.285099D6@webmail.messagingengine.com>
References: <1379357351.28759.22678441.285099D6@webmail.messagingengine.com>
Message-ID: <52378AB8.8070608@btinternet.com>

 From the sublime to the, er ... plebeian?  Just an idea for Python 4:
Is there any good reason to have separate time and datetime modules?
I sometimes find myself spinning my wheels converting between a format 
supported by one and a format supported by the other.
Rob Cliffe

On 16/09/2013 19:49, random832 at fastmail.us wrote:
> The practice of using OS functions for time handling has its worst
> effects on Windows, where many functions are unable to process times
> from before 1970-01-01 even though there is no reason for Python to have
> such a limitation. It also results in uneven support for strftime
> specifiers. Some of these functions also suffer from the Year 2038
> problem on OSes with a 32-bit time_t type.
>
> I propose supplying pure-python implementations (in accordance with PEP
> 399) for the entire datetime module, and additionally the asctime,
> strftime, strptime, and gmtime functions in the time module, and
> calendar.timegm. Unfortunately, functions dealing with local time stamps
> in the system's idea of local time are still dependent on the platform's
> C library functions (localtime, mktime, ctime)
>
> Or, if this is not practical, supplying alternate implementations of the
> relevant C functions, and calling these instead wherever these are used.
> If it is practical to do so, these functions should use python integers
> as the type for timestamps; if not, they should use 64-bit integers in
> preference to the platform time_t.
>
> Is it reasonable to expose the possibility of an epoch other than 1970
> (or of timestamps that handle leap seconds in a different manner than
> POSIX) at a python level? Even if such a platform ever comes to be
> supported, it could be done so with a layer that hides these
> differences.
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> https://mail.python.org/mailman/listinfo/python-ideas
>
>
> -----
> No virus found in this message.
> Checked by AVG - www.avg.com
> Version: 2012.0.2242 / Virus Database: 3222/6171 - Release Date: 09/16/13
>
>


From ben+python at benfinney.id.au  Tue Sep 17 01:14:11 2013
From: ben+python at benfinney.id.au (Ben Finney)
Date: Tue, 17 Sep 2013 09:14:11 +1000
Subject: [Python-ideas] =?utf-8?q?Continued_support_for_=E2=80=98time?=
 =?utf-8?b?4oCZIGFuZCDigJhkYXRldGltZeKAmSBtb2R1bGVzICh3YXM6IFJlZHVjZSBw?=
 =?utf-8?q?latform_dependence_of_date_and_time_related_functions=29?=
References: <1379357351.28759.22678441.285099D6@webmail.messagingengine.com>
 <52378AB8.8070608@btinternet.com>
Message-ID: <7wa9jcz7ek.fsf_-_@benfinney.id.au>

Rob Cliffe <rob.cliffe at btinternet.com> writes:

> From the sublime to the, er ... plebeian?

When changing the subject of discussion, please change the Subject field
accordingly.

> Just an idea for Python 4: Is there any good reason to have separate
> time and datetime modules?

That's how it's been for a long time. There is now a lot of existing
Python code that uses those two modules as they are.

This would not be a good reason for *introducing* such a pair of modules
with confusingly-different APIs. But that's not the decision we face
today, many years after those modules entered the standard library.

Changes to the standard library API, especially for modules that are in
long-established use, must be considered conservatively. And that *is* a
good reason to continue having ?time? and ?datetime? modules which both
support the existing behaviour.

> I sometimes find myself spinning my wheels converting between a format
> supported by one and a format supported by the other.

That's a different matter, and does not challenge the continued
existence of separate ?time? and ?datetime? modules.

The ?datetime? module has grown functionality for working with the data
types of the ?time? module. What conversions are you lacking from the
current ?datetime? <URL:http://docs.python.org/3/library/datetime.html>?

-- 
 \         ?Pinky, are you pondering what I'm pondering?? ?I think so, |
  `\    Brain, but if the plural of mouse is mice, wouldn't the plural |
_o__)                      of spouse be spice?? ?_Pinky and The Brain_ |
Ben Finney


From steve at pearwood.info  Tue Sep 17 01:24:44 2013
From: steve at pearwood.info (Steven D'Aprano)
Date: Tue, 17 Sep 2013 09:24:44 +1000
Subject: [Python-ideas] =?utf-8?q?Continued_support_for_=E2=80=98time?=
	=?utf-8?b?4oCZIGFuZCDigJhkYXRldGltZeKAmSBtb2R1bGVzICh3YXM6IFJlZHVj?=
	=?utf-8?q?e_platform_dependence_of_date_and_time_related_functions?=
	=?utf-8?q?=29?=
In-Reply-To: <7wa9jcz7ek.fsf_-_@benfinney.id.au>
References: <1379357351.28759.22678441.285099D6@webmail.messagingengine.com>
 <52378AB8.8070608@btinternet.com> <7wa9jcz7ek.fsf_-_@benfinney.id.au>
Message-ID: <20130916232443.GF19939@ando>

On Tue, Sep 17, 2013 at 09:14:11AM +1000, Ben Finney wrote:
> Rob Cliffe <rob.cliffe at btinternet.com> writes:

> > Just an idea for Python 4: Is there any good reason to have separate
> > time and datetime modules?
> 
> That's how it's been for a long time. There is now a lot of existing
> Python code that uses those two modules as they are.
[...]
> Changes to the standard library API, especially for modules that are in
> long-established use, must be considered conservatively. And that *is* a
> good reason to continue having ?time? and ?datetime? modules which both
> support the existing behaviour.

Agreed.

But I suggest to Rob, or anyone else who likes the idea of merging the 
two modules and is willing to do the work, to start off by creating an 
interface module that wraps the two. Call it (for lack of a better name) 
"mytime". When the "mytime" module is sufficiently mature, which may 
require publishing it on PyPI for the public to use, it could 
potentially be added to the standard library as a high level interface 
to the lower-level time and datetime modules.

That doesn't need to wait for Python 4000.

I'm +0 on the general idea. I don't use either module enough to be 
annoyed by there being two of them. (Three if you include calendar.)


-- 
Steven

From random832 at fastmail.us  Tue Sep 17 15:22:14 2013
From: random832 at fastmail.us (random832 at fastmail.us)
Date: Tue, 17 Sep 2013 09:22:14 -0400
Subject: [Python-ideas] Reduce platform dependence of date and time
 related functions
In-Reply-To: <1379357351.28759.22678441.285099D6@webmail.messagingengine.com>
References: <1379357351.28759.22678441.285099D6@webmail.messagingengine.com>
Message-ID: <1379424134.25980.23022817.0FD01849@webmail.messagingengine.com>

I have an addition to this proposal: struct_time should always provide
tm_gmtoff and tm_zone, gmtime should populate them with 0 and GMT*, and
if the platform does not provide values localtime should populate them
with timezone or altzone and values from tzname depending on if isdst is
true after calling the platform localtime function.

*The current practice of the reference code of the "tz" project and of
at least glibc is to use GMT. If anyone has an argument that it should
be UTC or some other value on some platforms, please speak up.

From victor.stinner at gmail.com  Tue Sep 17 15:31:31 2013
From: victor.stinner at gmail.com (Victor Stinner)
Date: Tue, 17 Sep 2013 15:31:31 +0200
Subject: [Python-ideas] Reduce platform dependence of date and time
 related functions
In-Reply-To: <1379424134.25980.23022817.0FD01849@webmail.messagingengine.com>
References: <1379357351.28759.22678441.285099D6@webmail.messagingengine.com>
 <1379424134.25980.23022817.0FD01849@webmail.messagingengine.com>
Message-ID: <CAMpsgwY0h5eR8+hqfutEdp8D0vbES7XHFiQ5qBxT-dYJa_nqTA@mail.gmail.com>

2013/9/17  <random832 at fastmail.us>:
> I have an addition to this proposal: struct_time should always provide
> tm_gmtoff and tm_zone, gmtime should populate them with 0 and GMT*, and
> if the platform does not provide values localtime should populate them
> with timezone or altzone and values from tzname depending on if isdst is
> true after calling the platform localtime function.

In Python, "unknown" is usually written None. It's safer than filling
the structure with invalid values.

Victor

From random832 at fastmail.us  Tue Sep 17 18:01:43 2013
From: random832 at fastmail.us (random832 at fastmail.us)
Date: Tue, 17 Sep 2013 12:01:43 -0400
Subject: [Python-ideas] Reduce platform dependence of date and time
 related functions
In-Reply-To: <CAP7h-xaKbsWFapLYmUoGqTLRpNJWau1hVmu2it=QvJfBX_fAzg@mail.gmail.com>
References: <1379357351.28759.22678441.285099D6@webmail.messagingengine.com>
 <CAP7h-xaKbsWFapLYmUoGqTLRpNJWau1hVmu2it=QvJfBX_fAzg@mail.gmail.com>
Message-ID: <1379433703.14501.23084249.72F8456E@webmail.messagingengine.com>

On Mon, Sep 16, 2013, at 15:02, Alexander Belopolsky wrote:
> On Mon, Sep 16, 2013 at 2:49 PM, <random832 at fastmail.us> wrote:
> > I propose supplying pure-python implementations (in accordance with PEP
> > 399) for the entire datetime module
> 
> We already have that in python 3.x:
> 
> http://bugs.python.org/issue7989

Sorry - it was unclear to me that simply clicking "browse" from
http://hg.python.org/cpython/ did not result in browsing the latest
source. (What branch is that? It's not "default")

> The idea to provide pure python implementation of the time module was
> proposed and rejected:
> 
> http://bugs.python.org/issue9528

This is a much more limited scope than that. I was merely proposing a
limited set of functions - this could be implemented in the same way as
the posix module, with a small pure python module that imports
everything from the larger C module. These could simply be implemented
in C instead - are we guaranteed to have a 64-bit integer type
available? My main concern (for pure python vs C) was whether or not it
is possible to work with greater than 32 bit values on a 32 bit system.
If necessary we could do some of the work in double - the input is
double, anyway, so it won't be outside that range.

Do you have any thoughts on the rest of the proposal (that gmtime,
timegm, and strftime should have unlimited - or at least not limited to
low platform-specific limits like 1970 or 2038 - range, that python
"epoch timestamps" should be defined as beginning in 1970 and not
including leap seconds regardless of hypothetical [I don't believe any
currently supported systems actually do, except to the extent that
individual Unix sites can use so-called "right" tz data] systems that
may have a time_t that behaves otherwise, that tm_gmtoff and tm_zone
should always be provided)?

One concern for strftime in particular is locale support. It may be
difficult to query the relevant locale data in a portable manner.

From alexander.belopolsky at gmail.com  Tue Sep 17 18:11:46 2013
From: alexander.belopolsky at gmail.com (Alexander Belopolsky)
Date: Tue, 17 Sep 2013 12:11:46 -0400
Subject: [Python-ideas] Reduce platform dependence of date and time
 related functions
In-Reply-To: <1379433703.14501.23084249.72F8456E@webmail.messagingengine.com>
References: <1379357351.28759.22678441.285099D6@webmail.messagingengine.com>
 <CAP7h-xaKbsWFapLYmUoGqTLRpNJWau1hVmu2it=QvJfBX_fAzg@mail.gmail.com>
 <1379433703.14501.23084249.72F8456E@webmail.messagingengine.com>
Message-ID: <CAP7h-xYyv6Eo4xzS1CYdVrOMb3Hy+Fpb6bnwxXjF_YckX53QOQ@mail.gmail.com>

On Tue, Sep 17, 2013 at 12:01 PM, <random832 at fastmail.us> wrote:

> > We already have that in python 3.x:
> >
> > http://bugs.python.org/issue7989
>
> Sorry - it was unclear to me that simply clicking "browse" from
> http://hg.python.org/cpython/ did not result in browsing the latest
> source. (What branch is that? It's not "default")


http://hg.python.org/cpython/file/default/Lib/datetime.py
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20130917/224785d6/attachment.html>

From brett at python.org  Tue Sep 17 18:19:11 2013
From: brett at python.org (Brett Cannon)
Date: Tue, 17 Sep 2013 12:19:11 -0400
Subject: [Python-ideas] Reduce platform dependence of date and time
 related functions
In-Reply-To: <1379433703.14501.23084249.72F8456E@webmail.messagingengine.com>
References: <1379357351.28759.22678441.285099D6@webmail.messagingengine.com>
 <CAP7h-xaKbsWFapLYmUoGqTLRpNJWau1hVmu2it=QvJfBX_fAzg@mail.gmail.com>
 <1379433703.14501.23084249.72F8456E@webmail.messagingengine.com>
Message-ID: <CAP1=2W5UN1zBjMOd7pLiO8twT0wmmteGkCj1RCdr=hCW6i108A@mail.gmail.com>

On Tue, Sep 17, 2013 at 12:01 PM, <random832 at fastmail.us> wrote:

> On Mon, Sep 16, 2013, at 15:02, Alexander Belopolsky wrote:
> > On Mon, Sep 16, 2013 at 2:49 PM, <random832 at fastmail.us> wrote:
> > > I propose supplying pure-python implementations (in accordance with PEP
> > > 399) for the entire datetime module
> >
> > We already have that in python 3.x:
> >
> > http://bugs.python.org/issue7989
>
> Sorry - it was unclear to me that simply clicking "browse" from
> http://hg.python.org/cpython/ did not result in browsing the latest
> source. (What branch is that? It's not "default")
>

Depends on the last commit (it's an hgweb thing; always specify the branch).


>
> > The idea to provide pure python implementation of the time module was
> > proposed and rejected:
> >
> > http://bugs.python.org/issue9528
>
> This is a much more limited scope than that. I was merely proposing a
> limited set of functions - this could be implemented in the same way as
> the posix module, with a small pure python module that imports
> everything from the larger C module. These could simply be implemented
> in C instead - are we guaranteed to have a 64-bit integer type
> available? My main concern (for pure python vs C) was whether or not it
> is possible to work with greater than 32 bit values on a 32 bit system.
> If necessary we could do some of the work in double - the input is
> double, anyway, so it won't be outside that range.
>
> Do you have any thoughts on the rest of the proposal (that gmtime,
> timegm, and strftime should have unlimited - or at least not limited to
> low platform-specific limits like 1970 or 2038 - range, that python
> "epoch timestamps" should be defined as beginning in 1970 and not
> including leap seconds regardless of hypothetical [I don't believe any
> currently supported systems actually do, except to the extent that
> individual Unix sites can use so-called "right" tz data] systems that
> may have a time_t that behaves otherwise, that tm_gmtoff and tm_zone
> should always be provided)?
>
> One concern for strftime in particular is locale support. It may be
> difficult to query the relevant locale data in a portable manner.


You also have the issue that if you port strftime then you lose the pure
Python port of strptime:
http://hg.python.org/cpython/file/default/Lib/_strptime.py
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20130917/aa147964/attachment-0001.html>

From alexander.belopolsky at gmail.com  Tue Sep 17 18:23:39 2013
From: alexander.belopolsky at gmail.com (Alexander Belopolsky)
Date: Tue, 17 Sep 2013 12:23:39 -0400
Subject: [Python-ideas] Reduce platform dependence of date and time
 related functions
In-Reply-To: <1379433703.14501.23084249.72F8456E@webmail.messagingengine.com>
References: <1379357351.28759.22678441.285099D6@webmail.messagingengine.com>
 <CAP7h-xaKbsWFapLYmUoGqTLRpNJWau1hVmu2it=QvJfBX_fAzg@mail.gmail.com>
 <1379433703.14501.23084249.72F8456E@webmail.messagingengine.com>
Message-ID: <CAP7h-xZfr4B1=qK3AHLCig+M-FEE5FXZYSt6z80htQmzm2AEDg@mail.gmail.com>

On Tue, Sep 17, 2013 at 12:01 PM, <random832 at fastmail.us> wrote:

> Do you have any thoughts on the rest of the proposal (that gmtime,
> timegm, and strftime should have unlimited - or at least not limited to
> low platform-specific limits like 1970 or 2038 - range, that python
> "epoch timestamps" should be defined as beginning in 1970 and not
> including leap seconds regardless of hypothetical [I don't believe any
> currently supported systems actually do, except to the extent that
> individual Unix sites can use so-called "right" tz data] systems that
> may have a time_t that behaves otherwise, that tm_gmtoff and tm_zone
> should always be provided)?
>

You should review what's new in 3.x documents.  Many of the features that
you ask for have already been implemented.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20130917/87de87d1/attachment.html>

From random832 at fastmail.us  Tue Sep 17 18:27:40 2013
From: random832 at fastmail.us (random832 at fastmail.us)
Date: Tue, 17 Sep 2013 12:27:40 -0400
Subject: [Python-ideas] Reduce platform dependence of date and time
 related functions
In-Reply-To: <CAP1=2W5UN1zBjMOd7pLiO8twT0wmmteGkCj1RCdr=hCW6i108A@mail.gmail.com>
References: <1379357351.28759.22678441.285099D6@webmail.messagingengine.com>
 <CAP7h-xaKbsWFapLYmUoGqTLRpNJWau1hVmu2it=QvJfBX_fAzg@mail.gmail.com>
 <1379433703.14501.23084249.72F8456E@webmail.messagingengine.com>
 <CAP1=2W5UN1zBjMOd7pLiO8twT0wmmteGkCj1RCdr=hCW6i108A@mail.gmail.com>
Message-ID: <1379435260.28764.23112997.115F6763@webmail.messagingengine.com>

On Tue, Sep 17, 2013, at 12:19, Brett Cannon wrote:
> You also have the issue that if you port strftime then you lose the pure
> Python port of strptime:
> http://hg.python.org/cpython/file/default/Lib/_strptime.py

Why would that make you lose that? I'm not sure I understand.

From random832 at fastmail.us  Tue Sep 17 18:49:23 2013
From: random832 at fastmail.us (random832 at fastmail.us)
Date: Tue, 17 Sep 2013 12:49:23 -0400
Subject: [Python-ideas] Reduce platform dependence of date and time
 related functions
In-Reply-To: <CAP7h-xZfr4B1=qK3AHLCig+M-FEE5FXZYSt6z80htQmzm2AEDg@mail.gmail.com>
References: <1379357351.28759.22678441.285099D6@webmail.messagingengine.com>
 <CAP7h-xaKbsWFapLYmUoGqTLRpNJWau1hVmu2it=QvJfBX_fAzg@mail.gmail.com>
 <1379433703.14501.23084249.72F8456E@webmail.messagingengine.com>
 <CAP7h-xZfr4B1=qK3AHLCig+M-FEE5FXZYSt6z80htQmzm2AEDg@mail.gmail.com>
Message-ID: <1379436563.9158.23113441.7127505C@webmail.messagingengine.com>

On Tue, Sep 17, 2013, at 12:23, Alexander Belopolsky wrote:
> You should review what's new in 3.x documents.  Many of the features that
> you ask for have already been implemented.

To what are you referring? 3.4 what's new mentions no changes related to
the time module.  3.3 mentions only new functions unrelated to time
conversions. The change mentioned in 3.2 does not fix limitations caused
by the platform. 32-bit platforms are still limited by the range of
time_t for gmtime [and e.g. datetime.fromtimestamp], and MSVC, while
having a 64-bit time_t, is limited to positive values (and arbitrarily
imposes the same limitation on functions that accept a struct tm,
rejecting any time that would, interpreted as local time, result in a
value before 1970-01-01 00:00:00 GMT) 3.1 and 3.0 mention no changes to
the time module.

All of the issues I mentioned apply to 3.3 (You may not have noticed the
range issue as it may not apply to your platform, and by "should always
be provided" i meant _always_, even if the platform doesn't provide them
- they can be populated from timezone/altzone and tzname in that case),
and the epoch/leap second thing is still clearly present in the 3.4
docs.

I personally confirmed every single issue I mentioned except for the one
about a pure-python implementation of datetime (which was because I was
misled by the web hg browser), and except for the year 2038 limitation
that does not apply on this system, on this version: Python 3.3.2
(v3.3.2:d047928ae3f6, May 16 2013, 00:06:53) [MSC v.1600 64 bit (AMD64)]
on win32

From brett at python.org  Tue Sep 17 19:02:20 2013
From: brett at python.org (Brett Cannon)
Date: Tue, 17 Sep 2013 13:02:20 -0400
Subject: [Python-ideas] Reduce platform dependence of date and time
 related functions
In-Reply-To: <1379435260.28764.23112997.115F6763@webmail.messagingengine.com>
References: <1379357351.28759.22678441.285099D6@webmail.messagingengine.com>
 <CAP7h-xaKbsWFapLYmUoGqTLRpNJWau1hVmu2it=QvJfBX_fAzg@mail.gmail.com>
 <1379433703.14501.23084249.72F8456E@webmail.messagingengine.com>
 <CAP1=2W5UN1zBjMOd7pLiO8twT0wmmteGkCj1RCdr=hCW6i108A@mail.gmail.com>
 <1379435260.28764.23112997.115F6763@webmail.messagingengine.com>
Message-ID: <CAP1=2W6K78=qKSWY0c=OoGGJtCU6c1OY-EOXw7u_zniKdG3LeA@mail.gmail.com>

On Tue, Sep 17, 2013 at 12:27 PM, <random832 at fastmail.us> wrote:

> On Tue, Sep 17, 2013, at 12:19, Brett Cannon wrote:
> > You also have the issue that if you port strftime then you lose the pure
> > Python port of strptime:
> > http://hg.python.org/cpython/file/default/Lib/_strptime.py
>
> Why would that make you lose that? I'm not sure I understand.
>

strptime is implemented using strftime to get the locale information. As
you pointed out, getting the locale details is essentially not possible in
a cross-platform way unless you use strptime or strftime, so you have to
choose which is implemented in Python and relies on the other.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20130917/29049b37/attachment.html>

From alexander.belopolsky at gmail.com  Tue Sep 17 19:29:43 2013
From: alexander.belopolsky at gmail.com (Alexander Belopolsky)
Date: Tue, 17 Sep 2013 13:29:43 -0400
Subject: [Python-ideas] Reduce platform dependence of date and time
 related functions
In-Reply-To: <1379436563.9158.23113441.7127505C@webmail.messagingengine.com>
References: <1379357351.28759.22678441.285099D6@webmail.messagingengine.com>
 <CAP7h-xaKbsWFapLYmUoGqTLRpNJWau1hVmu2it=QvJfBX_fAzg@mail.gmail.com>
 <1379433703.14501.23084249.72F8456E@webmail.messagingengine.com>
 <CAP7h-xZfr4B1=qK3AHLCig+M-FEE5FXZYSt6z80htQmzm2AEDg@mail.gmail.com>
 <1379436563.9158.23113441.7127505C@webmail.messagingengine.com>
Message-ID: <CAP7h-xZrt1R9v_OEwXBtj4dhi0gNh34b_Kw0pnJQbRCt-jN43w@mail.gmail.com>

On Tue, Sep 17, 2013 at 12:49 PM, <random832 at fastmail.us> wrote:

> 32-bit platforms are still limited by the range of
> time_t for gmtime [and e.g. datetime.fromtimestamp],
>

datetime.fromtimestamp() is not the same as gmtime.  You should use
datetime.utcfromtimestamp()  which is only limited by supported date range
(years 1-9999).
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20130917/54474837/attachment.html>

From alexander.belopolsky at gmail.com  Tue Sep 17 19:41:45 2013
From: alexander.belopolsky at gmail.com (Alexander Belopolsky)
Date: Tue, 17 Sep 2013 13:41:45 -0400
Subject: [Python-ideas] Reduce platform dependence of date and time
 related functions
In-Reply-To: <CAP1=2W6K78=qKSWY0c=OoGGJtCU6c1OY-EOXw7u_zniKdG3LeA@mail.gmail.com>
References: <1379357351.28759.22678441.285099D6@webmail.messagingengine.com>
 <CAP7h-xaKbsWFapLYmUoGqTLRpNJWau1hVmu2it=QvJfBX_fAzg@mail.gmail.com>
 <1379433703.14501.23084249.72F8456E@webmail.messagingengine.com>
 <CAP1=2W5UN1zBjMOd7pLiO8twT0wmmteGkCj1RCdr=hCW6i108A@mail.gmail.com>
 <1379435260.28764.23112997.115F6763@webmail.messagingengine.com>
 <CAP1=2W6K78=qKSWY0c=OoGGJtCU6c1OY-EOXw7u_zniKdG3LeA@mail.gmail.com>
Message-ID: <CAP7h-xYnzhhqePPBUHec4-Y6pE=jYmmwUf1H2NFO5HDDFauFKQ@mail.gmail.com>

On Tue, Sep 17, 2013 at 1:02 PM, Brett Cannon <brett at python.org> wrote:

> As you pointed out, getting the locale details is essentially not possible
> in a cross-platform way unless you use strptime or strftime, so you have to
> choose which is implemented in Python and relies on the other.


What we can do is to implement "C" locale behavior.  In fact, in many uses
of strftime() its locale-dependence is a problem.  I would much rather have
strftime_l()-like function and "C" locale implemented in stdlib.  This is
somewhat similar to the situation we have with timezone support: include
utc timezone and leave it to third parties to supply interfaces to platform
tz databases.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20130917/fec3f976/attachment-0001.html>

From random832 at fastmail.us  Tue Sep 17 21:08:49 2013
From: random832 at fastmail.us (random832 at fastmail.us)
Date: Tue, 17 Sep 2013 15:08:49 -0400
Subject: [Python-ideas] Reduce platform dependence of date and time
 related functions
In-Reply-To: <CAP7h-xZrt1R9v_OEwXBtj4dhi0gNh34b_Kw0pnJQbRCt-jN43w@mail.gmail.com>
References: <1379357351.28759.22678441.285099D6@webmail.messagingengine.com>
 <CAP7h-xaKbsWFapLYmUoGqTLRpNJWau1hVmu2it=QvJfBX_fAzg@mail.gmail.com>
 <1379433703.14501.23084249.72F8456E@webmail.messagingengine.com>
 <CAP7h-xZfr4B1=qK3AHLCig+M-FEE5FXZYSt6z80htQmzm2AEDg@mail.gmail.com>
 <1379436563.9158.23113441.7127505C@webmail.messagingengine.com>
 <CAP7h-xZrt1R9v_OEwXBtj4dhi0gNh34b_Kw0pnJQbRCt-jN43w@mail.gmail.com>
Message-ID: <1379444929.17672.23175609.4503967D@webmail.messagingengine.com>

On Tue, Sep 17, 2013, at 13:29, Alexander Belopolsky wrote:
> On Tue, Sep 17, 2013 at 12:49 PM, <random832 at fastmail.us> wrote:
> 
> > 32-bit platforms are still limited by the range of
> > time_t for gmtime [and e.g. datetime.fromtimestamp],
> >
> 
> datetime.fromtimestamp() is not the same as gmtime.  You should use
> datetime.utcfromtimestamp()  which is only limited by supported date
> range
> (years 1-9999).

fromtimestamp(timestamp, timezone.utc).

And anyway, I was listing it as _another example_ of a function in
datetime which is limited by the range of time_t, not as one that is
somehow "the same as" gmtime. And even if you want to play this game,
you are WRONG WRONG WRONG about utcfromtimestamp:

Python 3.3.2 (v3.3.2:d047928ae3f6, May 16 2013, 00:06:53) [MSC v.1600 64
bit (AMD64)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> from datetime import *
>>> datetime.utcfromtimestamp(-100000) # should be  1969-12-30 20:13:20
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
OSError: [Errno 22] Invalid argument
>>> datetime.utcfromtimestamp(2**63)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
OverflowError: timestamp out of range for platform time_t

(I don't care, per se, about 300 billion years from now, but I am 99%
certain I'd get the same result for the latter with 2**31 on 32-bit
Unix. This was to illustrate that it requires it to be in the range of
the platform time_t type.)

I feel like you're being deliberately obtuse at this point.

From alexander.belopolsky at gmail.com  Tue Sep 17 21:24:35 2013
From: alexander.belopolsky at gmail.com (Alexander Belopolsky)
Date: Tue, 17 Sep 2013 15:24:35 -0400
Subject: [Python-ideas] Reduce platform dependence of date and time
 related functions
In-Reply-To: <1379444929.17672.23175609.4503967D@webmail.messagingengine.com>
References: <1379357351.28759.22678441.285099D6@webmail.messagingengine.com>
 <CAP7h-xaKbsWFapLYmUoGqTLRpNJWau1hVmu2it=QvJfBX_fAzg@mail.gmail.com>
 <1379433703.14501.23084249.72F8456E@webmail.messagingengine.com>
 <CAP7h-xZfr4B1=qK3AHLCig+M-FEE5FXZYSt6z80htQmzm2AEDg@mail.gmail.com>
 <1379436563.9158.23113441.7127505C@webmail.messagingengine.com>
 <CAP7h-xZrt1R9v_OEwXBtj4dhi0gNh34b_Kw0pnJQbRCt-jN43w@mail.gmail.com>
 <1379444929.17672.23175609.4503967D@webmail.messagingengine.com>
Message-ID: <CAP7h-xY-wzrt=bma9CLJP=9fyzcWzy7jEzs=j8B5cfpCAjsxcg@mail.gmail.com>

On Tue, Sep 17, 2013 at 3:08 PM, <random832 at fastmail.us> wrote:

> fromtimestamp(timestamp, timezone.utc).
>
> And anyway, I was listing it as _another example_ of a function in
> datetime which is limited by the range of time_t, not as one that is
> somehow "the same as" gmtime. And even if you want to play this game,
> you are WRONG WRONG WRONG about utcfromtimestamp:
>

I would say this is a bug.  Is fromtimestamp(timestamp, timezone.utc)
similarly affected?  Please submit a bug report.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20130917/02947d41/attachment.html>

From anthonyfk at gmail.com  Tue Sep 17 22:18:32 2013
From: anthonyfk at gmail.com (Kyle Fisher)
Date: Tue, 17 Sep 2013 14:18:32 -0600
Subject: [Python-ideas] Keep free list of popular iterator objects
In-Reply-To: <20130916101731.60bb20dd@pitrou.net>
References: <CALRznwQorVYYF4gLuDQxcdPZq5ET1oa5gK0gFMMUjPxS2B9Q6A@mail.gmail.com>
 <20130916101731.60bb20dd@pitrou.net>
Message-ID: <CALRznwTidcxn5eVJ9qcucrwnM3M_TrLSAkJrbd0RjGSw6Mi-tA@mail.gmail.com>

Story time.

I was able to make a build at work with freelists enabled for iterators in
dictobject.c, listobject.c and iterobject.c.  When running this through our
application I saw:
 1) When loading several datapoints from database: 0.1% improvement (with a
wider-but-forgotten standard deviation).  So, no improvement but no ruined
performance either.  Makes sense since this was mostly an I/O bound task.
 2) When parsing in-memory data files: 1.5% improvement.  This is
approximately what I was expecting, so far so good!

At this point I decided to run the benchmark suite Antoine pointed me to.
I also realized that I had been testing without some optimizations turned
on.  I made two new builds, both with "-O3 -DNDEBUG -march=native" and
profile guided optimizations turned on.  I then added a benchmark to
explicitly test tight inner loops.  I ran the benchmarks and saw... a 1.02x
improvement on the benchmark I made and a 1.04x slow down on two others
(nbody, slowunpickle).  I then ran our application again and confirmed that
all initial speed ups I saw were now lost in the noise.

So, thank you everyone for letting me entertain this idea, but it looks
like Raymond's hunch was right. :)

Cheers,
-Kyle
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20130917/d6420cd4/attachment.html>

From random832 at fastmail.us  Tue Sep 17 22:58:17 2013
From: random832 at fastmail.us (random832 at fastmail.us)
Date: Tue, 17 Sep 2013 16:58:17 -0400
Subject: [Python-ideas] Reduce platform dependence of date and time
 related functions
In-Reply-To: <CAMpsgwY0h5eR8+hqfutEdp8D0vbES7XHFiQ5qBxT-dYJa_nqTA@mail.gmail.com>
References: <1379357351.28759.22678441.285099D6@webmail.messagingengine.com>
 <1379424134.25980.23022817.0FD01849@webmail.messagingengine.com>
 <CAMpsgwY0h5eR8+hqfutEdp8D0vbES7XHFiQ5qBxT-dYJa_nqTA@mail.gmail.com>
Message-ID: <1379451497.3210.23219769.5E1521E5@webmail.messagingengine.com>

On Tue, Sep 17, 2013, at 9:31, Victor Stinner wrote:
> 2013/9/17  <random832 at fastmail.us>:
> > I have an addition to this proposal: struct_time should always provide
> > tm_gmtoff and tm_zone, gmtime should populate them with 0 and GMT*, and
> > if the platform does not provide values localtime should populate them
> > with timezone or altzone and values from tzname depending on if isdst is
> > true after calling the platform localtime function.
> 
> In Python, "unknown" is usually written None. It's safer than filling
> the structure with invalid values.

They're not unknown. The values are provided by the system in global
variables. If timezone, altzone, and tzname should not be used, then
they should not be provided.

You can also determine gmtoff empirically by calling timegm and
subtracting the original timestamp from the result. Or you could look at
the seconds, minutes, hours, year, and yday members after calling both
gmtime and localtime in the first place.

From alexander.belopolsky at gmail.com  Tue Sep 17 23:21:27 2013
From: alexander.belopolsky at gmail.com (Alexander Belopolsky)
Date: Tue, 17 Sep 2013 17:21:27 -0400
Subject: [Python-ideas] Reduce platform dependence of date and time
 related functions
In-Reply-To: <1379451497.3210.23219769.5E1521E5@webmail.messagingengine.com>
References: <1379357351.28759.22678441.285099D6@webmail.messagingengine.com>
 <1379424134.25980.23022817.0FD01849@webmail.messagingengine.com>
 <CAMpsgwY0h5eR8+hqfutEdp8D0vbES7XHFiQ5qBxT-dYJa_nqTA@mail.gmail.com>
 <1379451497.3210.23219769.5E1521E5@webmail.messagingengine.com>
Message-ID: <CAP7h-xbHy+Xz0mazr-qooQ5FXSe6e0n0Vq6gH2Y2S0NA5PMcmg@mail.gmail.com>

On Tue, Sep 17, 2013 at 4:58 PM, <random832 at fastmail.us> wrote:
>
> You can also determine gmtoff empirically by calling timegm and
> subtracting the original timestamp from the result. Or you could look at
> the seconds, minutes, hours, year, and yday members after calling both
> gmtime and localtime in the first place.


How is this different from what we do in datetime.astimezone()?

                # Compute UTC offset and compare with the value implied
                # by tm_isdst.  If the values match, use the zone name
                # implied by tm_isdst.
                delta = local - datetime(*_time.gmtime(ts)[:6])
                dst = _time.daylight and localtm.tm_isdst > 0
                gmtoff = -(_time.altzone if dst else _time.timezone)
                if delta == timedelta(seconds=gmtoff):
                    tz = timezone(delta, _time.tzname[dst])
                else:
                    tz = timezone(delta)

http://hg.python.org/cpython/file/default/Lib/datetime.py#l1500
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20130917/4cc57058/attachment-0001.html>

From ethan at stoneleaf.us  Tue Sep 17 23:15:51 2013
From: ethan at stoneleaf.us (Ethan Furman)
Date: Tue, 17 Sep 2013 14:15:51 -0700
Subject: [Python-ideas] Reduce platform dependence of date and time
 related functions
In-Reply-To: <1379451497.3210.23219769.5E1521E5@webmail.messagingengine.com>
References: <1379357351.28759.22678441.285099D6@webmail.messagingengine.com>
 <1379424134.25980.23022817.0FD01849@webmail.messagingengine.com>
 <CAMpsgwY0h5eR8+hqfutEdp8D0vbES7XHFiQ5qBxT-dYJa_nqTA@mail.gmail.com>
 <1379451497.3210.23219769.5E1521E5@webmail.messagingengine.com>
Message-ID: <5238C687.7080207@stoneleaf.us>

On 09/17/2013 01:58 PM, random832 wrote:
> On Tue, Sep 17, 2013, at 9:31, Victor Stinner wrote:
>>
>> In Python, "unknown" is usually written None. It's safer than filling
>> the structure with invalid values.
>
> You can also determine gmtoff empirically by calling timegm and
> subtracting the original timestamp from the result. Or you could look at
> the seconds, minutes, hours, year, and yday members after calling both
> gmtime and localtime in the first place.

Is timegm/gmtime provided and consistent across all Python platforms?

--
~Ethan~

From random832 at fastmail.us  Wed Sep 18 00:30:39 2013
From: random832 at fastmail.us (random832 at fastmail.us)
Date: Tue, 17 Sep 2013 18:30:39 -0400
Subject: [Python-ideas] Reduce platform dependence of date and time
 related functions
In-Reply-To: <CAP7h-xbHy+Xz0mazr-qooQ5FXSe6e0n0Vq6gH2Y2S0NA5PMcmg@mail.gmail.com>
References: <1379357351.28759.22678441.285099D6@webmail.messagingengine.com>
 <1379424134.25980.23022817.0FD01849@webmail.messagingengine.com>
 <CAMpsgwY0h5eR8+hqfutEdp8D0vbES7XHFiQ5qBxT-dYJa_nqTA@mail.gmail.com>
 <1379451497.3210.23219769.5E1521E5@webmail.messagingengine.com>
 <CAP7h-xbHy+Xz0mazr-qooQ5FXSe6e0n0Vq6gH2Y2S0NA5PMcmg@mail.gmail.com>
Message-ID: <1379457039.10549.23252137.42075987@webmail.messagingengine.com>

On Tue, Sep 17, 2013, at 17:21, Alexander Belopolsky wrote:
> On Tue, Sep 17, 2013 at 4:58 PM, <random832 at fastmail.us> wrote:
> >
> > You can also determine gmtoff empirically by calling timegm and
> > subtracting the original timestamp from the result. Or you could look at
> > the seconds, minutes, hours, year, and yday members after calling both
> > gmtime and localtime in the first place.
> 
> 
> How is this different from what we do in datetime.astimezone()?

Not very different at all, except for the fact where I want the
functionality in struct_time to populate tm_gmtoff and tm_zone where
it's not available.

My goal is to normalize the functionality available on all platforms, to
the extent that it's possible, so that people are less likely to write
non-portable code and encounter example code that doesn't work.

From random832 at fastmail.us  Wed Sep 18 03:30:10 2013
From: random832 at fastmail.us (random832 at fastmail.us)
Date: Tue, 17 Sep 2013 21:30:10 -0400
Subject: [Python-ideas] Reduce platform dependence of date and time
 related functions
In-Reply-To: <5238C687.7080207@stoneleaf.us>
References: <1379357351.28759.22678441.285099D6@webmail.messagingengine.com>
 <1379424134.25980.23022817.0FD01849@webmail.messagingengine.com>
 <CAMpsgwY0h5eR8+hqfutEdp8D0vbES7XHFiQ5qBxT-dYJa_nqTA@mail.gmail.com>
 <1379451497.3210.23219769.5E1521E5@webmail.messagingengine.com>
 <5238C687.7080207@stoneleaf.us>
Message-ID: <1379467810.8907.23301397.3E8EF78A@webmail.messagingengine.com>

On Tue, Sep 17, 2013, at 17:15, Ethan Furman wrote:
> Is timegm/gmtime provided and consistent across all Python platforms?

Part of what I was proposing was _to_ provide a consistent
implementation - there's no reason (if we define timestamps as being
objectively based in 1970 and having no leap seconds) that it couldn't
be provided in python itself instead of  using the system's version.

From ncoghlan at gmail.com  Wed Sep 18 03:37:05 2013
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Wed, 18 Sep 2013 11:37:05 +1000
Subject: [Python-ideas] Reduce platform dependence of date and time
 related functions
In-Reply-To: <1379467810.8907.23301397.3E8EF78A@webmail.messagingengine.com>
References: <1379357351.28759.22678441.285099D6@webmail.messagingengine.com>
 <1379424134.25980.23022817.0FD01849@webmail.messagingengine.com>
 <CAMpsgwY0h5eR8+hqfutEdp8D0vbES7XHFiQ5qBxT-dYJa_nqTA@mail.gmail.com>
 <1379451497.3210.23219769.5E1521E5@webmail.messagingengine.com>
 <5238C687.7080207@stoneleaf.us>
 <1379467810.8907.23301397.3E8EF78A@webmail.messagingengine.com>
Message-ID: <CADiSq7chuYNpGs_2LMS7n15-Den-DSw-wa-3qPx_ZqoihRtKng@mail.gmail.com>

On 18 September 2013 11:30,  <random832 at fastmail.us> wrote:
> On Tue, Sep 17, 2013, at 17:15, Ethan Furman wrote:
>> Is timegm/gmtime provided and consistent across all Python platforms?
>
> Part of what I was proposing was _to_ provide a consistent
> implementation - there's no reason (if we define timestamps as being
> objectively based in 1970 and having no leap seconds) that it couldn't
> be provided in python itself instead of  using the system's version.

Yeah, this is a similar change to the one that was made for math.c
years ago - stepping up from merely relying on the system libraries to
ensuring a consistent cross-platform experience. It's just a concern
with initial development and long term maintenance effort, rather than
a fundamental desire to expose the raw platform behaviour (there are
*some* modules where we want to let developers have access to the
underlying platform specific behaviour, but the datetime APIs aren't
really one of them)

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia

From mal at egenix.com  Wed Sep 18 09:42:18 2013
From: mal at egenix.com (M.-A. Lemburg)
Date: Wed, 18 Sep 2013 09:42:18 +0200
Subject: [Python-ideas] Reduce platform dependence of date and time
 related functions
In-Reply-To: <CADiSq7chuYNpGs_2LMS7n15-Den-DSw-wa-3qPx_ZqoihRtKng@mail.gmail.com>
References: <1379357351.28759.22678441.285099D6@webmail.messagingengine.com>	<1379424134.25980.23022817.0FD01849@webmail.messagingengine.com>	<CAMpsgwY0h5eR8+hqfutEdp8D0vbES7XHFiQ5qBxT-dYJa_nqTA@mail.gmail.com>	<1379451497.3210.23219769.5E1521E5@webmail.messagingengine.com>	<5238C687.7080207@stoneleaf.us>	<1379467810.8907.23301397.3E8EF78A@webmail.messagingengine.com>
 <CADiSq7chuYNpGs_2LMS7n15-Den-DSw-wa-3qPx_ZqoihRtKng@mail.gmail.com>
Message-ID: <5239595A.7020208@egenix.com>

On 18.09.2013 03:37, Nick Coghlan wrote:
> On 18 September 2013 11:30,  <random832 at fastmail.us> wrote:
>> On Tue, Sep 17, 2013, at 17:15, Ethan Furman wrote:
>>> Is timegm/gmtime provided and consistent across all Python platforms?
>>
>> Part of what I was proposing was _to_ provide a consistent
>> implementation - there's no reason (if we define timestamps as being
>> objectively based in 1970 and having no leap seconds) that it couldn't
>> be provided in python itself instead of  using the system's version.
> 
> Yeah, this is a similar change to the one that was made for math.c
> years ago - stepping up from merely relying on the system libraries to
> ensuring a consistent cross-platform experience. It's just a concern
> with initial development and long term maintenance effort, rather than
> a fundamental desire to expose the raw platform behaviour (there are
> *some* modules where we want to let developers have access to the
> underlying platform specific behaviour, but the datetime APIs aren't
> really one of them)

I wonder why you'd want to use Unix ticks (what datetime calls a
timestamp) as basis for cross-platform date/time calculations.

If you really need a time_t representation of date/time values,
you're stuck with the platform dependent limitations anyway.

The time C functions are useful to tap into the OS's time zone
library, but time zone data changes regularly, so predictions that
go even only a few years into the future are bound to fail for some
zones. You can only reliably use UTC/GMT for absolute future date/time
values.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Sep 18 2013)
>>> Python Projects, Consulting and Support ...   http://www.egenix.com/
>>> mxODBC.Zope/Plone.Database.Adapter ...       http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________
2013-09-11: Released eGenix PyRun 1.3.0 ...       http://egenix.com/go49
2013-09-20: PyCon UK 2013, Coventry, UK ...                 2 days to go
2013-09-28: PyDDF Sprint ...                               10 days to go

   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
    D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
           Registered at Amtsgericht Duesseldorf: HRB 46611
               http://www.egenix.com/company/contact/

From random832 at fastmail.us  Wed Sep 18 15:25:08 2013
From: random832 at fastmail.us (random832 at fastmail.us)
Date: Wed, 18 Sep 2013 09:25:08 -0400
Subject: [Python-ideas] Reduce platform dependence of date and time
 related functions
In-Reply-To: <5239595A.7020208@egenix.com>
References: <1379357351.28759.22678441.285099D6@webmail.messagingengine.com>
 <1379424134.25980.23022817.0FD01849@webmail.messagingengine.com>
 <CAMpsgwY0h5eR8+hqfutEdp8D0vbES7XHFiQ5qBxT-dYJa_nqTA@mail.gmail.com>
 <1379451497.3210.23219769.5E1521E5@webmail.messagingengine.com>
 <5238C687.7080207@stoneleaf.us>
 <1379467810.8907.23301397.3E8EF78A@webmail.messagingengine.com>
 <CADiSq7chuYNpGs_2LMS7n15-Den-DSw-wa-3qPx_ZqoihRtKng@mail.gmail.com>
 <5239595A.7020208@egenix.com>
Message-ID: <1379510708.3739.23500557.2C09DC9D@webmail.messagingengine.com>

On Wed, Sep 18, 2013, at 3:42, M.-A. Lemburg wrote:
> I wonder why you'd want to use Unix ticks (what datetime calls a
> timestamp) as basis for cross-platform date/time calculations.

Because we've already got half a dozen APIs that use them. And there's
no particular reason to consider it _worse_ than any other scalar time
representation.

If we were defining the library from scratch today, we could argue the
merits of using days vs seconds vs microseconds as the unit, of 1970 vs
1904 vs 1600 vs 0000 for the epoch, and whether leap seconds should be
supported. But we've already got APIs that use time_t (and all supported
platforms define time_t as seconds since 1970)

From mal at egenix.com  Wed Sep 18 15:34:31 2013
From: mal at egenix.com (M.-A. Lemburg)
Date: Wed, 18 Sep 2013 15:34:31 +0200
Subject: [Python-ideas] Reduce platform dependence of date and time
 related functions
In-Reply-To: <1379510708.3739.23500557.2C09DC9D@webmail.messagingengine.com>
References: <1379357351.28759.22678441.285099D6@webmail.messagingengine.com>	<1379424134.25980.23022817.0FD01849@webmail.messagingengine.com>	<CAMpsgwY0h5eR8+hqfutEdp8D0vbES7XHFiQ5qBxT-dYJa_nqTA@mail.gmail.com>	<1379451497.3210.23219769.5E1521E5@webmail.messagingengine.com>	<5238C687.7080207@stoneleaf.us>	<1379467810.8907.23301397.3E8EF78A@webmail.messagingengine.com>	<CADiSq7chuYNpGs_2LMS7n15-Den-DSw-wa-3qPx_ZqoihRtKng@mail.gmail.com>	<5239595A.7020208@egenix.com>
 <1379510708.3739.23500557.2C09DC9D@webmail.messagingengine.com>
Message-ID: <5239ABE7.1040005@egenix.com>

On 18.09.2013 15:25, random832 at fastmail.us wrote:
> On Wed, Sep 18, 2013, at 3:42, M.-A. Lemburg wrote:
>> I wonder why you'd want to use Unix ticks (what datetime calls a
>> timestamp) as basis for cross-platform date/time calculations.
> 
> Because we've already got half a dozen APIs that use them. And there's
> no particular reason to consider it _worse_ than any other scalar time
> representation.
> 
> If we were defining the library from scratch today, we could argue the
> merits of using days vs seconds vs microseconds as the unit, of 1970 vs
> 1904 vs 1600 vs 0000 for the epoch, and whether leap seconds should be
> supported. But we've already got APIs that use time_t (and all supported
> platforms define time_t as seconds since 1970)

Right, but those APIs are all limited to what the platforms
defines as t_time and like you say: those values are often
limited to certain ranges.

If you want platform independent representations, use one of the
available conversion routines to turn the time_t values into
e.g. datetime objects and ideally convert the values to UTC
to avoid time zone issues. Then use those objects for date/time
calculations.

time_t values are really not a good basis for doing date/time
calculations. Ideally, they should only be used and regarded
as containers holding a platform dependent date/time value,
nothing more.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Sep 18 2013)
>>> Python Projects, Consulting and Support ...   http://www.egenix.com/
>>> mxODBC.Zope/Plone.Database.Adapter ...       http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________
2013-09-11: Released eGenix PyRun 1.3.0 ...       http://egenix.com/go49
2013-09-20: PyCon UK 2013, Coventry, UK ...                 2 days to go
2013-09-28: PyDDF Sprint ...                               10 days to go

   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
    D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
           Registered at Amtsgericht Duesseldorf: HRB 46611
               http://www.egenix.com/company/contact/

From mistersheik at gmail.com  Wed Sep 18 18:21:37 2013
From: mistersheik at gmail.com (Neil Girdhar)
Date: Wed, 18 Sep 2013 09:21:37 -0700 (PDT)
Subject: [Python-ideas] pickle does not work properly with cooperative
 multiple inheritance. Propose: "__getnewkwargs__".
Message-ID: <ab5ae66f-4947-4be2-b60f-c16dfefbf5ce@googlegroups.com>

My understanding of cooperative multiple inheritance is that a class often 
doesn't know how your parent classes want to be constructed, pickled, etc. 
and so it delegates to its parents using super.

In general, constructors can accept keyword arguments, and forward their 
unused arguments to parents effortlessly:

class A(B):
    def __init__(self, x, **kwargs):
         super().__init__(**kwargs)

will extract x and forward kwargs.

Unfortunately, the same mechanism is not easily available for pickling 
because __getnewargs__ returns only a tuple.  If there were a 
__getnewkwargs__ method, then we could have

class A:
    def _getnewkwargs__(self):
         return {**super().__getnewkwargs(), a=self.a, b=self.b}  # (new 
unpacking from PEP448)

Note how additional kwargs are added to the dict of kwargs specified by the 
parent objects.

Best,

Neil
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20130918/d385828a/attachment.html>

From random832 at fastmail.us  Wed Sep 18 19:20:43 2013
From: random832 at fastmail.us (random832 at fastmail.us)
Date: Wed, 18 Sep 2013 13:20:43 -0400
Subject: [Python-ideas] Reduce platform dependence of date and time
 related functions
In-Reply-To: <5239ABE7.1040005@egenix.com>
References: <1379357351.28759.22678441.285099D6@webmail.messagingengine.com>
 <1379424134.25980.23022817.0FD01849@webmail.messagingengine.com>
 <CAMpsgwY0h5eR8+hqfutEdp8D0vbES7XHFiQ5qBxT-dYJa_nqTA@mail.gmail.com>
 <1379451497.3210.23219769.5E1521E5@webmail.messagingengine.com>
 <5238C687.7080207@stoneleaf.us>
 <1379467810.8907.23301397.3E8EF78A@webmail.messagingengine.com>
 <CADiSq7chuYNpGs_2LMS7n15-Den-DSw-wa-3qPx_ZqoihRtKng@mail.gmail.com>
 <5239595A.7020208@egenix.com>
 <1379510708.3739.23500557.2C09DC9D@webmail.messagingengine.com>
 <5239ABE7.1040005@egenix.com>
Message-ID: <1379524843.28775.23598689.7002A138@webmail.messagingengine.com>

On Wed, Sep 18, 2013, at 9:34, M.-A. Lemburg wrote:
> Right, but those APIs are all limited to what the platforms
> defines as t_time and like you say: those values are often
> limited to certain ranges.

We're going around in circles. I'm proposing _removing_ those
limitations, so that for example code written for Unix systems (that
assumes it can use negative values before 1970) will work on Windows,
and code written for 64-bit systems will work on systems whose native
time_t is 32 bits.

It occurs to me that you might have misunderstood me. By "APIs" I was
not referring to the platform functions themselves (which, obviously,
are limited to what the platform's type can represent, and sometimes
impose arbitrary limits on top of that), I was talking about
datetime.fromtimestamp, the various functions in the time module,
calendar.timegm, os.stat, and so on. There's no reason _those_ should be
limited to what the platform defines.

Just because "seconds since 1970" was invented by a platform does not
mean it should be considered to be a platform-dependent representation.
There's nothing _wrong_ with it as a representation of UTC, except for
the fact that it can't represent leap seconds, and I suspect a lot of
other things break in the presence of leap seconds anyway. The fact that
timedelta is defined as a days/seconds combination, for example. In the
presence of leap seconds, it shouldn't be possible to normalize them any
more than if there were a months or years field.

> If you want platform independent representations, use one of the
> available conversion routines to turn the time_t values into
> e.g. datetime objects and ideally convert the values to UTC
> to avoid time zone issues. Then use those objects for date/time
> calculations.
> 
> time_t values are really not a good basis for doing date/time
> calculations. Ideally, they should only be used and regarded
> as containers holding a platform dependent date/time value,
> nothing more.

That ship sailed long ago. This isn't a Python 4000 thread; we're
talking about the API we have, not the one we want.

From alexander.belopolsky at gmail.com  Wed Sep 18 19:37:53 2013
From: alexander.belopolsky at gmail.com (Alexander Belopolsky)
Date: Wed, 18 Sep 2013 13:37:53 -0400
Subject: [Python-ideas] Reduce platform dependence of date and time
 related functions
In-Reply-To: <1379524843.28775.23598689.7002A138@webmail.messagingengine.com>
References: <1379357351.28759.22678441.285099D6@webmail.messagingengine.com>
 <1379424134.25980.23022817.0FD01849@webmail.messagingengine.com>
 <CAMpsgwY0h5eR8+hqfutEdp8D0vbES7XHFiQ5qBxT-dYJa_nqTA@mail.gmail.com>
 <1379451497.3210.23219769.5E1521E5@webmail.messagingengine.com>
 <5238C687.7080207@stoneleaf.us>
 <1379467810.8907.23301397.3E8EF78A@webmail.messagingengine.com>
 <CADiSq7chuYNpGs_2LMS7n15-Den-DSw-wa-3qPx_ZqoihRtKng@mail.gmail.com>
 <5239595A.7020208@egenix.com>
 <1379510708.3739.23500557.2C09DC9D@webmail.messagingengine.com>
 <5239ABE7.1040005@egenix.com>
 <1379524843.28775.23598689.7002A138@webmail.messagingengine.com>
Message-ID: <CAP7h-xbcA_TFpQ9AWWe3UCFoNOTt8bgT9-NmvCiYA0wd6CNzmg@mail.gmail.com>

On Wed, Sep 18, 2013 at 1:20 PM, <random832 at fastmail.us> wrote:

> We're going around in circles. I'm proposing _removing_ those
> limitations, so that for example code written for Unix systems (that
> assumes it can use negative values before 1970) will work on Windows,
> and code written for 64-bit systems will work on systems whose native
> time_t is 32 bits.
>

That's a sign that this discussion should move to the tracker where a
concrete patch can be proposed and discussed.  There is at least one
proposal that seems to be controversial: remove platform-dependent code
from datetime.utcfromtimestamp().

The change is trivial:

def utcfromtimestamp(seconds):
      return datetime(1970, 1, 1) + timedelta(seconds=seconds)

I will gladly apply such patch once it is complete with tests and C code.

The case for changing time.gmtime() is weaker.  We would have to add
additional dependency of time module on datetime or move or duplicate a
sizable chunk of C code.  If someone wants to undertake this project, I
would like to see an attempt to remove circular dependency between time and
datetime modules rather than couple the two modules even more tightly.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20130918/f68d9c3e/attachment-0001.html>

From mistersheik at gmail.com  Wed Sep 18 20:38:59 2013
From: mistersheik at gmail.com (Neil Girdhar)
Date: Wed, 18 Sep 2013 11:38:59 -0700 (PDT)
Subject: [Python-ideas] Add PassedArgSpec class to types and expose mapping
 given an ArgSpec.
Message-ID: <3a434bfd-6136-424d-9bea-498b637d0cc8@googlegroups.com>

As far as I know, the way that arguments are mapped to a parameter 
specification is not exposed to the programmer.  I suggest adding a 
PassedArgSpec class having two members: args and kwargs.  Then, 
inspect.ArgSpec can take an argument specification and decode the 
PassedArgSpect (putting the right things in the right places) and return a 
dictionary with everything in its right place.

I can only think of one use for now, which is replacing "arguments" in the 
returned tuple of __reduce__ and maybe allowing it to be returned by 
"__getnewargs__".  It might also be nice to store such argument 
specifications instead of the pair args, kwargs when storing them in lists.

Best,

Neil
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20130918/f4ffa460/attachment.html>

From mistersheik at gmail.com  Wed Sep 18 21:12:13 2013
From: mistersheik at gmail.com (Neil Girdhar)
Date: Wed, 18 Sep 2013 12:12:13 -0700 (PDT)
Subject: [Python-ideas] pickle does not work properly with cooperative
 multiple inheritance. Propose: "__getnewkwargs__".
In-Reply-To: <ab5ae66f-4947-4be2-b60f-c16dfefbf5ce@googlegroups.com>
References: <ab5ae66f-4947-4be2-b60f-c16dfefbf5ce@googlegroups.com>
Message-ID: <849ba4fa-aa67-4beb-95c9-b2494c3d907d@googlegroups.com>

An alternative is to allow __getnewargs__ to return a "PassedArgSpec" as I 
described in another idea.

On Wednesday, September 18, 2013 12:21:37 PM UTC-4, Neil Girdhar wrote:
>
> My understanding of cooperative multiple inheritance is that a class often 
> doesn't know how your parent classes want to be constructed, pickled, etc. 
> and so it delegates to its parents using super.
>
> In general, constructors can accept keyword arguments, and forward their 
> unused arguments to parents effortlessly:
>
> class A(B):
>     def __init__(self, x, **kwargs):
>          super().__init__(**kwargs)
>
> will extract x and forward kwargs.
>
> Unfortunately, the same mechanism is not easily available for pickling 
> because __getnewargs__ returns only a tuple.  If there were a 
> __getnewkwargs__ method, then we could have
>
> class A:
>     def _getnewkwargs__(self):
>          return {**super().__getnewkwargs(), a=self.a, b=self.b}  # (new 
> unpacking from PEP448)
>
> Note how additional kwargs are added to the dict of kwargs specified by 
> the parent objects.
>
> Best,
>
> Neil
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20130918/1c4357a0/attachment.html>

From ncoghlan at gmail.com  Thu Sep 19 02:23:09 2013
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Thu, 19 Sep 2013 10:23:09 +1000
Subject: [Python-ideas] Add PassedArgSpec class to types and expose
 mapping given an ArgSpec.
In-Reply-To: <3a434bfd-6136-424d-9bea-498b637d0cc8@googlegroups.com>
References: <3a434bfd-6136-424d-9bea-498b637d0cc8@googlegroups.com>
Message-ID: <CADiSq7dXVmJeScXHPdHVqprRf5vesDOYp20yTAkUFD9QjNe4zA@mail.gmail.com>

(Extra copy to the list, since Google Groups breaks the recipient list :P)

inspect.Signature.bind() supports this in Python 3.3+

For earlier versions, Aaron Iles backported the functionality on PyPI as
"funcsigs".

You can also just define an appropriate function, call it as f(*args,
**kwds) and return the resulting locals() namespace.

Cheers,
Nick.
On 19 Sep 2013 04:39, "Neil Girdhar" <mistersheik at gmail.com> wrote:

> As far as I know, the way that arguments are mapped to a parameter
> specification is not exposed to the programmer.  I suggest adding a
> PassedArgSpec class having two members: args and kwargs.  Then,
> inspect.ArgSpec can take an argument specification and decode the
> PassedArgSpect (putting the right things in the right places) and return a
> dictionary with everything in its right place.
>
> I can only think of one use for now, which is replacing "arguments" in the
> returned tuple of __reduce__ and maybe allowing it to be returned by
> "__getnewargs__".  It might also be nice to store such argument
> specifications instead of the pair args, kwargs when storing them in lists.
>
> Best,
>
> Neil
>
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> https://mail.python.org/mailman/listinfo/python-ideas
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20130919/87dd47c3/attachment.html>

From mistersheik at gmail.com  Thu Sep 19 10:23:22 2013
From: mistersheik at gmail.com (Neil Girdhar)
Date: Thu, 19 Sep 2013 01:23:22 -0700 (PDT)
Subject: [Python-ideas] Introduce collections.Reiterable
Message-ID: <575f4071-1b5c-4a16-b36c-b5f925cdd2f7@googlegroups.com>

This is an idea I've wanted for a while:

When I call functions that accept an iterable, I often have to check 
whether the function iterates over the iterable once or more than once.  If 
it iterates more than once, I must not pass a generator, but rather cast to 
a list.  Otherwise, the second iteration through the generator will be 
empty as the first has exhausted it completely.

It would be nice to introduce an abstract base class in collections (docs<http://docs.python.org/3.3/library/collections.abc.html>) 
between Iterable and Sequence.  Right now, Sequence inherits from Iterable. 
 I propose having Sequence inherit from Reiterable, which in turn, inherits 
from Iterable.  All sequences are reiterable, whereas generators are not. 
 However, views in sets and dictionaries, and numpy arrays are examples of 
Reiterables that are not Sequences. Having such an abstract base class 
would be useful for debugging in its own right.

Also, functions that iterate twice over an iterable can check to make sure 
the iterable is "re-iterable" using isinstance (the standard approach as 
per pep 3119 <http://www.python.org/dev/peps/pep-3119/>).  But, better yet, 
itertools could add two functions: auto_tee, which takes an iterable "I" as 
its parameter, and an integer n.  If it is not a reiterable, it calls tee 
and returns n iterables independently capable of iterating "I".  If it is 
reiterable, it returns [I] * n.  This way, the client code can do whatever 
is easiest and the target code can call auto_tee if necessary.  auto_list 
could do the same sort of thing, but omitting the copy that would normally 
be incurred if a list were passed in.

Maybe this is less useful than I once thought since I've gotten by without 
it, but I just wanted to throw the idea out there in case it clicks for 
someone else.

Neil
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20130919/f5519451/attachment.html>

From mal at egenix.com  Thu Sep 19 10:28:15 2013
From: mal at egenix.com (M.-A. Lemburg)
Date: Thu, 19 Sep 2013 10:28:15 +0200
Subject: [Python-ideas] Reduce platform dependence of date and time
 related functions
In-Reply-To: <CAP7h-xbcA_TFpQ9AWWe3UCFoNOTt8bgT9-NmvCiYA0wd6CNzmg@mail.gmail.com>
References: <1379357351.28759.22678441.285099D6@webmail.messagingengine.com>	<1379424134.25980.23022817.0FD01849@webmail.messagingengine.com>	<CAMpsgwY0h5eR8+hqfutEdp8D0vbES7XHFiQ5qBxT-dYJa_nqTA@mail.gmail.com>	<1379451497.3210.23219769.5E1521E5@webmail.messagingengine.com>	<5238C687.7080207@stoneleaf.us>	<1379467810.8907.23301397.3E8EF78A@webmail.messagingengine.com>	<CADiSq7chuYNpGs_2LMS7n15-Den-DSw-wa-3qPx_ZqoihRtKng@mail.gmail.com>	<5239595A.7020208@egenix.com>	<1379510708.3739.23500557.2C09DC9D@webmail.messagingengine.com>	<5239ABE7.1040005@egenix.com>	<1379524843.28775.23598689.7002A138@webmail.messagingengine.com>
 <CAP7h-xbcA_TFpQ9AWWe3UCFoNOTt8bgT9-NmvCiYA0wd6CNzmg@mail.gmail.com>
Message-ID: <523AB59F.6050509@egenix.com>

On 18.09.2013 19:37, Alexander Belopolsky wrote:
> On Wed, Sep 18, 2013 at 1:20 PM, <random832 at fastmail.us> wrote:
> 
>> We're going around in circles. I'm proposing _removing_ those
>> limitations, so that for example code written for Unix systems (that
>> assumes it can use negative values before 1970) will work on Windows,
>> and code written for 64-bit systems will work on systems whose native
>> time_t is 32 bits.
>>
> 
> That's a sign that this discussion should move to the tracker where a
> concrete patch can be proposed and discussed.  There is at least one
> proposal that seems to be controversial: remove platform-dependent code
> from datetime.utcfromtimestamp().
> 
> The change is trivial:
> 
> def utcfromtimestamp(seconds):
>       return datetime(1970, 1, 1) + timedelta(seconds=seconds)
> 
> I will gladly apply such patch once it is complete with tests and C code.

If you do apply this change, you will have to clearly state that
the datetime module's understanding of a timestamp may differ from
the platform definition of Unix ticks.

> The case for changing time.gmtime() is weaker.  We would have to add
> additional dependency of time module on datetime or move or duplicate a
> sizable chunk of C code.  If someone wants to undertake this project, I
> would like to see an attempt to remove circular dependency between time and
> datetime modules rather than couple the two modules even more tightly.

-1 on changing the time module APIs. People expect those to be
wrappers of the C APIs and thus also expect these APIs to
implement the platform specific behavior, e.g. supporting
leap seconds with gmtime().

POSIX called for not supporting leap seconds in e.g. gmtime(),
but they are part of the definition of GMT/UTC and it's possible
to enable support for them:

    http://en.wikipedia.org/wiki/Leap_second

Platform comparison:

    http://k5wiki.kerberos.org/wiki/Leap_second_handling

That said, it's very rare to find a system that actually
does not implement POSIX gmtime().

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Sep 19 2013)
>>> Python Projects, Consulting and Support ...   http://www.egenix.com/
>>> mxODBC.Zope/Plone.Database.Adapter ...       http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________
2013-09-11: Released eGenix PyRun 1.3.0 ...       http://egenix.com/go49
2013-09-20: PyCon UK 2013, Coventry, UK ...                     tomorrow
2013-09-28: PyDDF Sprint ...                                9 days to go

   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
    D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
           Registered at Amtsgericht Duesseldorf: HRB 46611
               http://www.egenix.com/company/contact/

From ncoghlan at gmail.com  Thu Sep 19 10:32:52 2013
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Thu, 19 Sep 2013 18:32:52 +1000
Subject: [Python-ideas] Introduce collections.Reiterable
In-Reply-To: <575f4071-1b5c-4a16-b36c-b5f925cdd2f7@googlegroups.com>
References: <575f4071-1b5c-4a16-b36c-b5f925cdd2f7@googlegroups.com>
Message-ID: <CADiSq7c2G8E47Hth-kQWd4W33dpb9YFiyLs+vs7Seaw20w6fHw@mail.gmail.com>

(Grr, why is Google Groups so broken? :P)

My question would be, does the new class add anything that isn't
already covered by:

    isinstance(c, Iterable) and not isinstance(c, Iterator)

Cheers,
Nick.

On 19 September 2013 18:23, Neil Girdhar <mistersheik at gmail.com> wrote:
> This is an idea I've wanted for a while:
>
> When I call functions that accept an iterable, I often have to check whether
> the function iterates over the iterable once or more than once.  If it
> iterates more than once, I must not pass a generator, but rather cast to a
> list.  Otherwise, the second iteration through the generator will be empty
> as the first has exhausted it completely.
>
> It would be nice to introduce an abstract base class in collections (docs)
> between Iterable and Sequence.  Right now, Sequence inherits from Iterable.
> I propose having Sequence inherit from Reiterable, which in turn, inherits
> from Iterable.  All sequences are reiterable, whereas generators are not.
> However, views in sets and dictionaries, and numpy arrays are examples of
> Reiterables that are not Sequences. Having such an abstract base class would
> be useful for debugging in its own right.
>
> Also, functions that iterate twice over an iterable can check to make sure
> the iterable is "re-iterable" using isinstance (the standard approach as per
> pep 3119).  But, better yet, itertools could add two functions: auto_tee,
> which takes an iterable "I" as its parameter, and an integer n.  If it is
> not a reiterable, it calls tee and returns n iterables independently capable
> of iterating "I".  If it is reiterable, it returns [I] * n.  This way, the
> client code can do whatever is easiest and the target code can call auto_tee
> if necessary.  auto_list could do the same sort of thing, but omitting the
> copy that would normally be incurred if a list were passed in.
>
> Maybe this is less useful than I once thought since I've gotten by without
> it, but I just wanted to throw the idea out there in case it clicks for
> someone else.
>
> Neil
>
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> https://mail.python.org/mailman/listinfo/python-ideas
>


-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia

From mistersheik at gmail.com  Thu Sep 19 10:59:35 2013
From: mistersheik at gmail.com (Neil Girdhar)
Date: Thu, 19 Sep 2013 04:59:35 -0400
Subject: [Python-ideas] Introduce collections.Reiterable
In-Reply-To: <CADiSq7c2G8E47Hth-kQWd4W33dpb9YFiyLs+vs7Seaw20w6fHw@mail.gmail.com>
References: <575f4071-1b5c-4a16-b36c-b5f925cdd2f7@googlegroups.com>
 <CADiSq7c2G8E47Hth-kQWd4W33dpb9YFiyLs+vs7Seaw20w6fHw@mail.gmail.com>
Message-ID: <CAA68w_kuPS6xVjAbropBFyiQib3aH9Ke+PiW8ecWhSkmHsnXnQ@mail.gmail.com>

Well, generators are iterable, but if you write a function like:

def f(s):
     for x in s:
             do_something(x)
     for x in s:
             do_something_else(x)

x should not be a generator.  I am proposing adding a function to itertools
like auto_reiterable that would take s and give you an reiterable in the
most efficient way possible.


On Thu, Sep 19, 2013 at 4:32 AM, Nick Coghlan <ncoghlan at gmail.com> wrote:

> My question would be, does the new class add anything that isn't
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20130919/25900a05/attachment.html>

From ncoghlan at gmail.com  Thu Sep 19 11:12:26 2013
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Thu, 19 Sep 2013 19:12:26 +1000
Subject: [Python-ideas] Introduce collections.Reiterable
In-Reply-To: <CAA68w_kuPS6xVjAbropBFyiQib3aH9Ke+PiW8ecWhSkmHsnXnQ@mail.gmail.com>
References: <575f4071-1b5c-4a16-b36c-b5f925cdd2f7@googlegroups.com>
 <CADiSq7c2G8E47Hth-kQWd4W33dpb9YFiyLs+vs7Seaw20w6fHw@mail.gmail.com>
 <CAA68w_kuPS6xVjAbropBFyiQib3aH9Ke+PiW8ecWhSkmHsnXnQ@mail.gmail.com>
Message-ID: <CADiSq7cVXAUi4BLpfa13QjhwNmX9JancrMjHSEEfKn_P=F-_ag@mail.gmail.com>

On 19 Sep 2013 18:59, "Neil Girdhar" <mistersheik at gmail.com> wrote:
>
> Well, generators are iterable, but if you write a function like:
>
> def f(s):
>      for x in s:
>              do_something(x)
>      for x in s:
>              do_something_else(x)
>
> x should not be a generator.  I am proposing adding a function to
itertools like auto_reiterable that would take s and give you an reiterable
in the most efficient way possible.

Generators *are* iterators, though, so they fail the second half of the
check. Hence my question - is there any obvious case where "iterable but
not an iterator" gives the wrong answer?

Cheers,
Nick.

>
>
> On Thu, Sep 19, 2013 at 4:32 AM, Nick Coghlan <ncoghlan at gmail.com> wrote:
>>
>> My question would be, does the new class add anything that isn't
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20130919/76bcb42d/attachment.html>

From mistersheik at gmail.com  Thu Sep 19 11:14:16 2013
From: mistersheik at gmail.com (Neil Girdhar)
Date: Thu, 19 Sep 2013 05:14:16 -0400
Subject: [Python-ideas] Introduce collections.Reiterable
In-Reply-To: <CADiSq7cVXAUi4BLpfa13QjhwNmX9JancrMjHSEEfKn_P=F-_ag@mail.gmail.com>
References: <575f4071-1b5c-4a16-b36c-b5f925cdd2f7@googlegroups.com>
 <CADiSq7c2G8E47Hth-kQWd4W33dpb9YFiyLs+vs7Seaw20w6fHw@mail.gmail.com>
 <CAA68w_kuPS6xVjAbropBFyiQib3aH9Ke+PiW8ecWhSkmHsnXnQ@mail.gmail.com>
 <CADiSq7cVXAUi4BLpfa13QjhwNmX9JancrMjHSEEfKn_P=F-_ag@mail.gmail.com>
Message-ID: <CAA68w_mfQiVZFGv3VsP9sAN-+VEiG3J5pkBjzy+MNgYhWPjQVA@mail.gmail.com>

I am proposing a new class "Reiterable" that is a subclass of Iterable.
 For example, a dictionary view is a reiterable.  It would be fine to pass
such an object to the function f.

Best,
Neil


On Thu, Sep 19, 2013 at 5:12 AM, Nick Coghlan <ncoghlan at gmail.com> wrote:

>
> On 19 Sep 2013 18:59, "Neil Girdhar" <mistersheik at gmail.com> wrote:
> >
> > Well, generators are iterable, but if you write a function like:
> >
> > def f(s):
> >      for x in s:
> >              do_something(x)
> >      for x in s:
> >              do_something_else(x)
> >
> > x should not be a generator.  I am proposing adding a function to
> itertools like auto_reiterable that would take s and give you an reiterable
> in the most efficient way possible.
>
> Generators *are* iterators, though, so they fail the second half of the
> check. Hence my question - is there any obvious case where "iterable but
> not an iterator" gives the wrong answer?
>
> Cheers,
> Nick.
>
> >
> >
> > On Thu, Sep 19, 2013 at 4:32 AM, Nick Coghlan <ncoghlan at gmail.com>
> wrote:
> >>
> >> My question would be, does the new class add anything that isn't
> >
> >
> >
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20130919/05fa0b51/attachment.html>

From ncoghlan at gmail.com  Thu Sep 19 11:20:28 2013
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Thu, 19 Sep 2013 19:20:28 +1000
Subject: [Python-ideas] Introduce collections.Reiterable
In-Reply-To: <CAA68w_mfQiVZFGv3VsP9sAN-+VEiG3J5pkBjzy+MNgYhWPjQVA@mail.gmail.com>
References: <575f4071-1b5c-4a16-b36c-b5f925cdd2f7@googlegroups.com>
 <CADiSq7c2G8E47Hth-kQWd4W33dpb9YFiyLs+vs7Seaw20w6fHw@mail.gmail.com>
 <CAA68w_kuPS6xVjAbropBFyiQib3aH9Ke+PiW8ecWhSkmHsnXnQ@mail.gmail.com>
 <CADiSq7cVXAUi4BLpfa13QjhwNmX9JancrMjHSEEfKn_P=F-_ag@mail.gmail.com>
 <CAA68w_mfQiVZFGv3VsP9sAN-+VEiG3J5pkBjzy+MNgYhWPjQVA@mail.gmail.com>
Message-ID: <CADiSq7fbkaA4JA8KhQorr45-KAFp8gGHZRH0VH=7Bqkwbeaeng@mail.gmail.com>

On 19 Sep 2013 19:14, "Neil Girdhar" <mistersheik at gmail.com> wrote:
>
> I am proposing a new class "Reiterable" that is a subclass of Iterable.
 For example, a dictionary view is a reiterable.  It would be fine to pass
such an object to the function f.

I'm afraid simply repeating your proposal still doesn't answer my question.
You have indicated that you are trying to identify things that are
iterable, but not iterators. That is already possible using a second
isinstance check to exclude iterators.

So, what is the value you see in adding a new ABC to further simplify an
already simple check?

Cheers,
Nick.

>
> Best,
> Neil
>
>
> On Thu, Sep 19, 2013 at 5:12 AM, Nick Coghlan <ncoghlan at gmail.com> wrote:
>>
>>
>> On 19 Sep 2013 18:59, "Neil Girdhar" <mistersheik at gmail.com> wrote:
>> >
>> > Well, generators are iterable, but if you write a function like:
>> >
>> > def f(s):
>> >      for x in s:
>> >              do_something(x)
>> >      for x in s:
>> >              do_something_else(x)
>> >
>> > x should not be a generator.  I am proposing adding a function to
itertools like auto_reiterable that would take s and give you an reiterable
in the most efficient way possible.
>>
>> Generators *are* iterators, though, so they fail the second half of the
check. Hence my question - is there any obvious case where "iterable but
not an iterator" gives the wrong answer?
>>
>> Cheers,
>> Nick.
>>
>> >
>> >
>> > On Thu, Sep 19, 2013 at 4:32 AM, Nick Coghlan <ncoghlan at gmail.com>
wrote:
>> >>
>> >> My question would be, does the new class add anything that isn't
>> >
>> >
>> >
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20130919/33098f4b/attachment-0001.html>

From mistersheik at gmail.com  Thu Sep 19 11:23:04 2013
From: mistersheik at gmail.com (Neil Girdhar)
Date: Thu, 19 Sep 2013 05:23:04 -0400
Subject: [Python-ideas] Introduce collections.Reiterable
In-Reply-To: <CADiSq7fbkaA4JA8KhQorr45-KAFp8gGHZRH0VH=7Bqkwbeaeng@mail.gmail.com>
References: <575f4071-1b5c-4a16-b36c-b5f925cdd2f7@googlegroups.com>
 <CADiSq7c2G8E47Hth-kQWd4W33dpb9YFiyLs+vs7Seaw20w6fHw@mail.gmail.com>
 <CAA68w_kuPS6xVjAbropBFyiQib3aH9Ke+PiW8ecWhSkmHsnXnQ@mail.gmail.com>
 <CADiSq7cVXAUi4BLpfa13QjhwNmX9JancrMjHSEEfKn_P=F-_ag@mail.gmail.com>
 <CAA68w_mfQiVZFGv3VsP9sAN-+VEiG3J5pkBjzy+MNgYhWPjQVA@mail.gmail.com>
 <CADiSq7fbkaA4JA8KhQorr45-KAFp8gGHZRH0VH=7Bqkwbeaeng@mail.gmail.com>
Message-ID: <CAA68w_mAevgTz+2uCFiVGn9Z9nHqP+nrfMaF8V+g0UDgfSNoGw@mail.gmail.com>

No, not things which are iterable but not iterators, things which are
*reiterable* rather than merely iterable.  That is, things that can be
iterated multiple times without the generated elements disappearing.


On Thu, Sep 19, 2013 at 5:20 AM, Nick Coghlan <ncoghlan at gmail.com> wrote:

>
> On 19 Sep 2013 19:14, "Neil Girdhar" <mistersheik at gmail.com> wrote:
> >
> > I am proposing a new class "Reiterable" that is a subclass of Iterable.
>  For example, a dictionary view is a reiterable.  It would be fine to pass
> such an object to the function f.
>
> I'm afraid simply repeating your proposal still doesn't answer my
> question. You have indicated that you are trying to identify things that
> are iterable, but not iterators. That is already possible using a second
> isinstance check to exclude iterators.
>
> So, what is the value you see in adding a new ABC to further simplify an
> already simple check?
>
> Cheers,
> Nick.
>
> >
> > Best,
> > Neil
> >
> >
> > On Thu, Sep 19, 2013 at 5:12 AM, Nick Coghlan <ncoghlan at gmail.com>
> wrote:
> >>
> >>
> >> On 19 Sep 2013 18:59, "Neil Girdhar" <mistersheik at gmail.com> wrote:
> >> >
> >> > Well, generators are iterable, but if you write a function like:
> >> >
> >> > def f(s):
> >> >      for x in s:
> >> >              do_something(x)
> >> >      for x in s:
> >> >              do_something_else(x)
> >> >
> >> > x should not be a generator.  I am proposing adding a function to
> itertools like auto_reiterable that would take s and give you an reiterable
> in the most efficient way possible.
> >>
> >> Generators *are* iterators, though, so they fail the second half of the
> check. Hence my question - is there any obvious case where "iterable but
> not an iterator" gives the wrong answer?
> >>
> >> Cheers,
> >> Nick.
> >>
> >> >
> >> >
> >> > On Thu, Sep 19, 2013 at 4:32 AM, Nick Coghlan <ncoghlan at gmail.com>
> wrote:
> >> >>
> >> >> My question would be, does the new class add anything that isn't
> >> >
> >> >
> >> >
> >
> >
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20130919/42756530/attachment.html>

From mistersheik at gmail.com  Thu Sep 19 11:25:08 2013
From: mistersheik at gmail.com (Neil Girdhar)
Date: Thu, 19 Sep 2013 05:25:08 -0400
Subject: [Python-ideas] Introduce collections.Reiterable
In-Reply-To: <CAA68w_mAevgTz+2uCFiVGn9Z9nHqP+nrfMaF8V+g0UDgfSNoGw@mail.gmail.com>
References: <575f4071-1b5c-4a16-b36c-b5f925cdd2f7@googlegroups.com>
 <CADiSq7c2G8E47Hth-kQWd4W33dpb9YFiyLs+vs7Seaw20w6fHw@mail.gmail.com>
 <CAA68w_kuPS6xVjAbropBFyiQib3aH9Ke+PiW8ecWhSkmHsnXnQ@mail.gmail.com>
 <CADiSq7cVXAUi4BLpfa13QjhwNmX9JancrMjHSEEfKn_P=F-_ag@mail.gmail.com>
 <CAA68w_mfQiVZFGv3VsP9sAN-+VEiG3J5pkBjzy+MNgYhWPjQVA@mail.gmail.com>
 <CADiSq7fbkaA4JA8KhQorr45-KAFp8gGHZRH0VH=7Bqkwbeaeng@mail.gmail.com>
 <CAA68w_mAevgTz+2uCFiVGn9Z9nHqP+nrfMaF8V+g0UDgfSNoGw@mail.gmail.com>
Message-ID: <CAA68w_mcENMGN1eD3a79LcBS0xJ24w+5QSM_3er=kgmPjT9jCQ@mail.gmail.com>

Note that neither a generator expression, nor a dictionary view, nor a list
are Iterators.  They are all Iterable.  The list is also a Sequence.  The
Reiterable category applies to the latter two, since they can be iterated
over multiple times without being consume,


On Thu, Sep 19, 2013 at 5:23 AM, Neil Girdhar <mistersheik at gmail.com> wrote:

> No, not things which are iterable but not iterators, things which are
> *reiterable* rather than merely iterable.  That is, things that can be
> iterated multiple times without the generated elements disappearing.
>
>
> On Thu, Sep 19, 2013 at 5:20 AM, Nick Coghlan <ncoghlan at gmail.com> wrote:
>
>>
>> On 19 Sep 2013 19:14, "Neil Girdhar" <mistersheik at gmail.com> wrote:
>> >
>> > I am proposing a new class "Reiterable" that is a subclass of Iterable.
>>  For example, a dictionary view is a reiterable.  It would be fine to pass
>> such an object to the function f.
>>
>> I'm afraid simply repeating your proposal still doesn't answer my
>> question. You have indicated that you are trying to identify things that
>> are iterable, but not iterators. That is already possible using a second
>> isinstance check to exclude iterators.
>>
>> So, what is the value you see in adding a new ABC to further simplify an
>> already simple check?
>>
>> Cheers,
>> Nick.
>>
>> >
>> > Best,
>> > Neil
>> >
>> >
>> > On Thu, Sep 19, 2013 at 5:12 AM, Nick Coghlan <ncoghlan at gmail.com>
>> wrote:
>> >>
>> >>
>> >> On 19 Sep 2013 18:59, "Neil Girdhar" <mistersheik at gmail.com> wrote:
>> >> >
>> >> > Well, generators are iterable, but if you write a function like:
>> >> >
>> >> > def f(s):
>> >> >      for x in s:
>> >> >              do_something(x)
>> >> >      for x in s:
>> >> >              do_something_else(x)
>> >> >
>> >> > x should not be a generator.  I am proposing adding a function to
>> itertools like auto_reiterable that would take s and give you an reiterable
>> in the most efficient way possible.
>> >>
>> >> Generators *are* iterators, though, so they fail the second half of
>> the check. Hence my question - is there any obvious case where "iterable
>> but not an iterator" gives the wrong answer?
>> >>
>> >> Cheers,
>> >> Nick.
>> >>
>> >> >
>> >> >
>> >> > On Thu, Sep 19, 2013 at 4:32 AM, Nick Coghlan <ncoghlan at gmail.com>
>> wrote:
>> >> >>
>> >> >> My question would be, does the new class add anything that isn't
>> >> >
>> >> >
>> >> >
>> >
>> >
>>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20130919/6da734d2/attachment.html>

From solipsis at pitrou.net  Thu Sep 19 11:30:47 2013
From: solipsis at pitrou.net (Antoine Pitrou)
Date: Thu, 19 Sep 2013 11:30:47 +0200
Subject: [Python-ideas] Introduce collections.Reiterable
References: <575f4071-1b5c-4a16-b36c-b5f925cdd2f7@googlegroups.com>
 <CADiSq7c2G8E47Hth-kQWd4W33dpb9YFiyLs+vs7Seaw20w6fHw@mail.gmail.com>
 <CAA68w_kuPS6xVjAbropBFyiQib3aH9Ke+PiW8ecWhSkmHsnXnQ@mail.gmail.com>
Message-ID: <20130919113047.325ca3d3@pitrou.net>

Le Thu, 19 Sep 2013 04:59:35 -0400,
Neil Girdhar <mistersheik at gmail.com> a
?crit :
> Well, generators are iterable, but if you write a function like:
> 
> def f(s):
>      for x in s:
>              do_something(x)
>      for x in s:
>              do_something_else(x)
> 
> x should not be a generator.  I am proposing adding a function to
> itertools like auto_reiterable that would take s and give you an
> reiterable in the most efficient way possible.

Try the following:


import collections
import itertools


class Reiterable:

    def __init__(self, it):
        self.need_cloning = isinstance(it, collections.Iterator)
        assert self.need_cloning or isinstance(it, collections.Iterable)
        self.master = it

    def __iter__(self):
        if self.need_cloning:
            self.master, it = itertools.tee(self.master)
            return it
        else:
            return iter(self.master)

def gen():
    yield from "ghi"

for arg in ("abc", iter("def"), gen()):
    it = Reiterable(arg)
    print(list(it))
    print(list(it))
    print(list(it))


I don't know if that would be useful as part of the stdlib.

Regards

Antoine.


From masklinn at masklinn.net  Thu Sep 19 11:38:35 2013
From: masklinn at masklinn.net (Masklinn)
Date: Thu, 19 Sep 2013 11:38:35 +0200
Subject: [Python-ideas] Introduce collections.Reiterable
In-Reply-To: <CAA68w_mAevgTz+2uCFiVGn9Z9nHqP+nrfMaF8V+g0UDgfSNoGw@mail.gmail.com>
References: <575f4071-1b5c-4a16-b36c-b5f925cdd2f7@googlegroups.com>
 <CADiSq7c2G8E47Hth-kQWd4W33dpb9YFiyLs+vs7Seaw20w6fHw@mail.gmail.com>
 <CAA68w_kuPS6xVjAbropBFyiQib3aH9Ke+PiW8ecWhSkmHsnXnQ@mail.gmail.com>
 <CADiSq7cVXAUi4BLpfa13QjhwNmX9JancrMjHSEEfKn_P=F-_ag@mail.gmail.com>
 <CAA68w_mfQiVZFGv3VsP9sAN-+VEiG3J5pkBjzy+MNgYhWPjQVA@mail.gmail.com>
 <CADiSq7fbkaA4JA8KhQorr45-KAFp8gGHZRH0VH=7Bqkwbeaeng@mail.gmail.com>
 <CAA68w_mAevgTz+2uCFiVGn9Z9nHqP+nrfMaF8V+g0UDgfSNoGw@mail.gmail.com>
Message-ID: <84F931B7-80D2-4EC4-B748-892D0A1E4CE1@masklinn.net>

On 2013-09-19, at 11:23 , Neil Girdhar wrote:

> No, not things which are iterable but not iterators, things which are
> *reiterable* rather than merely iterable.  That is, things that can be
> iterated multiple times without the generated elements disappearing.

The point Nick is trying to bring across is that "iterable but not an
iterator" seems to do *exactly* what you ask for: you can get multiple
independent iterators out of it, and thus you can iterate it multiple
times without the generated elements disappearing.

From mistersheik at gmail.com  Thu Sep 19 11:38:30 2013
From: mistersheik at gmail.com (Neil Girdhar)
Date: Thu, 19 Sep 2013 05:38:30 -0400
Subject: [Python-ideas] Introduce collections.Reiterable
In-Reply-To: <20130919113047.325ca3d3@pitrou.net>
References: <575f4071-1b5c-4a16-b36c-b5f925cdd2f7@googlegroups.com>
 <CADiSq7c2G8E47Hth-kQWd4W33dpb9YFiyLs+vs7Seaw20w6fHw@mail.gmail.com>
 <CAA68w_kuPS6xVjAbropBFyiQib3aH9Ke+PiW8ecWhSkmHsnXnQ@mail.gmail.com>
 <20130919113047.325ca3d3@pitrou.net>
Message-ID: <CAA68w_kwYC88RJ+8=eEMmRXd=e01X_bhyh_AbTqLBbEUP2+d3Q@mail.gmail.com>

First of all, that's amazing and exactly what I was looking for.

Second, sorry Nick, I guess we were talking past each other and I didn't
understand what you were getting at.  From the collections.abc
documentation, I imagined that subclasses are more restricted and therefore
can do more than their superclasses.  However, as you were trying to tell
me things that are "Iterators" (and thus also Iterable) can do *less* than
things that are merely Iterable.  The former cannot be iterated over twice.
  If I'm understanding this correctly, would it be nice if the
documentation then made this promise (as I don't believe it does)?

Best,

Neil


On Thu, Sep 19, 2013 at 5:30 AM, Antoine Pitrou <solipsis at pitrou.net> wrote:

> Le Thu, 19 Sep 2013 04:59:35 -0400,
> Neil Girdhar <mistersheik at gmail.com> a
> ?crit :
> > Well, generators are iterable, but if you write a function like:
> >
> > def f(s):
> >      for x in s:
> >              do_something(x)
> >      for x in s:
> >              do_something_else(x)
> >
> > x should not be a generator.  I am proposing adding a function to
> > itertools like auto_reiterable that would take s and give you an
> > reiterable in the most efficient way possible.
>
> Try the following:
>
>
> import collections
> import itertools
>
>
> class Reiterable:
>
>     def __init__(self, it):
>         self.need_cloning = isinstance(it, collections.Iterator)
>         assert self.need_cloning or isinstance(it, collections.Iterable)
>         self.master = it
>
>     def __iter__(self):
>         if self.need_cloning:
>             self.master, it = itertools.tee(self.master)
>             return it
>         else:
>             return iter(self.master)
>
> def gen():
>     yield from "ghi"
>
> for arg in ("abc", iter("def"), gen()):
>     it = Reiterable(arg)
>     print(list(it))
>     print(list(it))
>     print(list(it))
>
>
> I don't know if that would be useful as part of the stdlib.
>
> Regards
>
> Antoine.
>
>
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> https://mail.python.org/mailman/listinfo/python-ideas
>
> --
>
> ---
> You received this message because you are subscribed to a topic in the
> Google Groups "python-ideas" group.
> To unsubscribe from this topic, visit
> https://groups.google.com/d/topic/python-ideas/OumiLGDwRWA/unsubscribe.
> To unsubscribe from this group and all its topics, send an email to
> python-ideas+unsubscribe at googlegroups.com.
> For more options, visit https://groups.google.com/groups/opt_out.
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20130919/8d941474/attachment-0001.html>

From mistersheik at gmail.com  Thu Sep 19 11:40:26 2013
From: mistersheik at gmail.com (Neil Girdhar)
Date: Thu, 19 Sep 2013 05:40:26 -0400
Subject: [Python-ideas] Introduce collections.Reiterable
In-Reply-To: <84F931B7-80D2-4EC4-B748-892D0A1E4CE1@masklinn.net>
References: <575f4071-1b5c-4a16-b36c-b5f925cdd2f7@googlegroups.com>
 <CADiSq7c2G8E47Hth-kQWd4W33dpb9YFiyLs+vs7Seaw20w6fHw@mail.gmail.com>
 <CAA68w_kuPS6xVjAbropBFyiQib3aH9Ke+PiW8ecWhSkmHsnXnQ@mail.gmail.com>
 <CADiSq7cVXAUi4BLpfa13QjhwNmX9JancrMjHSEEfKn_P=F-_ag@mail.gmail.com>
 <CAA68w_mfQiVZFGv3VsP9sAN-+VEiG3J5pkBjzy+MNgYhWPjQVA@mail.gmail.com>
 <CADiSq7fbkaA4JA8KhQorr45-KAFp8gGHZRH0VH=7Bqkwbeaeng@mail.gmail.com>
 <CAA68w_mAevgTz+2uCFiVGn9Z9nHqP+nrfMaF8V+g0UDgfSNoGw@mail.gmail.com>
 <84F931B7-80D2-4EC4-B748-892D0A1E4CE1@masklinn.net>
Message-ID: <CAA68w_nga2hA7DP8Y-jk9Ozd_4Mkrw2XuSbSeqgLOtJQ2eLnbQ@mail.gmail.com>

Yes, I see that now (with Antoine's code as well).  Sorry that it wasn't
clear to me earlier.


On Thu, Sep 19, 2013 at 5:38 AM, Masklinn <masklinn at masklinn.net> wrote:

> On 2013-09-19, at 11:23 , Neil Girdhar wrote:
>
> > No, not things which are iterable but not iterators, things which are
> > *reiterable* rather than merely iterable.  That is, things that can be
> > iterated multiple times without the generated elements disappearing.
>
> The point Nick is trying to bring across is that "iterable but not an
> iterator" seems to do *exactly* what you ask for: you can get multiple
> independent iterators out of it, and thus you can iterate it multiple
> times without the generated elements disappearing.
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> https://mail.python.org/mailman/listinfo/python-ideas
>
> --
>
> ---
> You received this message because you are subscribed to a topic in the
> Google Groups "python-ideas" group.
> To unsubscribe from this topic, visit
> https://groups.google.com/d/topic/python-ideas/OumiLGDwRWA/unsubscribe.
> To unsubscribe from this group and all its topics, send an email to
> python-ideas+unsubscribe at googlegroups.com.
> For more options, visit https://groups.google.com/groups/opt_out.
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20130919/67ed0b63/attachment.html>

From mistersheik at gmail.com  Thu Sep 19 11:58:38 2013
From: mistersheik at gmail.com (Neil Girdhar)
Date: Thu, 19 Sep 2013 05:58:38 -0400
Subject: [Python-ideas] Introduce collections.Reiterable
In-Reply-To: <CAA68w_kwYC88RJ+8=eEMmRXd=e01X_bhyh_AbTqLBbEUP2+d3Q@mail.gmail.com>
References: <575f4071-1b5c-4a16-b36c-b5f925cdd2f7@googlegroups.com>
 <CADiSq7c2G8E47Hth-kQWd4W33dpb9YFiyLs+vs7Seaw20w6fHw@mail.gmail.com>
 <CAA68w_kuPS6xVjAbropBFyiQib3aH9Ke+PiW8ecWhSkmHsnXnQ@mail.gmail.com>
 <20130919113047.325ca3d3@pitrou.net>
 <CAA68w_kwYC88RJ+8=eEMmRXd=e01X_bhyh_AbTqLBbEUP2+d3Q@mail.gmail.com>
Message-ID: <CAA68w_=pp7gz-QaD6Pv8E4i_4xpRZQHPpN+3aHzr9dBvby9msg@mail.gmail.com>

Sorry, never mind, the documentation is clear.  I got caught up with
collections.abc page and didn't click through.   At least I have Antoine's
code now to use in my project whether it gets added to the standard library
or not :)

Best,

Neil


On Thu, Sep 19, 2013 at 5:38 AM, Neil Girdhar <mistersheik at gmail.com> wrote:

> First of all, that's amazing and exactly what I was looking for.
>
> Second, sorry Nick, I guess we were talking past each other and I didn't
> understand what you were getting at.  From the collections.abc
> documentation, I imagined that subclasses are more restricted and therefore
> can do more than their superclasses.  However, as you were trying to tell
> me things that are "Iterators" (and thus also Iterable) can do *less* than
> things that are merely Iterable.  The former cannot be iterated over twice.
>   If I'm understanding this correctly, would it be nice if the
> documentation then made this promise (as I don't believe it does)?
>
> Best,
>
> Neil
>
>
> On Thu, Sep 19, 2013 at 5:30 AM, Antoine Pitrou <solipsis at pitrou.net>wrote:
>
>> Le Thu, 19 Sep 2013 04:59:35 -0400,
>> Neil Girdhar <mistersheik at gmail.com> a
>> ?crit :
>> > Well, generators are iterable, but if you write a function like:
>> >
>> > def f(s):
>> >      for x in s:
>> >              do_something(x)
>> >      for x in s:
>> >              do_something_else(x)
>> >
>> > x should not be a generator.  I am proposing adding a function to
>> > itertools like auto_reiterable that would take s and give you an
>> > reiterable in the most efficient way possible.
>>
>> Try the following:
>>
>>
>> import collections
>> import itertools
>>
>>
>> class Reiterable:
>>
>>     def __init__(self, it):
>>         self.need_cloning = isinstance(it, collections.Iterator)
>>         assert self.need_cloning or isinstance(it, collections.Iterable)
>>         self.master = it
>>
>>     def __iter__(self):
>>         if self.need_cloning:
>>             self.master, it = itertools.tee(self.master)
>>             return it
>>         else:
>>             return iter(self.master)
>>
>> def gen():
>>     yield from "ghi"
>>
>> for arg in ("abc", iter("def"), gen()):
>>     it = Reiterable(arg)
>>     print(list(it))
>>     print(list(it))
>>     print(list(it))
>>
>>
>> I don't know if that would be useful as part of the stdlib.
>>
>> Regards
>>
>> Antoine.
>>
>>
>> _______________________________________________
>> Python-ideas mailing list
>> Python-ideas at python.org
>> https://mail.python.org/mailman/listinfo/python-ideas
>>
>> --
>>
>> ---
>> You received this message because you are subscribed to a topic in the
>> Google Groups "python-ideas" group.
>> To unsubscribe from this topic, visit
>> https://groups.google.com/d/topic/python-ideas/OumiLGDwRWA/unsubscribe.
>> To unsubscribe from this group and all its topics, send an email to
>> python-ideas+unsubscribe at googlegroups.com.
>> For more options, visit https://groups.google.com/groups/opt_out.
>>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20130919/ed860452/attachment.html>

From tjreedy at udel.edu  Thu Sep 19 12:21:25 2013
From: tjreedy at udel.edu (Terry Reedy)
Date: Thu, 19 Sep 2013 06:21:25 -0400
Subject: [Python-ideas] Introduce collections.Reiterable
In-Reply-To: <20130919113047.325ca3d3@pitrou.net>
References: <575f4071-1b5c-4a16-b36c-b5f925cdd2f7@googlegroups.com>
 <CADiSq7c2G8E47Hth-kQWd4W33dpb9YFiyLs+vs7Seaw20w6fHw@mail.gmail.com>
 <CAA68w_kuPS6xVjAbropBFyiQib3aH9Ke+PiW8ecWhSkmHsnXnQ@mail.gmail.com>
 <20130919113047.325ca3d3@pitrou.net>
Message-ID: <l1ej72$2s2$1@ger.gmane.org>

On 9/19/2013 5:30 AM, Antoine Pitrou wrote:

>> x should not be a generator.  I am proposing adding a function to
>> itertools like auto_reiterable that would take s and give you an
>> reiterable in the most efficient way possible.
>
> Try the following:
>
>
> import collections
> import itertools
>
>
> class Reiterable:
>
>      def __init__(self, it):
>          self.need_cloning = isinstance(it, collections.Iterator)
>          assert self.need_cloning or isinstance(it, collections.Iterable)
>          self.master = it
>
>      def __iter__(self):
>          if self.need_cloning:
>              self.master, it = itertools.tee(self.master)
>              return it
>          else:
>              return iter(self.master)
>
> def gen():
>      yield from "ghi"
>
> for arg in ("abc", iter("def"), gen()):
>      it = Reiterable(arg)
>      print(list(it))
>      print(list(it))
>      print(list(it))
>
>
> I don't know if that would be useful as part of the stdlib.

A slight problem is that there is no guaranteed that a non-iterator 
iterable is re-iterable.

-- 
Terry Jan Reedy


From tjreedy at udel.edu  Thu Sep 19 12:28:19 2013
From: tjreedy at udel.edu (Terry Reedy)
Date: Thu, 19 Sep 2013 06:28:19 -0400
Subject: [Python-ideas] Introduce collections.Reiterable
In-Reply-To: <CADiSq7c2G8E47Hth-kQWd4W33dpb9YFiyLs+vs7Seaw20w6fHw@mail.gmail.com>
References: <575f4071-1b5c-4a16-b36c-b5f925cdd2f7@googlegroups.com>
 <CADiSq7c2G8E47Hth-kQWd4W33dpb9YFiyLs+vs7Seaw20w6fHw@mail.gmail.com>
Message-ID: <l1ejk0$7gs$1@ger.gmane.org>

On 9/19/2013 4:32 AM, Nick Coghlan wrote:
> (Grr, why is Google Groups so broken? :P)
>
> My question would be, does the new class add anything that isn't
> already covered by:
>
>      isinstance(c, Iterable) and not isinstance(c, Iterator)

Not everything in that category is necessarily re-iterable.
Or if it is serially reiterable, it may not be parallel iterable, as 
needed for nested loops.

-- 
Terry Jan Reedy


From solipsis at pitrou.net  Thu Sep 19 12:26:30 2013
From: solipsis at pitrou.net (Antoine Pitrou)
Date: Thu, 19 Sep 2013 12:26:30 +0200
Subject: [Python-ideas] Introduce collections.Reiterable
References: <575f4071-1b5c-4a16-b36c-b5f925cdd2f7@googlegroups.com>
 <CADiSq7c2G8E47Hth-kQWd4W33dpb9YFiyLs+vs7Seaw20w6fHw@mail.gmail.com>
 <CAA68w_kuPS6xVjAbropBFyiQib3aH9Ke+PiW8ecWhSkmHsnXnQ@mail.gmail.com>
 <20130919113047.325ca3d3@pitrou.net> <l1ej72$2s2$1@ger.gmane.org>
Message-ID: <20130919122630.681b0a86@pitrou.net>

Le Thu, 19 Sep 2013 06:21:25 -0400,
Terry Reedy <tjreedy at udel.edu> a ?crit :
> On 9/19/2013 5:30 AM, Antoine Pitrou wrote:
> 
> >> x should not be a generator.  I am proposing adding a function to
> >> itertools like auto_reiterable that would take s and give you an
> >> reiterable in the most efficient way possible.
> >
> > Try the following:
> >
> >
> > import collections
> > import itertools
> >
> >
> > class Reiterable:
> >
> >      def __init__(self, it):
> >          self.need_cloning = isinstance(it, collections.Iterator)
> >          assert self.need_cloning or isinstance(it,
> > collections.Iterable) self.master = it
> >
> >      def __iter__(self):
> >          if self.need_cloning:
> >              self.master, it = itertools.tee(self.master)
> >              return it
> >          else:
> >              return iter(self.master)
> >
> > def gen():
> >      yield from "ghi"
> >
> > for arg in ("abc", iter("def"), gen()):
> >      it = Reiterable(arg)
> >      print(list(it))
> >      print(list(it))
> >      print(list(it))
> >
> >
> > I don't know if that would be useful as part of the stdlib.
> 
> A slight problem is that there is no guaranteed that a non-iterator 
> iterable is re-iterable.

Any useful examples?

Regards

Antoine.


From tjreedy at udel.edu  Thu Sep 19 12:31:12 2013
From: tjreedy at udel.edu (Terry Reedy)
Date: Thu, 19 Sep 2013 06:31:12 -0400
Subject: [Python-ideas] Introduce collections.Reiterable
In-Reply-To: <CAA68w_kuPS6xVjAbropBFyiQib3aH9Ke+PiW8ecWhSkmHsnXnQ@mail.gmail.com>
References: <575f4071-1b5c-4a16-b36c-b5f925cdd2f7@googlegroups.com>
 <CADiSq7c2G8E47Hth-kQWd4W33dpb9YFiyLs+vs7Seaw20w6fHw@mail.gmail.com>
 <CAA68w_kuPS6xVjAbropBFyiQib3aH9Ke+PiW8ecWhSkmHsnXnQ@mail.gmail.com>
Message-ID: <l1ejpc$9jd$1@ger.gmane.org>

On 9/19/2013 4:59 AM, Neil Girdhar wrote:
> Well, generators are iterable, but if you write a function like:
>
> def f(s):
>       for x in s:
>               do_something(x)
>       for x in s:
>               do_something_else(x)

This strikes me as bad design. It should perhaps a) be two functions or 
b) take two iterable arguments or c) jam the two loops together.

-- 
Terry Jan Reedy


From mistersheik at gmail.com  Thu Sep 19 12:39:35 2013
From: mistersheik at gmail.com (Neil Girdhar)
Date: Thu, 19 Sep 2013 06:39:35 -0400
Subject: [Python-ideas] Introduce collections.Reiterable
In-Reply-To: <l1ejpc$9jd$1@ger.gmane.org>
References: <575f4071-1b5c-4a16-b36c-b5f925cdd2f7@googlegroups.com>
 <CADiSq7c2G8E47Hth-kQWd4W33dpb9YFiyLs+vs7Seaw20w6fHw@mail.gmail.com>
 <CAA68w_kuPS6xVjAbropBFyiQib3aH9Ke+PiW8ecWhSkmHsnXnQ@mail.gmail.com>
 <l1ejpc$9jd$1@ger.gmane.org>
Message-ID: <CAA68w_m6pwzQb0jMqxtTJ0gRc-CDHnU+75B9bi=Gmzbc5crTvA@mail.gmail.com>

That was just for illustration.  Here's the code I just fixed to use
Reiterable:

class Network:
    def update(self, nodes):
        nodes = Reiterable(nodes)
        super().update(self, nodes)
        for node in nodes:
            node.setParent(self)
            node.propertyValuesChanged.connect(self.modelPropertiesChanged)
        self.modelNodesAddedRemoved.emit()


On Thu, Sep 19, 2013 at 6:31 AM, Terry Reedy <tjreedy at udel.edu> wrote:

> On 9/19/2013 4:59 AM, Neil Girdhar wrote:
>
>> Well, generators are iterable, but if you write a function like:
>>
>> def f(s):
>>       for x in s:
>>               do_something(x)
>>       for x in s:
>>               do_something_else(x)
>>
>
> This strikes me as bad design. It should perhaps a) be two functions or b)
> take two iterable arguments or c) jam the two loops together.
>
> --
> Terry Jan Reedy
>
>
> ______________________________**_________________
> Python-ideas mailing list
> Python-ideas at python.org
> https://mail.python.org/**mailman/listinfo/python-ideas<https://mail.python.org/mailman/listinfo/python-ideas>
>
> --
>
> --- You received this message because you are subscribed to a topic in the
> Google Groups "python-ideas" group.
> To unsubscribe from this topic, visit https://groups.google.com/d/**
> topic/python-ideas/**OumiLGDwRWA/unsubscribe<https://groups.google.com/d/topic/python-ideas/OumiLGDwRWA/unsubscribe>
> .
> To unsubscribe from this group and all its topics, send an email to
> python-ideas+unsubscribe@**googlegroups.com<python-ideas%2Bunsubscribe at googlegroups.com>
> .
> For more options, visit https://groups.google.com/**groups/opt_out<https://groups.google.com/groups/opt_out>
> .
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20130919/4daf8d59/attachment.html>

From joshua at landau.ws  Thu Sep 19 13:37:17 2013
From: joshua at landau.ws (Joshua Landau)
Date: Thu, 19 Sep 2013 12:37:17 +0100
Subject: [Python-ideas] Introduce collections.Reiterable
In-Reply-To: <l1ejk0$7gs$1@ger.gmane.org>
References: <575f4071-1b5c-4a16-b36c-b5f925cdd2f7@googlegroups.com>
 <CADiSq7c2G8E47Hth-kQWd4W33dpb9YFiyLs+vs7Seaw20w6fHw@mail.gmail.com>
 <l1ejk0$7gs$1@ger.gmane.org>
Message-ID: <CAN1F8qUo9N30u5emS8YXhC30tc78L6OShiFUqYaEdU79caP9ng@mail.gmail.com>

On 19 September 2013 11:28, Terry Reedy <tjreedy at udel.edu> wrote:
> On 9/19/2013 4:32 AM, Nick Coghlan wrote:
>>
>> (Grr, why is Google Groups so broken? :P)
>>
>> My question would be, does the new class add anything that isn't
>> already covered by:
>>
>>      isinstance(c, Iterable) and not isinstance(c, Iterator)
>
>
> Not everything in that category is necessarily re-iterable.

I cannot think of a non-pathological case where it is not; if it is
not re-iterable it should be changed to an iterator if it isn't
already.

> Or if it is serially reiterable, it may not be parallel iterable, as needed
> for nested loops.

What do you mean?

From rosuav at gmail.com  Thu Sep 19 13:52:02 2013
From: rosuav at gmail.com (Chris Angelico)
Date: Thu, 19 Sep 2013 21:52:02 +1000
Subject: [Python-ideas] Introduce collections.Reiterable
In-Reply-To: <CAA68w_m6pwzQb0jMqxtTJ0gRc-CDHnU+75B9bi=Gmzbc5crTvA@mail.gmail.com>
References: <575f4071-1b5c-4a16-b36c-b5f925cdd2f7@googlegroups.com>
 <CADiSq7c2G8E47Hth-kQWd4W33dpb9YFiyLs+vs7Seaw20w6fHw@mail.gmail.com>
 <CAA68w_kuPS6xVjAbropBFyiQib3aH9Ke+PiW8ecWhSkmHsnXnQ@mail.gmail.com>
 <l1ejpc$9jd$1@ger.gmane.org>
 <CAA68w_m6pwzQb0jMqxtTJ0gRc-CDHnU+75B9bi=Gmzbc5crTvA@mail.gmail.com>
Message-ID: <CAPTjJmrV48ad6=sHG-jjumQ_qMLu1iY6iFMon8e6+AsRcvT-Yg@mail.gmail.com>

On Thu, Sep 19, 2013 at 8:39 PM, Neil Girdhar <mistersheik at gmail.com> wrote:
> That was just for illustration.  Here's the code I just fixed to use
> Reiterable:
>
> class Network:
>     def update(self, nodes):
>         nodes = Reiterable(nodes)
>         super().update(self, nodes)
>         for node in nodes:
>             node.setParent(self)
>             node.propertyValuesChanged.connect(self.modelPropertiesChanged)
>         self.modelNodesAddedRemoved.emit()

Hmm. As an alternative to reiterable, can you rejig the design
something like this?

class Network(...):
    def update(self,nodes):
        for node in nodes: self._update(node)
        self.modelNodesAddedRemoved.emit()
    def _update(self,node):
        super()._update(self,node)
        node.setParent(self)
        node.propertyValuesChanged.connect(self.modelPropertiesChanged)

You put update() into the highest appropriate place in the class
hierarchy, and then each subclass simply overrides _update to do the
work. That way, you iterate over nodes exactly once, and every point
in the hierarchy gets to do its own _update.

ChrisA

From steve at pearwood.info  Thu Sep 19 14:18:29 2013
From: steve at pearwood.info (Steven D'Aprano)
Date: Thu, 19 Sep 2013 22:18:29 +1000
Subject: [Python-ideas] Introduce collections.Reiterable
In-Reply-To: <CADiSq7cVXAUi4BLpfa13QjhwNmX9JancrMjHSEEfKn_P=F-_ag@mail.gmail.com>
References: <575f4071-1b5c-4a16-b36c-b5f925cdd2f7@googlegroups.com>
 <CADiSq7c2G8E47Hth-kQWd4W33dpb9YFiyLs+vs7Seaw20w6fHw@mail.gmail.com>
 <CAA68w_kuPS6xVjAbropBFyiQib3aH9Ke+PiW8ecWhSkmHsnXnQ@mail.gmail.com>
 <CADiSq7cVXAUi4BLpfa13QjhwNmX9JancrMjHSEEfKn_P=F-_ag@mail.gmail.com>
Message-ID: <20130919121828.GK19939@ando>

On Thu, Sep 19, 2013 at 07:12:26PM +1000, Nick Coghlan wrote:

> is there any obvious case where "iterable but
> not an iterator" gives the wrong answer?

I'm not sure if it counts as "obvious", but one can write an iterator 
that is re-iterable. A trivial example:

class Reiter:
    def __init__(self):
        self.i = 0
    def __next__(self):
        i = self.i
        if i < 10:
            self.i += 1
            return i
        self.i = 0
        raise StopIteration
    def __iter__(self):
        return self


I know that according to the iterator protocol, such a re-iterator 
counts as "broken":

[quote]
The intention of the protocol is that once an iterator?s next() method 
raises StopIteration, it will continue to do so on subsequent calls. 
Implementations that do not obey this property are deemed broken. (This 
constraint was added in Python 2.3; in Python 2.2, various iterators are 
broken according to this rule.)

http://docs.python.org/2/library/stdtypes.html#iterator-types


but clearly there is a use-case for re-iterable "things", such as dict 
views, which can be re-iterated over. We just don't call them iterators. 
So maybe there should be a way to distinguish between "oops this 
iterator is broken" and "yes, this object can be iterated over 
repeatedly, it's all good".

At the moment, dict views aren't directly iterable (you can't call 
next() on them). But in principle they could have been designed as 
re-iterable iterators.

Another example might be iterators with a reset or restart method, or 
similar. E.g. file objects and seek(0). File objects are officially 
"broken" iterators, since you can seek back to the beginning of the 
file. I don't think that's a bad thing.

But nor am I sure that it requires a special Reiterable class so we can 
test for it.


-- 
Steven

From steve at pearwood.info  Thu Sep 19 14:28:30 2013
From: steve at pearwood.info (Steven D'Aprano)
Date: Thu, 19 Sep 2013 22:28:30 +1000
Subject: [Python-ideas] Introduce collections.Reiterable
In-Reply-To: <l1ejpc$9jd$1@ger.gmane.org>
References: <575f4071-1b5c-4a16-b36c-b5f925cdd2f7@googlegroups.com>
 <CADiSq7c2G8E47Hth-kQWd4W33dpb9YFiyLs+vs7Seaw20w6fHw@mail.gmail.com>
 <CAA68w_kuPS6xVjAbropBFyiQib3aH9Ke+PiW8ecWhSkmHsnXnQ@mail.gmail.com>
 <l1ejpc$9jd$1@ger.gmane.org>
Message-ID: <20130919122829.GL19939@ando>

On Thu, Sep 19, 2013 at 06:31:12AM -0400, Terry Reedy wrote:
> On 9/19/2013 4:59 AM, Neil Girdhar wrote:
> >Well, generators are iterable, but if you write a function like:
> >
> >def f(s):
> >      for x in s:
> >              do_something(x)
> >      for x in s:
> >              do_something_else(x)
> 
> This strikes me as bad design. It should perhaps a) be two functions or 
> b) take two iterable arguments or c) jam the two loops together.

Perhaps, but sometimes there are hidden loops. Here's an example near 
and dear to my heart... *wink*

def variance(data):
    # Don't do this.
    sumx = sum(data)
    sumx2 = sum(x**2 for x in data)
    ss = sumx2 - (sumx**2)/n
    return ss/(n-1)


Ignore the fact that this algorithm is numerically unstable. It fails 
for iterator arguments, because sum(data) consumes the iterator and 
leaves sumx2 always equal to zero.


-- 
Steven

From ncoghlan at gmail.com  Thu Sep 19 15:02:57 2013
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Thu, 19 Sep 2013 23:02:57 +1000
Subject: [Python-ideas] Introduce collections.Reiterable
In-Reply-To: <20130919121828.GK19939@ando>
References: <575f4071-1b5c-4a16-b36c-b5f925cdd2f7@googlegroups.com>
 <CADiSq7c2G8E47Hth-kQWd4W33dpb9YFiyLs+vs7Seaw20w6fHw@mail.gmail.com>
 <CAA68w_kuPS6xVjAbropBFyiQib3aH9Ke+PiW8ecWhSkmHsnXnQ@mail.gmail.com>
 <CADiSq7cVXAUi4BLpfa13QjhwNmX9JancrMjHSEEfKn_P=F-_ag@mail.gmail.com>
 <20130919121828.GK19939@ando>
Message-ID: <CADiSq7cyp2=XWT2e+nsCR9Q2TqrqM5fdHUhFBwoTfHT4bwum9w@mail.gmail.com>

On 19 September 2013 22:18, Steven D'Aprano <steve at pearwood.info> wrote:
> The intention of the protocol is that once an iterator?s next() method
> raises StopIteration, it will continue to do so on subsequent calls.
> Implementations that do not obey this property are deemed broken. (This
> constraint was added in Python 2.3; in Python 2.2, various iterators are
> broken according to this rule.)
>
> http://docs.python.org/2/library/stdtypes.html#iterator-types
>
>
> but clearly there is a use-case for re-iterable "things", such as dict
> views, which can be re-iterated over. We just don't call them iterators.
> So maybe there should be a way to distinguish between "oops this
> iterator is broken" and "yes, this object can be iterated over
> repeatedly, it's all good".
>
> At the moment, dict views aren't directly iterable (you can't call
> next() on them). But in principle they could have been designed as
> re-iterable iterators.

That's not what iterable means. The iterable/iterator distinction is
well defined and reflected in the collections ABCs:

* iterables are objects that return iterators from __iter__.
* iterators are the subset of iterables that return "self" from
__iter__, and expose a next (2.x) or __next__ (3.x) method

That "iterators return self from __iter__" is important, since almost
everywhere Python iterates over something, it call "_itr = iter(obj)"
first.

So, my question is a genuine one. While, *in theory*, an object can
define a stateful __iter__ method that (e.g.) only works the first
time it is called, or returns a separate object that still stores it's
"current position" information on the original container, I simply
can't think of a non-pathological case where "isinstance(obj,
Iterable) and not isinstance(obj, Iterator)" would give the wrong
answer.

In theory, yes, an object could obviously pass that test and still not
be Reiterable, but I'm interested in what's true in *practice*.

Cheers,
Nick.

P.S. Generator-iterators are a further subset of iterators that expose
send and throw and are integrated with the interpreter eval loop in
various ways that other objects can't yet match. Although I think Mark
Shannon has some ideas about refactoring that API to let other objects
plug into it.


-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia

From abarnert at yahoo.com  Thu Sep 19 18:07:40 2013
From: abarnert at yahoo.com (Andrew Barnert)
Date: Thu, 19 Sep 2013 09:07:40 -0700
Subject: [Python-ideas] Reduce platform dependence of date and time
	related functions
In-Reply-To: <CAP7h-xYnzhhqePPBUHec4-Y6pE=jYmmwUf1H2NFO5HDDFauFKQ@mail.gmail.com>
References: <1379357351.28759.22678441.285099D6@webmail.messagingengine.com>
 <CAP7h-xaKbsWFapLYmUoGqTLRpNJWau1hVmu2it=QvJfBX_fAzg@mail.gmail.com>
 <1379433703.14501.23084249.72F8456E@webmail.messagingengine.com>
 <CAP1=2W5UN1zBjMOd7pLiO8twT0wmmteGkCj1RCdr=hCW6i108A@mail.gmail.com>
 <1379435260.28764.23112997.115F6763@webmail.messagingengine.com>
 <CAP1=2W6K78=qKSWY0c=OoGGJtCU6c1OY-EOXw7u_zniKdG3LeA@mail.gmail.com>
 <CAP7h-xYnzhhqePPBUHec4-Y6pE=jYmmwUf1H2NFO5HDDFauFKQ@mail.gmail.com>
Message-ID: <67B7C352-5251-46DE-BED3-7D1FF38A8A78@yahoo.com>

On Sep 17, 2013, at 10:41, Alexander Belopolsky <alexander.belopolsky at gmail.com> wrote:

> On Tue, Sep 17, 2013 at 1:02 PM, Brett Cannon <brett at python.org> wrote:
>> As you pointed out, getting the locale details is essentially not possible in a cross-platform way unless you use strptime or strftime, so you have to choose which is implemented in Python and relies on the other.
> 
> What we can do is to implement "C" locale behavior.  In fact, in many uses of strftime() its locale-dependence is a problem.  

But in many cases it's useful. And the platform doesn't give us any way to get enough information about the locale to implement it ourselves. It's the same reason we have naive local times--local times are useful, the platform doesn't give us enough information about the local timezone, so we have to use what it gives us.

> I would much rather have strftime_l()-like function and "C" locale implemented in stdlib.  

I agree that having both would be useful.

If you're suggesting renaming platform-dependent locale-handling strftime to strftime_l, and adding a new "C"-locale-only strftime, I don't like the naming. The function that acts just like the POSIX function strftime, and like the Python function in every version up to now, should be called strftime; give the new function a different name instead. Otherwise, I can't see a problem.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20130919/d2cf2126/attachment-0001.html>

From abarnert at yahoo.com  Thu Sep 19 18:26:53 2013
From: abarnert at yahoo.com (Andrew Barnert)
Date: Thu, 19 Sep 2013 09:26:53 -0700
Subject: [Python-ideas] Introduce collections.Reiterable
In-Reply-To: <20130919121828.GK19939@ando>
References: <575f4071-1b5c-4a16-b36c-b5f925cdd2f7@googlegroups.com>
 <CADiSq7c2G8E47Hth-kQWd4W33dpb9YFiyLs+vs7Seaw20w6fHw@mail.gmail.com>
 <CAA68w_kuPS6xVjAbropBFyiQib3aH9Ke+PiW8ecWhSkmHsnXnQ@mail.gmail.com>
 <CADiSq7cVXAUi4BLpfa13QjhwNmX9JancrMjHSEEfKn_P=F-_ag@mail.gmail.com>
 <20130919121828.GK19939@ando>
Message-ID: <89A6482E-0700-4414-9E09-9AEB143F30E1@yahoo.com>

On Sep 19, 2013, at 5:18, Steven D'Aprano <steve at pearwood.info> wrote:

> On Thu, Sep 19, 2013 at 07:12:26PM +1000, Nick Coghlan wrote:
> 
>> is there any obvious case where "iterable but
>> not an iterator" gives the wrong answer?
> 
> I'm not sure if it counts as "obvious", but one can write an iterator 
> that is re-iterable. A trivial example:
> 
> class Reiter:
>    def __init__(self):
>        self.i = 0
>    def __next__(self):
>        i = self.i
>        if i < 10:
>            self.i += 1
>            return i
>        self.i = 0
>        raise StopIteration
>    def __iter__(self):
>        return self
> 
> 
> I know that according to the iterator protocol, such a re-iterator 
> counts as "broken":

It also wouldn't break the OP's code, or any other reasonable code that cares about the distinction; at worst, it would cause it to unnecessarily make an extra list copy.

The only thing that would break the code is something that isn't an iterator, is an iterable, and can only be iterated once.

Of course you could build that as well, but it would be even more pathological than your example. For example:

class OneShot:
    def __init__(self, it):
        self.it = iter(it)
    def __iter__(self):
        return self.it

Besides being something no one should ever write, it's also something no code could ever guard against. If we had the Reiterable ABC, I could just register OneShot as Reiterable.

> Another example might be iterators with a reset or restart method, or 
> similar. E.g. file objects and seek(0). File objects are officially 
> "broken" iterators, since you can seek back to the beginning of the 
> file. I don't think that's a bad thing.

They can't be reiterated if you just treat them as iterators. You have to treat them as files--e.g., call seek(0)--if you want to reiterate them. So there's no way this could be a problem in any real code.

> But nor am I sure that it requires a special Reiterable class so we can 
> test for it.

Unless you added a "__reiter__" method, or some other way to get a new, reset-to-the-start, iterator from the iterable, such a class wouldn't help anyway. And even if we had that method, for loops and yield from and so on would all have to try __reiter__ first and fall back to __iter__. Otherwise, you still wouldn't be able to pass a file to the OP's code, or any other code that distinguishes on Reiterable to decide whether to copy or tee or use a one-pass algorithm instead of multi-pass or whatever.

From alexander.belopolsky at gmail.com  Thu Sep 19 19:57:08 2013
From: alexander.belopolsky at gmail.com (Alexander Belopolsky)
Date: Thu, 19 Sep 2013 13:57:08 -0400
Subject: [Python-ideas] Reduce platform dependence of date and time
 related functions
In-Reply-To: <67B7C352-5251-46DE-BED3-7D1FF38A8A78@yahoo.com>
References: <1379357351.28759.22678441.285099D6@webmail.messagingengine.com>
 <CAP7h-xaKbsWFapLYmUoGqTLRpNJWau1hVmu2it=QvJfBX_fAzg@mail.gmail.com>
 <1379433703.14501.23084249.72F8456E@webmail.messagingengine.com>
 <CAP1=2W5UN1zBjMOd7pLiO8twT0wmmteGkCj1RCdr=hCW6i108A@mail.gmail.com>
 <1379435260.28764.23112997.115F6763@webmail.messagingengine.com>
 <CAP1=2W6K78=qKSWY0c=OoGGJtCU6c1OY-EOXw7u_zniKdG3LeA@mail.gmail.com>
 <CAP7h-xYnzhhqePPBUHec4-Y6pE=jYmmwUf1H2NFO5HDDFauFKQ@mail.gmail.com>
 <67B7C352-5251-46DE-BED3-7D1FF38A8A78@yahoo.com>
Message-ID: <CAP7h-xaaqJOeYEjKfZB_9YGBoUae1hydkVBpZnFkafuLLPGk8g@mail.gmail.com>

On Thu, Sep 19, 2013 at 12:07 PM, Andrew Barnert <abarnert at yahoo.com> wrote:

> If you're suggesting renaming platform-dependent locale-handling strftime
> to strftime_l, ...


I was thinking of changing datetime.strftime(fmt) signature to
strftime(fmt, locale=None) with default behavior being the same as now and
d.strftime(fmt, "C") invoking new  internal C-locale implementation.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20130919/30c3be17/attachment.html>

From tjreedy at udel.edu  Thu Sep 19 23:25:25 2013
From: tjreedy at udel.edu (Terry Reedy)
Date: Thu, 19 Sep 2013 17:25:25 -0400
Subject: [Python-ideas] Introduce collections.Reiterable
In-Reply-To: <20130919122829.GL19939@ando>
References: <575f4071-1b5c-4a16-b36c-b5f925cdd2f7@googlegroups.com>
 <CADiSq7c2G8E47Hth-kQWd4W33dpb9YFiyLs+vs7Seaw20w6fHw@mail.gmail.com>
 <CAA68w_kuPS6xVjAbropBFyiQib3aH9Ke+PiW8ecWhSkmHsnXnQ@mail.gmail.com>
 <l1ejpc$9jd$1@ger.gmane.org> <20130919122829.GL19939@ando>
Message-ID: <l1fq41$ufk$1@ger.gmane.org>

On 9/19/2013 8:28 AM, Steven D'Aprano wrote:
> On Thu, Sep 19, 2013 at 06:31:12AM -0400, Terry Reedy wrote:
>> On 9/19/2013 4:59 AM, Neil Girdhar wrote:
>>> Well, generators are iterable, but if you write a function like:
>>>
>>> def f(s):
>>>       for x in s:
>>>               do_something(x)
>>>       for x in s:
>>>               do_something_else(x)
>>
>> This strikes me as bad design. It should perhaps a) be two functions or
>> b) take two iterable arguments or c) jam the two loops together.
>
> Perhaps, but sometimes there are hidden loops. Here's an example near
> and dear to my heart... *wink*
>
> def variance(data):
>      # Don't do this.
>      sumx = sum(data)
>      sumx2 = sum(x**2 for x in data)
>      ss = sumx2 - (sumx**2)/n
>      return ss/(n-1)
>
>
> Ignore the fact that this algorithm is numerically unstable.

Lets not ;-)

> It fails
> for iterator arguments, because sum(data) consumes the iterator and
> leaves sumx2 always equal to zero.

This is doubly bad design because the two 'hidden' loops are trivially 
jammed together in one explicit loop, while use of Reiterable would not 
remove the numerical instability. While it may seem that a numerically 
stable solution needs two loops (the second to sum (x-sumx)**2), the two 
loops can still be jammed together with the Method of Provisional Means.

http://www.stat.wisc.edu/~larget/math496/mean-var.html
http://www.statistical-solutions-software.com/BMDP-documents/BMDP-Formula1.pdf

Also called 'online algorithm' and 'Weighted incremental algorithm' in
https://en.wikipedia.org/wiki/Algorithms_for_calculating_variance

This was invented and used back when re-iteration of large datasets (on 
cards or tape) was possible but very slow (1970s or before). (Restack or 
rewind and reread might triple the (expensive) run time.)

-- 
Terry Jan Reedy


From tjreedy at udel.edu  Thu Sep 19 23:40:20 2013
From: tjreedy at udel.edu (Terry Reedy)
Date: Thu, 19 Sep 2013 17:40:20 -0400
Subject: [Python-ideas] Introduce collections.Reiterable
In-Reply-To: <20130919121828.GK19939@ando>
References: <575f4071-1b5c-4a16-b36c-b5f925cdd2f7@googlegroups.com>
 <CADiSq7c2G8E47Hth-kQWd4W33dpb9YFiyLs+vs7Seaw20w6fHw@mail.gmail.com>
 <CAA68w_kuPS6xVjAbropBFyiQib3aH9Ke+PiW8ecWhSkmHsnXnQ@mail.gmail.com>
 <CADiSq7cVXAUi4BLpfa13QjhwNmX9JancrMjHSEEfKn_P=F-_ag@mail.gmail.com>
 <20130919121828.GK19939@ando>
Message-ID: <l1fr00$7va$1@ger.gmane.org>

On 9/19/2013 8:18 AM, Steven D'Aprano wrote:
> On Thu, Sep 19, 2013 at 07:12:26PM +1000, Nick Coghlan wrote:
>
>> is there any obvious case where "iterable but
>> not an iterator" gives the wrong answer?
>
> I'm not sure if it counts as "obvious", but one can write an iterator
> that is re-iterable. A trivial example:
>
> class Reiter:
>      def __init__(self):
>          self.i = 0
>      def __next__(self):
>          i = self.i
>          if i < 10:
>              self.i += 1
>              return i
>          self.i = 0

This, I agree, is bad.

>          raise StopIteration
>      def __iter__(self):
>          return self
>
>
> I know that according to the iterator protocol, such a re-iterator
> counts as "broken":
>
> [quote]
> The intention of the protocol is that once an iterator?s next() method
> raises StopIteration, it will continue to do so on subsequent calls.

I would add 'unless and until iter() or another reset method is called. 
Once one pokes at a iterator with another mutation method, all bets are 
off. I would consider Reiter less broken or not at all if the reset in 
__next__ were removed, since then it would continue to raise until 
explicity reset with __iter__

> Implementations that do not obey this property are deemed broken. (This
> constraint was added in Python 2.3; in Python 2.2, various iterators are
> broken according to this rule.)
>
> http://docs.python.org/2/library/stdtypes.html#iterator-types
>
> but clearly there is a use-case for re-iterable "things", such as dict
> views, which can be re-iterated over. We just don't call them iterators.
> So maybe there should be a way to distinguish between "oops this
> iterator is broken" and "yes, this object can be iterated over
> repeatedly, it's all good".
>
> At the moment, dict views aren't directly iterable (you can't call
> next() on them). But in principle they could have been designed as
> re-iterable iterators.
>
> Another example might be iterators with a reset or restart method, or
> similar. E.g. file objects and seek(0). File objects are officially
> "broken" iterators, since you can seek back to the beginning of the
> file. I don't think that's a bad thing.
>
> But nor am I sure that it requires a special Reiterable class so we can
> test for it.
>
>


-- 
Terry Jan Reedy


From abarnert at yahoo.com  Fri Sep 20 00:00:27 2013
From: abarnert at yahoo.com (Andrew Barnert)
Date: Thu, 19 Sep 2013 15:00:27 -0700
Subject: [Python-ideas] Reduce platform dependence of date and time
	related functions
In-Reply-To: <CAP7h-xaaqJOeYEjKfZB_9YGBoUae1hydkVBpZnFkafuLLPGk8g@mail.gmail.com>
References: <1379357351.28759.22678441.285099D6@webmail.messagingengine.com>
 <CAP7h-xaKbsWFapLYmUoGqTLRpNJWau1hVmu2it=QvJfBX_fAzg@mail.gmail.com>
 <1379433703.14501.23084249.72F8456E@webmail.messagingengine.com>
 <CAP1=2W5UN1zBjMOd7pLiO8twT0wmmteGkCj1RCdr=hCW6i108A@mail.gmail.com>
 <1379435260.28764.23112997.115F6763@webmail.messagingengine.com>
 <CAP1=2W6K78=qKSWY0c=OoGGJtCU6c1OY-EOXw7u_zniKdG3LeA@mail.gmail.com>
 <CAP7h-xYnzhhqePPBUHec4-Y6pE=jYmmwUf1H2NFO5HDDFauFKQ@mail.gmail.com>
 <67B7C352-5251-46DE-BED3-7D1FF38A8A78@yahoo.com>
 <CAP7h-xaaqJOeYEjKfZB_9YGBoUae1hydkVBpZnFkafuLLPGk8g@mail.gmail.com>
Message-ID: <FDA35CFA-9EDC-41F0-8A20-5A5E6B46099E@yahoo.com>

On Sep 19, 2013, at 10:57, Alexander Belopolsky <alexander.belopolsky at gmail.com> wrote:

> 
> On Thu, Sep 19, 2013 at 12:07 PM, Andrew Barnert <abarnert at yahoo.com> wrote:
>> If you're suggesting renaming platform-dependent locale-handling strftime to strftime_l, ...
> 
> I was thinking of changing datetime.strftime(fmt) signature to strftime(fmt, locale=None) with default behavior being the same as now and d.strftime(fmt, "C") invoking new  internal C-locale implementation.

But that API implies that you could call, e.g., d.strftime(fmt, "pt_BR"), which I assume isn't something anyone is planning on implementing.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20130919/6e7f5a21/attachment-0001.html>

From mistersheik at gmail.com  Fri Sep 20 00:22:29 2013
From: mistersheik at gmail.com (Neil Girdhar)
Date: Thu, 19 Sep 2013 18:22:29 -0400
Subject: [Python-ideas] Introduce collections.Reiterable
In-Reply-To: <l1fr00$7va$1@ger.gmane.org>
References: <575f4071-1b5c-4a16-b36c-b5f925cdd2f7@googlegroups.com>
 <CADiSq7c2G8E47Hth-kQWd4W33dpb9YFiyLs+vs7Seaw20w6fHw@mail.gmail.com>
 <CAA68w_kuPS6xVjAbropBFyiQib3aH9Ke+PiW8ecWhSkmHsnXnQ@mail.gmail.com>
 <CADiSq7cVXAUi4BLpfa13QjhwNmX9JancrMjHSEEfKn_P=F-_ag@mail.gmail.com>
 <20130919121828.GK19939@ando> <l1fr00$7va$1@ger.gmane.org>
Message-ID: <CAA68w_kNSDT0eonhtbyK31_vUaiv8Xx15Xh5ag1_mMeD0oK5bA@mail.gmail.com>

Why not do it the way Antoine suggested, but instead of

self.need_cloning = isinstance(it, collections.Iterator)

have

self.need_cloning = isinstance(it, collections.Reiterable)

Then, mark the appropriate classes as subclasses of collections.Reiterable
where collections.Sequence < collections.Reiterable < collections.Iterable?

Best,

Neil


On Thu, Sep 19, 2013 at 5:40 PM, Terry Reedy <tjreedy at udel.edu> wrote:

> On 9/19/2013 8:18 AM, Steven D'Aprano wrote:
>
>> On Thu, Sep 19, 2013 at 07:12:26PM +1000, Nick Coghlan wrote:
>>
>>  is there any obvious case where "iterable but
>>> not an iterator" gives the wrong answer?
>>>
>>
>> I'm not sure if it counts as "obvious", but one can write an iterator
>> that is re-iterable. A trivial example:
>>
>> class Reiter:
>>      def __init__(self):
>>          self.i = 0
>>      def __next__(self):
>>          i = self.i
>>          if i < 10:
>>              self.i += 1
>>              return i
>>          self.i = 0
>>
>
> This, I agree, is bad.
>
>
>           raise StopIteration
>>      def __iter__(self):
>>          return self
>>
>>
>> I know that according to the iterator protocol, such a re-iterator
>> counts as "broken":
>>
>> [quote]
>> The intention of the protocol is that once an iterator?s next() method
>> raises StopIteration, it will continue to do so on subsequent calls.
>>
>
> I would add 'unless and until iter() or another reset method is called.
> Once one pokes at a iterator with another mutation method, all bets are
> off. I would consider Reiter less broken or not at all if the reset in
> __next__ were removed, since then it would continue to raise until
> explicity reset with __iter__
>
>
>  Implementations that do not obey this property are deemed broken. (This
>> constraint was added in Python 2.3; in Python 2.2, various iterators are
>> broken according to this rule.)
>>
>> http://docs.python.org/2/**library/stdtypes.html#**iterator-types<http://docs.python.org/2/library/stdtypes.html#iterator-types>
>>
>> but clearly there is a use-case for re-iterable "things", such as dict
>> views, which can be re-iterated over. We just don't call them iterators.
>> So maybe there should be a way to distinguish between "oops this
>> iterator is broken" and "yes, this object can be iterated over
>> repeatedly, it's all good".
>>
>> At the moment, dict views aren't directly iterable (you can't call
>> next() on them). But in principle they could have been designed as
>> re-iterable iterators.
>>
>> Another example might be iterators with a reset or restart method, or
>> similar. E.g. file objects and seek(0). File objects are officially
>> "broken" iterators, since you can seek back to the beginning of the
>> file. I don't think that's a bad thing.
>>
>> But nor am I sure that it requires a special Reiterable class so we can
>> test for it.
>>
>>
>>
>
> --
> Terry Jan Reedy
>
>
>
> ______________________________**_________________
> Python-ideas mailing list
> Python-ideas at python.org
> https://mail.python.org/**mailman/listinfo/python-ideas<https://mail.python.org/mailman/listinfo/python-ideas>
>
> --
>
> --- You received this message because you are subscribed to a topic in the
> Google Groups "python-ideas" group.
> To unsubscribe from this topic, visit https://groups.google.com/d/**
> topic/python-ideas/**OumiLGDwRWA/unsubscribe<https://groups.google.com/d/topic/python-ideas/OumiLGDwRWA/unsubscribe>
> .
> To unsubscribe from this group and all its topics, send an email to
> python-ideas+unsubscribe@**googlegroups.com<python-ideas%2Bunsubscribe at googlegroups.com>
> .
> For more options, visit https://groups.google.com/**groups/opt_out<https://groups.google.com/groups/opt_out>
> .
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20130919/5125c0fb/attachment.html>

From tjreedy at udel.edu  Fri Sep 20 03:28:04 2013
From: tjreedy at udel.edu (Terry Reedy)
Date: Thu, 19 Sep 2013 21:28:04 -0400
Subject: [Python-ideas] Introduce collections.Reiterable
In-Reply-To: <CADiSq7cyp2=XWT2e+nsCR9Q2TqrqM5fdHUhFBwoTfHT4bwum9w@mail.gmail.com>
References: <575f4071-1b5c-4a16-b36c-b5f925cdd2f7@googlegroups.com>
 <CADiSq7c2G8E47Hth-kQWd4W33dpb9YFiyLs+vs7Seaw20w6fHw@mail.gmail.com>
 <CAA68w_kuPS6xVjAbropBFyiQib3aH9Ke+PiW8ecWhSkmHsnXnQ@mail.gmail.com>
 <CADiSq7cVXAUi4BLpfa13QjhwNmX9JancrMjHSEEfKn_P=F-_ag@mail.gmail.com>
 <20130919121828.GK19939@ando>
 <CADiSq7cyp2=XWT2e+nsCR9Q2TqrqM5fdHUhFBwoTfHT4bwum9w@mail.gmail.com>
Message-ID: <l1g8b1$f65$1@ger.gmane.org>

Answering am going answer three people in one response.
In no particular order...

On 9/19/2013 9:02 AM, Nick Coghlan wrote:
> So, my question is a genuine one. While, *in theory*, an object can
> define a stateful __iter__ method that (e.g.) only works the first
> time it is called, or returns a separate object that still stores it's
> "current position" information on the original container, I simply
> can't think of a non-pathological case where "isinstance(obj,
> Iterable) and not isinstance(obj, Iterator)" would give the wrong
> answer.
> In theory, yes, an object could obviously pass that test and still not
> be Reiterable, but I'm interested in what's true in *practice*.

On 9/19/2013 6:26 AM, Antoine Pitrou wrote:
 >> A slight problem is that there is no guaranteed that a non-iterator
 >> iterable is re-iterable.
 > Any useful examples?

On 9/19/2013 7:37 AM, Joshua Landau wrote:> On 19 September 2013 11:28, 
Terry Reedy <tjreedy at udel.edu> wrote:
 >> Not everything in that category is necessarily re-iterable.
 > I cannot think of a non-pathological case where it is not; if it is
 > not re-iterable it should be changed to an iterator if it isn't
 > already.

[I think 'pathological' is a bit 'heavy' as a synonym for 'poorly 
written' ;=]

 >> Or if it is serially reiterable, it may not be parallel iterable,
 >> as needed for nested loops.
 > What do you mean?

To back up a bit: When dev write a function, dev is responsible to 
specify acceptible inputs. Neither the language or custom require dev to 
test that inputs meet the specification. Looking before leaping may not 
always work. I believe this to be true when inputs are iterables.

When user calls a function, user is responsible to provide arguments 
that meet the specification and accept the consequences either way.

When dev specifies an 'iterable' argument, he is (should be) saying that 
the argument will be iterated at most once and probably will be iterated 
eventually. If user passes an iterator, user should (except possibly in 
rare cases) not use it otherwise.

The first problem, which impinges on both specification and reiteration, 
is than an iterable may be either finite, or not, or 'in between' 
depending the hardware and user needs. I think we should take 'iterable' 
to mean 'finite iterable' unless dev explicitly relaxes that by saying 
'possibly infinite iterable'.  (To be clear, infinite iterables are 
extremely useful.)

An additional complication, including for reiteration, is that 
'practically' finite may be different for time and space. For instance, 
'for i in range(10000000000): pass # 10 billion iterations' would take 
about 5 minute on my machine while list(range(10000000000)) would fail. 
(The opposite situation is possible, but less relevant to this issue.)

Currently, if dev needs to iterate an input more than once, the 
specification should say so. If the user wants to pass an iterator, the 
user can instead pass list(iter). The reason to have user rather than 
dev make this call is that user is in a better position than dev to know 
whether iter is effectively finite.

Now to the varieties of reiteration:

A. Serial: iterate the input (typically to exhaustion) and then 
reiterate (typically to exhaustion). In the typical case, the iterable 
must be finite. Given finite iterator iter, list(iter) is probably more 
efficient than tee(iter). But let user decide if either is sensible.

B. Parallel: iterate the input with two iterators that march along more 
or less in parallel. The degenerate extreme 'for a,b in zip(iter,iter):' 
would be better written 'for a in iter: b = a'. If the two iterators are 
mostly in sync, then the second iterator is only really needed when they 
diverge. In any case, parallel iteration is best handled internally, 
invisible to the caller, with tee or two or more indexes. (Indexes into 
a concrete collection are nice because it is so easy to sync one to the 
other -- 'i = j' or 'j = i'.) While re does this with finite strings, 
the underlying iterable for such functions does not, in general, need to 
be finite.

C: Crossed: iterate different dimensions in 'crossed' fashion. "for i in 
row: for j in column". For this to involve reiteration, case one is 
square arrays iterated by index. But then it is not an issue, as that 
will be done with a reiterable range. Case two is with multiple iterator 
inputs, with cross products as one example:

def cross(itera, iterb):
   for a in itera:
     for b in iterb:
       yield a,b

The doc should specify that itera and iterb must be independent 
iterables. Note that the outermost iterator does not have to be finite.

Useful example and determinism: generator functions are callable but not 
iterable. For the simple iterate once situation, one calls and passes 
the resulting generator. For reiteration, the following may work:

class GenfIt:
   def __init__(self, genf, *args):
     self.genf = genf
     self.args = args
   def __iter__(self):
     return self.genf(*args)

However, another hidden assumption in this thread has been that 
non-iterator iterables are deterministic, in the sense that re-calling 
iter(it) returns an iterator that yields the same sequence of items 
before raising StopIteration. Some very useful iterator-producing 
functions do not do that (ones returning iterators based on 
pseudo-random or external inputs). So we need to add 'deterministic' to 
the notion of 'reiterable'. And that cannot be mechanically determined.

(Other possible complications: a resource can only be accessed by one 
connection at a time. Or it limits the frequency of connections.)

In summary: A. There are multiple iterable and iteration use cases.  B. 
We cannot really get away from documenting the requirements for iterable 
inputs and keeping some responsibility for meeting them in the hands of 
callers.

-- 
Terry Jan Reedy


From abarnert at yahoo.com  Fri Sep 20 04:19:02 2013
From: abarnert at yahoo.com (Andrew Barnert)
Date: Thu, 19 Sep 2013 19:19:02 -0700
Subject: [Python-ideas] Introduce collections.Reiterable
In-Reply-To: <CAA68w_kNSDT0eonhtbyK31_vUaiv8Xx15Xh5ag1_mMeD0oK5bA@mail.gmail.com>
References: <575f4071-1b5c-4a16-b36c-b5f925cdd2f7@googlegroups.com>
 <CADiSq7c2G8E47Hth-kQWd4W33dpb9YFiyLs+vs7Seaw20w6fHw@mail.gmail.com>
 <CAA68w_kuPS6xVjAbropBFyiQib3aH9Ke+PiW8ecWhSkmHsnXnQ@mail.gmail.com>
 <CADiSq7cVXAUi4BLpfa13QjhwNmX9JancrMjHSEEfKn_P=F-_ag@mail.gmail.com>
 <20130919121828.GK19939@ando> <l1fr00$7va$1@ger.gmane.org>
 <CAA68w_kNSDT0eonhtbyK31_vUaiv8Xx15Xh5ag1_mMeD0oK5bA@mail.gmail.com>
Message-ID: <023C2EA0-9716-4FE1-888D-C556B8917FA1@yahoo.com>

On Sep 19, 2013, at 15:22, Neil Girdhar <mistersheik at gmail.com> wrote:

> Why not do it the way Antoine suggested, but instead of 
> 
> self.need_cloning = isinstance(it, collections.Iterator)
> 
> have
> 
> self.need_cloning = isinstance(it, collections.Reiterable)

Because we already have Iterator today, and we don't have Reiterable, and nobody has yet come up with a useful case where the latter would do the right thing and the former wouldn't.

(Also because the second one is the exact opposite of what you meant... But I assume that's a simple typo.)

> 
> Then, mark the appropriate classes as subclasses of collections.Reiterable where collections.Sequence < collections.Reiterable < collections.Iterable?
> 
> Best,
> 
> Neil
> 
> 
> On Thu, Sep 19, 2013 at 5:40 PM, Terry Reedy <tjreedy at udel.edu> wrote:
>> On 9/19/2013 8:18 AM, Steven D'Aprano wrote:
>>> On Thu, Sep 19, 2013 at 07:12:26PM +1000, Nick Coghlan wrote:
>>> 
>>>> is there any obvious case where "iterable but
>>>> not an iterator" gives the wrong answer?
>>> 
>>> I'm not sure if it counts as "obvious", but one can write an iterator
>>> that is re-iterable. A trivial example:
>>> 
>>> class Reiter:
>>>      def __init__(self):
>>>          self.i = 0
>>>      def __next__(self):
>>>          i = self.i
>>>          if i < 10:
>>>              self.i += 1
>>>              return i
>>>          self.i = 0
>> 
>> This, I agree, is bad.
>> 
>> 
>>>          raise StopIteration
>>>      def __iter__(self):
>>>          return self
>>> 
>>> 
>>> I know that according to the iterator protocol, such a re-iterator
>>> counts as "broken":
>>> 
>>> [quote]
>>> The intention of the protocol is that once an iterator?s next() method
>>> raises StopIteration, it will continue to do so on subsequent calls.
>> 
>> I would add 'unless and until iter() or another reset method is called. Once one pokes at a iterator with another mutation method, all bets are off. I would consider Reiter less broken or not at all if the reset in __next__ were removed, since then it would continue to raise until explicity reset with __iter__
>> 
>> 
>>> Implementations that do not obey this property are deemed broken. (This
>>> constraint was added in Python 2.3; in Python 2.2, various iterators are
>>> broken according to this rule.)
>>> 
>>> http://docs.python.org/2/library/stdtypes.html#iterator-types
>>> 
>>> but clearly there is a use-case for re-iterable "things", such as dict
>>> views, which can be re-iterated over. We just don't call them iterators.
>>> So maybe there should be a way to distinguish between "oops this
>>> iterator is broken" and "yes, this object can be iterated over
>>> repeatedly, it's all good".
>>> 
>>> At the moment, dict views aren't directly iterable (you can't call
>>> next() on them). But in principle they could have been designed as
>>> re-iterable iterators.
>>> 
>>> Another example might be iterators with a reset or restart method, or
>>> similar. E.g. file objects and seek(0). File objects are officially
>>> "broken" iterators, since you can seek back to the beginning of the
>>> file. I don't think that's a bad thing.
>>> 
>>> But nor am I sure that it requires a special Reiterable class so we can
>>> test for it.
>> 
>> 
>> -- 
>> Terry Jan Reedy
>> 
>> 
>> 
>> _______________________________________________
>> Python-ideas mailing list
>> Python-ideas at python.org
>> https://mail.python.org/mailman/listinfo/python-ideas
>> 
>> -- 
>> 
>> --- You received this message because you are subscribed to a topic in the Google Groups "python-ideas" group.
>> To unsubscribe from this topic, visit https://groups.google.com/d/topic/python-ideas/OumiLGDwRWA/unsubscribe.
>> To unsubscribe from this group and all its topics, send an email to python-ideas+unsubscribe at googlegroups.com.
>> For more options, visit https://groups.google.com/groups/opt_out.
> 
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> https://mail.python.org/mailman/listinfo/python-ideas
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20130919/4539ac0e/attachment-0001.html>

From abarnert at yahoo.com  Fri Sep 20 04:34:26 2013
From: abarnert at yahoo.com (Andrew Barnert)
Date: Thu, 19 Sep 2013 19:34:26 -0700
Subject: [Python-ideas] Introduce collections.Reiterable
In-Reply-To: <l1g8b1$f65$1@ger.gmane.org>
References: <575f4071-1b5c-4a16-b36c-b5f925cdd2f7@googlegroups.com>
 <CADiSq7c2G8E47Hth-kQWd4W33dpb9YFiyLs+vs7Seaw20w6fHw@mail.gmail.com>
 <CAA68w_kuPS6xVjAbropBFyiQib3aH9Ke+PiW8ecWhSkmHsnXnQ@mail.gmail.com>
 <CADiSq7cVXAUi4BLpfa13QjhwNmX9JancrMjHSEEfKn_P=F-_ag@mail.gmail.com>
 <20130919121828.GK19939@ando>
 <CADiSq7cyp2=XWT2e+nsCR9Q2TqrqM5fdHUhFBwoTfHT4bwum9w@mail.gmail.com>
 <l1g8b1$f65$1@ger.gmane.org>
Message-ID: <CA8A948D-9693-4FBC-8123-767DF98E15CD@yahoo.com>

On Sep 19, 2013, at 18:28, Terry Reedy <tjreedy at udel.edu> wrote:

> B. Parallel: iterate the input with two iterators that march along more or less in parallel. The degenerate extreme 'for a,b in zip(iter,iter):' would be better written 'for a in iter: b = a'. If the two iterators are mostly in sync, then the second iterator is only really needed when they diverge

But this does something totally different for iterators and other iterables. For an iterator, zip(iter, iter) will get you items (0, 1), then (2, 3), then (4, 5), etc. For anything else, it'll get you items (0, 0), then (1, 1), etc. And the same basic difference holds for less extreme versions, except more obviously.

So, even if there is any useful distinction between reiterable iterators and non-reiterable here (and I don't think I see one), it pales next to the distinction between iterables that return themselves on iter and those that don't.

From mistersheik at gmail.com  Fri Sep 20 05:18:54 2013
From: mistersheik at gmail.com (Neil Girdhar)
Date: Thu, 19 Sep 2013 23:18:54 -0400
Subject: [Python-ideas] Introduce collections.Reiterable
In-Reply-To: <023C2EA0-9716-4FE1-888D-C556B8917FA1@yahoo.com>
References: <575f4071-1b5c-4a16-b36c-b5f925cdd2f7@googlegroups.com>
 <CADiSq7c2G8E47Hth-kQWd4W33dpb9YFiyLs+vs7Seaw20w6fHw@mail.gmail.com>
 <CAA68w_kuPS6xVjAbropBFyiQib3aH9Ke+PiW8ecWhSkmHsnXnQ@mail.gmail.com>
 <CADiSq7cVXAUi4BLpfa13QjhwNmX9JancrMjHSEEfKn_P=F-_ag@mail.gmail.com>
 <20130919121828.GK19939@ando> <l1fr00$7va$1@ger.gmane.org>
 <CAA68w_kNSDT0eonhtbyK31_vUaiv8Xx15Xh5ag1_mMeD0oK5bA@mail.gmail.com>
 <023C2EA0-9716-4FE1-888D-C556B8917FA1@yahoo.com>
Message-ID: <CAA68w_ng+X2CUkQULFYk2McJZK2c1LCEf3XMVZbfcuKU0LSHcg@mail.gmail.com>

You're right, but first I'm not sure that I agree with Terry's comment that
"if dev needs to iterate an input more than once, the specification should
say so. If the user wants to pass an iterator, the user can instead pass
list(iter). The reason to have user rather than dev make this call is that
user is in a better position than dev to know whether iter is effectively
finite."

The problem with this is that it's exhausting to keep checking whether a
function needs an iterable or not, and it's noisy for the user of the
function to have to cast to things list.  I've had silent breakages when I
change the return value of one function from list to generator not
realizing that it was passed somewhere to another function that wanted a
reiterable.  Most importantly, there's no sure *and* easy way to assert
that the input a function is reiterable, and so the silent breakages are
hard to discover.  Even if the user should be the one deciding what to do,
the dev has to be able to assert that the right thing was done.

Therefore, I feel there should be a definitive test for reiterability.
 Either:
* The documentation should promise that Iterable and not Iterator is
reiterable, or
* Reiterable should be added to collections.abc, or
* some other definitive test that hasn't been brought up yet.

Best,

Neil


On Thu, Sep 19, 2013 at 10:19 PM, Andrew Barnert <abarnert at yahoo.com> wrote:

> On Sep 19, 2013, at 15:22, Neil Girdhar <mistersheik at gmail.com> wrote:
>
> Why not do it the way Antoine suggested, but instead of
>
> self.need_cloning = isinstance(it, collections.Iterator)
>
> have
>
> self.need_cloning = isinstance(it, collections.Reiterable)
>
>
> Because we already have Iterator today, and we don't have Reiterable, and
> nobody has yet come up with a useful case where the latter would do the
> right thing and the former wouldn't.
>
> (Also because the second one is the exact opposite of what you meant...
> But I assume that's a simple typo.)
>
>
> Then, mark the appropriate classes as subclasses of collections.Reiterable
> where collections.Sequence < collections.Reiterable < collections.Iterable?
>
> Best,
>
> Neil
>
>
> On Thu, Sep 19, 2013 at 5:40 PM, Terry Reedy <tjreedy at udel.edu> wrote:
>
>> On 9/19/2013 8:18 AM, Steven D'Aprano wrote:
>>
>>> On Thu, Sep 19, 2013 at 07:12:26PM +1000, Nick Coghlan wrote:
>>>
>>>  is there any obvious case where "iterable but
>>>> not an iterator" gives the wrong answer?
>>>>
>>>
>>> I'm not sure if it counts as "obvious", but one can write an iterator
>>> that is re-iterable. A trivial example:
>>>
>>> class Reiter:
>>>      def __init__(self):
>>>          self.i = 0
>>>      def __next__(self):
>>>          i = self.i
>>>          if i < 10:
>>>              self.i += 1
>>>              return i
>>>          self.i = 0
>>>
>>
>> This, I agree, is bad.
>>
>>
>>           raise StopIteration
>>>      def __iter__(self):
>>>          return self
>>>
>>>
>>> I know that according to the iterator protocol, such a re-iterator
>>> counts as "broken":
>>>
>>> [quote]
>>> The intention of the protocol is that once an iterator?s next() method
>>> raises StopIteration, it will continue to do so on subsequent calls.
>>>
>>
>> I would add 'unless and until iter() or another reset method is called.
>> Once one pokes at a iterator with another mutation method, all bets are
>> off. I would consider Reiter less broken or not at all if the reset in
>> __next__ were removed, since then it would continue to raise until
>> explicity reset with __iter__
>>
>>
>>  Implementations that do not obey this property are deemed broken. (This
>>> constraint was added in Python 2.3; in Python 2.2, various iterators are
>>> broken according to this rule.)
>>>
>>> http://docs.python.org/2/**library/stdtypes.html#**iterator-types<http://docs.python.org/2/library/stdtypes.html#iterator-types>
>>>
>>> but clearly there is a use-case for re-iterable "things", such as dict
>>> views, which can be re-iterated over. We just don't call them iterators.
>>> So maybe there should be a way to distinguish between "oops this
>>> iterator is broken" and "yes, this object can be iterated over
>>> repeatedly, it's all good".
>>>
>>> At the moment, dict views aren't directly iterable (you can't call
>>> next() on them). But in principle they could have been designed as
>>> re-iterable iterators.
>>>
>>> Another example might be iterators with a reset or restart method, or
>>> similar. E.g. file objects and seek(0). File objects are officially
>>> "broken" iterators, since you can seek back to the beginning of the
>>> file. I don't think that's a bad thing.
>>>
>>> But nor am I sure that it requires a special Reiterable class so we can
>>> test for it.
>>>
>>>
>>>
>>
>> --
>> Terry Jan Reedy
>>
>>
>>
>> ______________________________**_________________
>> Python-ideas mailing list
>> Python-ideas at python.org
>> https://mail.python.org/**mailman/listinfo/python-ideas<https://mail.python.org/mailman/listinfo/python-ideas>
>>
>> --
>>
>> --- You received this message because you are subscribed to a topic in
>> the Google Groups "python-ideas" group.
>> To unsubscribe from this topic, visit https://groups.google.com/d/**
>> topic/python-ideas/**OumiLGDwRWA/unsubscribe<https://groups.google.com/d/topic/python-ideas/OumiLGDwRWA/unsubscribe>
>> .
>> To unsubscribe from this group and all its topics, send an email to
>> python-ideas+unsubscribe@**googlegroups.com<python-ideas%2Bunsubscribe at googlegroups.com>
>> .
>> For more options, visit https://groups.google.com/**groups/opt_out<https://groups.google.com/groups/opt_out>
>> .
>>
>
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> https://mail.python.org/mailman/listinfo/python-ideas
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20130919/a8ec14b4/attachment.html>

From stephen at xemacs.org  Fri Sep 20 07:15:31 2013
From: stephen at xemacs.org (Stephen J. Turnbull)
Date: Fri, 20 Sep 2013 14:15:31 +0900
Subject: [Python-ideas] Introduce collections.Reiterable
In-Reply-To: <CAA68w_ng+X2CUkQULFYk2McJZK2c1LCEf3XMVZbfcuKU0LSHcg@mail.gmail.com>
References: <575f4071-1b5c-4a16-b36c-b5f925cdd2f7@googlegroups.com>
 <CADiSq7c2G8E47Hth-kQWd4W33dpb9YFiyLs+vs7Seaw20w6fHw@mail.gmail.com>
 <CAA68w_kuPS6xVjAbropBFyiQib3aH9Ke+PiW8ecWhSkmHsnXnQ@mail.gmail.com>
 <CADiSq7cVXAUi4BLpfa13QjhwNmX9JancrMjHSEEfKn_P=F-_ag@mail.gmail.com>
 <20130919121828.GK19939@ando> <l1fr00$7va$1@ger.gmane.org>
 <CAA68w_kNSDT0eonhtbyK31_vUaiv8Xx15Xh5ag1_mMeD0oK5bA@mail.gmail.com>
 <023C2EA0-9716-4FE1-888D-C556B8917FA1@yahoo.com>
 <CAA68w_ng+X2CUkQULFYk2McJZK2c1LCEf3XMVZbfcuKU0LSHcg@mail.gmail.com>
Message-ID: <874n9g3wgc.fsf@uwakimon.sk.tsukuba.ac.jp>

Neil Girdhar writes:

 > Most importantly, there's no sure *and* easy way to assert that the
 > input a function is reiterable, and so the silent breakages are
 > hard to discover.

But you haven't defined "reiterable" yet, except "it fixes the
breakage I've experienced".

"Reiterable" could mean that the same object can be passed to
iteration contexts freely, and in each one it will start from the
beginning and the context will receive the same sequence of objects
(in the same order).  Or order might not be guaranteed.  It might mean
that the same object once exhausted can be passed to another iteration
context and it will restart.  Or it might mean that the object
supports a rewind method that must be explicitly called, but can be
called even if the object hasn't been exhausted.  Or it might mean
that the object is clonable, and functions that iterate objects passed
into them must clone them unless they know that the object will never
be reiterated.

All of the above also have concurrent variations: in a threading
context, multiple threads have access to each object and might be
iterating with arbitrary timing.  (Eg, if a program is rewritten to
use threads, a sequential reiteration could easily become a
parallel/concurrent reiteration.)  Oh, another: AFAIK even
non-iterator iterables may change their content when iterated.  Eg,
weak containers: I forget if there are any iterables that allow
deletions and insertions in underlying containers, but in the case of
a weak ref deleting a ref elsewhere may cause the ref itself to
disappear.  Should "reiterable" provide any guarantees there?

 >?Even if the user should be the one deciding what to do, the dev has
 > to be able to assert that the right thing was done.

But there's no such need *between user and dev*.  Assertions protect a
dev from *herself*.  Users do what they do, and devs either protect
themselves from user vagaries, or they don't.  If the dev wants to
protect herself from undesirable user choices, cloning an iterator in
Python should be cheap.  If it isn't, let's fix that.

Assertions are useful, indeed.  But in this case, where the assertion
itself is based on an undefined term as far as I can tell (I suspect
this is because different use cases actually want different
definitions), rather than an assertion the dev should treat herself
(as a writer of other modules) as a user.  Ie, she should protect
herself from herself in the same way by cloning the iterator (for
efficiency converting the iterable to an iterator then cloning).

Regards,


From abarnert at yahoo.com  Fri Sep 20 09:51:03 2013
From: abarnert at yahoo.com (Andrew Barnert)
Date: Fri, 20 Sep 2013 00:51:03 -0700
Subject: [Python-ideas] Introduce collections.Reiterable
In-Reply-To: <CAA68w_ng+X2CUkQULFYk2McJZK2c1LCEf3XMVZbfcuKU0LSHcg@mail.gmail.com>
References: <575f4071-1b5c-4a16-b36c-b5f925cdd2f7@googlegroups.com>
 <CADiSq7c2G8E47Hth-kQWd4W33dpb9YFiyLs+vs7Seaw20w6fHw@mail.gmail.com>
 <CAA68w_kuPS6xVjAbropBFyiQib3aH9Ke+PiW8ecWhSkmHsnXnQ@mail.gmail.com>
 <CADiSq7cVXAUi4BLpfa13QjhwNmX9JancrMjHSEEfKn_P=F-_ag@mail.gmail.com>
 <20130919121828.GK19939@ando> <l1fr00$7va$1@ger.gmane.org>
 <CAA68w_kNSDT0eonhtbyK31_vUaiv8Xx15Xh5ag1_mMeD0oK5bA@mail.gmail.com>
 <023C2EA0-9716-4FE1-888D-C556B8917FA1@yahoo.com>
 <CAA68w_ng+X2CUkQULFYk2McJZK2c1LCEf3XMVZbfcuKU0LSHcg@mail.gmail.com>
Message-ID: <4BD119E4-2ED9-47B4-8AB5-3A7756C926E8@yahoo.com>

On Sep 19, 2013, at 20:18, Neil Girdhar <mistersheik at gmail.com> wrote:

> Therefore, I feel there should be a definitive test for reiterability.  Either:
> * The documentation should promise that Iterable and not Iterator is reiterable, or
> * Reiterable should be added to collections.abc, or
> * some other definitive test that hasn't been brought up yet.

Everyone agrees that, in theory, someone could create a non-iterator Iterable that can only be iterated once.

The question is: has anyone ever done such a thing? (Intentionally, that is. Anyone who accidentally created a broken iterable wouldn't be helped by a new ABC--they can't protect against unintentionally broken semantics--and even less so by a documentation change.) Can you think of any good reason anyone might ever want to do such a thing?

If not, what are you hoping to protect against, and how do you hope this change to help you do so?
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20130920/ad9e2ca0/attachment.html>

From mistersheik at gmail.com  Fri Sep 20 10:10:35 2013
From: mistersheik at gmail.com (Neil Girdhar)
Date: Fri, 20 Sep 2013 04:10:35 -0400
Subject: [Python-ideas] Introduce collections.Reiterable
In-Reply-To: <874n9g3wgc.fsf@uwakimon.sk.tsukuba.ac.jp>
References: <575f4071-1b5c-4a16-b36c-b5f925cdd2f7@googlegroups.com>
 <CADiSq7c2G8E47Hth-kQWd4W33dpb9YFiyLs+vs7Seaw20w6fHw@mail.gmail.com>
 <CAA68w_kuPS6xVjAbropBFyiQib3aH9Ke+PiW8ecWhSkmHsnXnQ@mail.gmail.com>
 <CADiSq7cVXAUi4BLpfa13QjhwNmX9JancrMjHSEEfKn_P=F-_ag@mail.gmail.com>
 <20130919121828.GK19939@ando> <l1fr00$7va$1@ger.gmane.org>
 <CAA68w_kNSDT0eonhtbyK31_vUaiv8Xx15Xh5ag1_mMeD0oK5bA@mail.gmail.com>
 <023C2EA0-9716-4FE1-888D-C556B8917FA1@yahoo.com>
 <CAA68w_ng+X2CUkQULFYk2McJZK2c1LCEf3XMVZbfcuKU0LSHcg@mail.gmail.com>
 <874n9g3wgc.fsf@uwakimon.sk.tsukuba.ac.jp>
Message-ID: <CAA68w_=yXZxE2ia6RpueGjoK1Su+fiKfrJ0NuAMRO4kL3NnRMg@mail.gmail.com>

On Fri, Sep 20, 2013 at 1:15 AM, Stephen J. Turnbull <stephen at xemacs.org>wrote:

> Neil Girdhar writes:
>
>  > Most importantly, there's no sure *and* easy way to assert that the
>  > input a function is reiterable, and so the silent breakages are
>  > hard to discover.
>
> But you haven't defined "reiterable" yet, except "it fixes the
> breakage I've experienced".
>
> "Reiterable" could mean that the same object can be passed to
> iteration contexts freely, and in each one it will start from the
> beginning and the context will receive the same sequence of objects
> (in the same order).  Or order might not be guaranteed.  It might mean
> that the same object once exhausted can be passed to another iteration
> context and it will restart.  Or it might mean that the object
> supports a rewind method that must be explicitly called, but can be
> called even if the object hasn't been exhausted.  Or it might mean
> that the object is clonable, and functions that iterate objects passed
> into them must clone them unless they know that the object will never
> be reiterated.
>
>
Many different solutions would fix the problems I've seen. My suggestion is
that Reiterable should be define as an iteratable for which calling the
__iter__ method yields the same elements in the same order irrespective of
whether __iter__ is called while a previously returned iterator is still
iterating.  That way Antoine's above code would turn any non-reiterable
into a reiterable of this strong definition.  Correct me if I'm wrong, but
views on dicts are reiterable.

All of the above also have concurrent variations: in a threading
> context, multiple threads have access to each object and might be
> iterating with arbitrary timing.  (Eg, if a program is rewritten to
> use threads, a sequential reiteration could easily become a
> parallel/concurrent reiteration.)  Oh, another: AFAIK even
> non-iterator iterables may change their content when iterated.  Eg,
> weak containers: I forget if there are any iterables that allow
> deletions and insertions in underlying containers, but in the case of
> a weak ref deleting a ref elsewhere may cause the ref itself to
> disappear.  Should "reiterable" provide any guarantees there?
>

I think it shouldn't because Sequence doesn't guarantee that

x = len(a)
f(a)
a[x-1] = 5

won't throw if e.g., f does something to a.


>
>  > Even if the user should be the one deciding what to do, the dev has
>  > to be able to assert that the right thing was done.
>
> But there's no such need *between user and dev*.  Assertions protect a
> dev from *herself*.  Users do what they do, and devs either protect
> themselves from user vagaries, or they don't.  If the dev wants to
> protect herself from undesirable user choices, cloning an iterator in
> Python should be cheap.  If it isn't, let's fix that.
>

the dev/user terminology I think is unfortunate.  In my case, I'm both the
dev and the user.  I think a better way to word is that I personally want
to encapsulate the double-iteration in the member function.  I don't want
to have to know how my iterable is going to be used as a caller.

Your second point that the method should be able to cheaply clone an
iterator cheaply is precisely what I'd like to achieve with a "Reiterator"
class like Antoine's.  Its problem is that it makes an assumption that
non-iterator iterables are reiterable, which is not promised.  For that
class to work, that should either be promised or another mechanism should
be provided to satisfy its initial check.


>
> Assertions are useful, indeed.  But in this case, where the assertion
> itself is based on an undefined term as far as I can tell (I suspect
> this is because different use cases actually want different
> definitions), rather than an assertion the dev should treat herself
> (as a writer of other modules) as a user.  Ie, she should protect
> herself from herself in the same way by cloning the iterator (for
> efficiency converting the iterable to an iterator then cloning).
>

It sounds like you're saying put the items into a list no matter what.
 That's what I was doing before this thread.  I just thought it would be
more efficient if the object were a view, a list, a tuple, or a numpy
array, for the code to elide the list construction.  This could be achieved
as described above.

Best,
Neil


> Regards,
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20130920/00a0191b/attachment.html>

From solipsis at pitrou.net  Fri Sep 20 10:33:05 2013
From: solipsis at pitrou.net (Antoine Pitrou)
Date: Fri, 20 Sep 2013 10:33:05 +0200
Subject: [Python-ideas] Introduce collections.Reiterable
References: <575f4071-1b5c-4a16-b36c-b5f925cdd2f7@googlegroups.com>
 <CADiSq7c2G8E47Hth-kQWd4W33dpb9YFiyLs+vs7Seaw20w6fHw@mail.gmail.com>
 <CAA68w_kuPS6xVjAbropBFyiQib3aH9Ke+PiW8ecWhSkmHsnXnQ@mail.gmail.com>
 <CADiSq7cVXAUi4BLpfa13QjhwNmX9JancrMjHSEEfKn_P=F-_ag@mail.gmail.com>
 <20130919121828.GK19939@ando>
 <CADiSq7cyp2=XWT2e+nsCR9Q2TqrqM5fdHUhFBwoTfHT4bwum9w@mail.gmail.com>
 <l1g8b1$f65$1@ger.gmane.org>
Message-ID: <20130920103305.0d20539e@pitrou.net>

Le Thu, 19 Sep 2013 21:28:04 -0400,
Terry Reedy <tjreedy at udel.edu> a ?crit :
> The first problem, which impinges on both specification and
> reiteration, is than an iterable may be either finite, or not, or 'in
> between' depending the hardware and user needs.

This isn't a problem. itertools.tee() will deal with it fine.

> An additional complication, including for reiteration, is that 
> 'practically' finite may be different for time and space.

This is a strawman, since the "complication" applies to all kinds of
iterables.

> Currently, if dev needs to iterate an input more than once, the 
> specification should say so. If the user wants to pass an iterator,
> the user can instead pass list(iter).

Not if the user really wants, or needs, the iterator to be consumed
lazily. This can matter if the iterator is infinite, or if consuming it
has resouce-consuming side effects such as doing I/O, etc.

list(iter) is a limited solution to the problem. And the thing is,
using a Reiterable helper doesn't preclude the caller from calling
list() as well, so it's a strawman here.

> Now to the varieties of reiteration:
> 
> A. Serial: [...]
> 
> B. Parallel: [...]
> 
> C: Crossed: [...]

Nice discussion, but unrelated. If the iterable doesn't work in
those situations, it is purely a bug in the iterable, and it's not
related to "reiteration".

In other words, if an API returns something that cannot be iterated
an arbitrary number of times, it should return an iterator, not an
iterable ;-)

> However, another hidden assumption in this thread has been that 
> non-iterator iterables are deterministic, in the sense that
> re-calling iter(it) returns an iterator that yields the same sequence
> of items before raising StopIteration. Some very useful
> iterator-producing functions do not do that (ones returning iterators
> based on pseudo-random or external inputs).

Well, I hardly ever use non-deterministic iterables, and I can't
remember passing a "pseudo-random iterator" to a function expecting a
generic iterable. YMMV.

> So we need to add
> 'deterministic' to the notion of 'reiterable'. And that cannot be
> mechanically determined.

Many things are not mechanically determined that still make sense to
specifiy in an API. "Mechanically determined" is a rather silly
criterion when designing APIs, especially in a dynamic language where
nothing can ever be taken for granted.

(in other words, if you want "mechanically determined" API guarantees,
perhaps you should try Haskell or Rust :-))

> (Other possible complications: a resource can only be accessed by one 
> connection at a time. Or it limits the frequency of connections.)

That's true, but the caller can still call list() regardless of how the
callee is implemented.

Regards

Antoine.


From steve at pearwood.info  Fri Sep 20 11:48:58 2013
From: steve at pearwood.info (Steven D'Aprano)
Date: Fri, 20 Sep 2013 19:48:58 +1000
Subject: [Python-ideas] Introduce collections.Reiterable
In-Reply-To: <CADiSq7cyp2=XWT2e+nsCR9Q2TqrqM5fdHUhFBwoTfHT4bwum9w@mail.gmail.com>
References: <575f4071-1b5c-4a16-b36c-b5f925cdd2f7@googlegroups.com>
 <CADiSq7c2G8E47Hth-kQWd4W33dpb9YFiyLs+vs7Seaw20w6fHw@mail.gmail.com>
 <CAA68w_kuPS6xVjAbropBFyiQib3aH9Ke+PiW8ecWhSkmHsnXnQ@mail.gmail.com>
 <CADiSq7cVXAUi4BLpfa13QjhwNmX9JancrMjHSEEfKn_P=F-_ag@mail.gmail.com>
 <20130919121828.GK19939@ando>
 <CADiSq7cyp2=XWT2e+nsCR9Q2TqrqM5fdHUhFBwoTfHT4bwum9w@mail.gmail.com>
Message-ID: <20130920094854.GO19939@ando>

On Thu, Sep 19, 2013 at 11:02:57PM +1000, Nick Coghlan wrote:
> On 19 September 2013 22:18, Steven D'Aprano <steve at pearwood.info> wrote:
[...]
> > At the moment, dict views aren't directly iterable (you can't call
> > next() on them). But in principle they could have been designed as
> > re-iterable iterators.
> 
> That's not what iterable means. The iterable/iterator distinction is
> well defined and reflected in the collections ABCs:

Actually, I think the collections ABC gets it wrong, according to both 
common practice and the definition given in the glossary:

http://docs.python.org/3.4/glossary.html

More on this below.

As for my comment above, dict views don't obey the iterator protocol 
themselves, as they have no __next__ method, nor do they obey the 
sequence protocol, as they are not indexable. Hence they are not 
*directly* iterable, but they are *indirectly* iterable, since they have 
an __iter__ method which returns an iterator.

I don't think this is a critical distinction. I think it is fine to call 
views "iterable", since they can be iterated over. On the rare occasion 
that it matters, we can just do what I did above, and talk about objects 
which are directly iterable (e.g. iterators, sequences, generator 
objects) and those which are indirectly iterable (e.g. dict views).


> * iterables are objects that return iterators from __iter__.

That definition is incomplete, because iterable objects include those 
that obey the sequence protocol. This is not only by long-standing 
tradition (pre-dating the introduction of iterators, if I remember 
correctly), but also as per the definition in the glossary. Alas, 
collections.Iterable gets this wrong:

py> class Seq:
...     def __getitem__(self, index):
...             if 0 <= index < 5: return index+1000
...             raise IndexError
...
py> s = Seq()
py> isinstance(s, Iterable)
False
py> list(s)  # definitely iterable
[1000, 1001, 1002, 1003, 1004]


(Note that although Seq obeys the sequence protocol, and is can be 
iterated over, it is not a fully-fledged Sequence since it has no 
__len__.)

I think this is a bug in the Iterable ABC, but I'm not sure how one 
might fix it.


> * iterators are the subset of iterables that return "self" from
> __iter__, and expose a next (2.x) or __next__ (3.x) method

That is certainly correct. All iterators are iterables, but not all 
iterables are iterators.


> That "iterators return self from __iter__" is important, since almost
> everywhere Python iterates over something, it call "_itr = iter(obj)"
> first.

And then falls back on the sequence protocol.


> So, my question is a genuine one. While, *in theory*, an object can
> define a stateful __iter__ method that (e.g.) only works the first
> time it is called, or returns a separate object that still stores it's
> "current position" information on the original container, I simply
> can't think of a non-pathological case where "isinstance(obj,
> Iterable) and not isinstance(obj, Iterator)" would give the wrong
> answer.
> 
> In theory, yes, an object could obviously pass that test and still not
> be Reiterable, but I'm interested in what's true in *practice*.

I don't think you and I are actually in disagreement here. This is 
Python, and one could write an iterator class that is reiterable, or an 
iterable object (as determined by isinstance) which cannot be iterated 
over, but I think we can dismiss them as pathological cases. Even if 
such unusual objects are useful, it is the caller's responsibility, not 
the callee's, to use them safely and appropriately with functions that 
are expecting them.


-- 
Steven

From p.f.moore at gmail.com  Fri Sep 20 12:03:17 2013
From: p.f.moore at gmail.com (Paul Moore)
Date: Fri, 20 Sep 2013 11:03:17 +0100
Subject: [Python-ideas] Introduce collections.Reiterable
In-Reply-To: <20130920094854.GO19939@ando>
References: <575f4071-1b5c-4a16-b36c-b5f925cdd2f7@googlegroups.com>
 <CADiSq7c2G8E47Hth-kQWd4W33dpb9YFiyLs+vs7Seaw20w6fHw@mail.gmail.com>
 <CAA68w_kuPS6xVjAbropBFyiQib3aH9Ke+PiW8ecWhSkmHsnXnQ@mail.gmail.com>
 <CADiSq7cVXAUi4BLpfa13QjhwNmX9JancrMjHSEEfKn_P=F-_ag@mail.gmail.com>
 <20130919121828.GK19939@ando>
 <CADiSq7cyp2=XWT2e+nsCR9Q2TqrqM5fdHUhFBwoTfHT4bwum9w@mail.gmail.com>
 <20130920094854.GO19939@ando>
Message-ID: <CACac1F-=X-nfKe9sYx-_U_Ties3nU3z4vqzs=DkdqfJFD0StiA@mail.gmail.com>

On 20 September 2013 10:48, Steven D'Aprano <steve at pearwood.info> wrote:
> Actually, I think the collections ABC gets it wrong, according to both
> common practice and the definition given in the glossary:
>
> http://docs.python.org/3.4/glossary.html
>
> More on this below.
>
> As for my comment above, dict views don't obey the iterator protocol
> themselves, as they have no __next__ method, nor do they obey the
> sequence protocol, as they are not indexable. Hence they are not
> *directly* iterable, but they are *indirectly* iterable, since they have
> an __iter__ method which returns an iterator.
>
> I don't think this is a critical distinction. I think it is fine to call
> views "iterable", since they can be iterated over. On the rare occasion
> that it matters, we can just do what I did above, and talk about objects
> which are directly iterable (e.g. iterators, sequences, generator
> objects) and those which are indirectly iterable (e.g. dict views).

An iterable is an object that returns an iterator when passed to
iter(). It's *iterators* that have to have __next__, not iterables. An
iterable hast to have __iter__, which as far as I know dict views do

>>> {}.keys().__iter__
<method-wrapper '__iter__' of dict_keys object at 0x00000000027366D8>

Paul

From mistersheik at gmail.com  Fri Sep 20 12:18:47 2013
From: mistersheik at gmail.com (Neil Girdhar)
Date: Fri, 20 Sep 2013 06:18:47 -0400
Subject: [Python-ideas] Introduce collections.Reiterable
In-Reply-To: <20130920094854.GO19939@ando>
References: <575f4071-1b5c-4a16-b36c-b5f925cdd2f7@googlegroups.com>
 <CADiSq7c2G8E47Hth-kQWd4W33dpb9YFiyLs+vs7Seaw20w6fHw@mail.gmail.com>
 <CAA68w_kuPS6xVjAbropBFyiQib3aH9Ke+PiW8ecWhSkmHsnXnQ@mail.gmail.com>
 <CADiSq7cVXAUi4BLpfa13QjhwNmX9JancrMjHSEEfKn_P=F-_ag@mail.gmail.com>
 <20130919121828.GK19939@ando>
 <CADiSq7cyp2=XWT2e+nsCR9Q2TqrqM5fdHUhFBwoTfHT4bwum9w@mail.gmail.com>
 <20130920094854.GO19939@ando>
Message-ID: <CAA68w_mOgbFLsgReM+sYsQVENbgjMy+JLWai8=iuDO0=PAVrOA@mail.gmail.com>

On Fri, Sep 20, 2013 at 5:48 AM, Steven D'Aprano <steve at pearwood.info>wrote:

> On Thu, Sep 19, 2013 at 11:02:57PM +1000, Nick Coghlan wrote:
> > On 19 September 2013 22:18, Steven D'Aprano <steve at pearwood.info> wrote:
> [...]
> > > At the moment, dict views aren't directly iterable (you can't call
> > > next() on them). But in principle they could have been designed as
> > > re-iterable iterators.
> >
> > That's not what iterable means. The iterable/iterator distinction is
> > well defined and reflected in the collections ABCs:
>
> Actually, I think the collections ABC gets it wrong, according to both
> common practice and the definition given in the glossary:
>
> http://docs.python.org/3.4/glossary.html


Where does the glossary disagree with collections.abc?

>
>
> More on this below.
>
> As for my comment above, dict views don't obey the iterator protocol
> themselves, as they have no __next__ method, nor do they obey the
> sequence protocol, as they are not indexable. Hence they are not
> *directly* iterable, but they are *indirectly* iterable, since they have
> an __iter__ method which returns an iterator.
>

What you're calling "indirectly iterable" is what the docs call "Iterable"
and what collections.abc call Iterable, right?


>
> I don't think this is a critical distinction. I think it is fine to call
> views "iterable", since they can be iterated over. On the rare occasion
> that it matters, we can just do what I did above, and talk about objects
> which are directly iterable (e.g. iterators, sequences, generator
> objects) and those which are indirectly iterable (e.g. dict views).
>
>
> > * iterables are objects that return iterators from __iter__.
>
> That definition is incomplete, because iterable objects include those
> that obey the sequence protocol. This is not only by long-standing
> tradition (pre-dating the introduction of iterators, if I remember
> correctly), but also as per the definition in the glossary. Alas,
> collections.Iterable gets this wrong:
>
> py> class Seq:
> ...     def __getitem__(self, index):
> ...             if 0 <= index < 5: return index+1000
> ...             raise IndexError
> ...
> py> s = Seq()
> py> isinstance(s, Iterable)
> False
> py> list(s)  # definitely iterable
> [1000, 1001, 1002, 1003, 1004]
>

PEP 3119 makes it clear that isinstance( collections.Sequence) is the de
facto way of checking whether something is a sequence.   Casting to list is
not the de facto way.  Therefore, Seq is neither Iterable nor a Sequence
according to collections.abc.  If you inherit from the collections.Sequence
(you'll need to implement __len__) you'll get the Iterable stuff for free
as desired: Sequence subclasses Iterable.


>
>
> (Note that although Seq obeys the sequence protocol, and is can be
> iterated over, it is not a fully-fledged Sequence since it has no
> __len__.)
>

I guess we disagree that Seq obeys the sequence protocol.


>
> I think this is a bug in the Iterable ABC, but I'm not sure how one
> might fix it.
>
>
>
> > * iterators are the subset of iterables that return "self" from
> > __iter__, and expose a next (2.x) or __next__ (3.x) method
>
> That is certainly correct. All iterators are iterables, but not all
> iterables are iterators.
>
>
> > That "iterators return self from __iter__" is important, since almost
> > everywhere Python iterates over something, it call "_itr = iter(obj)"
> > first.
>
> And then falls back on the sequence protocol.
>
>
> > So, my question is a genuine one. While, *in theory*, an object can
> > define a stateful __iter__ method that (e.g.) only works the first
> > time it is called, or returns a separate object that still stores it's
> > "current position" information on the original container, I simply
> > can't think of a non-pathological case where "isinstance(obj,
> > Iterable) and not isinstance(obj, Iterator)" would give the wrong
> > answer.
> >
> > In theory, yes, an object could obviously pass that test and still not
> > be Reiterable, but I'm interested in what's true in *practice*.
>
> I don't think you and I are actually in disagreement here. This is
> Python, and one could write an iterator class that is reiterable, or an
> iterable object (as determined by isinstance) which cannot be iterated
> over, but I think we can dismiss them as pathological cases. Even if
> such unusual objects are useful, it is the caller's responsibility, not
> the callee's, to use them safely and appropriately with functions that
> are expecting them.
>

Is it possible minimize the mental load on the caller by encapsulating the
distinction between parameters that accept iterables and reiterables?  One
of the big problems with C++ for example is the great care that must be
taken, e.g. to not write past the ends of arrays.  A small mistake can take
a week to track down.  One does become more careful with years of
experience, but it is much simpler if the language prevents such
catastrophes.  For me, Python has been this language in many ways.
 Reiterables would be another such defensively motivated distinction.  Of
course, you could just ask callers to "be more careful", but I don't see
the problem with fixing the language specification so that Antoine's
Reiterable adaptor works properly.

Cheers,

Neil


>
>
> --
> Steven
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> https://mail.python.org/mailman/listinfo/python-ideas
>
> --
>
> ---
> You received this message because you are subscribed to a topic in the
> Google Groups "python-ideas" group.
> To unsubscribe from this topic, visit
> https://groups.google.com/d/topic/python-ideas/OumiLGDwRWA/unsubscribe.
> To unsubscribe from this group and all its topics, send an email to
> python-ideas+unsubscribe at googlegroups.com.
> For more options, visit https://groups.google.com/groups/opt_out.
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20130920/53687231/attachment.html>

From oscar.j.benjamin at gmail.com  Fri Sep 20 12:45:33 2013
From: oscar.j.benjamin at gmail.com (Oscar Benjamin)
Date: Fri, 20 Sep 2013 11:45:33 +0100
Subject: [Python-ideas] Numerical instability was: Re: Introduce
	collections.Reiterable
Message-ID: <CAHVvXxQbd8iXNrD-oUW3u6tixb8amSe0qxRNtawYGS-AFEpDOA@mail.gmail.com>

On 19 September 2013 22:25, Terry Reedy <tjreedy at udel.edu> wrote:
> On 9/19/2013 8:28 AM, Steven D'Aprano wrote:
>>
>> def variance(data):
>>      # Don't do this.
>>      sumx = sum(data)
>>      sumx2 = sum(x**2 for x in data)
>>      ss = sumx2 - (sumx**2)/n
>>      return ss/(n-1)
>>
>> Ignore the fact that this algorithm is numerically unstable.
>
> Lets not ;-)
>
>> It fails
>> for iterator arguments, because sum(data) consumes the iterator and
>> leaves sumx2 always equal to zero.
>
> This is doubly bad design because the two 'hidden' loops are trivially
> jammed together in one explicit loop, while use of Reiterable would not
> remove the numerical instability. While it may seem that a numerically
> stable solution needs two loops (the second to sum (x-sumx)**2), the two
> loops can still be jammed together with the Method of Provisional Means.
>
> http://www.stat.wisc.edu/~larget/math496/mean-var.html
> http://www.statistical-solutions-software.com/BMDP-documents/BMDP-Formula1.pdf
>
> Also called 'online algorithm' and 'Weighted incremental algorithm' in
> https://en.wikipedia.org/wiki/Algorithms_for_calculating_variance
>
> This was invented and used back when re-iteration of large datasets (on
> cards or tape) was possible but very slow (1970s or before). (Restack or
> rewind and reread might triple the (expensive) run time.)

I'm never quite sure what exactly is meant by "numerical instability"
in most contexts because I'm mainly familiar with the use of the term
in ODE solvers where an unstable solver will literally diverge from
the true solution as if it were an unstable equilibrium. However in
that context the error is truncation error rather than rounding error
and would occur even with infinite precision arithmetic:
http://en.wikipedia.org/wiki/Stiff_equation

I'm going to assume that numerical instability is just a way of saying
that a method is inaccurate in some cases.

Although the incremental algorithm is much better than the naive
approach Steven (knowingly) showed above I don't think it's true that
constraining yourself to a single pass doesn't limit the possible
accuracy. Another point of relevance here is that the incremental
formula cannot be as efficiently implemented in Python since you don't
get to take advantage of the super fast math.fsum function which is
also more accurate than a naive Kahan algorithm.

The script at the bottom of this post tests a few methods on a
deliberately awkward set of random numbers and typical output is:

$ ./stats.py
exact: 0.989661716301
naive -> error = -21476.0408922
incremental -> error = -1.0770901604e-07
two_pass -> error = 1.29118937764e-13
three_pass -> error = 0.0

For these numbers the three_pass method usually has an error of 0 but
otherwise  1ulp (1e-16). (It can actually be collapsed into a two pass
method but then we couldn't use fsum.)

If you know of a one-pass algorithm (or a way to improve the
implementation I showed) that is as accurate as either the two_pass or
three_pass methods I'd be very interested to see it (I'm sure Steven
would be as well).


Oscar


$ cat stats.py
#!/usr/bin/env python

from __future__ import print_function

from random import gauss
from math import fsum
from fractions import Fraction

# Return the exact result as a Fraction. Nothing wrong
# with using the computational formula for variance here.
def variance_exact(data):
    data = [Fraction(x) for x in data]
    n = len(data)
    sumx = sum(data)
    sumx2 = sum(x**2 for x in data)
    ss = sumx2 - (sumx**2)/n
    return ss/(n-1)

# Although this is the most efficient formula when using
# exact computation it fails under fixed precision
# floating point since it ends up subtracting two large
# almost equal numbers leading to a catastrophic loss of
# precision.
def variance_naive(data):
    n = len(data)
    sumx = fsum(data)
    sumx2 = fsum(x**2 for x in data)
    ss = sumx2 - (sumx**2)/n
    return ss/(n-1)

# Incremental variance calculation from Wikipedia. If
# the above uses fsum then a fair comparison should
# use some compensated summation here also. However
# it's not clear (to me) how to incorporate compensated
# summation here.
# http://en.wikipedia.org/wiki/Algorithms_for_calculating_variance#Compensated_variant
def variance_incremental(data):
    n = 0
    mean = 0
    M2 = 0

    for x in data:
        n = n + 1
        delta = x - mean
        mean = mean + delta/n
        M2 = M2 + delta*(x - mean)

    variance = M2/(n - 1)
    return variance

# This is to me the obvious formula since I think of
# this as the definition of the variance.
def variance_twopass(data):
    n = len(data)
    mean = fsum(data) / n
    sumdev2 = fsum((x - mean)**2 for x in data)
    variance = sumdev2 / (n - 1)
    return variance


# This is the three-pass algorithm used in Steven's
# statistics module. It's not one I had seen before but
# AFAICT it's very accurate. In fact the 2nd and 3rd passes
# can be merged as in variance_incremental but then we
# wouldn't be able to take advantage of fsum.
def variance_threepass(data):
    n = len(data)
    mean = fsum(data) / n
    sumdev2 = fsum((x-mean)**2 for x in data)
    # The following sum should mathematically equal zero, but due to rounding
    # error may not.
    sumdev2 -= fsum((x-mean) for x in data)**2 / n
    return sumdev2 / (n - 1)

methods = [
    ('naive', variance_naive),
    ('incremental', variance_incremental),
    ('two_pass', variance_twopass),
    ('three_pass', variance_threepass),
]

# Test numbers with large mean and small standard deviation.
# This is the case that causes trouble for the naive formula.
N = 100000
testnums = [gauss(mu=10**10, sigma=1) for n in range(N)]

# First compute the exact result
exact = variance_incremental([Fraction(num) for num in testnums])
print('exact:', float(exact))

# Compare each with the exact result
for name, var in methods:
    print(name, '-> error =', var(testnums) - exact)

From steve at pearwood.info  Fri Sep 20 13:10:00 2013
From: steve at pearwood.info (Steven D'Aprano)
Date: Fri, 20 Sep 2013 21:10:00 +1000
Subject: [Python-ideas] Introduce collections.Reiterable
In-Reply-To: <CACac1F-=X-nfKe9sYx-_U_Ties3nU3z4vqzs=DkdqfJFD0StiA@mail.gmail.com>
References: <575f4071-1b5c-4a16-b36c-b5f925cdd2f7@googlegroups.com>
 <CADiSq7c2G8E47Hth-kQWd4W33dpb9YFiyLs+vs7Seaw20w6fHw@mail.gmail.com>
 <CAA68w_kuPS6xVjAbropBFyiQib3aH9Ke+PiW8ecWhSkmHsnXnQ@mail.gmail.com>
 <CADiSq7cVXAUi4BLpfa13QjhwNmX9JancrMjHSEEfKn_P=F-_ag@mail.gmail.com>
 <20130919121828.GK19939@ando>
 <CADiSq7cyp2=XWT2e+nsCR9Q2TqrqM5fdHUhFBwoTfHT4bwum9w@mail.gmail.com>
 <20130920094854.GO19939@ando>
 <CACac1F-=X-nfKe9sYx-_U_Ties3nU3z4vqzs=DkdqfJFD0StiA@mail.gmail.com>
Message-ID: <20130920111000.GQ19939@ando>

On Fri, Sep 20, 2013 at 11:03:17AM +0100, Paul Moore wrote:

> An iterable is an object that returns an iterator when passed to
> iter(). It's *iterators* that have to have __next__, not iterables. An
> iterable hast to have __iter__, which as far as I know dict views do

It is not correct to say that iterables have to have an __iter__ method, 
by both common usage of the term, and by the definition in the glossary. 
Sorry to repeat myself, but iterables can also be objects which obey the 
sequence protocol. I already showed an example of a non-pathological 
object which can be iterated over but where isinstance(obj, Iterable) 
returns the wrong result. See my previous post, or the end of this one, 
for that example.

Here's the glossary entry in full:


iterable
    An object capable of returning its members one at a time. Examples 
of iterables include all sequence types (such as list, str, and tuple) 
and some non-sequence types like dict, file objects, and objects of any 
classes you define with an __iter__() or __getitem__() method. Iterables 
can be used in a for loop and in many other places where a sequence is 
needed (zip(), map(), ...). When an iterable object is passed as an 
argument to the built-in function iter(), it returns an iterator for the 
object. This iterator is good for one pass over the set of values. When 
using iterables, it is usually not necessary to call iter() or deal with 
iterator objects yourself. The for statement does that automatically for 
you, creating a temporary unnamed variable to hold the iterator for the 
duration of the loop. See also iterator, sequence, and generator.

http://docs.python.org/3.4/glossary.html

I think we can all agree that dict views are iterables. You can iterate 
over them, and they have an __iter__ method. We can also agree that 
views aren't sequences, nor are they iterators themselves:

py> keys = {}.keys()
py> iter(keys) is keys
False

and isinstance gives the correct result for views:

py> isinstance(keys, Sequence)
False
py> isinstance(keys, Iterator)
False
py> isinstance(keys, Iterable)
True

None of this is in dispute! But (and this was really a very minor point 
in my original post, seemingly blown all out of proportion) you can't 
iterate over a view directly. Or perhaps, for the avoidance of doubt, I 
should say you can't iterate over a view *manually* without creating an 
itermediate iterator object. Iteration in Python is implemented by two 
protocols:

1) the iterator protocol, which repeatedly calls __next__ until 
StopIteration is raised; and

2) the sequence protocol, which repeatedly calls __getitem__(0), 
__getitem__(1), __getitem__(2), ... until IndexError is raised.

Dict views don't obey either of these, as it has no __next__ or 
__getitem__ method. That is all I mean when I say that dict views aren't 
"directly [manually] iterable". Instead, they have an __iter__ method 
which returns an object which is directly iterable, a dict_keyiterator 
object.

This really was a very minor point, I've already spent far more words 
on this than it deserves. But the important point seems to have been 
missed, namely that the Iterable ABC gives the wrong result for some 
objects which are iterable. Here it is again:

py> class Seq:
...     def __getitem__(self, index):
...             if 0 <= index < 5: return index+1000
...             raise IndexError
...
py> s = Seq()
py> isinstance(s, Iterable)  # The ABC claims Seq is not iterable.
False
py> for x in s:  # But it actually is.
...     print(x)
...
1000
1001
1002
1003
1004


Can anyone convince me this is not a bug in the Iterable ABC?


-- 
Steven

From steve at pearwood.info  Fri Sep 20 14:45:05 2013
From: steve at pearwood.info (Steven D'Aprano)
Date: Fri, 20 Sep 2013 22:45:05 +1000
Subject: [Python-ideas] Introduce collections.Reiterable
In-Reply-To: <CAA68w_mOgbFLsgReM+sYsQVENbgjMy+JLWai8=iuDO0=PAVrOA@mail.gmail.com>
References: <575f4071-1b5c-4a16-b36c-b5f925cdd2f7@googlegroups.com>
 <CADiSq7c2G8E47Hth-kQWd4W33dpb9YFiyLs+vs7Seaw20w6fHw@mail.gmail.com>
 <CAA68w_kuPS6xVjAbropBFyiQib3aH9Ke+PiW8ecWhSkmHsnXnQ@mail.gmail.com>
 <CADiSq7cVXAUi4BLpfa13QjhwNmX9JancrMjHSEEfKn_P=F-_ag@mail.gmail.com>
 <20130919121828.GK19939@ando>
 <CADiSq7cyp2=XWT2e+nsCR9Q2TqrqM5fdHUhFBwoTfHT4bwum9w@mail.gmail.com>
 <20130920094854.GO19939@ando>
 <CAA68w_mOgbFLsgReM+sYsQVENbgjMy+JLWai8=iuDO0=PAVrOA@mail.gmail.com>
Message-ID: <20130920124505.GT19939@ando>

On Fri, Sep 20, 2013 at 06:18:47AM -0400, Neil Girdhar wrote:
> On Fri, Sep 20, 2013 at 5:48 AM, Steven D'Aprano <steve at pearwood.info>wrote:
> 
> > On Thu, Sep 19, 2013 at 11:02:57PM +1000, Nick Coghlan wrote:
> > > On 19 September 2013 22:18, Steven D'Aprano <steve at pearwood.info> wrote:
> > [...]
> > > > At the moment, dict views aren't directly iterable (you can't call
> > > > next() on them). But in principle they could have been designed as
> > > > re-iterable iterators.
> > >
> > > That's not what iterable means. The iterable/iterator distinction is
> > > well defined and reflected in the collections ABCs:
> >
> > Actually, I think the collections ABC gets it wrong, according to both
> > common practice and the definition given in the glossary:
> >
> > http://docs.python.org/3.4/glossary.html
> 
> 
> Where does the glossary disagree with collections.abc?

I show below a class that is iterable, yet is not an instance of 
collections.Iterable. By the glossary definition it is iterable (it has 
a __getitem__ method that raises IndexError when there are no more items 
to be returned).


[...]
> What you're calling "indirectly iterable" is what the docs call "Iterable"
> and what collections.abc call Iterable, right?

I've explained this further in my reply to Paul Moore. What I should 
have said was *manually* iterable, in the sense of directly calling 
__next__ or __getitem__ on the view.


Here's an example of an iterable class that collections.Iterable claims 
is not an iterable:

> > py> class Seq:
> > ...     def __getitem__(self, index):
> > ...             if 0 <= index < 5: return index+1000
> > ...             raise IndexError
> > ...
> > py> s = Seq()
> > py> isinstance(s, Iterable)
> > False
> > py> list(s)  # definitely iterable
> > [1000, 1001, 1002, 1003, 1004]
> >
> 
> PEP 3119 makes it clear that isinstance( collections.Sequence) is the de
> facto way of checking whether something is a sequence.

I'm not testing whether it is a sequence. I explicitly stated it isn't a 
sequence, since it doesn't implement __len__. The Sequence ABC gets this 
right.


> Casting to list is not the de facto way.

No, but casting to list demonstrates that it can be iterated over. In my 
reply to Paul, I explicitly used it in a for-loop.


> Therefore, Seq is neither Iterable nor a Sequence
> according to collections.abc.

I'm not concerned by Sequence. It's not a Sequence. No dispute there. 
But it is an iterable, since it obeys the sequence protocol and can be 
iterated over. (Which is not the same as being a sequence.)


> > (Note that although Seq obeys the sequence protocol, and is can be
> > iterated over, it is not a fully-fledged Sequence since it has no
> > __len__.)
> >
> 
> I guess we disagree that Seq obeys the sequence protocol.

I'm not sure why you think it doesn't obey the sequence protocol. It is 
demonstrably true that it does. If it wasn't obvious from the source 
code, it should be obvious from a few seconds' experimentation at the 
interactive interpreter:

py> s = Seq()
py> s[0]
1000
py> s[1]
1001
[...cut s[2], s[3], s[4] for brevity...]
py> s[5]
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "<stdin>", line 4, in __getitem__
IndexError


That's all there is to the sequence protocol, and it's enough to make 
Seq objects iterable.


-- 
Steven

From p.f.moore at gmail.com  Fri Sep 20 15:24:50 2013
From: p.f.moore at gmail.com (Paul Moore)
Date: Fri, 20 Sep 2013 14:24:50 +0100
Subject: [Python-ideas] Introduce collections.Reiterable
In-Reply-To: <20130920111000.GQ19939@ando>
References: <575f4071-1b5c-4a16-b36c-b5f925cdd2f7@googlegroups.com>
 <CADiSq7c2G8E47Hth-kQWd4W33dpb9YFiyLs+vs7Seaw20w6fHw@mail.gmail.com>
 <CAA68w_kuPS6xVjAbropBFyiQib3aH9Ke+PiW8ecWhSkmHsnXnQ@mail.gmail.com>
 <CADiSq7cVXAUi4BLpfa13QjhwNmX9JancrMjHSEEfKn_P=F-_ag@mail.gmail.com>
 <20130919121828.GK19939@ando>
 <CADiSq7cyp2=XWT2e+nsCR9Q2TqrqM5fdHUhFBwoTfHT4bwum9w@mail.gmail.com>
 <20130920094854.GO19939@ando>
 <CACac1F-=X-nfKe9sYx-_U_Ties3nU3z4vqzs=DkdqfJFD0StiA@mail.gmail.com>
 <20130920111000.GQ19939@ando>
Message-ID: <CACac1F8cbVqML5xXUhn=Qik3jNwHDHfsZgSK-_yASbQEEZTRMw@mail.gmail.com>

On 20 September 2013 12:10, Steven D'Aprano <steve at pearwood.info> wrote:
> This really was a very minor point, I've already spent far more words
> on this than it deserves. But the important point seems to have been
> missed, namely that the Iterable ABC gives the wrong result for some
> objects which are iterable. Here it is again:
>
> py> class Seq:
> ...     def __getitem__(self, index):
> ...             if 0 <= index < 5: return index+1000
> ...             raise IndexError
> ...
> py> s = Seq()
> py> isinstance(s, Iterable)  # The ABC claims Seq is not iterable.
> False
> py> for x in s:  # But it actually is.
> ...     print(x)
> ...
> 1000
> 1001
> 1002
> 1003
> 1004
>
>
> Can anyone convince me this is not a bug in the Iterable ABC?

Ah, I see. I misread your point and got it backwards. My apologies. As
regards whether it is a bug, the best I can do is to refer to the
definition of collections.abc.Iterable:

class collections.abc.Iterable
ABC for classes that provide the __iter__() method. See also the
definition of iterable.

Clearly the behaviour is as defined (there is no __iter__). And quite
possibly the full definition of iterable (... or it has a __getitem__
*that behaves correctly when passed the interers 1, 2, 3...*) is not
computable, so it's not possible to define a completely accurate spec
for "what an iterable is". The ABC appears therefore to be taking a
conservative approach of accepting a few false negatives for the sake
of avoiding false positives. I can accept that trade-off, although I
concede that it's unfortunate.

But the messages I take from this are:

1. There's no way of defining an iterable ABC that covers 100% of the
things that are commonly referred to as "iterables".
2. ABCs and LBYL-style coding have their own set of risks, and once
again "Easier to ask for forgiveness" appears to be the approach to
take :-)

Paul

From random832 at fastmail.us  Fri Sep 20 16:46:11 2013
From: random832 at fastmail.us (random832 at fastmail.us)
Date: Fri, 20 Sep 2013 10:46:11 -0400
Subject: [Python-ideas] Reduce platform dependence of date and time
 related functions
In-Reply-To: <FDA35CFA-9EDC-41F0-8A20-5A5E6B46099E@yahoo.com>
References: <1379357351.28759.22678441.285099D6@webmail.messagingengine.com>
 <CAP7h-xaKbsWFapLYmUoGqTLRpNJWau1hVmu2it=QvJfBX_fAzg@mail.gmail.com>
 <1379433703.14501.23084249.72F8456E@webmail.messagingengine.com>
 <CAP1=2W5UN1zBjMOd7pLiO8twT0wmmteGkCj1RCdr=hCW6i108A@mail.gmail.com>
 <1379435260.28764.23112997.115F6763@webmail.messagingengine.com>
 <CAP1=2W6K78=qKSWY0c=OoGGJtCU6c1OY-EOXw7u_zniKdG3LeA@mail.gmail.com>
 <CAP7h-xYnzhhqePPBUHec4-Y6pE=jYmmwUf1H2NFO5HDDFauFKQ@mail.gmail.com>
 <67B7C352-5251-46DE-BED3-7D1FF38A8A78@yahoo.com>
 <CAP7h-xaaqJOeYEjKfZB_9YGBoUae1hydkVBpZnFkafuLLPGk8g@mail.gmail.com>
 <FDA35CFA-9EDC-41F0-8A20-5A5E6B46099E@yahoo.com>
Message-ID: <1379688371.25215.24411669.1642F9A6@webmail.messagingengine.com>

On Thu, Sep 19, 2013, at 18:00, Andrew Barnert wrote:
> But that API implies that you could call, e.g., d.strftime(fmt, "pt_BR"),
> which I assume isn't something anyone is planning on implementing.

Well, you could implement it by acquiring the GIL, setting the locale
(putenv + setlocale), calling the platform strftime, and then resetting
the locale afterward - all while locked, to prevent exposing the
temporary strftime change to other code. (This also suggests a way to
implement a tzinfo object in terms of native timezones)

Long-term it would be nice to have python ship its own locale data,
and/or to acquire platform-specific locale data via GetLocaleInfo[Ex] on
windows and nl_langinfo on POSIX OSes where it is provided. (Note that
the latter still would require stopping everything and setting the
global locale to acquire the data, but since you've got to translate a
locale name to a handle to use GetLocaleInfo or xlocale, it'd make sense
to encapsulate this in a locale object which does all this upon being
created. With platform strftime as a fallback. The issue with using
platform strftime to populate things in advance is that %O is difficult
and %E may be intractable.

From abarnert at yahoo.com  Fri Sep 20 17:59:28 2013
From: abarnert at yahoo.com (Andrew Barnert)
Date: Fri, 20 Sep 2013 08:59:28 -0700
Subject: [Python-ideas] Reduce platform dependence of date and time
	related functions
In-Reply-To: <1379688371.25215.24411669.1642F9A6@webmail.messagingengine.com>
References: <1379357351.28759.22678441.285099D6@webmail.messagingengine.com>
 <CAP7h-xaKbsWFapLYmUoGqTLRpNJWau1hVmu2it=QvJfBX_fAzg@mail.gmail.com>
 <1379433703.14501.23084249.72F8456E@webmail.messagingengine.com>
 <CAP1=2W5UN1zBjMOd7pLiO8twT0wmmteGkCj1RCdr=hCW6i108A@mail.gmail.com>
 <1379435260.28764.23112997.115F6763@webmail.messagingengine.com>
 <CAP1=2W6K78=qKSWY0c=OoGGJtCU6c1OY-EOXw7u_zniKdG3LeA@mail.gmail.com>
 <CAP7h-xYnzhhqePPBUHec4-Y6pE=jYmmwUf1H2NFO5HDDFauFKQ@mail.gmail.com>
 <67B7C352-5251-46DE-BED3-7D1FF38A8A78@yahoo.com>
 <CAP7h-xaaqJOeYEjKfZB_9YGBoUae1hydkVBpZnFkafuLLPGk8g@mail.gmail.com>
 <FDA35CFA-9EDC-41F0-8A20-5A5E6B46099E@yahoo.com>
 <1379688371.25215.24411669.1642F9A6@webmail.messagingengine.com>
Message-ID: <6DFE1EDA-7BEC-4CEA-8BCB-585EEA49FB2A@yahoo.com>

On Sep 20, 2013, at 7:46, random832 at fastmail.us wrote:

> On Thu, Sep 19, 2013, at 18:00, Andrew Barnert wrote:
>> But that API implies that you could call, e.g., d.strftime(fmt, "pt_BR"),
>> which I assume isn't something anyone is planning on implementing.
> 
> Well, you could implement it by acquiring the GIL, setting the locale
> (putenv + setlocale), calling the platform strftime, and then resetting
> the locale afterward - all while locked, to prevent exposing the
> temporary strftime change to other code. (This also suggests a way to
> implement a tzinfo object in terms of native timezones)

OK, yes, you could do that, but are you actually proposing that the stdlib should do so? If not, it's a misleading API. If so, it's a much larger proposal than what we initially started with. And I think providing C-locale str[fp]time with very wide, platform-independent limits is a useful idea even without this much more radical idea.

> Long-term it would be nice to have python ship its own locale data,
> and/or to acquire platform-specific locale data via GetLocaleInfo[Ex] on
> windows and nl_langinfo on POSIX OSes where it is provided.

IIRC, OS X has a different set of (CoreFoundation-based?) APIs that take the system preferences into account as well as the locale setting, which might be worth using if you're designing the ultimate locale handling system; otherwise your apps won't act like native Cocoa apps.

For that matter, both Windows and OS X have more than one notion of the local date format (long vs. short names, etc.); do you want to expose that as well, or just stick to the POSIX-like subset of each platform's capabilities?

> (Note that
> the latter still would require stopping everything and setting the
> global locale to acquire the data, but since you've got to translate a
> locale name to a handle to use GetLocaleInfo or xlocale, it'd make sense
> to encapsulate this in a locale object which does all this upon being
> created. With platform strftime as a fallback. The issue with using
> platform strftime to populate things in advance is that %O is difficult
> and %E may be intractable.

From abarnert at yahoo.com  Fri Sep 20 18:21:43 2013
From: abarnert at yahoo.com (Andrew Barnert)
Date: Fri, 20 Sep 2013 09:21:43 -0700
Subject: [Python-ideas] Introduce collections.Reiterable
In-Reply-To: <20130920124505.GT19939@ando>
References: <575f4071-1b5c-4a16-b36c-b5f925cdd2f7@googlegroups.com>
 <CADiSq7c2G8E47Hth-kQWd4W33dpb9YFiyLs+vs7Seaw20w6fHw@mail.gmail.com>
 <CAA68w_kuPS6xVjAbropBFyiQib3aH9Ke+PiW8ecWhSkmHsnXnQ@mail.gmail.com>
 <CADiSq7cVXAUi4BLpfa13QjhwNmX9JancrMjHSEEfKn_P=F-_ag@mail.gmail.com>
 <20130919121828.GK19939@ando>
 <CADiSq7cyp2=XWT2e+nsCR9Q2TqrqM5fdHUhFBwoTfHT4bwum9w@mail.gmail.com>
 <20130920094854.GO19939@ando>
 <CAA68w_mOgbFLsgReM+sYsQVENbgjMy+JLWai8=iuDO0=PAVrOA@mail.gmail.com>
 <20130920124505.GT19939@ando>
Message-ID: <A59E049F-4CBF-436A-9CAF-645311D03DFB@yahoo.com>

On Sep 20, 2013, at 5:45, Steven D'Aprano <steve at pearwood.info> wrote:

> On Fri, Sep 20, 2013 at 06:18:47AM -0400, Neil Girdhar wrote:
>> On Fri, Sep 20, 2013 at 5:48 AM, Steven D'Aprano <steve at pearwood.info>wrote:
>> 
>>> On Thu, Sep 19, 2013 at 11:02:57PM +1000, Nick Coghlan wrote:
>>>> On 19 September 2013 22:18, Steven D'Aprano <steve at pearwood.info> wrote:
>>> [...]
>>>>> At the moment, dict views aren't directly iterable (you can't call
>>>>> next() on them). But in principle they could have been designed as
>>>>> re-iterable iterators.
>>>> 
>>>> That's not what iterable means. The iterable/iterator distinction is
>>>> well defined and reflected in the collections ABCs:
>>> 
>>> Actually, I think the collections ABC gets it wrong, according to both
>>> common practice and the definition given in the glossary:
>>> 
>>> http://docs.python.org/3.4/glossary.html
>> 
>> 
>> Where does the glossary disagree with collections.abc?
> 
> I show below a class that is iterable, yet is not an instance of 
> collections.Iterable. By the glossary definition it is iterable (it has 
> a __getitem__ method that raises IndexError when there are no more items 
> to be returned).
> 
> 
> [...]
>> What you're calling "indirectly iterable" is what the docs call "Iterable"
>> and what collections.abc call Iterable, right?
> 
> I've explained this further in my reply to Paul Moore. What I should 
> have said was *manually* iterable, in the sense of directly calling 
> __next__ or __getitem__ on the view.

Being able to call __next__ on something is not a property of an iterable. It's only a property of an iterator. (In fact, I think python could have defined an iterator as "an iterable with __next__" just as profitably as "an iterable whose __iter__() returns itself", and ended up with the exact same categories as today. But that's not important.)

So, I'm not sure what your "manually iterable" is supposed to represent. Iterators and sequences but not other iterables? What does this distinction buy you? It seems as useful as inventing a word for all all wedge-headed cats plus tabby apple-headed cats but no other apple-headed cats: a perfectly definable category, but one of no value.

And that makes me think you're still confusing iterables and iterators. 

Except that you've pointed out a valid distinction--making something indexable (by an initial sequence of natural numbers? or does a 1-based array or an otherwise-not-iterable mapping-like object count as an empty iterator?) makes it work with the iterable protocol, but not the Iterable ABC, so clearly you know what you're talking about.

And that makes me think that I (and the people who have been responding to you before me) have missed something important in this "manually iterable" or "directly iterable" idea. So, maybe you should try explaining it a different way?

From stephen at xemacs.org  Fri Sep 20 18:53:02 2013
From: stephen at xemacs.org (Stephen J. Turnbull)
Date: Sat, 21 Sep 2013 01:53:02 +0900
Subject: [Python-ideas] Introduce collections.Reiterable
In-Reply-To: <CAA68w_=yXZxE2ia6RpueGjoK1Su+fiKfrJ0NuAMRO4kL3NnRMg@mail.gmail.com>
References: <575f4071-1b5c-4a16-b36c-b5f925cdd2f7@googlegroups.com>
 <CADiSq7c2G8E47Hth-kQWd4W33dpb9YFiyLs+vs7Seaw20w6fHw@mail.gmail.com>
 <CAA68w_kuPS6xVjAbropBFyiQib3aH9Ke+PiW8ecWhSkmHsnXnQ@mail.gmail.com>
 <CADiSq7cVXAUi4BLpfa13QjhwNmX9JancrMjHSEEfKn_P=F-_ag@mail.gmail.com>
 <20130919121828.GK19939@ando> <l1fr00$7va$1@ger.gmane.org>
 <CAA68w_kNSDT0eonhtbyK31_vUaiv8Xx15Xh5ag1_mMeD0oK5bA@mail.gmail.com>
 <023C2EA0-9716-4FE1-888D-C556B8917FA1@yahoo.com>
 <CAA68w_ng+X2CUkQULFYk2McJZK2c1LCEf3XMVZbfcuKU0LSHcg@mail.gmail.com>
 <874n9g3wgc.fsf@uwakimon.sk.tsukuba.ac.jp>
 <CAA68w_=yXZxE2ia6RpueGjoK1Su+fiKfrJ0NuAMRO4kL3NnRMg@mail.gmail.com>
Message-ID: <8738oz4eq9.fsf@uwakimon.sk.tsukuba.ac.jp>

Neil Girdhar writes:

 > Many different solutions would fix the problems I've seen.  My
 > suggestion is that Reiterable should be define as an iteratable for
 > which calling the __iter__ method yields the same elements in the
 > same order irrespective of whether __iter__ is called while a
 > previously returned iterator is still iterating.

 > Correct me if I'm wrong, but views on dicts are reiterable.

For the same reason that sequences are: a view is not an iterator, so
every time you iterate it, it gets passed to iter, and you get a new
iterator, which then iterates.

This is *why* Nick says that "isinstance(x, Iterable) and not
isinstance(x, Iterator)" is the test you want.  I can't speak for Nick
on Steven A's example of an object with a __getitem__ taking a numeric
argument that isn't an Iterable but is iterable, but I think that
falls under "consenting adults" aka "if you're afraid it will hurt,
don't".

 > Your second point that the method should be able to cheaply clone
 > an iterator cheaply is precisely what I'd like to achieve with a
 > "Reiterator" class like Antoine's.

Well, I've kinda convinced myself that it isn't going to be easy to do
that, without changing the type.  The problem is that __next__ is
(abstractly) a closure, and there's no way I know of to copy a
function (copy.copy just returns the function object unchanged).  So
you'd need to expose the hidden state in the closure, and that is a
change of type.

 > It sounds like you're saying put the items into a list no matter
 > what.

No, I'm saying if you don't know if you may consume the iterable, you
should convert to iterator, clone the iterator, and iterate the
clone.  But that probably requires a change of type, at which point
you may as well call it "Reiterable".


From random832 at fastmail.us  Fri Sep 20 18:55:29 2013
From: random832 at fastmail.us (random832 at fastmail.us)
Date: Fri, 20 Sep 2013 12:55:29 -0400
Subject: [Python-ideas] Reduce platform dependence of date and time
 related functions
In-Reply-To: <6DFE1EDA-7BEC-4CEA-8BCB-585EEA49FB2A@yahoo.com>
References: <1379357351.28759.22678441.285099D6@webmail.messagingengine.com>
 <CAP7h-xaKbsWFapLYmUoGqTLRpNJWau1hVmu2it=QvJfBX_fAzg@mail.gmail.com>
 <1379433703.14501.23084249.72F8456E@webmail.messagingengine.com>
 <CAP1=2W5UN1zBjMOd7pLiO8twT0wmmteGkCj1RCdr=hCW6i108A@mail.gmail.com>
 <1379435260.28764.23112997.115F6763@webmail.messagingengine.com>
 <CAP1=2W6K78=qKSWY0c=OoGGJtCU6c1OY-EOXw7u_zniKdG3LeA@mail.gmail.com>
 <CAP7h-xYnzhhqePPBUHec4-Y6pE=jYmmwUf1H2NFO5HDDFauFKQ@mail.gmail.com>
 <67B7C352-5251-46DE-BED3-7D1FF38A8A78@yahoo.com>
 <CAP7h-xaaqJOeYEjKfZB_9YGBoUae1hydkVBpZnFkafuLLPGk8g@mail.gmail.com>
 <FDA35CFA-9EDC-41F0-8A20-5A5E6B46099E@yahoo.com>
 <1379688371.25215.24411669.1642F9A6@webmail.messagingengine.com>
 <6DFE1EDA-7BEC-4CEA-8BCB-585EEA49FB2A@yahoo.com>
Message-ID: <1379696129.22168.24469709.2BA72C5C@webmail.messagingengine.com>

On Fri, Sep 20, 2013, at 11:59, Andrew Barnert wrote:
> OK, yes, you could do that, but are you actually proposing that the
> stdlib should do so? If not, it's a misleading API. If so, it's a much
> larger proposal than what we initially started with. And I think
> providing C-locale str[fp]time with very wide, platform-independent
> limits is a useful idea even without this much more radical idea.

We've basically got five "kinds" of locale we are talking about:

"C" locale - this is the easiest one to implement in a
platform-independent way, but probably the least useful (if you're not
intending locale-specific display, you should probably be using numeric
values)

Current platform locale, including all the subtleties like user
preferences you mentioned, when available. This is what we support now.

Specified platform locale (e.g. pt_BR, and we may still want to
translate from a single format rather than needing to specify 0x0416 or
"PTB" on Windows)

Platform-independent version of a specified locale, using e.g. CLDR.
This is the second-easiest to implement in a platform-independent way.

Platform-independent version of user's current locale. There are limits
to what can be achieved with this, for example Windows (and maybe Mac OS
- I know the pre-OSX versions did) lets you set certain things
individually. For example, I have my short date format set to
yyyy-MM-dd, but otherwise I'm in the en-US locale.

Anyway, this should be separate from the discussion of removing the
limitations of the platform code. Locale-specific data can be acquired
by calling the platform's strftime for a platform-independent strftime
just as it's done for strptime now - and we'd need it as a fallback
anyway. You can reduce the impact of platform's range limitations and
incompatible repertoire of format specifiers by doing them individually,
with a "safe" value for the year if needed, rather than throwing the
whole format string to the platform function.

For local time on windows, incidentally, we could extend the usable
range by calling SystemTimeToTzSpecificLocalTime, but that loses the
ability to use MSVCRT's version of the POSIX TZ variable.

From stephen at xemacs.org  Fri Sep 20 19:34:58 2013
From: stephen at xemacs.org (Stephen J. Turnbull)
Date: Sat, 21 Sep 2013 02:34:58 +0900
Subject: [Python-ideas] Introduce collections.Reiterable
In-Reply-To: <A59E049F-4CBF-436A-9CAF-645311D03DFB@yahoo.com>
References: <575f4071-1b5c-4a16-b36c-b5f925cdd2f7@googlegroups.com>
 <CADiSq7c2G8E47Hth-kQWd4W33dpb9YFiyLs+vs7Seaw20w6fHw@mail.gmail.com>
 <CAA68w_kuPS6xVjAbropBFyiQib3aH9Ke+PiW8ecWhSkmHsnXnQ@mail.gmail.com>
 <CADiSq7cVXAUi4BLpfa13QjhwNmX9JancrMjHSEEfKn_P=F-_ag@mail.gmail.com>
 <20130919121828.GK19939@ando>
 <CADiSq7cyp2=XWT2e+nsCR9Q2TqrqM5fdHUhFBwoTfHT4bwum9w@mail.gmail.com>
 <20130920094854.GO19939@ando>
 <CAA68w_mOgbFLsgReM+sYsQVENbgjMy+JLWai8=iuDO0=PAVrOA@mail.gmail.com>
 <20130920124505.GT19939@ando>
 <A59E049F-4CBF-436A-9CAF-645311D03DFB@yahoo.com>
Message-ID: <871u4j4csd.fsf@uwakimon.sk.tsukuba.ac.jp>

Andrew Barnert writes:

 > And that makes me think that I (and the people who have been
 > responding to you before me) have missed something important in
 > this "manually iterable" or "directly iterable" idea. So, maybe you
 > should try explaining it a different way?

How about "'Iterable' is a terrible name for an ABC that excludes an
important class of iterables"?

You see, the library manual lies (section 4.5 "Iterator types"):

    One method needs to be defined for container objects to provide
    iteration support:

    container.__iter__()

But in fact this is contradicted (section 2 "Built-in functions"):

    iter(object[, sentinel])
        Return an iterator object. The first argument is interpreted very
        differently depending on the presence of the second argument. Without
        a second argument, object must be a collection object which supports
        the iteration protocol (the __iter__() method), or it must support the
        sequence protocol (the __getitem__() method with integer arguments
        starting at 0).

I suggest that section 4.5 be corrected to

    Container objects provide iteration support when either of the methods

    container.__iter__()            # iteration protocol
    container.__getitem__()         # sequence protocol

    is defined.  In the latter case, __getitem__() must accept integer
    arguments starting at 0.

Curiously, all of the built-in sequences support both protocols.  I
suppose this section ought to say which is preferred.

The net result is that I guess Nick's test needs to be refined to

    def isIterable(o):
        try:
            iter(o)
            return True
        except TypeError:
            return False

    def isReiterable(o):
        return isIterable and not isinstance(o, collections.abc.Iterator)


From tjreedy at udel.edu  Fri Sep 20 22:01:28 2013
From: tjreedy at udel.edu (Terry Reedy)
Date: Fri, 20 Sep 2013 16:01:28 -0400
Subject: [Python-ideas] Numerical instability was: Re: Introduce
	collections.Reiterable
In-Reply-To: <CAHVvXxQbd8iXNrD-oUW3u6tixb8amSe0qxRNtawYGS-AFEpDOA@mail.gmail.com>
References: <CAHVvXxQbd8iXNrD-oUW3u6tixb8amSe0qxRNtawYGS-AFEpDOA@mail.gmail.com>
Message-ID: <l1i9il$trl$1@ger.gmane.org>

On 9/20/2013 6:45 AM, Oscar Benjamin wrote:

> I'm going to assume that numerical instability is just a way of saying
> that a method is inaccurate in some cases.

Good enough for me.

> Although the incremental algorithm is much better than the naive
> approach Steven (knowingly) showed above I don't think it's true that
> constraining yourself to a single pass doesn't limit the possible
> accuracy.

That may be one difference between integer and float arithmetic. The 
order of operations makes a difference.

> Another point of relevance here is that the incremental
> formula cannot be as efficiently implemented in Python since you don't
> get to take advantage of the super fast math.fsum function which is
> also more accurate than a naive Kahan algorithm.

Yes. One of the differences between 'theoretical' algorithms and 
practical algorithms coded in CPython is the bias toward using functions 
already coded in C.

> The script at the bottom of this post tests a few methods on a
> deliberately awkward set of random numbers and typical output is:

Thanks for doing this.

> $ ./stats.py
> exact: 0.989661716301
> naive -> error = -21476.0408922
> incremental -> error = -1.0770901604e-07
> two_pass -> error = 1.29118937764e-13
> three_pass -> error = 0.0

The incremental method is good enough for data measured to 3 significant 
figures, as is typical in as least parts of some sciences, and the data 
I worked with for a decade. But it is not good enough for substantially 
more accurate data. The Python statistics module should cater to the 
latter. The doc should just say that is requires a serially re-iterable 
input. (A person with data too large to fit in memory could write an 
iterable that opens a file and returns an iterator that reads blocks of 
values and yields them one at a time.)

The incremental method is useful for returning running means and 
deviations for data collected sporadically and indefinitely, without 
needing to store the cumulative data. It is a nice, non-obvious example 
of the principle that it is sometimes possible to summarize cumulative 
data with a relatively small and fixed set of sufficient statistics.

> For these numbers the three_pass method usually has an error of 0 but
> otherwise  1ulp (1e-16). (It can actually be collapsed into a two pass
> method but then we couldn't use fsum.)
>
> If you know of a one-pass algorithm (or a way to improve the
> implementation I showed) that is as accurate as either the two_pass or
> three_pass methods I'd be very interested to see it (I'm sure Steven
> would be as well).

If I were trying to improve the incremental variance algorithm, I would 
study the fsum method until a really understood it and then see if I 
could apply the same ideas.

>
>
> Oscar
>
>
> $ cat stats.py
> #!/usr/bin/env python
>
> from __future__ import print_function
>
> from random import gauss
> from math import fsum
> from fractions import Fraction
>
> # Return the exact result as a Fraction. Nothing wrong
> # with using the computational formula for variance here.
> def variance_exact(data):
>      data = [Fraction(x) for x in data]
>      n = len(data)
>      sumx = sum(data)
>      sumx2 = sum(x**2 for x in data)
>      ss = sumx2 - (sumx**2)/n
>      return ss/(n-1)
>
> # Although this is the most efficient formula when using
> # exact computation it fails under fixed precision
> # floating point since it ends up subtracting two large
> # almost equal numbers leading to a catastrophic loss of
> # precision.
> def variance_naive(data):
>      n = len(data)
>      sumx = fsum(data)
>      sumx2 = fsum(x**2 for x in data)
>      ss = sumx2 - (sumx**2)/n
>      return ss/(n-1)
>
> # Incremental variance calculation from Wikipedia. If
> # the above uses fsum then a fair comparison should
> # use some compensated summation here also. However
> # it's not clear (to me) how to incorporate compensated
> # summation here.
> # http://en.wikipedia.org/wiki/Algorithms_for_calculating_variance#Compensated_variant
> def variance_incremental(data):
>      n = 0
>      mean = 0
>      M2 = 0
>
>      for x in data:
>          n = n + 1
>          delta = x - mean
>          mean = mean + delta/n
>          M2 = M2 + delta*(x - mean)
>
>      variance = M2/(n - 1)
>      return variance
>
> # This is to me the obvious formula since I think of
> # this as the definition of the variance.
> def variance_twopass(data):
>      n = len(data)
>      mean = fsum(data) / n
>      sumdev2 = fsum((x - mean)**2 for x in data)
>      variance = sumdev2 / (n - 1)
>      return variance
>
>
> # This is the three-pass algorithm used in Steven's
> # statistics module. It's not one I had seen before but
> # AFAICT it's very accurate. In fact the 2nd and 3rd passes
> # can be merged as in variance_incremental but then we
> # wouldn't be able to take advantage of fsum.
> def variance_threepass(data):
>      n = len(data)
>      mean = fsum(data) / n
>      sumdev2 = fsum((x-mean)**2 for x in data)
>      # The following sum should mathematically equal zero, but due to rounding
>      # error may not.
>      sumdev2 -= fsum((x-mean) for x in data)**2 / n
>      return sumdev2 / (n - 1)
>
> methods = [
>      ('naive', variance_naive),
>      ('incremental', variance_incremental),
>      ('two_pass', variance_twopass),
>      ('three_pass', variance_threepass),
> ]
>
> # Test numbers with large mean and small standard deviation.
> # This is the case that causes trouble for the naive formula.
> N = 100000
> testnums = [gauss(mu=10**10, sigma=1) for n in range(N)]
>
> # First compute the exact result
> exact = variance_incremental([Fraction(num) for num in testnums])
> print('exact:', float(exact))
>
> # Compare each with the exact result
> for name, var in methods:
>      print(name, '-> error =', var(testnums) - exact)

-- 
Terry Jan Reedy


From tim.peters at gmail.com  Fri Sep 20 22:25:07 2013
From: tim.peters at gmail.com (Tim Peters)
Date: Fri, 20 Sep 2013 15:25:07 -0500
Subject: [Python-ideas] Numerical instability was: Re: Introduce
	collections.Reiterable
In-Reply-To: <l1i9il$trl$1@ger.gmane.org>
References: <CAHVvXxQbd8iXNrD-oUW3u6tixb8amSe0qxRNtawYGS-AFEpDOA@mail.gmail.com>
 <l1i9il$trl$1@ger.gmane.org>
Message-ID: <CAExdVNkF+Zen0QsRNCiNRTWiktTCaLPePAzehHXVm3Zmw=Z_cA@mail.gmail.com>

[Terry Reedy]
> ...
> If I were trying to improve the incremental variance algorithm, I would
> study the fsum method until a really understood it and then see if I could
> apply the same ideas.

There are a number of ways to do floating "as if with infinite
precision" addition, implemented in pure Python, here:

http://code.activestate.com/recipes/393090-binary-floating-point-summation-accurate-to-full-p/

Not saying they're applicable here, just saying that if anyone wants
to fully understand this, it's a lot easier to read Python code ;-)
`msum` there is closest to Python's math.fsum().

From tjreedy at udel.edu  Fri Sep 20 23:15:05 2013
From: tjreedy at udel.edu (Terry Reedy)
Date: Fri, 20 Sep 2013 17:15:05 -0400
Subject: [Python-ideas] Introduce collections.Reiterable
In-Reply-To: <20130920094854.GO19939@ando>
References: <575f4071-1b5c-4a16-b36c-b5f925cdd2f7@googlegroups.com>
 <CADiSq7c2G8E47Hth-kQWd4W33dpb9YFiyLs+vs7Seaw20w6fHw@mail.gmail.com>
 <CAA68w_kuPS6xVjAbropBFyiQib3aH9Ke+PiW8ecWhSkmHsnXnQ@mail.gmail.com>
 <CADiSq7cVXAUi4BLpfa13QjhwNmX9JancrMjHSEEfKn_P=F-_ag@mail.gmail.com>
 <20130919121828.GK19939@ando>
 <CADiSq7cyp2=XWT2e+nsCR9Q2TqrqM5fdHUhFBwoTfHT4bwum9w@mail.gmail.com>
 <20130920094854.GO19939@ando>
Message-ID: <l1idsm$dvp$1@ger.gmane.org>

On 9/20/2013 5:48 AM, Steven D'Aprano wrote:

> py> class Seq:
> ...     def __getitem__(self, index):
> ...             if 0 <= index < 5: return index+1000
> ...             raise IndexError
> ...
> py> s = Seq()
> py> isinstance(s, Iterable)
> False
> py> list(s)  # definitely iterable
> [1000, 1001, 1002, 1003, 1004]

I tested and iter() recognizes Seqs as iterables:

for i in iter(Seq()): print(i)
<same numbers as above>

It does, however, wrap them in an adaptor iterator class
 >>> type(iter(Seq()))
<class 'iterator'>
(which I was not really aware of before ;-) with proper __iter__ and 
__next__ methods
 >>> si is iter(si)
True
 >>> next(si)
1000

So I agree that collections.Iterable is limited relative to glossary and 
Python definition. The glossary might say that the older __getitem__ 
protocol is semi-deprecated (it is no longer used directly) but is 
adapted for back compatibility. The problem with the protocol is that an 
iteration __getitem__ may be a fake __getitem__ in that it ignores 
*index* (because it calculates the next item from stored data). A 
fake-getitem iterable, if it also had __len__, would look like a 
Sequence even though it really is not, because it cannot be properly 
indexed. Such iterables are likely to not be reiterable.

-- 
Terry Jan Reedy


From raymond.hettinger at gmail.com  Fri Sep 20 23:48:45 2013
From: raymond.hettinger at gmail.com (Raymond Hettinger)
Date: Fri, 20 Sep 2013 14:48:45 -0700
Subject: [Python-ideas] Introduce collections.Reiterable
In-Reply-To: <l1idsm$dvp$1@ger.gmane.org>
References: <575f4071-1b5c-4a16-b36c-b5f925cdd2f7@googlegroups.com>
 <CADiSq7c2G8E47Hth-kQWd4W33dpb9YFiyLs+vs7Seaw20w6fHw@mail.gmail.com>
 <CAA68w_kuPS6xVjAbropBFyiQib3aH9Ke+PiW8ecWhSkmHsnXnQ@mail.gmail.com>
 <CADiSq7cVXAUi4BLpfa13QjhwNmX9JancrMjHSEEfKn_P=F-_ag@mail.gmail.com>
 <20130919121828.GK19939@ando>
 <CADiSq7cyp2=XWT2e+nsCR9Q2TqrqM5fdHUhFBwoTfHT4bwum9w@mail.gmail.com>
 <20130920094854.GO19939@ando> <l1idsm$dvp$1@ger.gmane.org>
Message-ID: <84A21A05-BA4E-4324-98DB-6ADE7EB98D1D@gmail.com>


On Sep 20, 2013, at 2:15 PM, Terry Reedy <tjreedy at udel.edu> wrote:

> . The glossary might say that the older __getitem__ protocol is semi-deprecated (it is no longer used directly) but is adapted for back compatibility.

It is NOT deprecated.   People use and rely on this behavior.  It is a guaranteed behavior.  Please don't use the glossary as a place to introduce changes to the language.


Raymond
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20130920/f098d75e/attachment.html>

From storchaka at gmail.com  Fri Sep 20 23:53:43 2013
From: storchaka at gmail.com (Serhiy Storchaka)
Date: Sat, 21 Sep 2013 00:53:43 +0300
Subject: [Python-ideas] Numerical instability was: Re: Introduce
	collections.Reiterable
In-Reply-To: <CAHVvXxQbd8iXNrD-oUW3u6tixb8amSe0qxRNtawYGS-AFEpDOA@mail.gmail.com>
References: <CAHVvXxQbd8iXNrD-oUW3u6tixb8amSe0qxRNtawYGS-AFEpDOA@mail.gmail.com>
Message-ID: <l1ig47$6eg$1@ger.gmane.org>

20.09.13 13:45, Oscar Benjamin ???????(??):
> If you know of a one-pass algorithm (or a way to improve the
> implementation I showed) that is as accurate as either the two_pass or
> three_pass methods I'd be very interested to see it (I'm sure Steven
> would be as well).


import sys
fmin = sys.float_info.min
finvmin = int(1 / sys.float_info.min)
def i2f1(i):
     return i // finvmin + (i % finvmin) * fmin
def i2f2(i):
     return i2f1(i // finvmin) + (i % finvmin) * fmin * fmin

def variance_incremental_exact(data):
     n = 0
     sumx = 0
     sumx2 = 0

     for x in data:
         i = int(x)
         x = i * finvmin + int(round((x - i) * finvmin))
         n += 1
         sumx += x
         sumx2 += x * x

     ss = sumx2 * n - sumx * sumx
     d = n * (n - 1)
     return i2f2(ss // d) + i2f2(ss % d) / float(d)


From mistersheik at gmail.com  Fri Sep 20 23:56:01 2013
From: mistersheik at gmail.com (Neil Girdhar)
Date: Fri, 20 Sep 2013 17:56:01 -0400
Subject: [Python-ideas] Introduce collections.Reiterable
In-Reply-To: <84A21A05-BA4E-4324-98DB-6ADE7EB98D1D@gmail.com>
References: <575f4071-1b5c-4a16-b36c-b5f925cdd2f7@googlegroups.com>
 <CADiSq7c2G8E47Hth-kQWd4W33dpb9YFiyLs+vs7Seaw20w6fHw@mail.gmail.com>
 <CAA68w_kuPS6xVjAbropBFyiQib3aH9Ke+PiW8ecWhSkmHsnXnQ@mail.gmail.com>
 <CADiSq7cVXAUi4BLpfa13QjhwNmX9JancrMjHSEEfKn_P=F-_ag@mail.gmail.com>
 <20130919121828.GK19939@ando>
 <CADiSq7cyp2=XWT2e+nsCR9Q2TqrqM5fdHUhFBwoTfHT4bwum9w@mail.gmail.com>
 <20130920094854.GO19939@ando> <l1idsm$dvp$1@ger.gmane.org>
 <84A21A05-BA4E-4324-98DB-6ADE7EB98D1D@gmail.com>
Message-ID: <CAA68w_mvBujxzuScagQ4d_dzJZ0Rkt5Sja6FU7fs0fXRPS4bfg@mail.gmail.com>

On Fri, Sep 20, 2013 at 5:48 PM, Raymond Hettinger <
raymond.hettinger at gmail.com> wrote:

>
> On Sep 20, 2013, at 2:15 PM, Terry Reedy <tjreedy at udel.edu> wrote:
>
> . The glossary might say that the older __getitem__ protocol is
> semi-deprecated (it is no longer used directly) but is adapted for back
> compatibility.
>
>
> It is NOT deprecated.   People use and rely on this behavior.  It is a
> guaranteed behavior.  Please don't use the glossary as a place to introduce
> changes to the language.
>

Just curious, but who uses __getitem__ to implement an iterable that's not
a sequence?

>
>
> Raymond
>
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> https://mail.python.org/mailman/listinfo/python-ideas
>
> --
>
> ---
> You received this message because you are subscribed to a topic in the
> Google Groups "python-ideas" group.
> To unsubscribe from this topic, visit
> https://groups.google.com/d/topic/python-ideas/OumiLGDwRWA/unsubscribe.
> To unsubscribe from this group and all its topics, send an email to
> python-ideas+unsubscribe at googlegroups.com.
> For more options, visit https://groups.google.com/groups/opt_out.
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20130920/99baf342/attachment.html>

From timothy.c.delaney at gmail.com  Sat Sep 21 00:00:30 2013
From: timothy.c.delaney at gmail.com (Tim Delaney)
Date: Sat, 21 Sep 2013 08:00:30 +1000
Subject: [Python-ideas] Introduce collections.Reiterable
In-Reply-To: <20130920111000.GQ19939@ando>
References: <575f4071-1b5c-4a16-b36c-b5f925cdd2f7@googlegroups.com>
 <CADiSq7c2G8E47Hth-kQWd4W33dpb9YFiyLs+vs7Seaw20w6fHw@mail.gmail.com>
 <CAA68w_kuPS6xVjAbropBFyiQib3aH9Ke+PiW8ecWhSkmHsnXnQ@mail.gmail.com>
 <CADiSq7cVXAUi4BLpfa13QjhwNmX9JancrMjHSEEfKn_P=F-_ag@mail.gmail.com>
 <20130919121828.GK19939@ando>
 <CADiSq7cyp2=XWT2e+nsCR9Q2TqrqM5fdHUhFBwoTfHT4bwum9w@mail.gmail.com>
 <20130920094854.GO19939@ando>
 <CACac1F-=X-nfKe9sYx-_U_Ties3nU3z4vqzs=DkdqfJFD0StiA@mail.gmail.com>
 <20130920111000.GQ19939@ando>
Message-ID: <CAN8CLgnAWQ3iszEtvEo_FTydx9Un+Y+41Fe5fVsNdX70Ube9Ww@mail.gmail.com>

On 20 September 2013 21:10, Steven D'Aprano <steve at pearwood.info> wrote:

>
> py> class Seq:
> ...     def __getitem__(self, index):
> ...             if 0 <= index < 5: return index+1000
> ...             raise IndexError
> ...
> py> s = Seq()
> py> isinstance(s, Iterable)  # The ABC claims Seq is not iterable.
> False
> py> for x in s:  # But it actually is.
> ...     print(x)
> ...
> 1000
> 1001
> 1002
> 1003
> 1004
>
>
> Can anyone convince me this is not a bug in the Iterable ABC?
>

I think there is a distinction here between collections.Iterable (as a
defined ABC) and something that is "iterable" (lowercase "i"). As you've
noted, an "iterable" is "An object capable of returning its members one at
a time".

So I think a valid definition of reiterable (barring pathological cases) is:

    obj is not iter(obj)

(assuming of course that obj is iterable).

Python 3.3.0 (v3.3.0:bd8afb90ebf2, Sep 29 2012, 10:57:17) [MSC v.1600 64
bit (AM
D64)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> class Seq:
...     def __getitem__(self, index):
...         if 0 <= index < 5:
...             return index+1000
...         raise IndexError
...
>>> s = Seq()
>>> s is iter(s)
False
>>> i = iter(s)
>>> i is iter(i)
True
>>> t = ()
>>> t is iter(t)
False
>>> i = iter(t)
>>> i is iter(i)
True
>>>

Tim Delaney
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20130921/70711ae6/attachment-0001.html>

From mistersheik at gmail.com  Sat Sep 21 00:02:34 2013
From: mistersheik at gmail.com (Neil Girdhar)
Date: Fri, 20 Sep 2013 18:02:34 -0400
Subject: [Python-ideas] Introduce collections.Reiterable
In-Reply-To: <8738oz4eq9.fsf@uwakimon.sk.tsukuba.ac.jp>
References: <575f4071-1b5c-4a16-b36c-b5f925cdd2f7@googlegroups.com>
 <CADiSq7c2G8E47Hth-kQWd4W33dpb9YFiyLs+vs7Seaw20w6fHw@mail.gmail.com>
 <CAA68w_kuPS6xVjAbropBFyiQib3aH9Ke+PiW8ecWhSkmHsnXnQ@mail.gmail.com>
 <CADiSq7cVXAUi4BLpfa13QjhwNmX9JancrMjHSEEfKn_P=F-_ag@mail.gmail.com>
 <20130919121828.GK19939@ando> <l1fr00$7va$1@ger.gmane.org>
 <CAA68w_kNSDT0eonhtbyK31_vUaiv8Xx15Xh5ag1_mMeD0oK5bA@mail.gmail.com>
 <023C2EA0-9716-4FE1-888D-C556B8917FA1@yahoo.com>
 <CAA68w_ng+X2CUkQULFYk2McJZK2c1LCEf3XMVZbfcuKU0LSHcg@mail.gmail.com>
 <874n9g3wgc.fsf@uwakimon.sk.tsukuba.ac.jp>
 <CAA68w_=yXZxE2ia6RpueGjoK1Su+fiKfrJ0NuAMRO4kL3NnRMg@mail.gmail.com>
 <8738oz4eq9.fsf@uwakimon.sk.tsukuba.ac.jp>
Message-ID: <CAA68w_nBRiEM6X4TQjO0c=imAS9p=BGsKRq0kqaPkKJkPge0NA@mail.gmail.com>

On Fri, Sep 20, 2013 at 12:53 PM, Stephen J. Turnbull <stephen at xemacs.org>wrote:

> Neil Girdhar writes:
>
>  > Many different solutions would fix the problems I've seen.  My
>  > suggestion is that Reiterable should be define as an iteratable for
>  > which calling the __iter__ method yields the same elements in the
>  > same order irrespective of whether __iter__ is called while a
>  > previously returned iterator is still iterating.
>
>  > Correct me if I'm wrong, but views on dicts are reiterable.
>
> For the same reason that sequences are: a view is not an iterator, so
> every time you iterate it, it gets passed to iter, and you get a new
> iterator, which then iterates.
>
> This is *why* Nick says that "isinstance(x, Iterable) and not
> isinstance(x, Iterator)" is the test you want.  I can't speak for Nick
> on Steven A's example of an object with a __getitem__ taking a numeric
> argument that isn't an Iterable but is iterable, but I think that
> falls under "consenting adults" aka "if you're afraid it will hurt,
> don't".
>
>
I want that test if the documentation will promise that that test is
supposed to be right.


>   > Your second point that the method should be able to cheaply clone
>  > an iterator cheaply is precisely what I'd like to achieve with a
>  > "Reiterator" class like Antoine's.
>
> Well, I've kinda convinced myself that it isn't going to be easy to do
> that, without changing the type.  The problem is that __next__ is
> (abstractly) a closure, and there's no way I know of to copy a
> function (copy.copy just returns the function object unchanged).  So
> you'd need to expose the hidden state in the closure, and that is a
> change of type.
>
>  > It sounds like you're saying put the items into a list no matter
>  > what.
>
> No, I'm saying if you don't know if you may consume the iterable, you
> should convert to iterator, clone the iterator, and iterate the
> clone.  But that probably requires a change of type, at which point
> you may as well call it "Reiterable".
>
>
> okay, so we're on the same page it sounds like.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20130920/ea23e7e1/attachment.html>

From tjreedy at udel.edu  Sat Sep 21 01:48:48 2013
From: tjreedy at udel.edu (Terry Reedy)
Date: Fri, 20 Sep 2013 19:48:48 -0400
Subject: [Python-ideas] Introduce collections.Reiterable
In-Reply-To: <CAN8CLgnAWQ3iszEtvEo_FTydx9Un+Y+41Fe5fVsNdX70Ube9Ww@mail.gmail.com>
References: <575f4071-1b5c-4a16-b36c-b5f925cdd2f7@googlegroups.com>
 <CADiSq7c2G8E47Hth-kQWd4W33dpb9YFiyLs+vs7Seaw20w6fHw@mail.gmail.com>
 <CAA68w_kuPS6xVjAbropBFyiQib3aH9Ke+PiW8ecWhSkmHsnXnQ@mail.gmail.com>
 <CADiSq7cVXAUi4BLpfa13QjhwNmX9JancrMjHSEEfKn_P=F-_ag@mail.gmail.com>
 <20130919121828.GK19939@ando>
 <CADiSq7cyp2=XWT2e+nsCR9Q2TqrqM5fdHUhFBwoTfHT4bwum9w@mail.gmail.com>
 <20130920094854.GO19939@ando>
 <CACac1F-=X-nfKe9sYx-_U_Ties3nU3z4vqzs=DkdqfJFD0StiA@mail.gmail.com>
 <20130920111000.GQ19939@ando>
 <CAN8CLgnAWQ3iszEtvEo_FTydx9Un+Y+41Fe5fVsNdX70Ube9Ww@mail.gmail.com>
Message-ID: <l1imst$88i$1@ger.gmane.org>

On 9/20/2013 6:00 PM, Tim Delaney wrote:

> I think there is a distinction here between collections.Iterable (as a
> defined ABC) and something that is "iterable" (lowercase "i"). As you've
> noted, an "iterable" is "An object capable of returning its members one
> at a time".
>
> So I think a valid definition of reiterable (barring pathological cases) is:
>
>      obj is not iter(obj)

If obj has a fake __getitem__, that will not work.

class Cnt:
     def __init__(self, maxn):
         self.n = 0
         self.maxn = maxn
     def __getitem__(self, dummy):
         n = self.n + 1
         if n <= self.maxn:
             self.n = n
             return n
         else:
             raise IndexError

c3 = Cnt(3)
print(c3 is not iter(c3), list(c3), list(c3))
 >>>
True [1, 2, 3] []

Dismissing legal code as 'pathological', as more than one person has, 
does not cut it as a design principle.

-- 
Terry Jan Reedy


From tjreedy at udel.edu  Sat Sep 21 01:59:25 2013
From: tjreedy at udel.edu (Terry Reedy)
Date: Fri, 20 Sep 2013 19:59:25 -0400
Subject: [Python-ideas] Introduce collections.Reiterable
In-Reply-To: <84A21A05-BA4E-4324-98DB-6ADE7EB98D1D@gmail.com>
References: <575f4071-1b5c-4a16-b36c-b5f925cdd2f7@googlegroups.com>
 <CADiSq7c2G8E47Hth-kQWd4W33dpb9YFiyLs+vs7Seaw20w6fHw@mail.gmail.com>
 <CAA68w_kuPS6xVjAbropBFyiQib3aH9Ke+PiW8ecWhSkmHsnXnQ@mail.gmail.com>
 <CADiSq7cVXAUi4BLpfa13QjhwNmX9JancrMjHSEEfKn_P=F-_ag@mail.gmail.com>
 <20130919121828.GK19939@ando>
 <CADiSq7cyp2=XWT2e+nsCR9Q2TqrqM5fdHUhFBwoTfHT4bwum9w@mail.gmail.com>
 <20130920094854.GO19939@ando> <l1idsm$dvp$1@ger.gmane.org>
 <84A21A05-BA4E-4324-98DB-6ADE7EB98D1D@gmail.com>
Message-ID: <l1ingq$dov$1@ger.gmane.org>

On 9/20/2013 5:48 PM, Raymond Hettinger wrote:
>
> On Sep 20, 2013, at 2:15 PM, Terry Reedy
> <tjreedy at udel.edu
> <mailto:tjreedy at udel.edu>> wrote:
>
>> . The glossary might say that the older __getitem__ protocol is
>> semi-deprecated (it is no longer used directly) but is adapted for
>> back compatibility.
>
> It is NOT deprecated.

And I did not suggest that is was. It is, however, not fully supported 
in that collections. Iterable does not recognize __getitem__ iterables 
and the same will be true of code that uses Iterable.

 > People use and rely on this behavior.

Are people still writing fake __getitem__ methods? (that are really next 
methods rather than random access methods).  It believe that usage of 
the protocol to be informally deprecated in favor of __iter__ and __next__.

-- 
Terry Jan Reedy


From timothy.c.delaney at gmail.com  Sat Sep 21 02:05:35 2013
From: timothy.c.delaney at gmail.com (Tim Delaney)
Date: Sat, 21 Sep 2013 10:05:35 +1000
Subject: [Python-ideas] Introduce collections.Reiterable
In-Reply-To: <l1imst$88i$1@ger.gmane.org>
References: <575f4071-1b5c-4a16-b36c-b5f925cdd2f7@googlegroups.com>
 <CADiSq7c2G8E47Hth-kQWd4W33dpb9YFiyLs+vs7Seaw20w6fHw@mail.gmail.com>
 <CAA68w_kuPS6xVjAbropBFyiQib3aH9Ke+PiW8ecWhSkmHsnXnQ@mail.gmail.com>
 <CADiSq7cVXAUi4BLpfa13QjhwNmX9JancrMjHSEEfKn_P=F-_ag@mail.gmail.com>
 <20130919121828.GK19939@ando>
 <CADiSq7cyp2=XWT2e+nsCR9Q2TqrqM5fdHUhFBwoTfHT4bwum9w@mail.gmail.com>
 <20130920094854.GO19939@ando>
 <CACac1F-=X-nfKe9sYx-_U_Ties3nU3z4vqzs=DkdqfJFD0StiA@mail.gmail.com>
 <20130920111000.GQ19939@ando>
 <CAN8CLgnAWQ3iszEtvEo_FTydx9Un+Y+41Fe5fVsNdX70Ube9Ww@mail.gmail.com>
 <l1imst$88i$1@ger.gmane.org>
Message-ID: <CAN8CLg=RQBXAJJbx5Ts+2D8ywHB_ziWj_T-VDMPWNbByETgrOQ@mail.gmail.com>

On 21 September 2013 09:48, Terry Reedy <tjreedy at udel.edu> wrote:

> On 9/20/2013 6:00 PM, Tim Delaney wrote:
>
>  I think there is a distinction here between collections.Iterable (as a
>> defined ABC) and something that is "iterable" (lowercase "i"). As you've
>> noted, an "iterable" is "An object capable of returning its members one
>> at a time".
>>
>> So I think a valid definition of reiterable (barring pathological cases)
>> is:
>>
>>      obj is not iter(obj)
>>
>
> If obj has a fake __getitem__, that will not work.
>
> class Cnt:
>     def __init__(self, maxn):
>         self.n = 0
>         self.maxn = maxn
>     def __getitem__(self, dummy):
>         n = self.n + 1
>         if n <= self.maxn:
>             self.n = n
>             return n
>         else:
>             raise IndexError
>
> c3 = Cnt(3)
> print(c3 is not iter(c3), list(c3), list(c3))
> >>>
> True [1, 2, 3] []
>
> Dismissing legal code as 'pathological', as more than one person has, does
> not cut it as a design principle.


To me, that is a reiterable. It might not give the same results each time
through, but you can iterate, it stops, then you can iterate over it again
- it won't raise an exception trying to do so. So not what I would consider
a pathological case - though definitely an unusual case and one that
obviously wouldn't work in many situations that require reiterables to
return the same values in the same order each time through.

So we've got two classes of reiterables here

- anything that can be iterated through, and then iterated through again,
for which obj is not iter(obj) will work in all but what I consider to be
pathological cases;

- iterables that can be iterated through multiple times, returning the same
objects in the same order each time through, for which I don't think a test
is possible.

Tim Delaney
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20130921/a101508f/attachment.html>

From timothy.c.delaney at gmail.com  Sat Sep 21 02:18:26 2013
From: timothy.c.delaney at gmail.com (Tim Delaney)
Date: Sat, 21 Sep 2013 10:18:26 +1000
Subject: [Python-ideas] Introduce collections.Reiterable
In-Reply-To: <CAN8CLg=RQBXAJJbx5Ts+2D8ywHB_ziWj_T-VDMPWNbByETgrOQ@mail.gmail.com>
References: <575f4071-1b5c-4a16-b36c-b5f925cdd2f7@googlegroups.com>
 <CADiSq7c2G8E47Hth-kQWd4W33dpb9YFiyLs+vs7Seaw20w6fHw@mail.gmail.com>
 <CAA68w_kuPS6xVjAbropBFyiQib3aH9Ke+PiW8ecWhSkmHsnXnQ@mail.gmail.com>
 <CADiSq7cVXAUi4BLpfa13QjhwNmX9JancrMjHSEEfKn_P=F-_ag@mail.gmail.com>
 <20130919121828.GK19939@ando>
 <CADiSq7cyp2=XWT2e+nsCR9Q2TqrqM5fdHUhFBwoTfHT4bwum9w@mail.gmail.com>
 <20130920094854.GO19939@ando>
 <CACac1F-=X-nfKe9sYx-_U_Ties3nU3z4vqzs=DkdqfJFD0StiA@mail.gmail.com>
 <20130920111000.GQ19939@ando>
 <CAN8CLgnAWQ3iszEtvEo_FTydx9Un+Y+41Fe5fVsNdX70Ube9Ww@mail.gmail.com>
 <l1imst$88i$1@ger.gmane.org>
 <CAN8CLg=RQBXAJJbx5Ts+2D8ywHB_ziWj_T-VDMPWNbByETgrOQ@mail.gmail.com>
Message-ID: <CAN8CLgkZogwNzCLr91yiTmq=hR7KzRhLGNiSUJgrcok7-L7awQ@mail.gmail.com>

On 21 September 2013 10:05, Tim Delaney <timothy.c.delaney at gmail.com> wrote:

>
>> Dismissing legal code as 'pathological', as more than one person has,
>> does not cut it as a design principle.
>
>
> To me, that is a reiterable. It might not give the same results each time
> through, but you can iterate, it stops, then you can iterate over it again
> - it won't raise an exception trying to do so. So not what I would consider
> a pathological case - though definitely an unusual case and one that
> obviously wouldn't work in many situations that require reiterables to
> return the same values in the same order each time through.
>
> So we've got two classes of reiterables here
>
> - anything that can be iterated through, and then iterated through again,
> for which obj is not iter(obj) will work in all but what I consider to be
> pathological cases;
>
> - iterables that can be iterated through multiple times, returning the
> same objects in the same order each time through, for which I don't think a
> test is possible.
>

Also, pathological is probably not the best term to use. Instead,
substitute "deliberately breaks a well-established protocol". It may make
sense to do so in certain circumstances, but you can't expect anyone else
to play nice with you if you don't play nice with them.

Tim Delaney
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20130921/60c4d833/attachment-0001.html>

From raymond.hettinger at gmail.com  Sat Sep 21 02:34:22 2013
From: raymond.hettinger at gmail.com (Raymond Hettinger)
Date: Fri, 20 Sep 2013 17:34:22 -0700
Subject: [Python-ideas] Introduce collections.Reiterable
In-Reply-To: <l1ingq$dov$1@ger.gmane.org>
References: <575f4071-1b5c-4a16-b36c-b5f925cdd2f7@googlegroups.com>
 <CADiSq7c2G8E47Hth-kQWd4W33dpb9YFiyLs+vs7Seaw20w6fHw@mail.gmail.com>
 <CAA68w_kuPS6xVjAbropBFyiQib3aH9Ke+PiW8ecWhSkmHsnXnQ@mail.gmail.com>
 <CADiSq7cVXAUi4BLpfa13QjhwNmX9JancrMjHSEEfKn_P=F-_ag@mail.gmail.com>
 <20130919121828.GK19939@ando>
 <CADiSq7cyp2=XWT2e+nsCR9Q2TqrqM5fdHUhFBwoTfHT4bwum9w@mail.gmail.com>
 <20130920094854.GO19939@ando> <l1idsm$dvp$1@ger.gmane.org>
 <84A21A05-BA4E-4324-98DB-6ADE7EB98D1D@gmail.com> <l1ingq$dov$1@ger.gmane.org>
Message-ID: <5354F5F4-8052-451E-BAFB-BED214484AF8@gmail.com>


On Sep 20, 2013, at 4:59 PM, Terry Reedy <tjreedy at udel.edu> wrote:

>>> . The glossary might say that the older __getitem__ protocol is
>>> semi-deprecated (it is no longer used directly) but is adapted for
>>> back compatibility.
>> 
>> It is NOT deprecated.
> 
> And I did not suggest that is was. It is, however, not fully supported in that collections. Iterable does not recognize __getitem__ iterables and the same will be true of code that uses Iterable.


The collections ABCs are all just a subset of things real collections do.   For example, there is no slicing support.  This was intentional.

To some degree, the only test of whether something is iterable is to call iter() on it and see what happens.  With Python's __getattr__ method, the only way to test for many behaviors is to attempt to call invoke them to see what happens.  That is why hasattr() has to invoke getattr().


Raymond
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20130920/a46f2633/attachment.html>

From mistersheik at gmail.com  Sat Sep 21 02:40:23 2013
From: mistersheik at gmail.com (Neil Girdhar)
Date: Fri, 20 Sep 2013 20:40:23 -0400
Subject: [Python-ideas] Introduce collections.Reiterable
In-Reply-To: <5354F5F4-8052-451E-BAFB-BED214484AF8@gmail.com>
References: <575f4071-1b5c-4a16-b36c-b5f925cdd2f7@googlegroups.com>
 <CADiSq7c2G8E47Hth-kQWd4W33dpb9YFiyLs+vs7Seaw20w6fHw@mail.gmail.com>
 <CAA68w_kuPS6xVjAbropBFyiQib3aH9Ke+PiW8ecWhSkmHsnXnQ@mail.gmail.com>
 <CADiSq7cVXAUi4BLpfa13QjhwNmX9JancrMjHSEEfKn_P=F-_ag@mail.gmail.com>
 <20130919121828.GK19939@ando>
 <CADiSq7cyp2=XWT2e+nsCR9Q2TqrqM5fdHUhFBwoTfHT4bwum9w@mail.gmail.com>
 <20130920094854.GO19939@ando> <l1idsm$dvp$1@ger.gmane.org>
 <84A21A05-BA4E-4324-98DB-6ADE7EB98D1D@gmail.com> <l1ingq$dov$1@ger.gmane.org>
 <5354F5F4-8052-451E-BAFB-BED214484AF8@gmail.com>
Message-ID: <CAA68w_nLweisPY3QBDWYtcUCygPyA_FvjMy4GCh0cAFa4sXLKQ@mail.gmail.com>

On Fri, Sep 20, 2013 at 8:34 PM, Raymond Hettinger <
raymond.hettinger at gmail.com> wrote:

>
> On Sep 20, 2013, at 4:59 PM, Terry Reedy <tjreedy at udel.edu> wrote:
>
> . The glossary might say that the older __getitem__ protocol is
> semi-deprecated (it is no longer used directly) but is adapted for
> back compatibility.
>
>
> It is NOT deprecated.
>
>
> And I did not suggest that is was. It is, however, not fully supported in
> that collections. Iterable does not recognize __getitem__ iterables and the
> same will be true of code that uses Iterable.
>
>
> The collections ABCs are all just a subset of things real collections do.
>   For example, there is no slicing support.  This was intentional.
>
> To some degree, the only test of whether something is iterable is to call
> iter() on it and see what happens.  With Python's __getattr__ method, the
> only way to test for many behaviors is to attempt to call invoke them to
> see what happens.  That is why hasattr() has to invoke getattr().
>
>
Is that how you see PEP 3119? It states that the "standardized test" if
something is iterable is precisely to use isinstance(x,
collections.Iterable), which is how I read these paragraphs:

On the other hand, one of the criticisms of inspection by classic OOP
theorists is the lack of formalisms and the ad hoc nature of what is being
inspected. In a language such as Python, in which almost any aspect of an
object can be reflected and directly accessed by external code, there are
many different ways to test whether an object conforms to a particular
protocol or not. For example, if asking 'is this object a mutable sequence
container?', one can look for a base class of 'list', or one can look for a
method named '__getitem__'. But note that although these tests may seem
obvious, neither of them are correct, as one generates false negatives, and
the other false positives.

The generally agreed-upon remedy is to standardize the tests, and group
them into a formal arrangement. This is most easily done by associating
with each class a set of standard testable properties, either via the
inheritance mechanism or some other means. Each test carries with it a set
of promises: it contains a promise about the general behavior of the class,
and a promise as to what other class methods will be available.

This PEP proposes a particular strategy for organizing these tests known as
Abstract Base Classes, or ABC. ABCs are simply Python classes that are
added into an object's inheritance tree to signal certain features of that
object to an external inspector. Tests are done using isinstance(), and the
presence of a particular ABC means that the test has passed.


>
> Raymond
>
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> https://mail.python.org/mailman/listinfo/python-ideas
>
> --
>
> ---
> You received this message because you are subscribed to a topic in the
> Google Groups "python-ideas" group.
> To unsubscribe from this topic, visit
> https://groups.google.com/d/topic/python-ideas/OumiLGDwRWA/unsubscribe.
> To unsubscribe from this group and all its topics, send an email to
> python-ideas+unsubscribe at googlegroups.com.
> For more options, visit https://groups.google.com/groups/opt_out.
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20130920/3552dbee/attachment.html>

From steve at pearwood.info  Sat Sep 21 02:52:21 2013
From: steve at pearwood.info (Steven D'Aprano)
Date: Sat, 21 Sep 2013 10:52:21 +1000
Subject: [Python-ideas] Introduce collections.Reiterable
In-Reply-To: <l1imst$88i$1@ger.gmane.org>
References: <CADiSq7c2G8E47Hth-kQWd4W33dpb9YFiyLs+vs7Seaw20w6fHw@mail.gmail.com>
 <CAA68w_kuPS6xVjAbropBFyiQib3aH9Ke+PiW8ecWhSkmHsnXnQ@mail.gmail.com>
 <CADiSq7cVXAUi4BLpfa13QjhwNmX9JancrMjHSEEfKn_P=F-_ag@mail.gmail.com>
 <20130919121828.GK19939@ando>
 <CADiSq7cyp2=XWT2e+nsCR9Q2TqrqM5fdHUhFBwoTfHT4bwum9w@mail.gmail.com>
 <20130920094854.GO19939@ando>
 <CACac1F-=X-nfKe9sYx-_U_Ties3nU3z4vqzs=DkdqfJFD0StiA@mail.gmail.com>
 <20130920111000.GQ19939@ando>
 <CAN8CLgnAWQ3iszEtvEo_FTydx9Un+Y+41Fe5fVsNdX70Ube9Ww@mail.gmail.com>
 <l1imst$88i$1@ger.gmane.org>
Message-ID: <20130921005221.GV19939@ando>

On Fri, Sep 20, 2013 at 07:48:48PM -0400, Terry Reedy wrote:
> On 9/20/2013 6:00 PM, Tim Delaney wrote:
> 
> >I think there is a distinction here between collections.Iterable (as a
> >defined ABC) and something that is "iterable" (lowercase "i"). As you've
> >noted, an "iterable" is "An object capable of returning its members one
> >at a time".
> >
> >So I think a valid definition of reiterable (barring pathological cases) 
> >is:
> >
> >     obj is not iter(obj)
> 
> If obj has a fake __getitem__, that will not work.

I don't understand what is "fake" about the following example. It is a 
*calculated* __getitem__, but that is perfectly legitimate. I suspect it 
is a buggy calculation, since obj[0] == obj[0] returns False, but that's 
another story.

To me, a "fake __getitem__" would be something like "__getitem__ = None", 
there only to fool hasattr() tests but not actually doing anything. So 
I'm not actually sure what you are getting at to call this "fake".

 
> class Cnt:
>     def __init__(self, maxn):
>         self.n = 0
>         self.maxn = maxn
>     def __getitem__(self, dummy):
>         n = self.n + 1
>         if n <= self.maxn:
>             self.n = n
>             return n
>         else:
>             raise IndexError
> 
> c3 = Cnt(3)
> print(c3 is not iter(c3), list(c3), list(c3))
> >>>
> True [1, 2, 3] []
> 
> Dismissing legal code as 'pathological', as more than one person has, 
> does not cut it as a design principle.

When I call something "pathological", I don't necessarily mean it is bad 
code. I mean it in the mathematical sense of being *either* bad/harmful 
or unexpected/unintuitive:

https://en.wikipedia.org/wiki/Pathological_%28mathematics%29

Perhaps I should use the term "exceptional" rather than "pathological", 
but that carries it's own baggage.

For instance, infinite iterators are (in my usage) pathological. You 
can't pass them to list(), but they are very useful in practice and 
shouldn't be dismissed as necessarily harmful.

The point is, I don't expect general-purpose Python functions to 
*necessarily* deal with every pathological/exceptional case. It is no 
fault of list() that it cannot convert an infinite iterator to a list, 
nor should list() include special code to detect and avoid infinite 
iterators, even if it could, which it cannot.

Cycles in lists are another example of pathology, but in this case, list 
repr *should* (and does) deal with it correctly:

py> L = []
py> L.append(L)
py> print(L)
[[...]]


-- 
Steven

From stephen at xemacs.org  Sat Sep 21 04:12:12 2013
From: stephen at xemacs.org (Stephen J. Turnbull)
Date: Sat, 21 Sep 2013 11:12:12 +0900
Subject: [Python-ideas] Introduce collections.Reiterable
In-Reply-To: <l1imst$88i$1@ger.gmane.org>
References: <575f4071-1b5c-4a16-b36c-b5f925cdd2f7@googlegroups.com>
 <CADiSq7c2G8E47Hth-kQWd4W33dpb9YFiyLs+vs7Seaw20w6fHw@mail.gmail.com>
 <CAA68w_kuPS6xVjAbropBFyiQib3aH9Ke+PiW8ecWhSkmHsnXnQ@mail.gmail.com>
 <CADiSq7cVXAUi4BLpfa13QjhwNmX9JancrMjHSEEfKn_P=F-_ag@mail.gmail.com>
 <20130919121828.GK19939@ando>
 <CADiSq7cyp2=XWT2e+nsCR9Q2TqrqM5fdHUhFBwoTfHT4bwum9w@mail.gmail.com>
 <20130920094854.GO19939@ando>
 <CACac1F-=X-nfKe9sYx-_U_Ties3nU3z4vqzs=DkdqfJFD0StiA@mail.gmail.com>
 <20130920111000.GQ19939@ando>
 <CAN8CLgnAWQ3iszEtvEo_FTydx9Un+Y+41Fe5fVsNdX70Ube9Ww@mail.gmail.com>
 <l1imst$88i$1@ger.gmane.org>
Message-ID: <87zjr63oub.fsf@uwakimon.sk.tsukuba.ac.jp>

Terry Reedy writes:

 > Dismissing legal code as 'pathological', as more than one person has, 
 > does not cut it as a design principle.

But you don't even need to write a class with __getitem__() to get
that behavior.

>>> l = [11, 12, 13]
>>> for i in l:
...  print(i)
...  if i%2 == 0:
...   l.remove(i)
... 
11
12
>>> l
[11, 13]
>>> 

Of course the iteration itself is probably buggy (ie, the writer
didn't mean to skip printing '13'), but in general iterables can
change themselves.

Neil himself seems to be of two minds about such cases.  On the one
hand, he said the above behavior is built in to list, so it's
acceptable to him.  (I think that's inconsistent: I would say the
property of being completely consumed is built in to iterator, so it
should be acceptable, too.)  On the other hand, he's defined a
reiterable as a collection that when iterated produces the same
objects in the same order.

Maybe what we really want is for copy.deepcopy to do the right thing
with iterables.  Then code that doesn't want to consume consumable
iterables can do a deepcopy (including replication of the closed-over
state of __next__() for iterators) before iterating.

Or perhaps the right thing is a copy.itercopy that creates a new
composite object as a shallow copy of everything except that it clones
the state of __next__() in case the object was an iterator to start
with.  

From stephen at xemacs.org  Sat Sep 21 04:41:10 2013
From: stephen at xemacs.org (Stephen J. Turnbull)
Date: Sat, 21 Sep 2013 11:41:10 +0900
Subject: [Python-ideas] Introduce collections.Reiterable
In-Reply-To: <CAN8CLgkZogwNzCLr91yiTmq=hR7KzRhLGNiSUJgrcok7-L7awQ@mail.gmail.com>
References: <575f4071-1b5c-4a16-b36c-b5f925cdd2f7@googlegroups.com>
 <CADiSq7c2G8E47Hth-kQWd4W33dpb9YFiyLs+vs7Seaw20w6fHw@mail.gmail.com>
 <CAA68w_kuPS6xVjAbropBFyiQib3aH9Ke+PiW8ecWhSkmHsnXnQ@mail.gmail.com>
 <CADiSq7cVXAUi4BLpfa13QjhwNmX9JancrMjHSEEfKn_P=F-_ag@mail.gmail.com>
 <20130919121828.GK19939@ando>
 <CADiSq7cyp2=XWT2e+nsCR9Q2TqrqM5fdHUhFBwoTfHT4bwum9w@mail.gmail.com>
 <20130920094854.GO19939@ando>
 <CACac1F-=X-nfKe9sYx-_U_Ties3nU3z4vqzs=DkdqfJFD0StiA@mail.gmail.com>
 <20130920111000.GQ19939@ando>
 <CAN8CLgnAWQ3iszEtvEo_FTydx9Un+Y+41Fe5fVsNdX70Ube9Ww@mail.gmail.com>
 <l1imst$88i$1@ger.gmane.org>
 <CAN8CLg=RQBXAJJbx5Ts+2D8ywHB_ziWj_T-VDMPWNbByETgrOQ@mail.gmail.com>
 <CAN8CLgkZogwNzCLr91yiTmq=hR7KzRhLGNiSUJgrcok7-L7awQ@mail.gmail.com>
Message-ID: <87y56q3ni1.fsf@uwakimon.sk.tsukuba.ac.jp>

Tim Delaney writes:

 > Also, pathological is probably not the best term to use. Instead,
 > substitute "deliberately breaks a well-established protocol".

Note that in Neil's use case (the OP) it's not deliberate.  His
function receives an iterable, it naively iterates it and (if an
iterator) consumes it, and then some other function loses.  Silently.

Also, as long as __getitem__(0) succeeds, this *is* the "sequence
protocol".  (A Sequence also has a __len__() method, but iterability
doesn't depend on that.)

I don't see why Python would deprecate this.  For example, consider
the sequence of factors of integers: [(1,2), (1,3), (1,2,2,4), (1,5),
(1,2,3,6), ...].  Factorization being in general a fairly expensive
operation, you might want to define this in terms of __getitem__() but
__len__() is infinite.  I admit this is a somewhat artificial example
(I don't know of non-academic applications for this sequence, although
factorization itself is very useful in applications like crypto).

From ethan at stoneleaf.us  Sat Sep 21 05:01:08 2013
From: ethan at stoneleaf.us (Ethan Furman)
Date: Fri, 20 Sep 2013 20:01:08 -0700
Subject: [Python-ideas] Introduce collections.Reiterable
In-Reply-To: <CAA68w_nLweisPY3QBDWYtcUCygPyA_FvjMy4GCh0cAFa4sXLKQ@mail.gmail.com>
References: <575f4071-1b5c-4a16-b36c-b5f925cdd2f7@googlegroups.com>
 <CADiSq7c2G8E47Hth-kQWd4W33dpb9YFiyLs+vs7Seaw20w6fHw@mail.gmail.com>
 <CAA68w_kuPS6xVjAbropBFyiQib3aH9Ke+PiW8ecWhSkmHsnXnQ@mail.gmail.com>
 <CADiSq7cVXAUi4BLpfa13QjhwNmX9JancrMjHSEEfKn_P=F-_ag@mail.gmail.com>
 <20130919121828.GK19939@ando>
 <CADiSq7cyp2=XWT2e+nsCR9Q2TqrqM5fdHUhFBwoTfHT4bwum9w@mail.gmail.com>
 <20130920094854.GO19939@ando> <l1idsm$dvp$1@ger.gmane.org>
 <84A21A05-BA4E-4324-98DB-6ADE7EB98D1D@gmail.com> <l1ingq$dov$1@ger.gmane.org>
 <5354F5F4-8052-451E-BAFB-BED214484AF8@gmail.com>
 <CAA68w_nLweisPY3QBDWYtcUCygPyA_FvjMy4GCh0cAFa4sXLKQ@mail.gmail.com>
Message-ID: <523D0BF4.5020404@stoneleaf.us>

On 09/20/2013 05:40 PM, Neil Girdhar wrote:
>
> [...]  Each test carries with it a set of promises: it contains a promise about the general behavior of
> the class, and a promise as to what other class methods will be available.
>
> [...]  Tests are done using isinstance(), and the presence of a particular ABC means that the test has passed.

So if the test passes, you know you're good.  Those paragraphs said nothing about the meaning of a failing test.

--
~Ethan~

From mistersheik at gmail.com  Sat Sep 21 06:00:14 2013
From: mistersheik at gmail.com (Neil Girdhar)
Date: Sat, 21 Sep 2013 00:00:14 -0400
Subject: [Python-ideas] Introduce collections.Reiterable
In-Reply-To: <523D0BF4.5020404@stoneleaf.us>
References: <575f4071-1b5c-4a16-b36c-b5f925cdd2f7@googlegroups.com>
 <CADiSq7c2G8E47Hth-kQWd4W33dpb9YFiyLs+vs7Seaw20w6fHw@mail.gmail.com>
 <CAA68w_kuPS6xVjAbropBFyiQib3aH9Ke+PiW8ecWhSkmHsnXnQ@mail.gmail.com>
 <CADiSq7cVXAUi4BLpfa13QjhwNmX9JancrMjHSEEfKn_P=F-_ag@mail.gmail.com>
 <20130919121828.GK19939@ando>
 <CADiSq7cyp2=XWT2e+nsCR9Q2TqrqM5fdHUhFBwoTfHT4bwum9w@mail.gmail.com>
 <20130920094854.GO19939@ando> <l1idsm$dvp$1@ger.gmane.org>
 <84A21A05-BA4E-4324-98DB-6ADE7EB98D1D@gmail.com> <l1ingq$dov$1@ger.gmane.org>
 <5354F5F4-8052-451E-BAFB-BED214484AF8@gmail.com>
 <CAA68w_nLweisPY3QBDWYtcUCygPyA_FvjMy4GCh0cAFa4sXLKQ@mail.gmail.com>
 <523D0BF4.5020404@stoneleaf.us>
Message-ID: <CAA68w_mhc2Mu0rj3ooCs1OXPohrQDUwtoe_rzeJVOa+d1CODhA@mail.gmail.com>

If someone allows their class to fail the standardized test for
Iterable/Reiterable/Sequence, that class doesn't deserve to be treated as
one.  (Anyone can register their class as a subclass of the ABCs, or more
simply inherit from one.)


On Fri, Sep 20, 2013 at 11:01 PM, Ethan Furman <ethan at stoneleaf.us> wrote:

> On 09/20/2013 05:40 PM, Neil Girdhar wrote:
>
>>
>> [...]  Each test carries with it a set of promises: it contains a promise
>> about the general behavior of
>>
>> the class, and a promise as to what other class methods will be available.
>>
>> [...]  Tests are done using isinstance(), and the presence of a
>> particular ABC means that the test has passed.
>>
>
> So if the test passes, you know you're good.  Those paragraphs said
> nothing about the meaning of a failing test.
>
> --
> ~Ethan~
>
> ______________________________**_________________
> Python-ideas mailing list
> Python-ideas at python.org
> https://mail.python.org/**mailman/listinfo/python-ideas<https://mail.python.org/mailman/listinfo/python-ideas>
>
> --
>
> --- You received this message because you are subscribed to a topic in the
> Google Groups "python-ideas" group.
> To unsubscribe from this topic, visit https://groups.google.com/d/**
> topic/python-ideas/**OumiLGDwRWA/unsubscribe<https://groups.google.com/d/topic/python-ideas/OumiLGDwRWA/unsubscribe>
> .
> To unsubscribe from this group and all its topics, send an email to
> python-ideas+unsubscribe@**googlegroups.com<python-ideas%2Bunsubscribe at googlegroups.com>
> .
> For more options, visit https://groups.google.com/**groups/opt_out<https://groups.google.com/groups/opt_out>
> .
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20130921/782c4107/attachment.html>

From steve at pearwood.info  Sat Sep 21 06:09:04 2013
From: steve at pearwood.info (Steven D'Aprano)
Date: Sat, 21 Sep 2013 14:09:04 +1000
Subject: [Python-ideas] Introduce collections.Reiterable
In-Reply-To: <CAA68w_mhc2Mu0rj3ooCs1OXPohrQDUwtoe_rzeJVOa+d1CODhA@mail.gmail.com>
References: <20130919121828.GK19939@ando>
 <CADiSq7cyp2=XWT2e+nsCR9Q2TqrqM5fdHUhFBwoTfHT4bwum9w@mail.gmail.com>
 <20130920094854.GO19939@ando> <l1idsm$dvp$1@ger.gmane.org>
 <84A21A05-BA4E-4324-98DB-6ADE7EB98D1D@gmail.com> <l1ingq$dov$1@ger.gmane.org>
 <5354F5F4-8052-451E-BAFB-BED214484AF8@gmail.com>
 <CAA68w_nLweisPY3QBDWYtcUCygPyA_FvjMy4GCh0cAFa4sXLKQ@mail.gmail.com>
 <523D0BF4.5020404@stoneleaf.us>
 <CAA68w_mhc2Mu0rj3ooCs1OXPohrQDUwtoe_rzeJVOa+d1CODhA@mail.gmail.com>
Message-ID: <20130921040904.GW19939@ando>

On Sat, Sep 21, 2013 at 12:00:14AM -0400, Neil Girdhar wrote:
> If someone allows their class to fail the standardized test for
> Iterable/Reiterable/Sequence, that class doesn't deserve to be treated as
> one.  (Anyone can register their class as a subclass of the ABCs, or more
> simply inherit from one.)

This is Python, and duck-typing rules, not Java-like type checking. If 
you want a language with strict type checking designed by theorists, try 
Haskell.

The ultimate test in Python of whether something is iterable or not is 
to try iterating over it, and see if it succeeds or not. If it iterates 
like a duck, that's good enough to be treated as a duck.


-- 
Steven

From mistersheik at gmail.com  Sat Sep 21 06:16:41 2013
From: mistersheik at gmail.com (Neil Girdhar)
Date: Sat, 21 Sep 2013 00:16:41 -0400
Subject: [Python-ideas] Introduce collections.Reiterable
In-Reply-To: <20130921040904.GW19939@ando>
References: <20130919121828.GK19939@ando>
 <CADiSq7cyp2=XWT2e+nsCR9Q2TqrqM5fdHUhFBwoTfHT4bwum9w@mail.gmail.com>
 <20130920094854.GO19939@ando> <l1idsm$dvp$1@ger.gmane.org>
 <84A21A05-BA4E-4324-98DB-6ADE7EB98D1D@gmail.com> <l1ingq$dov$1@ger.gmane.org>
 <5354F5F4-8052-451E-BAFB-BED214484AF8@gmail.com>
 <CAA68w_nLweisPY3QBDWYtcUCygPyA_FvjMy4GCh0cAFa4sXLKQ@mail.gmail.com>
 <523D0BF4.5020404@stoneleaf.us>
 <CAA68w_mhc2Mu0rj3ooCs1OXPohrQDUwtoe_rzeJVOa+d1CODhA@mail.gmail.com>
 <20130921040904.GW19939@ando>
Message-ID: <CAA68w_n1w0noOmRn-at4CfoHkHUkPQEzQB8S3F59Onn9+sm4qw@mail.gmail.com>

You're right that you should go ahead and use something however you want
to.  However, there are plenty of times where you can't do that, e.g., you
want to know if something is callable before calling it, and similarly if
something is reiterable before iterating it and exhausting.  That is the
purpose of collections.abc, and that's what I thought we were discussing.
 Could you make mistakes trying to look ahead like this?  Sure.  An object
could appear callable only to raise NotImplementedError on calling it.
 Looking ahead does not have to be foolproof.  This is Python, and of
course (almost) *any test* can be fooled.  That doesn't just go for
reiterability, it goes for callability as well.

Best,
Neil


On Sat, Sep 21, 2013 at 12:09 AM, Steven D'Aprano <steve at pearwood.info>wrote:

> On Sat, Sep 21, 2013 at 12:00:14AM -0400, Neil Girdhar wrote:
> > If someone allows their class to fail the standardized test for
> > Iterable/Reiterable/Sequence, that class doesn't deserve to be treated as
> > one.  (Anyone can register their class as a subclass of the ABCs, or more
> > simply inherit from one.)
>
> This is Python, and duck-typing rules, not Java-like type checking. If
> you want a language with strict type checking designed by theorists, try
> Haskell.
>
> The ultimate test in Python of whether something is iterable or not is
> to try iterating over it, and see if it succeeds or not. If it iterates
> like a duck, that's good enough to be treated as a duck.
>
>
> --
> Steven
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> https://mail.python.org/mailman/listinfo/python-ideas
>
> --
>
> ---
> You received this message because you are subscribed to a topic in the
> Google Groups "python-ideas" group.
> To unsubscribe from this topic, visit
> https://groups.google.com/d/topic/python-ideas/OumiLGDwRWA/unsubscribe.
> To unsubscribe from this group and all its topics, send an email to
> python-ideas+unsubscribe at googlegroups.com.
> For more options, visit https://groups.google.com/groups/opt_out.
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20130921/d3ea426c/attachment-0001.html>

From mistersheik at gmail.com  Sat Sep 21 06:20:39 2013
From: mistersheik at gmail.com (Neil Girdhar)
Date: Sat, 21 Sep 2013 00:20:39 -0400
Subject: [Python-ideas] Introduce collections.Reiterable
In-Reply-To: <87y56q3ni1.fsf@uwakimon.sk.tsukuba.ac.jp>
References: <575f4071-1b5c-4a16-b36c-b5f925cdd2f7@googlegroups.com>
 <CADiSq7c2G8E47Hth-kQWd4W33dpb9YFiyLs+vs7Seaw20w6fHw@mail.gmail.com>
 <CAA68w_kuPS6xVjAbropBFyiQib3aH9Ke+PiW8ecWhSkmHsnXnQ@mail.gmail.com>
 <CADiSq7cVXAUi4BLpfa13QjhwNmX9JancrMjHSEEfKn_P=F-_ag@mail.gmail.com>
 <20130919121828.GK19939@ando>
 <CADiSq7cyp2=XWT2e+nsCR9Q2TqrqM5fdHUhFBwoTfHT4bwum9w@mail.gmail.com>
 <20130920094854.GO19939@ando>
 <CACac1F-=X-nfKe9sYx-_U_Ties3nU3z4vqzs=DkdqfJFD0StiA@mail.gmail.com>
 <20130920111000.GQ19939@ando>
 <CAN8CLgnAWQ3iszEtvEo_FTydx9Un+Y+41Fe5fVsNdX70Ube9Ww@mail.gmail.com>
 <l1imst$88i$1@ger.gmane.org>
 <CAN8CLg=RQBXAJJbx5Ts+2D8ywHB_ziWj_T-VDMPWNbByETgrOQ@mail.gmail.com>
 <CAN8CLgkZogwNzCLr91yiTmq=hR7KzRhLGNiSUJgrcok7-L7awQ@mail.gmail.com>
 <87y56q3ni1.fsf@uwakimon.sk.tsukuba.ac.jp>
Message-ID: <CAA68w_mBaQ_+uUmfqDvs85qhudWLjCGXCw0V4p2_r6wdWikTpg@mail.gmail.com>

I can humbly suggest why Python would deprecate the sequence protocol:
there "should be one obvious way" to answer iter(), and in my opinion
that's the  __iter__()  method.  I considered infinite iterators, and if
you happen to have  __getitem__ written, you can trivially write an
__iter__ function as follows:

def __iter__(self):
     return (self.__getitem__(x) for x in itertools.count())

Now your class will be Iterable in the abc sense, and no longer relies on
the sequence protocol

Best,
Neil


On Fri, Sep 20, 2013 at 10:41 PM, Stephen J. Turnbull <stephen at xemacs.org>wrote:

> Tim Delaney writes:
>
>  > Also, pathological is probably not the best term to use. Instead,
>  > substitute "deliberately breaks a well-established protocol".
>
> Note that in Neil's use case (the OP) it's not deliberate.  His
> function receives an iterable, it naively iterates it and (if an
> iterator) consumes it, and then some other function loses.  Silently.
>
> Also, as long as __getitem__(0) succeeds, this *is* the "sequence
> protocol".  (A Sequence also has a __len__() method, but iterability
> doesn't depend on that.)
>
> I don't see why Python would deprecate this.  For example, consider
> the sequence of factors of integers: [(1,2), (1,3), (1,2,2,4), (1,5),
> (1,2,3,6), ...].  Factorization being in general a fairly expensive
> operation, you might want to define this in terms of __getitem__() but
> __len__() is infinite.  I admit this is a somewhat artificial example
> (I don't know of non-academic applications for this sequence, although
> factorization itself is very useful in applications like crypto).
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> https://mail.python.org/mailman/listinfo/python-ideas
>
> --
>
> ---
> You received this message because you are subscribed to a topic in the
> Google Groups "python-ideas" group.
> To unsubscribe from this topic, visit
> https://groups.google.com/d/topic/python-ideas/OumiLGDwRWA/unsubscribe.
> To unsubscribe from this group and all its topics, send an email to
> python-ideas+unsubscribe at googlegroups.com.
> For more options, visit https://groups.google.com/groups/opt_out.
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20130921/0bc07c3e/attachment.html>

From abarnert at yahoo.com  Sat Sep 21 06:18:29 2013
From: abarnert at yahoo.com (Andrew Barnert)
Date: Fri, 20 Sep 2013 21:18:29 -0700
Subject: [Python-ideas] Introduce collections.Reiterable
In-Reply-To: <l1idsm$dvp$1@ger.gmane.org>
References: <575f4071-1b5c-4a16-b36c-b5f925cdd2f7@googlegroups.com>
 <CADiSq7c2G8E47Hth-kQWd4W33dpb9YFiyLs+vs7Seaw20w6fHw@mail.gmail.com>
 <CAA68w_kuPS6xVjAbropBFyiQib3aH9Ke+PiW8ecWhSkmHsnXnQ@mail.gmail.com>
 <CADiSq7cVXAUi4BLpfa13QjhwNmX9JancrMjHSEEfKn_P=F-_ag@mail.gmail.com>
 <20130919121828.GK19939@ando>
 <CADiSq7cyp2=XWT2e+nsCR9Q2TqrqM5fdHUhFBwoTfHT4bwum9w@mail.gmail.com>
 <20130920094854.GO19939@ando> <l1idsm$dvp$1@ger.gmane.org>
Message-ID: <204C6E80-AF20-4D5C-B313-0B009E924859@yahoo.com>

On Sep 20, 2013, at 14:15, Terry Reedy <tjreedy at udel.edu> wrote:

>> False
>> py> list(s)  # definitely iterable
>> [1000, 1001, 1002, 1003, 1004]
> 
> I tested and iter() recognizes Seqs as iterables:
> 
> for i in iter(Seq()): print(i)
> <same numbers as above>
> 
> It does, however, wrap them in an adaptor iterator class

What else did you expect? Sequences are iterables, but they aren't iterators. So calling iter on one can't return the sequence itself.

>>>> type(iter(Seq()))
> <class 'iterator'>
> (which I was not really aware of before ;-) with proper __iter__ and __next__ methods
>>>> si is iter(si)
> True
>>>> next(si)
> 1000

Having an __iter__ that returns itself and a __next__ is the definition of what an iterator is. And returning an iterator is the whole point of the iter function. So what else could it do in this case?

Think about how you'd implement iter in pure python. You'd try to return its __iter__(), and on AttributeError, you'd return a generator. So the C implementation does the same thing, but, as usual, substitutes a custom C iterator for a generator.

From mistersheik at gmail.com  Sat Sep 21 06:23:54 2013
From: mistersheik at gmail.com (Neil Girdhar)
Date: Sat, 21 Sep 2013 00:23:54 -0400
Subject: [Python-ideas] Introduce collections.Reiterable
In-Reply-To: <87zjr63oub.fsf@uwakimon.sk.tsukuba.ac.jp>
References: <575f4071-1b5c-4a16-b36c-b5f925cdd2f7@googlegroups.com>
 <CADiSq7c2G8E47Hth-kQWd4W33dpb9YFiyLs+vs7Seaw20w6fHw@mail.gmail.com>
 <CAA68w_kuPS6xVjAbropBFyiQib3aH9Ke+PiW8ecWhSkmHsnXnQ@mail.gmail.com>
 <CADiSq7cVXAUi4BLpfa13QjhwNmX9JancrMjHSEEfKn_P=F-_ag@mail.gmail.com>
 <20130919121828.GK19939@ando>
 <CADiSq7cyp2=XWT2e+nsCR9Q2TqrqM5fdHUhFBwoTfHT4bwum9w@mail.gmail.com>
 <20130920094854.GO19939@ando>
 <CACac1F-=X-nfKe9sYx-_U_Ties3nU3z4vqzs=DkdqfJFD0StiA@mail.gmail.com>
 <20130920111000.GQ19939@ando>
 <CAN8CLgnAWQ3iszEtvEo_FTydx9Un+Y+41Fe5fVsNdX70Ube9Ww@mail.gmail.com>
 <l1imst$88i$1@ger.gmane.org> <87zjr63oub.fsf@uwakimon.sk.tsukuba.ac.jp>
Message-ID: <CAA68w_=fhyk8Wh2Mqx4-09cTaPhiqFTS3ZVjL4tez9X9-9Yx+Q@mail.gmail.com>

I appreciate the discussion illuminating various aspects of this I hadn't
considered. Finally, what I think I want is for
* all sequences
* all views
* numpy arrays
to answer yes to reiterable, and
* all generators
to answer no to reiterable.

Best, Neil

On Fri, Sep 20, 2013 at 10:12 PM, Stephen J. Turnbull <stephen at xemacs.org>wrote:

> Terry Reedy writes:
>
>  > Dismissing legal code as 'pathological', as more than one person has,
>  > does not cut it as a design principle.
>
> But you don't even need to write a class with __getitem__() to get
> that behavior.
>
> >>> l = [11, 12, 13]
> >>> for i in l:
> ...  print(i)
> ...  if i%2 == 0:
> ...   l.remove(i)
> ...
> 11
> 12
> >>> l
> [11, 13]
> >>>
>
> Of course the iteration itself is probably buggy (ie, the writer
> didn't mean to skip printing '13'), but in general iterables can
> change themselves.
>
> Neil himself seems to be of two minds about such cases.  On the one
> hand, he said the above behavior is built in to list, so it's
> acceptable to him.  (I think that's inconsistent: I would say the
> property of being completely consumed is built in to iterator, so it
> should be acceptable, too.)  On the other hand, he's defined a
> reiterable as a collection that when iterated produces the same
> objects in the same order.
>
> Maybe what we really want is for copy.deepcopy to do the right thing
> with iterables.  Then code that doesn't want to consume consumable
> iterables can do a deepcopy (including replication of the closed-over
> state of __next__() for iterators) before iterating.
>
> Or perhaps the right thing is a copy.itercopy that creates a new
> composite object as a shallow copy of everything except that it clones
> the state of __next__() in case the object was an iterator to start
> with.
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> https://mail.python.org/mailman/listinfo/python-ideas
>
> --
>
> ---
> You received this message because you are subscribed to a topic in the
> Google Groups "python-ideas" group.
> To unsubscribe from this topic, visit
> https://groups.google.com/d/topic/python-ideas/OumiLGDwRWA/unsubscribe.
> To unsubscribe from this group and all its topics, send an email to
> python-ideas+unsubscribe at googlegroups.com.
> For more options, visit https://groups.google.com/groups/opt_out.
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20130921/9e70482b/attachment.html>

From abarnert at yahoo.com  Sat Sep 21 06:43:26 2013
From: abarnert at yahoo.com (Andrew Barnert)
Date: Fri, 20 Sep 2013 21:43:26 -0700
Subject: [Python-ideas] Introduce collections.Reiterable
In-Reply-To: <CAA68w_n1w0noOmRn-at4CfoHkHUkPQEzQB8S3F59Onn9+sm4qw@mail.gmail.com>
References: <20130919121828.GK19939@ando>
 <CADiSq7cyp2=XWT2e+nsCR9Q2TqrqM5fdHUhFBwoTfHT4bwum9w@mail.gmail.com>
 <20130920094854.GO19939@ando> <l1idsm$dvp$1@ger.gmane.org>
 <84A21A05-BA4E-4324-98DB-6ADE7EB98D1D@gmail.com> <l1ingq$dov$1@ger.gmane.org>
 <5354F5F4-8052-451E-BAFB-BED214484AF8@gmail.com>
 <CAA68w_nLweisPY3QBDWYtcUCygPyA_FvjMy4GCh0cAFa4sXLKQ@mail.gmail.com>
 <523D0BF4.5020404@stoneleaf.us>
 <CAA68w_mhc2Mu0rj3ooCs1OXPohrQDUwtoe_rzeJVOa+d1CODhA@mail.gmail.com>
 <20130921040904.GW19939@ando>
 <CAA68w_n1w0noOmRn-at4CfoHkHUkPQEzQB8S3F59Onn9+sm4qw@mail.gmail.com>
Message-ID: <DE863B0E-7570-44A5-BC1C-2B65F8756F8F@yahoo.com>

On Sep 20, 2013, at 21:16, Neil Girdhar <mistersheik at gmail.com> wrote:

> You're right that you should go ahead and use something however you want to.  However, there are plenty of times where you can't do that, e.g., you want to know if something is callable before calling it,

Why? What's the harm in just calling it and handling the exception? And surely, if you really need to LBYL here, you need to know that it's callable with the argument you plan to pass it. 

There are good uses for checking if something is callable, but this isn't a good example. And it's very different from your other example.

> and similarly if something is reiterable before iterating it and exhausting.

This one is different. You can't just handle failure, because (a) there's no unambiguous sign of failure, and (b) it's too late to deal with it if you've already exhausted the iterator. 

However, if you just turn the test around, it _is_ syntactically checkable: if "isinstance(it, Iterator)", or "iter(it) is it" or "hasattr(it, __next__)" or "next(it)" doesn't raise... then you have to do a single-pass algorithm or tee the values or make a list or whatever.

Either Reiterable is just Iterable and not Iterator (barring any flaws in the definition of Iterable, which is a separate problem), or it's not an abstract type.

And if it's just Iterable and not Iterator, besides being complicated to implement (you can't inherit from the negation of a class), it's also more complicated to use. The obvious use case is: If you get an Iterator, you have to tee, make a list, use a one-pass algorithm instead of two-pass, whatever. Rewriting that instead as if you get an Iterable but it's not a Reiterable buys you nothing but verbosity. Turning it around so if you get a Reiterable you can skip the fallback just means a double negative that's harder to process.

>  That is the purpose of collections.abc, and that's what I thought we were discussing.  Could you make mistakes trying to look ahead like this?  Sure.  An object could appear callable only to raise NotImplementedError on calling it.  Looking ahead does not have to be foolproof.  This is Python, and of course (almost) *any test* can be fooled.  That doesn't just go for reiterability, it goes for callability as well.
> 
> Best,
> Neil
> 
> 
> On Sat, Sep 21, 2013 at 12:09 AM, Steven D'Aprano <steve at pearwood.info> wrote:
>> On Sat, Sep 21, 2013 at 12:00:14AM -0400, Neil Girdhar wrote:
>> > If someone allows their class to fail the standardized test for
>> > Iterable/Reiterable/Sequence, that class doesn't deserve to be treated as
>> > one.  (Anyone can register their class as a subclass of the ABCs, or more
>> > simply inherit from one.)
>> 
>> This is Python, and duck-typing rules, not Java-like type checking. If
>> you want a language with strict type checking designed by theorists, try
>> Haskell.
>> 
>> The ultimate test in Python of whether something is iterable or not is
>> to try iterating over it, and see if it succeeds or not. If it iterates
>> like a duck, that's good enough to be treated as a duck.
>> 
>> 
>> --
>> Steven
>> _______________________________________________
>> Python-ideas mailing list
>> Python-ideas at python.org
>> https://mail.python.org/mailman/listinfo/python-ideas
>> 
>> --
>> 
>> ---
>> You received this message because you are subscribed to a topic in the Google Groups "python-ideas" group.
>> To unsubscribe from this topic, visit https://groups.google.com/d/topic/python-ideas/OumiLGDwRWA/unsubscribe.
>> To unsubscribe from this group and all its topics, send an email to python-ideas+unsubscribe at googlegroups.com.
>> For more options, visit https://groups.google.com/groups/opt_out.
> 
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> https://mail.python.org/mailman/listinfo/python-ideas
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20130920/96f5fd06/attachment-0001.html>

From mistersheik at gmail.com  Sat Sep 21 06:52:29 2013
From: mistersheik at gmail.com (Neil Girdhar)
Date: Sat, 21 Sep 2013 00:52:29 -0400
Subject: [Python-ideas] Introduce collections.Reiterable
In-Reply-To: <F6683518-F434-4B96-8D10-E424D7F53778@yahoo.com>
References: <575f4071-1b5c-4a16-b36c-b5f925cdd2f7@googlegroups.com>
 <CADiSq7c2G8E47Hth-kQWd4W33dpb9YFiyLs+vs7Seaw20w6fHw@mail.gmail.com>
 <CAA68w_kuPS6xVjAbropBFyiQib3aH9Ke+PiW8ecWhSkmHsnXnQ@mail.gmail.com>
 <CADiSq7cVXAUi4BLpfa13QjhwNmX9JancrMjHSEEfKn_P=F-_ag@mail.gmail.com>
 <20130919121828.GK19939@ando>
 <CADiSq7cyp2=XWT2e+nsCR9Q2TqrqM5fdHUhFBwoTfHT4bwum9w@mail.gmail.com>
 <20130920094854.GO19939@ando>
 <CACac1F-=X-nfKe9sYx-_U_Ties3nU3z4vqzs=DkdqfJFD0StiA@mail.gmail.com>
 <20130920111000.GQ19939@ando>
 <CAN8CLgnAWQ3iszEtvEo_FTydx9Un+Y+41Fe5fVsNdX70Ube9Ww@mail.gmail.com>
 <l1imst$88i$1@ger.gmane.org> <87zjr63oub.fsf@uwakimon.sk.tsukuba.ac.jp>
 <CAA68w_=fhyk8Wh2Mqx4-09cTaPhiqFTS3ZVjL4tez9X9-9Yx+Q@mail.gmail.com>
 <F6683518-F434-4B96-8D10-E424D7F53778@yahoo.com>
Message-ID: <CAA68w_mkCqCjjFohiWQ2tj3SJxFA+uoAiqOEep30gMCBw33gtQ@mail.gmail.com>

We discussed this upthread: I only want "not iterator" if not iterator
promises reiterability. Right now, we have what may be a happy accident
that can easily be violated by someone else.

Best,
Neil


On Sat, Sep 21, 2013 at 12:50 AM, Andrew Barnert <abarnert at yahoo.com> wrote:

> On Sep 20, 2013, at 21:23, Neil Girdhar <mistersheik at gmail.com> wrote:
>
> I appreciate the discussion illuminating various aspects of this I hadn't
> considered. Finally, what I think I want is for
> * all sequences
> * all views
> * numpy arrays
> to answer yes to reiterable, and
> * all generators
> to answer no to reiterable.
>
>
> All sequences, views, and numpy arrays answer no to iterator (and so do
> sets, mappings, etc.), and all generators answer yes (and so do the
> iterators you get back from calling iter on a sequence, map, filter, your
> favorite itertools function, etc.)
>
> So you just want "not iterator". Even Haskell doesn't attempt to provide
> negative types like that. (And you can very easily show that it's iterator
> that's the normal type: it's syntactically checkable in various ways--e.g.,
> it.hasattr('__next__'), but the only positive way to check reiterable is
> not just semantic, but destructive.)
>
> Best, Neil
>
> On Fri, Sep 20, 2013 at 10:12 PM, Stephen J. Turnbull <stephen at xemacs.org>wrote:
>
>> Terry Reedy writes:
>>
>>  > Dismissing legal code as 'pathological', as more than one person has,
>>  > does not cut it as a design principle.
>>
>> But you don't even need to write a class with __getitem__() to get
>> that behavior.
>>
>> >>> l = [11, 12, 13]
>> >>> for i in l:
>> ...  print(i)
>> ...  if i%2 == 0:
>> ...   l.remove(i)
>> ...
>> 11
>> 12
>> >>> l
>> [11, 13]
>> >>>
>>
>> Of course the iteration itself is probably buggy (ie, the writer
>> didn't mean to skip printing '13'), but in general iterables can
>> change themselves.
>>
>> Neil himself seems to be of two minds about such cases.  On the one
>> hand, he said the above behavior is built in to list, so it's
>> acceptable to him.  (I think that's inconsistent: I would say the
>> property of being completely consumed is built in to iterator, so it
>> should be acceptable, too.)  On the other hand, he's defined a
>> reiterable as a collection that when iterated produces the same
>> objects in the same order.
>>
>> Maybe what we really want is for copy.deepcopy to do the right thing
>> with iterables.  Then code that doesn't want to consume consumable
>> iterables can do a deepcopy (including replication of the closed-over
>> state of __next__() for iterators) before iterating.
>>
>> Or perhaps the right thing is a copy.itercopy that creates a new
>> composite object as a shallow copy of everything except that it clones
>> the state of __next__() in case the object was an iterator to start
>> with.
>> _______________________________________________
>> Python-ideas mailing list
>> Python-ideas at python.org
>> https://mail.python.org/mailman/listinfo/python-ideas
>>
>> --
>>
>> ---
>> You received this message because you are subscribed to a topic in the
>> Google Groups "python-ideas" group.
>> To unsubscribe from this topic, visit
>> https://groups.google.com/d/topic/python-ideas/OumiLGDwRWA/unsubscribe.
>> To unsubscribe from this group and all its topics, send an email to
>> python-ideas+unsubscribe at googlegroups.com.
>> For more options, visit https://groups.google.com/groups/opt_out.
>>
>
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> https://mail.python.org/mailman/listinfo/python-ideas
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20130921/6b30d136/attachment.html>

From abarnert at yahoo.com  Sat Sep 21 06:50:25 2013
From: abarnert at yahoo.com (Andrew Barnert)
Date: Fri, 20 Sep 2013 21:50:25 -0700
Subject: [Python-ideas] Introduce collections.Reiterable
In-Reply-To: <CAA68w_=fhyk8Wh2Mqx4-09cTaPhiqFTS3ZVjL4tez9X9-9Yx+Q@mail.gmail.com>
References: <575f4071-1b5c-4a16-b36c-b5f925cdd2f7@googlegroups.com>
 <CADiSq7c2G8E47Hth-kQWd4W33dpb9YFiyLs+vs7Seaw20w6fHw@mail.gmail.com>
 <CAA68w_kuPS6xVjAbropBFyiQib3aH9Ke+PiW8ecWhSkmHsnXnQ@mail.gmail.com>
 <CADiSq7cVXAUi4BLpfa13QjhwNmX9JancrMjHSEEfKn_P=F-_ag@mail.gmail.com>
 <20130919121828.GK19939@ando>
 <CADiSq7cyp2=XWT2e+nsCR9Q2TqrqM5fdHUhFBwoTfHT4bwum9w@mail.gmail.com>
 <20130920094854.GO19939@ando>
 <CACac1F-=X-nfKe9sYx-_U_Ties3nU3z4vqzs=DkdqfJFD0StiA@mail.gmail.com>
 <20130920111000.GQ19939@ando>
 <CAN8CLgnAWQ3iszEtvEo_FTydx9Un+Y+41Fe5fVsNdX70Ube9Ww@mail.gmail.com>
 <l1imst$88i$1@ger.gmane.org> <87zjr63oub.fsf@uwakimon.sk.tsukuba.ac.jp>
 <CAA68w_=fhyk8Wh2Mqx4-09cTaPhiqFTS3ZVjL4tez9X9-9Yx+Q@mail.gmail.com>
Message-ID: <F6683518-F434-4B96-8D10-E424D7F53778@yahoo.com>

On Sep 20, 2013, at 21:23, Neil Girdhar <mistersheik at gmail.com> wrote:

> I appreciate the discussion illuminating various aspects of this I hadn't considered. Finally, what I think I want is for
> * all sequences
> * all views
> * numpy arrays
> to answer yes to reiterable, and
> * all generators
> to answer no to reiterable.

All sequences, views, and numpy arrays answer no to iterator (and so do sets, mappings, etc.), and all generators answer yes (and so do the iterators you get back from calling iter on a sequence, map, filter, your favorite itertools function, etc.)

So you just want "not iterator". Even Haskell doesn't attempt to provide negative types like that. (And you can very easily show that it's iterator that's the normal type: it's syntactically checkable in various ways--e.g., it.hasattr('__next__'), but the only positive way to check reiterable is not just semantic, but destructive.)

> Best, Neil
> 
> On Fri, Sep 20, 2013 at 10:12 PM, Stephen J. Turnbull <stephen at xemacs.org> wrote:
>> Terry Reedy writes:
>> 
>>  > Dismissing legal code as 'pathological', as more than one person has,
>>  > does not cut it as a design principle.
>> 
>> But you don't even need to write a class with __getitem__() to get
>> that behavior.
>> 
>> >>> l = [11, 12, 13]
>> >>> for i in l:
>> ...  print(i)
>> ...  if i%2 == 0:
>> ...   l.remove(i)
>> ...
>> 11
>> 12
>> >>> l
>> [11, 13]
>> >>>
>> 
>> Of course the iteration itself is probably buggy (ie, the writer
>> didn't mean to skip printing '13'), but in general iterables can
>> change themselves.
>> 
>> Neil himself seems to be of two minds about such cases.  On the one
>> hand, he said the above behavior is built in to list, so it's
>> acceptable to him.  (I think that's inconsistent: I would say the
>> property of being completely consumed is built in to iterator, so it
>> should be acceptable, too.)  On the other hand, he's defined a
>> reiterable as a collection that when iterated produces the same
>> objects in the same order.
>> 
>> Maybe what we really want is for copy.deepcopy to do the right thing
>> with iterables.  Then code that doesn't want to consume consumable
>> iterables can do a deepcopy (including replication of the closed-over
>> state of __next__() for iterators) before iterating.
>> 
>> Or perhaps the right thing is a copy.itercopy that creates a new
>> composite object as a shallow copy of everything except that it clones
>> the state of __next__() in case the object was an iterator to start
>> with.
>> _______________________________________________
>> Python-ideas mailing list
>> Python-ideas at python.org
>> https://mail.python.org/mailman/listinfo/python-ideas
>> 
>> --
>> 
>> ---
>> You received this message because you are subscribed to a topic in the Google Groups "python-ideas" group.
>> To unsubscribe from this topic, visit https://groups.google.com/d/topic/python-ideas/OumiLGDwRWA/unsubscribe.
>> To unsubscribe from this group and all its topics, send an email to python-ideas+unsubscribe at googlegroups.com.
>> For more options, visit https://groups.google.com/groups/opt_out.
> 
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> https://mail.python.org/mailman/listinfo/python-ideas
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20130920/2160d53a/attachment-0001.html>

From ben+python at benfinney.id.au  Sat Sep 21 06:54:50 2013
From: ben+python at benfinney.id.au (Ben Finney)
Date: Sat, 21 Sep 2013 14:54:50 +1000
Subject: [Python-ideas] Introduce collections.Reiterable
References: <20130919121828.GK19939@ando>
 <CADiSq7cyp2=XWT2e+nsCR9Q2TqrqM5fdHUhFBwoTfHT4bwum9w@mail.gmail.com>
 <20130920094854.GO19939@ando> <l1idsm$dvp$1@ger.gmane.org>
 <84A21A05-BA4E-4324-98DB-6ADE7EB98D1D@gmail.com>
 <l1ingq$dov$1@ger.gmane.org>
 <5354F5F4-8052-451E-BAFB-BED214484AF8@gmail.com>
 <CAA68w_nLweisPY3QBDWYtcUCygPyA_FvjMy4GCh0cAFa4sXLKQ@mail.gmail.com>
 <523D0BF4.5020404@stoneleaf.us>
 <CAA68w_mhc2Mu0rj3ooCs1OXPohrQDUwtoe_rzeJVOa+d1CODhA@mail.gmail.com>
 <20130921040904.GW19939@ando>
 <CAA68w_n1w0noOmRn-at4CfoHkHUkPQEzQB8S3F59Onn9+sm4qw@mail.gmail.com>
Message-ID: <7wpps2wz8l.fsf@benfinney.id.au>

Neil Girdhar <mistersheik at gmail.com>
writes:

> However, there are plenty of times where you can't do that, e.g., you
> want to know if something is callable before calling it

What is a concrete example of *needing* to know whether an object is
callable? Why not just use the object *as if it is* callable, and the
TypeError will propagate back to whoever fed you the object if it's not?

> and similarly if something is reiterable before iterating it and
> exhausting.

I have somewhat more sympathy for this desire; duck typing doesn't work
so well for this, because by the time the iterable is exhausted it's too
late to deal with its inability to re-start.

Still, though, this is the kind of division of responsibility that makes
a good program: tell the user of your code (in the docstring of your
class or function) that you require a sequence or some other re-iterable
object. If you try something that fails on what object you've been
given, that's the responsibility of the code that gave it to you. You
can be nice by ensuring it'll fail in such a way the caller gets a
meaningful exception.

-- 
 \        ?The number of UNIX installations has grown to 10, with more |
  `\         expected.? ?Unix Programmer's Manual, 2nd Ed., 1972-06-12 |
_o__)                                                                  |
Ben Finney


From mistersheik at gmail.com  Sat Sep 21 06:54:36 2013
From: mistersheik at gmail.com (Neil Girdhar)
Date: Sat, 21 Sep 2013 00:54:36 -0400
Subject: [Python-ideas] Introduce collections.Reiterable
In-Reply-To: <DE863B0E-7570-44A5-BC1C-2B65F8756F8F@yahoo.com>
References: <20130919121828.GK19939@ando>
 <CADiSq7cyp2=XWT2e+nsCR9Q2TqrqM5fdHUhFBwoTfHT4bwum9w@mail.gmail.com>
 <20130920094854.GO19939@ando> <l1idsm$dvp$1@ger.gmane.org>
 <84A21A05-BA4E-4324-98DB-6ADE7EB98D1D@gmail.com> <l1ingq$dov$1@ger.gmane.org>
 <5354F5F4-8052-451E-BAFB-BED214484AF8@gmail.com>
 <CAA68w_nLweisPY3QBDWYtcUCygPyA_FvjMy4GCh0cAFa4sXLKQ@mail.gmail.com>
 <523D0BF4.5020404@stoneleaf.us>
 <CAA68w_mhc2Mu0rj3ooCs1OXPohrQDUwtoe_rzeJVOa+d1CODhA@mail.gmail.com>
 <20130921040904.GW19939@ando>
 <CAA68w_n1w0noOmRn-at4CfoHkHUkPQEzQB8S3F59Onn9+sm4qw@mail.gmail.com>
 <DE863B0E-7570-44A5-BC1C-2B65F8756F8F@yahoo.com>
Message-ID: <CAA68w_m45TFTz2yhrapz8mr990CdNWGbK0ODuCbeHuD2jA_ovg@mail.gmail.com>

I check for callable when accepting callbacks because I will call them much
later and raising the error then is harder to track down.  Like I said in
the other mail, your alternative ways of checking reiterability have no
corresponding guarantee that it should work.  Checking other abcs are
supposed to work according to pep 3118.

Neil


On Sat, Sep 21, 2013 at 12:43 AM, Andrew Barnert <abarnert at yahoo.com> wrote:

> On Sep 20, 2013, at 21:16, Neil Girdhar <mistersheik at gmail.com> wrote:
>
> You're right that you should go ahead and use something however you want
> to.  However, there are plenty of times where you can't do that, e.g., you
> want to know if something is callable before calling it,
>
>
> Why? What's the harm in just calling it and handling the exception? And
> surely, if you really need to LBYL here, you need to know that it's
> callable with the argument you plan to pass it.
>
> There are good uses for checking if something is callable, but this isn't
> a good example. And it's very different from your other example.
>
> and similarly if something is reiterable before iterating it and
> exhausting.
>
>
> This one is different. You can't just handle failure, because (a) there's
> no unambiguous sign of failure, and (b) it's too late to deal with it if
> you've already exhausted the iterator.
>
> However, if you just turn the test around, it _is_ syntactically
> checkable: if "isinstance(it, Iterator)", or "iter(it) is it" or
> "hasattr(it, __next__)" or "next(it)" doesn't raise... then you have to do
> a single-pass algorithm or tee the values or make a list or whatever.
>
> Either Reiterable is just Iterable and not Iterator (barring any flaws in
> the definition of Iterable, which is a separate problem), or it's not an
> abstract type.
>
> And if it's just Iterable and not Iterator, besides being complicated to
> implement (you can't inherit from the negation of a class), it's also more
> complicated to use. The obvious use case is: If you get an Iterator, you
> have to tee, make a list, use a one-pass algorithm instead of two-pass,
> whatever. Rewriting that instead as if you get an Iterable but it's not a
> Reiterable buys you nothing but verbosity. Turning it around so if you get
> a Reiterable you can skip the fallback just means a double negative that's
> harder to process.
>
>  That is the purpose of collections.abc, and that's what I thought we were
> discussing.  Could you make mistakes trying to look ahead like this?  Sure.
>  An object could appear callable only to raise NotImplementedError on
> calling it.  Looking ahead does not have to be foolproof.  This is Python,
> and of course (almost) *any test* can be fooled.  That doesn't just go for
> reiterability, it goes for callability as well.
>
> Best,
> Neil
>
>
> On Sat, Sep 21, 2013 at 12:09 AM, Steven D'Aprano <steve at pearwood.info>wrote:
>
>> On Sat, Sep 21, 2013 at 12:00:14AM -0400, Neil Girdhar wrote:
>> > If someone allows their class to fail the standardized test for
>> > Iterable/Reiterable/Sequence, that class doesn't deserve to be treated
>> as
>> > one.  (Anyone can register their class as a subclass of the ABCs, or
>> more
>> > simply inherit from one.)
>>
>> This is Python, and duck-typing rules, not Java-like type checking. If
>> you want a language with strict type checking designed by theorists, try
>> Haskell.
>>
>> The ultimate test in Python of whether something is iterable or not is
>> to try iterating over it, and see if it succeeds or not. If it iterates
>> like a duck, that's good enough to be treated as a duck.
>>
>>
>> --
>> Steven
>> _______________________________________________
>> Python-ideas mailing list
>> Python-ideas at python.org
>> https://mail.python.org/mailman/listinfo/python-ideas
>>
>> --
>>
>> ---
>> You received this message because you are subscribed to a topic in the
>> Google Groups "python-ideas" group.
>> To unsubscribe from this topic, visit
>> https://groups.google.com/d/topic/python-ideas/OumiLGDwRWA/unsubscribe.
>> To unsubscribe from this group and all its topics, send an email to
>> python-ideas+unsubscribe at googlegroups.com.
>> For more options, visit https://groups.google.com/groups/opt_out.
>>
>
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> https://mail.python.org/mailman/listinfo/python-ideas
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20130921/1fc4d170/attachment.html>

From ben+python at benfinney.id.au  Sat Sep 21 07:04:30 2013
From: ben+python at benfinney.id.au (Ben Finney)
Date: Sat, 21 Sep 2013 15:04:30 +1000
Subject: [Python-ideas] Introduce collections.Reiterable
References: <20130919121828.GK19939@ando>
 <CADiSq7cyp2=XWT2e+nsCR9Q2TqrqM5fdHUhFBwoTfHT4bwum9w@mail.gmail.com>
 <20130920094854.GO19939@ando> <l1idsm$dvp$1@ger.gmane.org>
 <84A21A05-BA4E-4324-98DB-6ADE7EB98D1D@gmail.com>
 <l1ingq$dov$1@ger.gmane.org>
 <5354F5F4-8052-451E-BAFB-BED214484AF8@gmail.com>
 <CAA68w_nLweisPY3QBDWYtcUCygPyA_FvjMy4GCh0cAFa4sXLKQ@mail.gmail.com>
 <523D0BF4.5020404@stoneleaf.us>
 <CAA68w_mhc2Mu0rj3ooCs1OXPohrQDUwtoe_rzeJVOa+d1CODhA@mail.gmail.com>
 <20130921040904.GW19939@ando>
 <CAA68w_n1w0noOmRn-at4CfoHkHUkPQEzQB8S3F59Onn9+sm4qw@mail.gmail.com>
 <DE863B0E-7570-44A5-BC1C-2B65F8756F8F@yahoo.com>
 <CAA68w_m45TFTz2yhrapz8mr990CdNWGbK0ODuCbeHuD2jA_ovg@mail.gmail.com>
Message-ID: <7wli2qwysh.fsf@benfinney.id.au>

Neil Girdhar <mistersheik at gmail.com>
writes:

> I check for callable when accepting callbacks because I will call them
> much later and raising the error then is harder to track down.

Why is it harder to track down? That sounds like the problem to be
fixed.

-- 
 \            ?Simplicity is prerequisite for reliability.? ?Edsger W. |
  `\                                                          Dijkstra |
_o__)                                                                  |
Ben Finney


From mistersheik at gmail.com  Sat Sep 21 07:08:28 2013
From: mistersheik at gmail.com (Neil Girdhar)
Date: Sat, 21 Sep 2013 01:08:28 -0400
Subject: [Python-ideas] Introduce collections.Reiterable
In-Reply-To: <7wli2qwysh.fsf@benfinney.id.au>
References: <20130919121828.GK19939@ando>
 <CADiSq7cyp2=XWT2e+nsCR9Q2TqrqM5fdHUhFBwoTfHT4bwum9w@mail.gmail.com>
 <20130920094854.GO19939@ando> <l1idsm$dvp$1@ger.gmane.org>
 <84A21A05-BA4E-4324-98DB-6ADE7EB98D1D@gmail.com> <l1ingq$dov$1@ger.gmane.org>
 <5354F5F4-8052-451E-BAFB-BED214484AF8@gmail.com>
 <CAA68w_nLweisPY3QBDWYtcUCygPyA_FvjMy4GCh0cAFa4sXLKQ@mail.gmail.com>
 <523D0BF4.5020404@stoneleaf.us>
 <CAA68w_mhc2Mu0rj3ooCs1OXPohrQDUwtoe_rzeJVOa+d1CODhA@mail.gmail.com>
 <20130921040904.GW19939@ando>
 <CAA68w_n1w0noOmRn-at4CfoHkHUkPQEzQB8S3F59Onn9+sm4qw@mail.gmail.com>
 <DE863B0E-7570-44A5-BC1C-2B65F8756F8F@yahoo.com>
 <CAA68w_m45TFTz2yhrapz8mr990CdNWGbK0ODuCbeHuD2jA_ovg@mail.gmail.com>
 <7wli2qwysh.fsf@benfinney.id.au>
Message-ID: <CAA68w_mFEq2PhvWsoy_vuwm6AefXcR7AcLJPKRFXRLFMbpUgCg@mail.gmail.com>

Because the caller who sent the bad callable is no longer in the stack
trace.


On Sat, Sep 21, 2013 at 1:04 AM, Ben Finney <ben+python at benfinney.id.au>wrote:

> Neil Girdhar <mistersheik at gmail.com>
> writes:
>
> > I check for callable when accepting callbacks because I will call them
> > much later and raising the error then is harder to track down.
>
> Why is it harder to track down? That sounds like the problem to be
> fixed.
>
> --
>  \            ?Simplicity is prerequisite for reliability.? ?Edsger W. |
>   `\                                                          Dijkstra |
> _o__)                                                                  |
> Ben Finney
>
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> https://mail.python.org/mailman/listinfo/python-ideas
>
> --
>
> ---
> You received this message because you are subscribed to a topic in the
> Google Groups "python-ideas" group.
> To unsubscribe from this topic, visit
> https://groups.google.com/d/topic/python-ideas/OumiLGDwRWA/unsubscribe.
> To unsubscribe from this group and all its topics, send an email to
> python-ideas+unsubscribe at googlegroups.com.
> For more options, visit https://groups.google.com/groups/opt_out.
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20130921/70879af0/attachment-0001.html>

From solipsis at pitrou.net  Sat Sep 21 07:37:11 2013
From: solipsis at pitrou.net (Antoine Pitrou)
Date: Sat, 21 Sep 2013 07:37:11 +0200
Subject: [Python-ideas] Introduce collections.Reiterable
References: <20130919121828.GK19939@ando>
 <CADiSq7cyp2=XWT2e+nsCR9Q2TqrqM5fdHUhFBwoTfHT4bwum9w@mail.gmail.com>
 <20130920094854.GO19939@ando> <l1idsm$dvp$1@ger.gmane.org>
 <84A21A05-BA4E-4324-98DB-6ADE7EB98D1D@gmail.com>
 <l1ingq$dov$1@ger.gmane.org>
 <5354F5F4-8052-451E-BAFB-BED214484AF8@gmail.com>
 <CAA68w_nLweisPY3QBDWYtcUCygPyA_FvjMy4GCh0cAFa4sXLKQ@mail.gmail.com>
 <523D0BF4.5020404@stoneleaf.us>
 <CAA68w_mhc2Mu0rj3ooCs1OXPohrQDUwtoe_rzeJVOa+d1CODhA@mail.gmail.com>
 <20130921040904.GW19939@ando>
 <CAA68w_n1w0noOmRn-at4CfoHkHUkPQEzQB8S3F59Onn9+sm4qw@mail.gmail.com>
 <DE863B0E-7570-44A5-BC1C-2B65F8756F8F@yahoo.com>
 <CAA68w_m45TFTz2yhrapz8mr990CdNWGbK0ODuCbeHuD2jA_ovg@mail.gmail.com>
 <7wli2qwysh.fsf@benfinney.id.au>
Message-ID: <20130921073711.6313f9e1@fsol>

On Sat, 21 Sep 2013 15:04:30 +1000
Ben Finney <ben+python at benfinney.id.au> wrote:
> Neil Girdhar <mistersheik at gmail.com>
> writes:
> 
> > I check for callable when accepting callbacks because I will call them
> > much later and raising the error then is harder to track down.
> 
> Why is it harder to track down? That sounds like the problem to be
> fixed.

Well, there is no need to try and rehash this mantra.  callable() was
revived for a reason.

I will suggest anyone wanting a LBYL vs. EAFP discussion to go discuss
it on python-list, really ;-) python-ideas is not the place for the same
old platonic language design discussions that everyone's been having for
10+ years.

Regards

Antoine.


From ethan at stoneleaf.us  Sat Sep 21 07:54:30 2013
From: ethan at stoneleaf.us (Ethan Furman)
Date: Fri, 20 Sep 2013 22:54:30 -0700
Subject: [Python-ideas] Introduce collections.Reiterable
In-Reply-To: <CAA68w_mBaQ_+uUmfqDvs85qhudWLjCGXCw0V4p2_r6wdWikTpg@mail.gmail.com>
References: <575f4071-1b5c-4a16-b36c-b5f925cdd2f7@googlegroups.com>
 <CADiSq7c2G8E47Hth-kQWd4W33dpb9YFiyLs+vs7Seaw20w6fHw@mail.gmail.com>
 <CAA68w_kuPS6xVjAbropBFyiQib3aH9Ke+PiW8ecWhSkmHsnXnQ@mail.gmail.com>
 <CADiSq7cVXAUi4BLpfa13QjhwNmX9JancrMjHSEEfKn_P=F-_ag@mail.gmail.com>
 <20130919121828.GK19939@ando>
 <CADiSq7cyp2=XWT2e+nsCR9Q2TqrqM5fdHUhFBwoTfHT4bwum9w@mail.gmail.com>
 <20130920094854.GO19939@ando>
 <CACac1F-=X-nfKe9sYx-_U_Ties3nU3z4vqzs=DkdqfJFD0StiA@mail.gmail.com>
 <20130920111000.GQ19939@ando>
 <CAN8CLgnAWQ3iszEtvEo_FTydx9Un+Y+41Fe5fVsNdX70Ube9Ww@mail.gmail.com>
 <l1imst$88i$1@ger.gmane.org>
 <CAN8CLg=RQBXAJJbx5Ts+2D8ywHB_ziWj_T-VDMPWNbByETgrOQ@mail.gmail.com>
 <CAN8CLgkZogwNzCLr91yiTmq=hR7KzRhLGNiSUJgrcok7-L7awQ@mail.gmail.com>
 <87y56q3ni1.fsf@uwakimon.sk.tsukuba.ac.jp>
 <CAA68w_mBaQ_+uUmfqDvs85qhudWLjCGXCw0V4p2_r6wdWikTpg@mail.gmail.com>
Message-ID: <523D3496.9050602@stoneleaf.us>

On 09/20/2013 09:20 PM, Neil Girdhar wrote:
> I can humbly suggest why Python would deprecate the sequence protocol: there "should be one obvious way" to answer
> iter(), and in my opinion that's the  __iter__()  method.  I considered infinite iterators, and if you happen to have
>   __getitem__ written, you can trivially write an __iter__ function as follows:

1) One Obvious Way != Only One Way  (we can have both)

2) Deprecating (and removing) __getitem__ will break lots of code.  It's not going to happen.

--
~Ethan~

From mistersheik at gmail.com  Sat Sep 21 08:21:21 2013
From: mistersheik at gmail.com (Neil Girdhar)
Date: Sat, 21 Sep 2013 02:21:21 -0400
Subject: [Python-ideas] Introduce collections.Reiterable
In-Reply-To: <523D3496.9050602@stoneleaf.us>
References: <575f4071-1b5c-4a16-b36c-b5f925cdd2f7@googlegroups.com>
 <CADiSq7c2G8E47Hth-kQWd4W33dpb9YFiyLs+vs7Seaw20w6fHw@mail.gmail.com>
 <CAA68w_kuPS6xVjAbropBFyiQib3aH9Ke+PiW8ecWhSkmHsnXnQ@mail.gmail.com>
 <CADiSq7cVXAUi4BLpfa13QjhwNmX9JancrMjHSEEfKn_P=F-_ag@mail.gmail.com>
 <20130919121828.GK19939@ando>
 <CADiSq7cyp2=XWT2e+nsCR9Q2TqrqM5fdHUhFBwoTfHT4bwum9w@mail.gmail.com>
 <20130920094854.GO19939@ando>
 <CACac1F-=X-nfKe9sYx-_U_Ties3nU3z4vqzs=DkdqfJFD0StiA@mail.gmail.com>
 <20130920111000.GQ19939@ando>
 <CAN8CLgnAWQ3iszEtvEo_FTydx9Un+Y+41Fe5fVsNdX70Ube9Ww@mail.gmail.com>
 <l1imst$88i$1@ger.gmane.org>
 <CAN8CLg=RQBXAJJbx5Ts+2D8ywHB_ziWj_T-VDMPWNbByETgrOQ@mail.gmail.com>
 <CAN8CLgkZogwNzCLr91yiTmq=hR7KzRhLGNiSUJgrcok7-L7awQ@mail.gmail.com>
 <87y56q3ni1.fsf@uwakimon.sk.tsukuba.ac.jp>
 <CAA68w_mBaQ_+uUmfqDvs85qhudWLjCGXCw0V4p2_r6wdWikTpg@mail.gmail.com>
 <523D3496.9050602@stoneleaf.us>
Message-ID: <CAA68w_kZorxA4OYACVmvdPhusfyooO==wgfFtdfnKC0J-eTHEg@mail.gmail.com>

No one suggested removing __getitem__.  Some people have suggested
deprecating (without removing) the sequence protocol.  Do you know of any
object that relies on the sequence protocol?  That is, that implements
__getitem__ without implementing __iter__ (or using a mixin like
collections.Sequence to provide __iter__)?


On Sat, Sep 21, 2013 at 1:54 AM, Ethan Furman <ethan at stoneleaf.us> wrote:

> On 09/20/2013 09:20 PM, Neil Girdhar wrote:
>
>> I can humbly suggest why Python would deprecate the sequence protocol:
>> there "should be one obvious way" to answer
>> iter(), and in my opinion that's the  __iter__()  method.  I considered
>> infinite iterators, and if you happen to have
>>   __getitem__ written, you can trivially write an __iter__ function as
>> follows:
>>
>
> 1) One Obvious Way != Only One Way  (we can have both)
>
> 2) Deprecating (and removing) __getitem__ will break lots of code.  It's
> not going to happen.
>
> --
> ~Ethan~
>
> ______________________________**_________________
> Python-ideas mailing list
> Python-ideas at python.org
> https://mail.python.org/**mailman/listinfo/python-ideas<https://mail.python.org/mailman/listinfo/python-ideas>
>
> --
>
> --- You received this message because you are subscribed to a topic in the
> Google Groups "python-ideas" group.
> To unsubscribe from this topic, visit https://groups.google.com/d/**
> topic/python-ideas/**OumiLGDwRWA/unsubscribe<https://groups.google.com/d/topic/python-ideas/OumiLGDwRWA/unsubscribe>
> .
> To unsubscribe from this group and all its topics, send an email to
> python-ideas+unsubscribe@**googlegroups.com<python-ideas%2Bunsubscribe at googlegroups.com>
> .
> For more options, visit https://groups.google.com/**groups/opt_out<https://groups.google.com/groups/opt_out>
> .
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20130921/e52a6452/attachment.html>

From stephen at xemacs.org  Sat Sep 21 08:25:08 2013
From: stephen at xemacs.org (Stephen J. Turnbull)
Date: Sat, 21 Sep 2013 15:25:08 +0900
Subject: [Python-ideas] Introduce collections.Reiterable
In-Reply-To: <CAA68w_n1w0noOmRn-at4CfoHkHUkPQEzQB8S3F59Onn9+sm4qw@mail.gmail.com>
References: <20130919121828.GK19939@ando>
 <CADiSq7cyp2=XWT2e+nsCR9Q2TqrqM5fdHUhFBwoTfHT4bwum9w@mail.gmail.com>
 <20130920094854.GO19939@ando> <l1idsm$dvp$1@ger.gmane.org>
 <84A21A05-BA4E-4324-98DB-6ADE7EB98D1D@gmail.com>
 <l1ingq$dov$1@ger.gmane.org>
 <5354F5F4-8052-451E-BAFB-BED214484AF8@gmail.com>
 <CAA68w_nLweisPY3QBDWYtcUCygPyA_FvjMy4GCh0cAFa4sXLKQ@mail.gmail.com>
 <523D0BF4.5020404@stoneleaf.us>
 <CAA68w_mhc2Mu0rj3ooCs1OXPohrQDUwtoe_rzeJVOa+d1CODhA@mail.gmail.com>
 <20130921040904.GW19939@ando>
 <CAA68w_n1w0noOmRn-at4CfoHkHUkPQEzQB8S3F59Onn9+sm4qw@mail.gmail.com>
Message-ID: <87siwy3d4r.fsf@uwakimon.sk.tsukuba.ac.jp>

Neil Girdhar writes:

 > You're right that you should go ahead and use something however you
 > want to. ?However, there are plenty of times where you can't do that,
 > e.g., you want to know if something is callable before calling it,
 > and similarly if something is reiterable before iterating it and
 > exhausting. ?That is the purpose of collections.abc,

I don't think so.  It's documented that way:

    This module provides abstract base classes that can be used to
    test whether a class provides a particular interface; for example,
    whether it is hashable or whether it is a mapping.

But I wouldn't do explicit testing with isinstance, but rather use
implicit assertions (at instantiation time) by deriving from the ABC.
I don't see how Reiterable could be adapted to this style of
programming because the API of iterables is basically fixed (support
__iter__ or __getitem__).

 > and that's what I thought we were discussing.

You were, I agree.  But you proposed a new API, which pretty well
guarantees many discussants will take a more global view, like "do the
use cases justify this addition?"

Another such question is "what exactly is the specification?"  Tim
Delany, for example, AIUI doesn't have a problem with saying that any
iterable is reiterable, because it won't raise an exception if the
program iterates it after exhaustion.  It simply does nothing, but in
some cases that's perfectly acceptable.  I know you disagree, and I
don't think that's a useful definition.  Still it demonstrates the
wide range of opinions on what "reiterable" can or should guarantee.

From mistersheik at gmail.com  Sat Sep 21 08:56:29 2013
From: mistersheik at gmail.com (Neil Girdhar)
Date: Sat, 21 Sep 2013 02:56:29 -0400
Subject: [Python-ideas] Introduce collections.Reiterable
In-Reply-To: <87siwy3d4r.fsf@uwakimon.sk.tsukuba.ac.jp>
References: <20130919121828.GK19939@ando>
 <CADiSq7cyp2=XWT2e+nsCR9Q2TqrqM5fdHUhFBwoTfHT4bwum9w@mail.gmail.com>
 <20130920094854.GO19939@ando> <l1idsm$dvp$1@ger.gmane.org>
 <84A21A05-BA4E-4324-98DB-6ADE7EB98D1D@gmail.com> <l1ingq$dov$1@ger.gmane.org>
 <5354F5F4-8052-451E-BAFB-BED214484AF8@gmail.com>
 <CAA68w_nLweisPY3QBDWYtcUCygPyA_FvjMy4GCh0cAFa4sXLKQ@mail.gmail.com>
 <523D0BF4.5020404@stoneleaf.us>
 <CAA68w_mhc2Mu0rj3ooCs1OXPohrQDUwtoe_rzeJVOa+d1CODhA@mail.gmail.com>
 <20130921040904.GW19939@ando>
 <CAA68w_n1w0noOmRn-at4CfoHkHUkPQEzQB8S3F59Onn9+sm4qw@mail.gmail.com>
 <87siwy3d4r.fsf@uwakimon.sk.tsukuba.ac.jp>
Message-ID: <CAA68w_n=Dqixa43aTag4ouOMhBDiafVjd15iH6RAzHAvHhMJzg@mail.gmail.com>

On Sat, Sep 21, 2013 at 2:25 AM, Stephen J. Turnbull <stephen at xemacs.org>wrote:

> Neil Girdhar writes:
>
>  > You're right that you should go ahead and use something however you
>  > want to.  However, there are plenty of times where you can't do that,
>  > e.g., you want to know if something is callable before calling it,
>  > and similarly if something is reiterable before iterating it and
>  > exhausting.  That is the purpose of collections.abc,
>
> I don't think so.  It's documented that way:
>
>     This module provides abstract base classes that can be used to
>     test whether a class provides a particular interface; for example,
>     whether it is hashable or whether it is a mapping.
>
> But I wouldn't do explicit testing with isinstance, but rather use
> implicit assertions (at instantiation time) by deriving from the ABC.
> I don't see how Reiterable could be adapted to this style of
> programming because the API of iterables is basically fixed (support
> __iter__ or __getitem__).
>

Wouldn't you need to define a new ABC to do this?

Here's one possibility with Tim's iterable and not iterator:

class Reiterable(collections.Iterable,
                 metaclass=collections.abc.ABCMeta):
    @classmethod
    def __subclasshook__(cls, subclass):
        if (collections.Iterable.__subclasshook__(subclass)
            and not issubclass(subclass, collections.Iterator)):
            return True
        return NotImplemented

for obj in [list(), tuple(), dict(), set(), range(4),
            (x * x for x in range(4))]:
    print(type(obj),
          isinstance(obj, Reiterable))

Another possibility would be to explicitly register Views and so on using

Reiterable.register(...)


>
>  > and that's what I thought we were discussing.
>
> You were, I agree.  But you proposed a new API, which pretty well
> guarantees many discussants will take a more global view, like "do the
> use cases justify this addition?"
>

It's a good point.  I think if I'm the only with this problem, then the
answer is clearly no.  I will just cast to list and so what if it's a
little bit slower in some cases. How could I know that I was the only one
with this problem?


> Another such question is "what exactly is the specification?"  Tim
> Delany, for example, AIUI doesn't have a problem with saying that any
> iterable is reiterable, because it won't raise an exception if the
> program iterates it after exhaustion.  It simply does nothing, but in
> some cases that's perfectly acceptable.  I know you disagree, and I
> don't think that's a useful definition.  Still it demonstrates the
> wide range of opinions on what "reiterable" can or should guarantee.
>

Yes, agreed that there are a wide range of reasonable opinions.

Best,

Neil
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20130921/b05a4f95/attachment-0001.html>

From ethan at stoneleaf.us  Sat Sep 21 08:36:57 2013
From: ethan at stoneleaf.us (Ethan Furman)
Date: Fri, 20 Sep 2013 23:36:57 -0700
Subject: [Python-ideas] Introduce collections.Reiterable
In-Reply-To: <CAA68w_kZorxA4OYACVmvdPhusfyooO==wgfFtdfnKC0J-eTHEg@mail.gmail.com>
References: <575f4071-1b5c-4a16-b36c-b5f925cdd2f7@googlegroups.com>
 <CADiSq7c2G8E47Hth-kQWd4W33dpb9YFiyLs+vs7Seaw20w6fHw@mail.gmail.com>
 <CAA68w_kuPS6xVjAbropBFyiQib3aH9Ke+PiW8ecWhSkmHsnXnQ@mail.gmail.com>
 <CADiSq7cVXAUi4BLpfa13QjhwNmX9JancrMjHSEEfKn_P=F-_ag@mail.gmail.com>
 <20130919121828.GK19939@ando>
 <CADiSq7cyp2=XWT2e+nsCR9Q2TqrqM5fdHUhFBwoTfHT4bwum9w@mail.gmail.com>
 <20130920094854.GO19939@ando>
 <CACac1F-=X-nfKe9sYx-_U_Ties3nU3z4vqzs=DkdqfJFD0StiA@mail.gmail.com>
 <20130920111000.GQ19939@ando>
 <CAN8CLgnAWQ3iszEtvEo_FTydx9Un+Y+41Fe5fVsNdX70Ube9Ww@mail.gmail.com>
 <l1imst$88i$1@ger.gmane.org>
 <CAN8CLg=RQBXAJJbx5Ts+2D8ywHB_ziWj_T-VDMPWNbByETgrOQ@mail.gmail.com>
 <CAN8CLgkZogwNzCLr91yiTmq=hR7KzRhLGNiSUJgrcok7-L7awQ@mail.gmail.com>
 <87y56q3ni1.fsf@uwakimon.sk.tsukuba.ac.jp>
 <CAA68w_mBaQ_+uUmfqDvs85qhudWLjCGXCw0V4p2_r6wdWikTpg@mail.gmail.com>
 <523D3496.9050602@stoneleaf.us>
 <CAA68w_kZorxA4OYACVmvdPhusfyooO==wgfFtdfnKC0J-eTHEg@mail.gmail.com>
Message-ID: <523D3E89.7090406@stoneleaf.us>

On 09/20/2013 11:21 PM, Neil Girdhar wrote:
> No one suggested removing __getitem__.  Some people have suggested deprecating (without removing) the sequence protocol.
>   Do you know of any object that relies on the sequence protocol?  That is, that implements __getitem__ without
> implementing __iter__ (or using a mixin like collections.Sequence to provide __iter__)?

The goal of deprecation is removal.

Any item that supports index access, such as lists, tuples, and dictionaries, needs __getitem__.  Iteration is not the 
only way to access an iterable object.

--
~Ethan~

From stephen at xemacs.org  Sat Sep 21 09:02:35 2013
From: stephen at xemacs.org (Stephen J. Turnbull)
Date: Sat, 21 Sep 2013 16:02:35 +0900
Subject: [Python-ideas] Introduce collections.Reiterable
In-Reply-To: <CAA68w_mBaQ_+uUmfqDvs85qhudWLjCGXCw0V4p2_r6wdWikTpg@mail.gmail.com>
References: <575f4071-1b5c-4a16-b36c-b5f925cdd2f7@googlegroups.com>
 <CADiSq7c2G8E47Hth-kQWd4W33dpb9YFiyLs+vs7Seaw20w6fHw@mail.gmail.com>
 <CAA68w_kuPS6xVjAbropBFyiQib3aH9Ke+PiW8ecWhSkmHsnXnQ@mail.gmail.com>
 <CADiSq7cVXAUi4BLpfa13QjhwNmX9JancrMjHSEEfKn_P=F-_ag@mail.gmail.com>
 <20130919121828.GK19939@ando>
 <CADiSq7cyp2=XWT2e+nsCR9Q2TqrqM5fdHUhFBwoTfHT4bwum9w@mail.gmail.com>
 <20130920094854.GO19939@ando>
 <CACac1F-=X-nfKe9sYx-_U_Ties3nU3z4vqzs=DkdqfJFD0StiA@mail.gmail.com>
 <20130920111000.GQ19939@ando>
 <CAN8CLgnAWQ3iszEtvEo_FTydx9Un+Y+41Fe5fVsNdX70Ube9Ww@mail.gmail.com>
 <l1imst$88i$1@ger.gmane.org>
 <CAN8CLg=RQBXAJJbx5Ts+2D8ywHB_ziWj_T-VDMPWNbByETgrOQ@mail.gmail.com>
 <CAN8CLgkZogwNzCLr91yiTmq=hR7KzRhLGNiSUJgrcok7-L7awQ@mail.gmail.com>
 <87y56q3ni1.fsf@uwakimon.sk.tsukuba.ac.jp>
 <CAA68w_mBaQ_+uUmfqDvs85qhudWLjCGXCw0V4p2_r6wdWikTpg@mail.gmail.com>
Message-ID: <87r4ci3bec.fsf@uwakimon.sk.tsukuba.ac.jp>

Neil Girdhar writes:

 > I can humbly suggest why Python would deprecate the sequence
 > protocol: there "should be one obvious way" to answer iter(), and
 > in my opinion that's the ?__iter__() ?method. ?I considered
 > infinite iterators, and if you happen to have ?__getitem__ written,
 > you can trivially write an __iter__ function

Better yet, Python can do it for me.  That's *why* it makes sense for
iter() to accept an object with a __getitem__ method.

I wonder if it would be possible for Iterable to provide an __iter__
method at instantiation if and only if __iter__ is not defined in the
derived class and __getitem__ is.  Then

>>> class GoodIterable1(Iterable):
...  def __iter__(self):
...   return iter([])
...
>>> gi1 = GoodIterable1(Iterable)
>>> dir(gi1)
[..., __iter__, ...]
>>> class GoodIterable2(Iterable)
...  def __getitem__(self, i):
...   return [][0]
...
>>> dir(gi2)
[..., __getitem__, __iter__, ...]    # it's magic!
>>> class BadIterable(Iterable):
...  pass
...
>>> bi = BadIterable()               # ordinary mixin __iter__ wouldn't raise
                                     # but magic one does
TypeError: can't instantiate abstract class BadIterable with abstract methods __iter__
>>> 

Although I guess the ordinary mixin will raise anyway when it tries to
call __getitem__.

From mistersheik at gmail.com  Sat Sep 21 09:03:35 2013
From: mistersheik at gmail.com (Neil Girdhar)
Date: Sat, 21 Sep 2013 03:03:35 -0400
Subject: [Python-ideas] Introduce collections.Reiterable
In-Reply-To: <523D3E89.7090406@stoneleaf.us>
References: <575f4071-1b5c-4a16-b36c-b5f925cdd2f7@googlegroups.com>
 <CADiSq7c2G8E47Hth-kQWd4W33dpb9YFiyLs+vs7Seaw20w6fHw@mail.gmail.com>
 <CAA68w_kuPS6xVjAbropBFyiQib3aH9Ke+PiW8ecWhSkmHsnXnQ@mail.gmail.com>
 <CADiSq7cVXAUi4BLpfa13QjhwNmX9JancrMjHSEEfKn_P=F-_ag@mail.gmail.com>
 <20130919121828.GK19939@ando>
 <CADiSq7cyp2=XWT2e+nsCR9Q2TqrqM5fdHUhFBwoTfHT4bwum9w@mail.gmail.com>
 <20130920094854.GO19939@ando>
 <CACac1F-=X-nfKe9sYx-_U_Ties3nU3z4vqzs=DkdqfJFD0StiA@mail.gmail.com>
 <20130920111000.GQ19939@ando>
 <CAN8CLgnAWQ3iszEtvEo_FTydx9Un+Y+41Fe5fVsNdX70Ube9Ww@mail.gmail.com>
 <l1imst$88i$1@ger.gmane.org>
 <CAN8CLg=RQBXAJJbx5Ts+2D8ywHB_ziWj_T-VDMPWNbByETgrOQ@mail.gmail.com>
 <CAN8CLgkZogwNzCLr91yiTmq=hR7KzRhLGNiSUJgrcok7-L7awQ@mail.gmail.com>
 <87y56q3ni1.fsf@uwakimon.sk.tsukuba.ac.jp>
 <CAA68w_mBaQ_+uUmfqDvs85qhudWLjCGXCw0V4p2_r6wdWikTpg@mail.gmail.com>
 <523D3496.9050602@stoneleaf.us>
 <CAA68w_kZorxA4OYACVmvdPhusfyooO==wgfFtdfnKC0J-eTHEg@mail.gmail.com>
 <523D3E89.7090406@stoneleaf.us>
Message-ID: <CAA68w_=zrehihF49MYhMAHibYmgy46pBe6JtNNuLa8Up4F9bOQ@mail.gmail.com>

We're not talking about deprecating __getitem__.  We're talking about
deprecating the "sequence protocol" whereby iter(obj) falls back to calling
__getitem__ when an object doesn't have __iter__.  No one is talking about
removing __getitem__!

Neil


On Sat, Sep 21, 2013 at 2:36 AM, Ethan Furman <ethan at stoneleaf.us> wrote:

> On 09/20/2013 11:21 PM, Neil Girdhar wrote:
>
>> No one suggested removing __getitem__.  Some people have suggested
>> deprecating (without removing) the sequence protocol.
>>   Do you know of any object that relies on the sequence protocol?  That
>> is, that implements __getitem__ without
>> implementing __iter__ (or using a mixin like collections.Sequence to
>> provide __iter__)?
>>
>
> The goal of deprecation is removal.
>
> Any item that supports index access, such as lists, tuples, and
> dictionaries, needs __getitem__.  Iteration is not the only way to access
> an iterable object.
>
>
> --
> ~Ethan~
> ______________________________**_________________
> Python-ideas mailing list
> Python-ideas at python.org
> https://mail.python.org/**mailman/listinfo/python-ideas<https://mail.python.org/mailman/listinfo/python-ideas>
>
> --
>
> --- You received this message because you are subscribed to a topic in the
> Google Groups "python-ideas" group.
> To unsubscribe from this topic, visit https://groups.google.com/d/**
> topic/python-ideas/**OumiLGDwRWA/unsubscribe<https://groups.google.com/d/topic/python-ideas/OumiLGDwRWA/unsubscribe>
> .
> To unsubscribe from this group and all its topics, send an email to
> python-ideas+unsubscribe@**googlegroups.com<python-ideas%2Bunsubscribe at googlegroups.com>
> .
> For more options, visit https://groups.google.com/**groups/opt_out<https://groups.google.com/groups/opt_out>
> .
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20130921/63266a62/attachment.html>

From abarnert at yahoo.com  Sat Sep 21 09:04:50 2013
From: abarnert at yahoo.com (Andrew Barnert)
Date: Sat, 21 Sep 2013 00:04:50 -0700
Subject: [Python-ideas] Introduce collections.Reiterable
In-Reply-To: <CAA68w_mkCqCjjFohiWQ2tj3SJxFA+uoAiqOEep30gMCBw33gtQ@mail.gmail.com>
References: <575f4071-1b5c-4a16-b36c-b5f925cdd2f7@googlegroups.com>
 <CADiSq7c2G8E47Hth-kQWd4W33dpb9YFiyLs+vs7Seaw20w6fHw@mail.gmail.com>
 <CAA68w_kuPS6xVjAbropBFyiQib3aH9Ke+PiW8ecWhSkmHsnXnQ@mail.gmail.com>
 <CADiSq7cVXAUi4BLpfa13QjhwNmX9JancrMjHSEEfKn_P=F-_ag@mail.gmail.com>
 <20130919121828.GK19939@ando>
 <CADiSq7cyp2=XWT2e+nsCR9Q2TqrqM5fdHUhFBwoTfHT4bwum9w@mail.gmail.com>
 <20130920094854.GO19939@ando>
 <CACac1F-=X-nfKe9sYx-_U_Ties3nU3z4vqzs=DkdqfJFD0StiA@mail.gmail.com>
 <20130920111000.GQ19939@ando>
 <CAN8CLgnAWQ3iszEtvEo_FTydx9Un+Y+41Fe5fVsNdX70Ube9Ww@mail.gmail.com>
 <l1imst$88i$1@ger.gmane.org> <87zjr63oub.fsf@uwakimon.sk.tsukuba.ac.jp>
 <CAA68w_=fhyk8Wh2Mqx4-09cTaPhiqFTS3ZVjL4tez9X9-9Yx+Q@mail.gmail.com>
 <F6683518-F434-4B96-8D10-E424D7F53778@yahoo.com>
 <CAA68w_mkCqCjjFohiWQ2tj3SJxFA+uoAiqOEep30gMCBw33gtQ@mail.gmail.com>
Message-ID: <CED2BED8-5B94-4203-88D5-4510BC5C067C@yahoo.com>

On Sep 20, 2013, at 21:52, Neil Girdhar <mistersheik at gmail.com> wrote:

> We discussed this upthread: I only want "not iterator" if not iterator promises reiterability. Right now, we have what may be a happy accident that can easily be violated by someone else.

And if you define your new ABC, it can be just as easily violated by someone else. In fact, it will be violated in the exact _same_ cases. There's no check you can do besides the reverse of the checks done by iterator.

More importantly, it's not just "a happy accident". I've asked repeatedly if anyone can come up with a single example of a non-iterator, non-reiterable iterator, or even imagine what one would look like, and nobody's come up with one. And it's not like iterators are some new feature nobody's had time to explore yet.

So, in order to solve a problem that doesn't exist, you want to add a new feature that wouldn't solve it any better than what we have today.

> Best,
> Neil
> 
> 
> On Sat, Sep 21, 2013 at 12:50 AM, Andrew Barnert <abarnert at yahoo.com> wrote:
>> On Sep 20, 2013, at 21:23, Neil Girdhar <mistersheik at gmail.com> wrote:
>> 
>>> I appreciate the discussion illuminating various aspects of this I hadn't considered. Finally, what I think I want is for
>>> * all sequences
>>> * all views
>>> * numpy arrays
>>> to answer yes to reiterable, and
>>> * all generators
>>> to answer no to reiterable.
>> 
>> All sequences, views, and numpy arrays answer no to iterator (and so do sets, mappings, etc.), and all generators answer yes (and so do the iterators you get back from calling iter on a sequence, map, filter, your favorite itertools function, etc.)
>> 
>> So you just want "not iterator". Even Haskell doesn't attempt to provide negative types like that. (And you can very easily show that it's iterator that's the normal type: it's syntactically checkable in various ways--e.g., it.hasattr('__next__'), but the only positive way to check reiterable is not just semantic, but destructive.)
>> 
>>> Best, Neil
>>> 
>>> On Fri, Sep 20, 2013 at 10:12 PM, Stephen J. Turnbull <stephen at xemacs.org> wrote:
>>>> Terry Reedy writes:
>>>> 
>>>>  > Dismissing legal code as 'pathological', as more than one person has,
>>>>  > does not cut it as a design principle.
>>>> 
>>>> But you don't even need to write a class with __getitem__() to get
>>>> that behavior.
>>>> 
>>>> >>> l = [11, 12, 13]
>>>> >>> for i in l:
>>>> ...  print(i)
>>>> ...  if i%2 == 0:
>>>> ...   l.remove(i)
>>>> ...
>>>> 11
>>>> 12
>>>> >>> l
>>>> [11, 13]
>>>> >>>
>>>> 
>>>> Of course the iteration itself is probably buggy (ie, the writer
>>>> didn't mean to skip printing '13'), but in general iterables can
>>>> change themselves.
>>>> 
>>>> Neil himself seems to be of two minds about such cases.  On the one
>>>> hand, he said the above behavior is built in to list, so it's
>>>> acceptable to him.  (I think that's inconsistent: I would say the
>>>> property of being completely consumed is built in to iterator, so it
>>>> should be acceptable, too.)  On the other hand, he's defined a
>>>> reiterable as a collection that when iterated produces the same
>>>> objects in the same order.
>>>> 
>>>> Maybe what we really want is for copy.deepcopy to do the right thing
>>>> with iterables.  Then code that doesn't want to consume consumable
>>>> iterables can do a deepcopy (including replication of the closed-over
>>>> state of __next__() for iterators) before iterating.
>>>> 
>>>> Or perhaps the right thing is a copy.itercopy that creates a new
>>>> composite object as a shallow copy of everything except that it clones
>>>> the state of __next__() in case the object was an iterator to start
>>>> with.
>>>> _______________________________________________
>>>> Python-ideas mailing list
>>>> Python-ideas at python.org
>>>> https://mail.python.org/mailman/listinfo/python-ideas
>>>> 
>>>> --
>>>> 
>>>> ---
>>>> You received this message because you are subscribed to a topic in the Google Groups "python-ideas" group.
>>>> To unsubscribe from this topic, visit https://groups.google.com/d/topic/python-ideas/OumiLGDwRWA/unsubscribe.
>>>> To unsubscribe from this group and all its topics, send an email to python-ideas+unsubscribe at googlegroups.com.
>>>> For more options, visit https://groups.google.com/groups/opt_out.
>>> 
>>> _______________________________________________
>>> Python-ideas mailing list
>>> Python-ideas at python.org
>>> https://mail.python.org/mailman/listinfo/python-ideas
> 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20130921/f8365332/attachment-0001.html>

From abarnert at yahoo.com  Sat Sep 21 09:08:05 2013
From: abarnert at yahoo.com (Andrew Barnert)
Date: Sat, 21 Sep 2013 00:08:05 -0700
Subject: [Python-ideas] Introduce collections.Reiterable
In-Reply-To: <CAA68w_n=Dqixa43aTag4ouOMhBDiafVjd15iH6RAzHAvHhMJzg@mail.gmail.com>
References: <20130919121828.GK19939@ando>
 <CADiSq7cyp2=XWT2e+nsCR9Q2TqrqM5fdHUhFBwoTfHT4bwum9w@mail.gmail.com>
 <20130920094854.GO19939@ando> <l1idsm$dvp$1@ger.gmane.org>
 <84A21A05-BA4E-4324-98DB-6ADE7EB98D1D@gmail.com> <l1ingq$dov$1@ger.gmane.org>
 <5354F5F4-8052-451E-BAFB-BED214484AF8@gmail.com>
 <CAA68w_nLweisPY3QBDWYtcUCygPyA_FvjMy4GCh0cAFa4sXLKQ@mail.gmail.com>
 <523D0BF4.5020404@stoneleaf.us>
 <CAA68w_mhc2Mu0rj3ooCs1OXPohrQDUwtoe_rzeJVOa+d1CODhA@mail.gmail.com>
 <20130921040904.GW19939@ando>
 <CAA68w_n1w0noOmRn-at4CfoHkHUkPQEzQB8S3F59Onn9+sm4qw@mail.gmail.com>
 <87siwy3d4r.fsf@uwakimon.sk.tsukuba.ac.jp>
 <CAA68w_n=Dqixa43aTag4ouOMhBDiafVjd15iH6RAzHAvHhMJzg@mail.gmail.com>
Message-ID: <9997B519-F903-4E16-A3FD-E830AEB0BCE0@yahoo.com>

On Sep 20, 2013, at 23:56, Neil Girdhar <mistersheik at gmail.com> wrote:

>> You were, I agree.  But you proposed a new API, which pretty well
>> guarantees many discussants will take a more global view, like "do the
>> use cases justify this addition?"
> 
> It's a good point.  I think if I'm the only with this problem, then the answer is clearly no.  I will just cast to list and so what if it's a little bit slower in some cases. How could I know that I was the only one with this problem?

That seems more than a little stubborn. Today, you can create a list iff you're given an iterator. You'd prefer to write that in terms of creating a list iff you're given a non-reiterable iterable. And, if you can't have that, screw all your users, you'll just always make a list?

And again, if you have an actual problem that iterator doesn't solve, I'd love to see it.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20130921/6db3d130/attachment.html>

From mistersheik at gmail.com  Sat Sep 21 09:21:24 2013
From: mistersheik at gmail.com (Neil Girdhar)
Date: Sat, 21 Sep 2013 03:21:24 -0400
Subject: [Python-ideas] Introduce collections.Reiterable
In-Reply-To: <CED2BED8-5B94-4203-88D5-4510BC5C067C@yahoo.com>
References: <575f4071-1b5c-4a16-b36c-b5f925cdd2f7@googlegroups.com>
 <CADiSq7c2G8E47Hth-kQWd4W33dpb9YFiyLs+vs7Seaw20w6fHw@mail.gmail.com>
 <CAA68w_kuPS6xVjAbropBFyiQib3aH9Ke+PiW8ecWhSkmHsnXnQ@mail.gmail.com>
 <CADiSq7cVXAUi4BLpfa13QjhwNmX9JancrMjHSEEfKn_P=F-_ag@mail.gmail.com>
 <20130919121828.GK19939@ando>
 <CADiSq7cyp2=XWT2e+nsCR9Q2TqrqM5fdHUhFBwoTfHT4bwum9w@mail.gmail.com>
 <20130920094854.GO19939@ando>
 <CACac1F-=X-nfKe9sYx-_U_Ties3nU3z4vqzs=DkdqfJFD0StiA@mail.gmail.com>
 <20130920111000.GQ19939@ando>
 <CAN8CLgnAWQ3iszEtvEo_FTydx9Un+Y+41Fe5fVsNdX70Ube9Ww@mail.gmail.com>
 <l1imst$88i$1@ger.gmane.org> <87zjr63oub.fsf@uwakimon.sk.tsukuba.ac.jp>
 <CAA68w_=fhyk8Wh2Mqx4-09cTaPhiqFTS3ZVjL4tez9X9-9Yx+Q@mail.gmail.com>
 <F6683518-F434-4B96-8D10-E424D7F53778@yahoo.com>
 <CAA68w_mkCqCjjFohiWQ2tj3SJxFA+uoAiqOEep30gMCBw33gtQ@mail.gmail.com>
 <CED2BED8-5B94-4203-88D5-4510BC5C067C@yahoo.com>
Message-ID: <CAA68w_=s4kEgQeGa58g3tuoFZgQc+R48hBQJyFVcHs8TOgqkMg@mail.gmail.com>

I'm happy with iterable and not iterator if it comes with a promise.  Then
my first ABC is what what I probably want.  If not, then I think it's
better to do something lke

class Reiterable(collections.Iterable,
                 metaclass=collections.abc.ABCMeta):
    @classmethod
    def __subclasshook__(cls, subclass):
        if (issubclass(subclass, collections.MappingView)
            or issubclass(subclass, collections.Sequence)
            or issubclass(subclass, collections.Set)
            or issubclass(subclass, collections.Mapping)):
            return True
        return NotImplemented

Other classes can be added with register.


On Sat, Sep 21, 2013 at 3:04 AM, Andrew Barnert <abarnert at yahoo.com> wrote:

> On Sep 20, 2013, at 21:52, Neil Girdhar <mistersheik at gmail.com> wrote:
>
> We discussed this upthread: I only want "not iterator" if not iterator
> promises reiterability. Right now, we have what may be a happy accident
> that can easily be violated by someone else.
>
>
> And if you define your new ABC, it can be just as easily violated by
> someone else. In fact, it will be violated in the exact _same_
> cases. There's no check you can do besides the reverse of the checks done
> by iterator.
>
> More importantly, it's not just "a happy accident". I've asked repeatedly
> if anyone can come up with a single example of a non-iterator,
> non-reiterable iterator, or even imagine what one would look like, and
> nobody's come up with one. And it's not like iterators are some new feature
> nobody's had time to explore yet.
>
> So, in order to solve a problem that doesn't exist, you want to add a new
> feature that wouldn't solve it any better than what we have today.
>
>  Best,
> Neil
>
>
> On Sat, Sep 21, 2013 at 12:50 AM, Andrew Barnert <abarnert at yahoo.com>wrote:
>
>> On Sep 20, 2013, at 21:23, Neil Girdhar <mistersheik at gmail.com> wrote:
>>
>> I appreciate the discussion illuminating various aspects of this I hadn't
>> considered. Finally, what I think I want is for
>> * all sequences
>> * all views
>> * numpy arrays
>> to answer yes to reiterable, and
>> * all generators
>> to answer no to reiterable.
>>
>>
>> All sequences, views, and numpy arrays answer no to iterator (and so do
>> sets, mappings, etc.), and all generators answer yes (and so do the
>> iterators you get back from calling iter on a sequence, map, filter, your
>> favorite itertools function, etc.)
>>
>> So you just want "not iterator". Even Haskell doesn't attempt to provide
>> negative types like that. (And you can very easily show that it's iterator
>> that's the normal type: it's syntactically checkable in various ways--e.g.,
>> it.hasattr('__next__'), but the only positive way to check reiterable is
>> not just semantic, but destructive.)
>>
>> Best, Neil
>>
>> On Fri, Sep 20, 2013 at 10:12 PM, Stephen J. Turnbull <stephen at xemacs.org
>> > wrote:
>>
>>> Terry Reedy writes:
>>>
>>>  > Dismissing legal code as 'pathological', as more than one person has,
>>>  > does not cut it as a design principle.
>>>
>>> But you don't even need to write a class with __getitem__() to get
>>> that behavior.
>>>
>>> >>> l = [11, 12, 13]
>>> >>> for i in l:
>>> ...  print(i)
>>> ...  if i%2 == 0:
>>> ...   l.remove(i)
>>> ...
>>> 11
>>> 12
>>> >>> l
>>> [11, 13]
>>> >>>
>>>
>>> Of course the iteration itself is probably buggy (ie, the writer
>>> didn't mean to skip printing '13'), but in general iterables can
>>> change themselves.
>>>
>>> Neil himself seems to be of two minds about such cases.  On the one
>>> hand, he said the above behavior is built in to list, so it's
>>> acceptable to him.  (I think that's inconsistent: I would say the
>>> property of being completely consumed is built in to iterator, so it
>>> should be acceptable, too.)  On the other hand, he's defined a
>>> reiterable as a collection that when iterated produces the same
>>> objects in the same order.
>>>
>>> Maybe what we really want is for copy.deepcopy to do the right thing
>>> with iterables.  Then code that doesn't want to consume consumable
>>> iterables can do a deepcopy (including replication of the closed-over
>>> state of __next__() for iterators) before iterating.
>>>
>>> Or perhaps the right thing is a copy.itercopy that creates a new
>>> composite object as a shallow copy of everything except that it clones
>>> the state of __next__() in case the object was an iterator to start
>>> with.
>>> _______________________________________________
>>> Python-ideas mailing list
>>> Python-ideas at python.org
>>> https://mail.python.org/mailman/listinfo/python-ideas
>>>
>>> --
>>>
>>> ---
>>> You received this message because you are subscribed to a topic in the
>>> Google Groups "python-ideas" group.
>>> To unsubscribe from this topic, visit
>>> https://groups.google.com/d/topic/python-ideas/OumiLGDwRWA/unsubscribe.
>>> To unsubscribe from this group and all its topics, send an email to
>>> python-ideas+unsubscribe at googlegroups.com.
>>> For more options, visit https://groups.google.com/groups/opt_out.
>>>
>>
>> _______________________________________________
>> Python-ideas mailing list
>> Python-ideas at python.org
>> https://mail.python.org/mailman/listinfo/python-ideas
>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20130921/26c631cb/attachment.html>

From mistersheik at gmail.com  Sat Sep 21 09:21:55 2013
From: mistersheik at gmail.com (Neil Girdhar)
Date: Sat, 21 Sep 2013 03:21:55 -0400
Subject: [Python-ideas] Introduce collections.Reiterable
In-Reply-To: <CAA68w_=s4kEgQeGa58g3tuoFZgQc+R48hBQJyFVcHs8TOgqkMg@mail.gmail.com>
References: <575f4071-1b5c-4a16-b36c-b5f925cdd2f7@googlegroups.com>
 <CADiSq7c2G8E47Hth-kQWd4W33dpb9YFiyLs+vs7Seaw20w6fHw@mail.gmail.com>
 <CAA68w_kuPS6xVjAbropBFyiQib3aH9Ke+PiW8ecWhSkmHsnXnQ@mail.gmail.com>
 <CADiSq7cVXAUi4BLpfa13QjhwNmX9JancrMjHSEEfKn_P=F-_ag@mail.gmail.com>
 <20130919121828.GK19939@ando>
 <CADiSq7cyp2=XWT2e+nsCR9Q2TqrqM5fdHUhFBwoTfHT4bwum9w@mail.gmail.com>
 <20130920094854.GO19939@ando>
 <CACac1F-=X-nfKe9sYx-_U_Ties3nU3z4vqzs=DkdqfJFD0StiA@mail.gmail.com>
 <20130920111000.GQ19939@ando>
 <CAN8CLgnAWQ3iszEtvEo_FTydx9Un+Y+41Fe5fVsNdX70Ube9Ww@mail.gmail.com>
 <l1imst$88i$1@ger.gmane.org> <87zjr63oub.fsf@uwakimon.sk.tsukuba.ac.jp>
 <CAA68w_=fhyk8Wh2Mqx4-09cTaPhiqFTS3ZVjL4tez9X9-9Yx+Q@mail.gmail.com>
 <F6683518-F434-4B96-8D10-E424D7F53778@yahoo.com>
 <CAA68w_mkCqCjjFohiWQ2tj3SJxFA+uoAiqOEep30gMCBw33gtQ@mail.gmail.com>
 <CED2BED8-5B94-4203-88D5-4510BC5C067C@yahoo.com>
 <CAA68w_=s4kEgQeGa58g3tuoFZgQc+R48hBQJyFVcHs8TOgqkMg@mail.gmail.com>
Message-ID: <CAA68w_mpncrJQoJG9-Ho2V5+cm2gV+XSCkOx-kVpOYwW_HY-Sg@mail.gmail.com>

Whoa!  I'm not trying to be stubborn!! I'm just suggesting that what we
have today is find if the problem is isolated.


On Sat, Sep 21, 2013 at 3:21 AM, Neil Girdhar <mistersheik at gmail.com> wrote:

> I'm happy with iterable and not iterator if it comes with a promise.  Then
> my first ABC is what what I probably want.  If not, then I think it's
> better to do something lke
>
> class Reiterable(collections.Iterable,
>                  metaclass=collections.abc.ABCMeta):
>     @classmethod
>     def __subclasshook__(cls, subclass):
>         if (issubclass(subclass, collections.MappingView)
>             or issubclass(subclass, collections.Sequence)
>             or issubclass(subclass, collections.Set)
>             or issubclass(subclass, collections.Mapping)):
>             return True
>         return NotImplemented
>
> Other classes can be added with register.
>
>
> On Sat, Sep 21, 2013 at 3:04 AM, Andrew Barnert <abarnert at yahoo.com>wrote:
>
>> On Sep 20, 2013, at 21:52, Neil Girdhar <mistersheik at gmail.com> wrote:
>>
>> We discussed this upthread: I only want "not iterator" if not iterator
>> promises reiterability. Right now, we have what may be a happy accident
>> that can easily be violated by someone else.
>>
>>
>> And if you define your new ABC, it can be just as easily violated by
>> someone else. In fact, it will be violated in the exact _same_
>> cases. There's no check you can do besides the reverse of the checks done
>> by iterator.
>>
>> More importantly, it's not just "a happy accident". I've asked repeatedly
>> if anyone can come up with a single example of a non-iterator,
>> non-reiterable iterator, or even imagine what one would look like, and
>> nobody's come up with one. And it's not like iterators are some new feature
>> nobody's had time to explore yet.
>>
>> So, in order to solve a problem that doesn't exist, you want to add a new
>> feature that wouldn't solve it any better than what we have today.
>>
>>  Best,
>> Neil
>>
>>
>> On Sat, Sep 21, 2013 at 12:50 AM, Andrew Barnert <abarnert at yahoo.com>wrote:
>>
>>> On Sep 20, 2013, at 21:23, Neil Girdhar <mistersheik at gmail.com> wrote:
>>>
>>> I appreciate the discussion illuminating various aspects of this I
>>> hadn't considered. Finally, what I think I want is for
>>> * all sequences
>>> * all views
>>> * numpy arrays
>>> to answer yes to reiterable, and
>>> * all generators
>>> to answer no to reiterable.
>>>
>>>
>>> All sequences, views, and numpy arrays answer no to iterator (and so do
>>> sets, mappings, etc.), and all generators answer yes (and so do the
>>> iterators you get back from calling iter on a sequence, map, filter, your
>>> favorite itertools function, etc.)
>>>
>>> So you just want "not iterator". Even Haskell doesn't attempt to provide
>>> negative types like that. (And you can very easily show that it's iterator
>>> that's the normal type: it's syntactically checkable in various ways--e.g.,
>>> it.hasattr('__next__'), but the only positive way to check reiterable is
>>> not just semantic, but destructive.)
>>>
>>> Best, Neil
>>>
>>> On Fri, Sep 20, 2013 at 10:12 PM, Stephen J. Turnbull <
>>> stephen at xemacs.org> wrote:
>>>
>>>> Terry Reedy writes:
>>>>
>>>>  > Dismissing legal code as 'pathological', as more than one person has,
>>>>  > does not cut it as a design principle.
>>>>
>>>> But you don't even need to write a class with __getitem__() to get
>>>> that behavior.
>>>>
>>>> >>> l = [11, 12, 13]
>>>> >>> for i in l:
>>>> ...  print(i)
>>>> ...  if i%2 == 0:
>>>> ...   l.remove(i)
>>>> ...
>>>> 11
>>>> 12
>>>> >>> l
>>>> [11, 13]
>>>> >>>
>>>>
>>>> Of course the iteration itself is probably buggy (ie, the writer
>>>> didn't mean to skip printing '13'), but in general iterables can
>>>> change themselves.
>>>>
>>>> Neil himself seems to be of two minds about such cases.  On the one
>>>> hand, he said the above behavior is built in to list, so it's
>>>> acceptable to him.  (I think that's inconsistent: I would say the
>>>> property of being completely consumed is built in to iterator, so it
>>>> should be acceptable, too.)  On the other hand, he's defined a
>>>> reiterable as a collection that when iterated produces the same
>>>> objects in the same order.
>>>>
>>>> Maybe what we really want is for copy.deepcopy to do the right thing
>>>> with iterables.  Then code that doesn't want to consume consumable
>>>> iterables can do a deepcopy (including replication of the closed-over
>>>> state of __next__() for iterators) before iterating.
>>>>
>>>> Or perhaps the right thing is a copy.itercopy that creates a new
>>>> composite object as a shallow copy of everything except that it clones
>>>> the state of __next__() in case the object was an iterator to start
>>>> with.
>>>> _______________________________________________
>>>> Python-ideas mailing list
>>>> Python-ideas at python.org
>>>> https://mail.python.org/mailman/listinfo/python-ideas
>>>>
>>>> --
>>>>
>>>> ---
>>>> You received this message because you are subscribed to a topic in the
>>>> Google Groups "python-ideas" group.
>>>> To unsubscribe from this topic, visit
>>>> https://groups.google.com/d/topic/python-ideas/OumiLGDwRWA/unsubscribe.
>>>> To unsubscribe from this group and all its topics, send an email to
>>>> python-ideas+unsubscribe at googlegroups.com.
>>>> For more options, visit https://groups.google.com/groups/opt_out.
>>>>
>>>
>>> _______________________________________________
>>> Python-ideas mailing list
>>> Python-ideas at python.org
>>> https://mail.python.org/mailman/listinfo/python-ideas
>>>
>>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20130921/9a972a9e/attachment-0001.html>

From steve at pearwood.info  Sat Sep 21 10:02:17 2013
From: steve at pearwood.info (Steven D'Aprano)
Date: Sat, 21 Sep 2013 18:02:17 +1000
Subject: [Python-ideas] Introduce collections.Reiterable
In-Reply-To: <CAA68w_kZorxA4OYACVmvdPhusfyooO==wgfFtdfnKC0J-eTHEg@mail.gmail.com>
References: <CACac1F-=X-nfKe9sYx-_U_Ties3nU3z4vqzs=DkdqfJFD0StiA@mail.gmail.com>
 <20130920111000.GQ19939@ando>
 <CAN8CLgnAWQ3iszEtvEo_FTydx9Un+Y+41Fe5fVsNdX70Ube9Ww@mail.gmail.com>
 <l1imst$88i$1@ger.gmane.org>
 <CAN8CLg=RQBXAJJbx5Ts+2D8ywHB_ziWj_T-VDMPWNbByETgrOQ@mail.gmail.com>
 <CAN8CLgkZogwNzCLr91yiTmq=hR7KzRhLGNiSUJgrcok7-L7awQ@mail.gmail.com>
 <87y56q3ni1.fsf@uwakimon.sk.tsukuba.ac.jp>
 <CAA68w_mBaQ_+uUmfqDvs85qhudWLjCGXCw0V4p2_r6wdWikTpg@mail.gmail.com>
 <523D3496.9050602@stoneleaf.us>
 <CAA68w_kZorxA4OYACVmvdPhusfyooO==wgfFtdfnKC0J-eTHEg@mail.gmail.com>
Message-ID: <20130921080217.GX19939@ando>

On Sat, Sep 21, 2013 at 02:21:21AM -0400, Neil Girdhar wrote:
> No one suggested removing __getitem__.  Some people have suggested
> deprecating (without removing) the sequence protocol.

Some people -- that would be you, I believe.

What's the point of deprecating something if you have no intention of 
removing it?

What's the point of deprecating something which works? It isn't like 
it's doing any harm. You'll just cause unnecessary code-churn in other 
people's working code. Removing broken, unfixable code -- sure. Removing 
working code just because it annoys some people's idea of purity? I'm 
against that.


> Do you know of any
> object that relies on the sequence protocol?  That is, that implements
> __getitem__ without implementing __iter__ (or using a mixin like
> collections.Sequence to provide __iter__)?

Not off the top of my head, but that doesn't mean there aren't masses of 
code that does so. Off the top of my head, I don't know of any code that 
relies on exception tracebacks being printed to stderr rather that 
stdout, but that doesn't mean we should feel free to change that on a 
whim.


-- 
Steven

From steve at pearwood.info  Sat Sep 21 10:04:06 2013
From: steve at pearwood.info (Steven D'Aprano)
Date: Sat, 21 Sep 2013 18:04:06 +1000
Subject: [Python-ideas] Introduce collections.Reiterable
In-Reply-To: <CAA68w_=fhyk8Wh2Mqx4-09cTaPhiqFTS3ZVjL4tez9X9-9Yx+Q@mail.gmail.com>
References: <CADiSq7cVXAUi4BLpfa13QjhwNmX9JancrMjHSEEfKn_P=F-_ag@mail.gmail.com>
 <20130919121828.GK19939@ando>
 <CADiSq7cyp2=XWT2e+nsCR9Q2TqrqM5fdHUhFBwoTfHT4bwum9w@mail.gmail.com>
 <20130920094854.GO19939@ando>
 <CACac1F-=X-nfKe9sYx-_U_Ties3nU3z4vqzs=DkdqfJFD0StiA@mail.gmail.com>
 <20130920111000.GQ19939@ando>
 <CAN8CLgnAWQ3iszEtvEo_FTydx9Un+Y+41Fe5fVsNdX70Ube9Ww@mail.gmail.com>
 <l1imst$88i$1@ger.gmane.org> <87zjr63oub.fsf@uwakimon.sk.tsukuba.ac.jp>
 <CAA68w_=fhyk8Wh2Mqx4-09cTaPhiqFTS3ZVjL4tez9X9-9Yx+Q@mail.gmail.com>
Message-ID: <20130921080406.GY19939@ando>

On Sat, Sep 21, 2013 at 12:23:54AM -0400, Neil Girdhar wrote:
> I appreciate the discussion illuminating various aspects of this I hadn't
> considered. Finally, what I think I want is for
> * all sequences
> * all views
> * numpy arrays
> to answer yes to reiterable, and
> * all generators
> to answer no to reiterable.

Which brings us full circle to:

if isinstance(obj, Iterable) and not isinstance(obj, Iterator):
    print("Is re-iterable")
else:
    print("Is not re-iterable")


which I believe satisfies your requirement. Can you show any standard, 
non-pathological type where this test fails to give the correct answer? 
If not, what exactly is the problem with just using that test?

As far as I am concerned, not every one-line test needs an ABC.


-- 
Steven

From rymg19 at gmail.com  Sat Sep 21 20:15:58 2013
From: rymg19 at gmail.com (Ryan)
Date: Sat, 21 Sep 2013 13:15:58 -0500
Subject: [Python-ideas] Introduce collections.Reiterable
In-Reply-To: <7wpps2wz8l.fsf@benfinney.id.au>
References: <20130919121828.GK19939@ando>
 <CADiSq7cyp2=XWT2e+nsCR9Q2TqrqM5fdHUhFBwoTfHT4bwum9w@mail.gmail.com>
 <20130920094854.GO19939@ando> <l1idsm$dvp$1@ger.gmane.org>
 <84A21A05-BA4E-4324-98DB-6ADE7EB98D1D@gmail.com> <l1ingq$dov$1@ger.gmane.org>
 <5354F5F4-8052-451E-BAFB-BED214484AF8@gmail.com>
 <CAA68w_nLweisPY3QBDWYtcUCygPyA_FvjMy4GCh0cAFa4sXLKQ@mail.gmail.com>
 <523D0BF4.5020404@stoneleaf.us>
 <CAA68w_mhc2Mu0rj3ooCs1OXPohrQDUwtoe_rzeJVOa+d1CODhA@mail.gmail.com>
 <20130921040904.GW19939@ando>
 <CAA68w_n1w0noOmRn-at4CfoHkHUkPQEzQB8S3F59Onn9+sm4qw@mail.gmail.com>
 <7wpps2wz8l.fsf@benfinney.id.au>
Message-ID: <56a4d514-2f52-4dbe-a861-5298f4970101@email.android.com>

I can still see why checking if its callable is a good idea in some cases. Say you call the callback in line 1024 of module mymod:

self.call[item]()

And someone hands over a string:

TypeError: 'string' object is not callable

And this is what's probably going through the person's head:

Stupid Python! What did I do wrong now?

Checking if it's callable works better:
if not callable(self.call[item]):
 raise CallbackError('given callback %s must be callable' % str(item))

Now the user says:

Ohhhhh....so that's what I did wrong!!!

Ben Finney <ben+python at benfinney.id.au> wrote:

>Neil Girdhar <mistersheik at gmail.com>
>writes:
>
>> However, there are plenty of times where you can't do that, e.g., you
>> want to know if something is callable before calling it
>
>What is a concrete example of *needing* to know whether an object is
>callable? Why not just use the object *as if it is* callable, and the
>TypeError will propagate back to whoever fed you the object if it's
>not?
>
>> and similarly if something is reiterable before iterating it and
>> exhausting.
>
>I have somewhat more sympathy for this desire; duck typing doesn't work
>so well for this, because by the time the iterable is exhausted it's
>too
>late to deal with its inability to re-start.
>
>Still, though, this is the kind of division of responsibility that
>makes
>a good program: tell the user of your code (in the docstring of your
>class or function) that you require a sequence or some other
>re-iterable
>object. If you try something that fails on what object you've been
>given, that's the responsibility of the code that gave it to you. You
>can be nice by ensuring it'll fail in such a way the caller gets a
>meaningful exception.
>
>-- 
>\        ?The number of UNIX installations has grown to 10, with more |
> `\         expected.? ?Unix Programmer's Manual, 2nd Ed., 1972-06-12 |
>_o__)                                                                 
>|
>Ben Finney
>
>_______________________________________________
>Python-ideas mailing list
>Python-ideas at python.org
>https://mail.python.org/mailman/listinfo/python-ideas

-- 
Sent from my Android phone with K-9 Mail. Please excuse my brevity.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20130921/b620d198/attachment.html>

From stephen at xemacs.org  Sat Sep 21 21:04:39 2013
From: stephen at xemacs.org (Stephen J. Turnbull)
Date: Sun, 22 Sep 2013 04:04:39 +0900
Subject: [Python-ideas] Introduce collections.Reiterable
In-Reply-To: <56a4d514-2f52-4dbe-a861-5298f4970101@email.android.com>
References: <20130919121828.GK19939@ando>
 <CADiSq7cyp2=XWT2e+nsCR9Q2TqrqM5fdHUhFBwoTfHT4bwum9w@mail.gmail.com>
 <20130920094854.GO19939@ando> <l1idsm$dvp$1@ger.gmane.org>
 <84A21A05-BA4E-4324-98DB-6ADE7EB98D1D@gmail.com>
 <l1ingq$dov$1@ger.gmane.org>
 <5354F5F4-8052-451E-BAFB-BED214484AF8@gmail.com>
 <CAA68w_nLweisPY3QBDWYtcUCygPyA_FvjMy4GCh0cAFa4sXLKQ@mail.gmail.com>
 <523D0BF4.5020404@stoneleaf.us>
 <CAA68w_mhc2Mu0rj3ooCs1OXPohrQDUwtoe_rzeJVOa+d1CODhA@mail.gmail.com>
 <20130921040904.GW19939@ando>
 <CAA68w_n1w0noOmRn-at4CfoHkHUkPQEzQB8S3F59Onn9+sm4qw@mail.gmail.com>
 <7wpps2wz8l.fsf@benfinney.id.au>
 <56a4d514-2f52-4dbe-a861-5298f4970101@email.android.com>
Message-ID: <87mwn62dyw.fsf@uwakimon.sk.tsukuba.ac.jp>

Ryan writes:

 > I can still see why checking if its callable is a good idea in some
 > cases.

You can always rewrite the LBYL in EAFP form:

    try:
        self.call[item]()
    except TypeError as e:
        raise CallbackError(...) from e

This is definitely preferred if you can enclose a whole suite in a try
and expect the CallbackError to be infrequent, eg:

    try:
        while item in input:
            self.call[item]()
    except TypeError as e:
        raise CallbackError(...) from e

However, what you probably really want to do (which is a better
argument for LBYL, anyway) is

    if callable(callback):
        self.call[item] = callback
    else:
        what_would_jruser_do()


From rymg19 at gmail.com  Sat Sep 21 22:02:36 2013
From: rymg19 at gmail.com (Ryan)
Date: Sat, 21 Sep 2013 15:02:36 -0500
Subject: [Python-ideas] Introduce collections.Reiterable
In-Reply-To: <87mwn62dyw.fsf@uwakimon.sk.tsukuba.ac.jp>
References: <20130919121828.GK19939@ando>
 <CADiSq7cyp2=XWT2e+nsCR9Q2TqrqM5fdHUhFBwoTfHT4bwum9w@mail.gmail.com>
 <20130920094854.GO19939@ando> <l1idsm$dvp$1@ger.gmane.org>
 <84A21A05-BA4E-4324-98DB-6ADE7EB98D1D@gmail.com> <l1ingq$dov$1@ger.gmane.org>
 <5354F5F4-8052-451E-BAFB-BED214484AF8@gmail.com>
 <CAA68w_nLweisPY3QBDWYtcUCygPyA_FvjMy4GCh0cAFa4sXLKQ@mail.gmail.com>
 <523D0BF4.5020404@stoneleaf.us>
 <CAA68w_mhc2Mu0rj3ooCs1OXPohrQDUwtoe_rzeJVOa+d1CODhA@mail.gmail.com>
 <20130921040904.GW19939@ando>
 <CAA68w_n1w0noOmRn-at4CfoHkHUkPQEzQB8S3F59Onn9+sm4qw@mail.gmail.com>
 <7wpps2wz8l.fsf@benfinney.id.au>
 <56a4d514-2f52-4dbe-a861-5298f4970101@email.android.com>
 <87mwn62dyw.fsf@uwakimon.sk.tsukuba.ac.jp>
Message-ID: <b41f2d49-9e87-4cc4-98d6-2056e9375a7e@email.android.com>

The problem with the try...except statement is that it'll catch errors that occur inside the function. Say the function calls zip and accidentally gives an int instead of their list. It'll raise a TypeError, which will be caught and a CallbackError will be raised. But, in some programs, that behavior doesn't work.

"Stephen J. Turnbull" <stephen at xemacs.org> wrote:

>Ryan writes:
>
> > I can still see why checking if its callable is a good idea in some
> > cases.
>
>You can always rewrite the LBYL in EAFP form:
>
>    try:
>        self.call[item]()
>    except TypeError as e:
>        raise CallbackError(...) from e
>
>This is definitely preferred if you can enclose a whole suite in a try
>and expect the CallbackError to be infrequent, eg:
>
>    try:
>        while item in input:
>            self.call[item]()
>    except TypeError as e:
>        raise CallbackError(...) from e
>
>However, what you probably really want to do (which is a better
>argument for LBYL, anyway) is
>
>    if callable(callback):
>        self.call[item] = callback
>    else:
>        what_would_jruser_do()

-- 
Sent from my Android phone with K-9 Mail. Please excuse my brevity.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20130921/2b7be30f/attachment.html>

From abarnert at yahoo.com  Sat Sep 21 23:08:06 2013
From: abarnert at yahoo.com (Andrew Barnert)
Date: Sat, 21 Sep 2013 14:08:06 -0700
Subject: [Python-ideas] Introduce collections.Reiterable
In-Reply-To: <CAA68w_=s4kEgQeGa58g3tuoFZgQc+R48hBQJyFVcHs8TOgqkMg@mail.gmail.com>
References: <575f4071-1b5c-4a16-b36c-b5f925cdd2f7@googlegroups.com>
 <CADiSq7c2G8E47Hth-kQWd4W33dpb9YFiyLs+vs7Seaw20w6fHw@mail.gmail.com>
 <CAA68w_kuPS6xVjAbropBFyiQib3aH9Ke+PiW8ecWhSkmHsnXnQ@mail.gmail.com>
 <CADiSq7cVXAUi4BLpfa13QjhwNmX9JancrMjHSEEfKn_P=F-_ag@mail.gmail.com>
 <20130919121828.GK19939@ando>
 <CADiSq7cyp2=XWT2e+nsCR9Q2TqrqM5fdHUhFBwoTfHT4bwum9w@mail.gmail.com>
 <20130920094854.GO19939@ando>
 <CACac1F-=X-nfKe9sYx-_U_Ties3nU3z4vqzs=DkdqfJFD0StiA@mail.gmail.com>
 <20130920111000.GQ19939@ando>
 <CAN8CLgnAWQ3iszEtvEo_FTydx9Un+Y+41Fe5fVsNdX70Ube9Ww@mail.gmail.com>
 <l1imst$88i$1@ger.gmane.org> <87zjr63oub.fsf@uwakimon.sk.tsukuba.ac.jp>
 <CAA68w_=fhyk8Wh2Mqx4-09cTaPhiqFTS3ZVjL4tez9X9-9Yx+Q@mail.gmail.com>
 <F6683518-F434-4B96-8D10-E424D7F53778@yahoo.com>
 <CAA68w_mkCqCjjFohiWQ2tj3SJxFA+uoAiqOEep30gMCBw33gtQ@mail.gmail.com>
 <CED2BED8-5B94-4203-88D5-4510BC5C067C@yahoo.com>
 <CAA68w_=s4kEgQeGa58g3tuoFZgQc+R48hBQJyFVcHs8TOgqkMg@mail.gmail.com>
Message-ID: <1060C40C-4255-456C-9B2F-49DAC7B9FB03@yahoo.com>

On Sep 21, 2013, at 0:21, Neil Girdhar <mistersheik at gmail.com> wrote:

> I'm happy with iterable and not iterator if it comes with a promise.  Then my first ABC is what what I probably want.  If not, then I think it's better to do something lke
> 
> class Reiterable(collections.Iterable,
>                  metaclass=collections.abc.ABCMeta):
>     @classmethod
>     def __subclasshook__(cls, subclass):
>         if (issubclass(subclass, collections.MappingView)
>             or issubclass(subclass, collections.Sequence)
>             or issubclass(subclass, collections.Set)
>             or issubclass(subclass, collections.Mapping)):
>             return True
>         return NotImplemented

Which leaves out numpy arrays, most sorted list and dict classes from PyPI, ElementTree and similar element/node/etc. types, ScriptingBridge/appscript collections, win32com IWhateverCollections, and all kinds of other types that can be reiterated, which are correctly diagnosed by Iterable and not Iterator.

I haven't tested all of them, so some could fail to register as Iterable (especially given the possibility that Iterable may be incorrect, as mentioned elsewhere on this thread). But getting false negatives on a few types and having to deal with them by fixing a bug is surely better than getting false negatives on all types and having to deal with them by adding new, otherwise-unnecessary code.

> Other classes can be added with register.

So anyone who wants to use your module with numpy or appscript or ElementTree has to find all of the iterable types the class exposes (some of which aren't part of the public API--in some the case of appscript or win32com the may even be built dynamically as needed) and register all of them?

You're putting the burden in the wrong place. Because you're worried that some class could theoretically be a non-reiterable non-iterator iterable, even though neither you nor anyone else can think of a sensible example of such a thing, you're requiring the user to certify that every iterable single class he uses is not pathological. That's not LBYL, that's perform a comprehensive survey and environmental impact report on the entire region and file papers in triplicate before you leap.

If you're really worried about this unlikely possibility making it hard to debug the use of your code with some as-yet-unknown type, there are easier ways to verify things. For example, if the iterable works the first time, but is empty the second, the user has given you a non-reiterable, and you can assert or raise appropriately, which will make the code error just as easy to debug as having forgotten to register with Reiterable--and far easier to debug than having mistakenly registered with Reiterable when they shouldn't have. Plus, this lets you test for exactly what you want, not just a rough approximation. You could just as easily verify that the first element of each iteration matches, to ensure that it's not a random-reiterable type like Terry discussed that would ruin your particular two-pass algorithm. Or whatever is appropriate.

> On Sat, Sep 21, 2013 at 3:04 AM, Andrew Barnert <abarnert at yahoo.com> wrote:
>> On Sep 20, 2013, at 21:52, Neil Girdhar <mistersheik at gmail.com> wrote:
>> 
>>> We discussed this upthread: I only want "not iterator" if not iterator promises reiterability. Right now, we have what may be a happy accident that can easily be violated by someone else.
>> 
>> And if you define your new ABC, it can be just as easily violated by someone else. In fact, it will be violated in the exact _same_ cases. There's no check you can do besides the reverse of the checks done by iterator.
>> 
>> More importantly, it's not just "a happy accident". I've asked repeatedly if anyone can come up with a single example of a non-iterator, non-reiterable iterator, or even imagine what one would look like, and nobody's come up with one. And it's not like iterators are some new feature nobody's had time to explore yet.
>> 
>> So, in order to solve a problem that doesn't exist, you want to add a new feature that wouldn't solve it any better than what we have today.
>> 
>>> Best,
>>> Neil
>>> 
>>> 
>>> On Sat, Sep 21, 2013 at 12:50 AM, Andrew Barnert <abarnert at yahoo.com> wrote:
>>>> On Sep 20, 2013, at 21:23, Neil Girdhar <mistersheik at gmail.com> wrote:
>>>> 
>>>>> I appreciate the discussion illuminating various aspects of this I hadn't considered. Finally, what I think I want is for
>>>>> * all sequences
>>>>> * all views
>>>>> * numpy arrays
>>>>> to answer yes to reiterable, and
>>>>> * all generators
>>>>> to answer no to reiterable.
>>>> 
>>>> All sequences, views, and numpy arrays answer no to iterator (and so do sets, mappings, etc.), and all generators answer yes (and so do the iterators you get back from calling iter on a sequence, map, filter, your favorite itertools function, etc.)
>>>> 
>>>> So you just want "not iterator". Even Haskell doesn't attempt to provide negative types like that. (And you can very easily show that it's iterator that's the normal type: it's syntactically checkable in various ways--e.g., it.hasattr('__next__'), but the only positive way to check reiterable is not just semantic, but destructive.)
>>>> 
>>>>> Best, Neil
>>>>> 
>>>>> On Fri, Sep 20, 2013 at 10:12 PM, Stephen J. Turnbull <stephen at xemacs.org> wrote:
>>>>>> Terry Reedy writes:
>>>>>> 
>>>>>>  > Dismissing legal code as 'pathological', as more than one person has,
>>>>>>  > does not cut it as a design principle.
>>>>>> 
>>>>>> But you don't even need to write a class with __getitem__() to get
>>>>>> that behavior.
>>>>>> 
>>>>>> >>> l = [11, 12, 13]
>>>>>> >>> for i in l:
>>>>>> ...  print(i)
>>>>>> ...  if i%2 == 0:
>>>>>> ...   l.remove(i)
>>>>>> ...
>>>>>> 11
>>>>>> 12
>>>>>> >>> l
>>>>>> [11, 13]
>>>>>> >>>
>>>>>> 
>>>>>> Of course the iteration itself is probably buggy (ie, the writer
>>>>>> didn't mean to skip printing '13'), but in general iterables can
>>>>>> change themselves.
>>>>>> 
>>>>>> Neil himself seems to be of two minds about such cases.  On the one
>>>>>> hand, he said the above behavior is built in to list, so it's
>>>>>> acceptable to him.  (I think that's inconsistent: I would say the
>>>>>> property of being completely consumed is built in to iterator, so it
>>>>>> should be acceptable, too.)  On the other hand, he's defined a
>>>>>> reiterable as a collection that when iterated produces the same
>>>>>> objects in the same order.
>>>>>> 
>>>>>> Maybe what we really want is for copy.deepcopy to do the right thing
>>>>>> with iterables.  Then code that doesn't want to consume consumable
>>>>>> iterables can do a deepcopy (including replication of the closed-over
>>>>>> state of __next__() for iterators) before iterating.
>>>>>> 
>>>>>> Or perhaps the right thing is a copy.itercopy that creates a new
>>>>>> composite object as a shallow copy of everything except that it clones
>>>>>> the state of __next__() in case the object was an iterator to start
>>>>>> with.
>>>>>> _______________________________________________
>>>>>> Python-ideas mailing list
>>>>>> Python-ideas at python.org
>>>>>> https://mail.python.org/mailman/listinfo/python-ideas
>>>>>> 
>>>>>> --
>>>>>> 
>>>>>> ---
>>>>>> You received this message because you are subscribed to a topic in the Google Groups "python-ideas" group.
>>>>>> To unsubscribe from this topic, visit https://groups.google.com/d/topic/python-ideas/OumiLGDwRWA/unsubscribe.
>>>>>> To unsubscribe from this group and all its topics, send an email to python-ideas+unsubscribe at googlegroups.com.
>>>>>> For more options, visit https://groups.google.com/groups/opt_out.
>>>>> 
>>>>> _______________________________________________
>>>>> Python-ideas mailing list
>>>>> Python-ideas at python.org
>>>>> https://mail.python.org/mailman/listinfo/python-ideas
> 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20130921/592fdc6d/attachment-0001.html>

From mistersheik at gmail.com  Sat Sep 21 23:14:56 2013
From: mistersheik at gmail.com (Neil Girdhar)
Date: Sat, 21 Sep 2013 17:14:56 -0400
Subject: [Python-ideas] Introduce collections.Reiterable
In-Reply-To: <1060C40C-4255-456C-9B2F-49DAC7B9FB03@yahoo.com>
References: <575f4071-1b5c-4a16-b36c-b5f925cdd2f7@googlegroups.com>
 <CADiSq7c2G8E47Hth-kQWd4W33dpb9YFiyLs+vs7Seaw20w6fHw@mail.gmail.com>
 <CAA68w_kuPS6xVjAbropBFyiQib3aH9Ke+PiW8ecWhSkmHsnXnQ@mail.gmail.com>
 <CADiSq7cVXAUi4BLpfa13QjhwNmX9JancrMjHSEEfKn_P=F-_ag@mail.gmail.com>
 <20130919121828.GK19939@ando>
 <CADiSq7cyp2=XWT2e+nsCR9Q2TqrqM5fdHUhFBwoTfHT4bwum9w@mail.gmail.com>
 <20130920094854.GO19939@ando>
 <CACac1F-=X-nfKe9sYx-_U_Ties3nU3z4vqzs=DkdqfJFD0StiA@mail.gmail.com>
 <20130920111000.GQ19939@ando>
 <CAN8CLgnAWQ3iszEtvEo_FTydx9Un+Y+41Fe5fVsNdX70Ube9Ww@mail.gmail.com>
 <l1imst$88i$1@ger.gmane.org> <87zjr63oub.fsf@uwakimon.sk.tsukuba.ac.jp>
 <CAA68w_=fhyk8Wh2Mqx4-09cTaPhiqFTS3ZVjL4tez9X9-9Yx+Q@mail.gmail.com>
 <F6683518-F434-4B96-8D10-E424D7F53778@yahoo.com>
 <CAA68w_mkCqCjjFohiWQ2tj3SJxFA+uoAiqOEep30gMCBw33gtQ@mail.gmail.com>
 <CED2BED8-5B94-4203-88D5-4510BC5C067C@yahoo.com>
 <CAA68w_=s4kEgQeGa58g3tuoFZgQc+R48hBQJyFVcHs8TOgqkMg@mail.gmail.com>
 <1060C40C-4255-456C-9B2F-49DAC7B9FB03@yahoo.com>
Message-ID: <CAA68w_n-U4My0UE+gc4D8m2vXKQrXiO1XbEVp3BB_eQLLyBnYg@mail.gmail.com>

On Sat, Sep 21, 2013 at 5:08 PM, Andrew Barnert <abarnert at yahoo.com> wrote:

> On Sep 21, 2013, at 0:21, Neil Girdhar <mistersheik at gmail.com> wrote:
>
> I'm happy with iterable and not iterator if it comes with a promise.  Then
> my first ABC is what what I probably want.  If not, then I think it's
> better to do something lke
>
> class Reiterable(collections.Iterable,
>                  metaclass=collections.abc.ABCMeta):
>     @classmethod
>     def __subclasshook__(cls, subclass):
>         if (issubclass(subclass, collections.MappingView)
>             or issubclass(subclass, collections.Sequence)
>             or issubclass(subclass, collections.Set)
>             or issubclass(subclass, collections.Mapping)):
>             return True
>         return NotImplemented
>
>
> Which leaves out numpy arrays, most sorted list and dict classes from
> PyPI, ElementTree and similar element/node/etc. types,
> ScriptingBridge/appscript collections, win32com IWhateverCollections, and
> all kinds of other types that can be reiterated, which are correctly
> diagnosed by Iterable and not Iterator.
>
> I haven't tested all of them, so some could fail to register as Iterable
> (especially given the possibility that Iterable may be incorrect, as
> mentioned elsewhere on this thread). But getting false negatives on a few
> types and having to deal with them by fixing a bug is surely better than
> getting false negatives on all types and having to deal with them by adding
> new, otherwise-unnecessary code.
>
> Other classes can be added with register.
>
>
> So anyone who wants to use your module with numpy or appscript or
> ElementTree has to find all of the iterable types the class exposes (some
> of which aren't part of the public API--in some the case of appscript or
> win32com the may even be built dynamically as needed) and register all of
> them?
>
> You're putting the burden in the wrong place. Because you're worried that
> some class could theoretically be a non-reiterable non-iterator iterable,
> even though neither you nor anyone else can think of a sensible example of
> such a thing, you're requiring the user to certify that every iterable
> single class he uses is not pathological. That's not LBYL, that's perform a
> comprehensive survey and environmental impact report on the entire region
> and file papers in triplicate before you leap.
>

If you really think that there will never be a non-reiterable non-iterator
iterable, then the standard should promise that and we're in total
agreement.


>
> If you're really worried about this unlikely possibility making it hard to
> debug the use of your code with some as-yet-unknown type, there are easier
> ways to verify things. For example, if the iterable works the first time,
> but is empty the second, the user has given you a non-reiterable, and you
> can assert or raise appropriately, which will make the code error just as
> easy to debug as having forgotten to register with Reiterable--and far
> easier to debug than having mistakenly registered with Reiterable when they
> shouldn't have. Plus, this lets you test for exactly what you want, not
> just a rough approximation. You could just as easily verify that the first
> element of each iteration matches, to ensure that it's not a
> random-reiterable type like Terry discussed that would ruin your particular
> two-pass algorithm. Or whatever is appropriate.
>

I think "asserting on" the iterator that was passed in is a much worse
solution than "casting it to iterable".  Don't annoy the user with
implementation details is a good rule to follow.

Best,

Neil


>
> On Sat, Sep 21, 2013 at 3:04 AM, Andrew Barnert <abarnert at yahoo.com>wrote:
>
>> On Sep 20, 2013, at 21:52, Neil Girdhar <mistersheik at gmail.com> wrote:
>>
>> We discussed this upthread: I only want "not iterator" if not iterator
>> promises reiterability. Right now, we have what may be a happy accident
>> that can easily be violated by someone else.
>>
>>
>> And if you define your new ABC, it can be just as easily violated by
>> someone else. In fact, it will be violated in the exact _same_
>> cases. There's no check you can do besides the reverse of the checks done
>> by iterator.
>>
>> More importantly, it's not just "a happy accident". I've asked repeatedly
>> if anyone can come up with a single example of a non-iterator,
>> non-reiterable iterator, or even imagine what one would look like, and
>> nobody's come up with one. And it's not like iterators are some new feature
>> nobody's had time to explore yet.
>>
>> So, in order to solve a problem that doesn't exist, you want to add a new
>> feature that wouldn't solve it any better than what we have today.
>>
>>  Best,
>> Neil
>>
>>
>> On Sat, Sep 21, 2013 at 12:50 AM, Andrew Barnert <abarnert at yahoo.com>wrote:
>>
>>> On Sep 20, 2013, at 21:23, Neil Girdhar <mistersheik at gmail.com> wrote:
>>>
>>> I appreciate the discussion illuminating various aspects of this I
>>> hadn't considered. Finally, what I think I want is for
>>> * all sequences
>>> * all views
>>> * numpy arrays
>>> to answer yes to reiterable, and
>>> * all generators
>>> to answer no to reiterable.
>>>
>>>
>>> All sequences, views, and numpy arrays answer no to iterator (and so do
>>> sets, mappings, etc.), and all generators answer yes (and so do the
>>> iterators you get back from calling iter on a sequence, map, filter, your
>>> favorite itertools function, etc.)
>>>
>>> So you just want "not iterator". Even Haskell doesn't attempt to provide
>>> negative types like that. (And you can very easily show that it's iterator
>>> that's the normal type: it's syntactically checkable in various ways--e.g.,
>>> it.hasattr('__next__'), but the only positive way to check reiterable is
>>> not just semantic, but destructive.)
>>>
>>> Best, Neil
>>>
>>> On Fri, Sep 20, 2013 at 10:12 PM, Stephen J. Turnbull <
>>> stephen at xemacs.org> wrote:
>>>
>>>> Terry Reedy writes:
>>>>
>>>>  > Dismissing legal code as 'pathological', as more than one person has,
>>>>  > does not cut it as a design principle.
>>>>
>>>> But you don't even need to write a class with __getitem__() to get
>>>> that behavior.
>>>>
>>>> >>> l = [11, 12, 13]
>>>> >>> for i in l:
>>>> ...  print(i)
>>>> ...  if i%2 == 0:
>>>> ...   l.remove(i)
>>>> ...
>>>> 11
>>>> 12
>>>> >>> l
>>>> [11, 13]
>>>> >>>
>>>>
>>>> Of course the iteration itself is probably buggy (ie, the writer
>>>> didn't mean to skip printing '13'), but in general iterables can
>>>> change themselves.
>>>>
>>>> Neil himself seems to be of two minds about such cases.  On the one
>>>> hand, he said the above behavior is built in to list, so it's
>>>> acceptable to him.  (I think that's inconsistent: I would say the
>>>> property of being completely consumed is built in to iterator, so it
>>>> should be acceptable, too.)  On the other hand, he's defined a
>>>> reiterable as a collection that when iterated produces the same
>>>> objects in the same order.
>>>>
>>>> Maybe what we really want is for copy.deepcopy to do the right thing
>>>> with iterables.  Then code that doesn't want to consume consumable
>>>> iterables can do a deepcopy (including replication of the closed-over
>>>> state of __next__() for iterators) before iterating.
>>>>
>>>> Or perhaps the right thing is a copy.itercopy that creates a new
>>>> composite object as a shallow copy of everything except that it clones
>>>> the state of __next__() in case the object was an iterator to start
>>>> with.
>>>> _______________________________________________
>>>> Python-ideas mailing list
>>>> Python-ideas at python.org
>>>> https://mail.python.org/mailman/listinfo/python-ideas
>>>>
>>>> --
>>>>
>>>> ---
>>>> You received this message because you are subscribed to a topic in the
>>>> Google Groups "python-ideas" group.
>>>> To unsubscribe from this topic, visit
>>>> https://groups.google.com/d/topic/python-ideas/OumiLGDwRWA/unsubscribe.
>>>> To unsubscribe from this group and all its topics, send an email to
>>>> python-ideas+unsubscribe at googlegroups.com.
>>>> For more options, visit https://groups.google.com/groups/opt_out.
>>>>
>>>
>>> _______________________________________________
>>> Python-ideas mailing list
>>> Python-ideas at python.org
>>> https://mail.python.org/mailman/listinfo/python-ideas
>>>
>>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20130921/93859673/attachment.html>

From abarnert at yahoo.com  Sat Sep 21 23:17:19 2013
From: abarnert at yahoo.com (Andrew Barnert)
Date: Sat, 21 Sep 2013 14:17:19 -0700
Subject: [Python-ideas] Introduce collections.Reiterable
In-Reply-To: <b41f2d49-9e87-4cc4-98d6-2056e9375a7e@email.android.com>
References: <20130919121828.GK19939@ando>
 <CADiSq7cyp2=XWT2e+nsCR9Q2TqrqM5fdHUhFBwoTfHT4bwum9w@mail.gmail.com>
 <20130920094854.GO19939@ando> <l1idsm$dvp$1@ger.gmane.org>
 <84A21A05-BA4E-4324-98DB-6ADE7EB98D1D@gmail.com> <l1ingq$dov$1@ger.gmane.org>
 <5354F5F4-8052-451E-BAFB-BED214484AF8@gmail.com>
 <CAA68w_nLweisPY3QBDWYtcUCygPyA_FvjMy4GCh0cAFa4sXLKQ@mail.gmail.com>
 <523D0BF4.5020404@stoneleaf.us>
 <CAA68w_mhc2Mu0rj3ooCs1OXPohrQDUwtoe_rzeJVOa+d1CODhA@mail.gmail.com>
 <20130921040904.GW19939@ando>
 <CAA68w_n1w0noOmRn-at4CfoHkHUkPQEzQB8S3F59Onn9+sm4qw@mail.gmail.com>
 <7wpps2wz8l.fsf@benfinney.id.au>
 <56a4d514-2f52-4dbe-a861-5298f4970101@email.android.com>
 <87mwn62dyw.fsf@uwakimon.sk.tsukuba.ac.jp>
 <b41f2d49-9e87-4cc4-98d6-2056e9375a7e@email.android.com>
Message-ID: <A774D901-B9E2-4530-A60A-9A9C6177E4A4@yahoo.com>

On Sep 21, 2013, at 13:02, Ryan <rymg19 at gmail.com> wrote:

> The problem with the try...except statement is that it'll catch errors that occur inside the function.
> Say the function calls zip and accidentally gives an int instead of their list. It'll raise a TypeError, which will be caught and a CallbackError will be raised.

You can always use EAFP with a bit of "look back after you leaped" to help diagnose the error:

try:
    self.call[item]()
except TypeError as e:
    if not callable(self.call[item]):
        raise CallbackError(...) from e
    raise

I'm not sure that would be appropriate in this case, but similar code is very common when dealing with, e.g., the filesystem (especially in 2.x, where you had to distinguish errors on errno... but even in 3.x it's often worth telling the user that his error was because the folder be specified doesn't exist, as opposed to just the filename being wrong).

But anyway, I think this is way off topic for this thread. We already have both EAFP and LBYL mechanisms for dealing with iteration; the argument isn't which one you should use, but whether the existing ABCs are sufficient or leave an important gap if you choose to use them for LBYL.

> But, in some programs, that behavior doesn't work.
> 
> "Stephen J. Turnbull" <stephen at xemacs.org> wrote:
>> 
>> Ryan writes:
>> 
>>> I can still see why checking if its callable is a good idea in some
>>> cases.
>> 
>> You can always rewrite the LBYL in EAFP form:
>> 
>> try:
>> self.call[item]()
>> except TypeError as e:
>> raise CallbackError(...) from e
>> 
>> This is definitely preferred if you can enclose a whole suite in a try
>> and expect the CallbackError to be infrequent, eg:
>> 
>> try:
>> while item in input:
>> self.call[item]()
>> except TypeError as e:
>> raise CallbackError(...) from e
>> 
>> However, what you probably really want to do (which is a better
>> argument for LBYL, anyway) is
>> 
>> if callable(callback):
>> self.call[item] = callback
>> else:
>> what_would_jruser_do()
> 
> -- 
> Sent from my Android phone with K-9 Mail. Please excuse my brevity.
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> https://mail.python.org/mailman/listinfo/python-ideas
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20130921/63763b10/attachment-0001.html>

From tjreedy at udel.edu  Sun Sep 22 00:31:50 2013
From: tjreedy at udel.edu (Terry Reedy)
Date: Sat, 21 Sep 2013 18:31:50 -0400
Subject: [Python-ideas] Introduce collections.Reiterable
In-Reply-To: <CAA68w_n-U4My0UE+gc4D8m2vXKQrXiO1XbEVp3BB_eQLLyBnYg@mail.gmail.com>
References: <575f4071-1b5c-4a16-b36c-b5f925cdd2f7@googlegroups.com>
 <CAA68w_kuPS6xVjAbropBFyiQib3aH9Ke+PiW8ecWhSkmHsnXnQ@mail.gmail.com>
 <CADiSq7cVXAUi4BLpfa13QjhwNmX9JancrMjHSEEfKn_P=F-_ag@mail.gmail.com>
 <20130919121828.GK19939@ando>
 <CADiSq7cyp2=XWT2e+nsCR9Q2TqrqM5fdHUhFBwoTfHT4bwum9w@mail.gmail.com>
 <20130920094854.GO19939@ando>
 <CACac1F-=X-nfKe9sYx-_U_Ties3nU3z4vqzs=DkdqfJFD0StiA@mail.gmail.com>
 <20130920111000.GQ19939@ando>
 <CAN8CLgnAWQ3iszEtvEo_FTydx9Un+Y+41Fe5fVsNdX70Ube9Ww@mail.gmail.com>
 <l1imst$88i$1@ger.gmane.org> <87zjr63oub.fsf@uwakimon.sk.tsukuba.ac.jp>
 <CAA68w_=fhyk8Wh2Mqx4-09cTaPhiqFTS3ZVjL4tez9X9-9Yx+Q@mail.gmail.com>
 <F6683518-F434-4B96-8D10-E424D7F53778@yahoo.com>
 <CAA68w_mkCqCjjFohiWQ2tj3SJxFA+uoAiqOEep30gMCBw33gtQ@mail.gmail.com>
 <CED2BED8-5B94-4203-88D5-4510BC5C067C@yahoo.com>
 <CAA68w_=s4kEgQeGa58g3tuoFZgQc+R48hBQJyFVcHs8TOgqkMg@mail.gmail.com>
 <1060C40C-4255-456C-9B2F-49DAC7B9FB03@yahoo.com>
 <CAA68w_n-U4My0UE+gc4D8m2vXKQrXiO1XbEVp3BB_eQLLyBnYg@mail.gmail.com>
Message-ID: <l1l6oi$a0i$1@ger.gmane.org>

On 9/21/2013 5:14 PM, Neil Girdhar wrote:

> If you really think that there will never be a non-reiterable
> non-iterator iterable,

I already posted a sensible non-iterator iterable that is no more 
reiterable than an iterator. I expect that there are examples in the 
wild.  If nothing else, there are probably some written before the new 
iterator protocol was added. These are explicitly supported.

-- 
Terry Jan Reedy


From mistersheik at gmail.com  Sun Sep 22 00:33:36 2013
From: mistersheik at gmail.com (Neil Girdhar)
Date: Sat, 21 Sep 2013 18:33:36 -0400
Subject: [Python-ideas] Introduce collections.Reiterable
In-Reply-To: <l1l6oi$a0i$1@ger.gmane.org>
References: <575f4071-1b5c-4a16-b36c-b5f925cdd2f7@googlegroups.com>
 <CAA68w_kuPS6xVjAbropBFyiQib3aH9Ke+PiW8ecWhSkmHsnXnQ@mail.gmail.com>
 <CADiSq7cVXAUi4BLpfa13QjhwNmX9JancrMjHSEEfKn_P=F-_ag@mail.gmail.com>
 <20130919121828.GK19939@ando>
 <CADiSq7cyp2=XWT2e+nsCR9Q2TqrqM5fdHUhFBwoTfHT4bwum9w@mail.gmail.com>
 <20130920094854.GO19939@ando>
 <CACac1F-=X-nfKe9sYx-_U_Ties3nU3z4vqzs=DkdqfJFD0StiA@mail.gmail.com>
 <20130920111000.GQ19939@ando>
 <CAN8CLgnAWQ3iszEtvEo_FTydx9Un+Y+41Fe5fVsNdX70Ube9Ww@mail.gmail.com>
 <l1imst$88i$1@ger.gmane.org> <87zjr63oub.fsf@uwakimon.sk.tsukuba.ac.jp>
 <CAA68w_=fhyk8Wh2Mqx4-09cTaPhiqFTS3ZVjL4tez9X9-9Yx+Q@mail.gmail.com>
 <F6683518-F434-4B96-8D10-E424D7F53778@yahoo.com>
 <CAA68w_mkCqCjjFohiWQ2tj3SJxFA+uoAiqOEep30gMCBw33gtQ@mail.gmail.com>
 <CED2BED8-5B94-4203-88D5-4510BC5C067C@yahoo.com>
 <CAA68w_=s4kEgQeGa58g3tuoFZgQc+R48hBQJyFVcHs8TOgqkMg@mail.gmail.com>
 <1060C40C-4255-456C-9B2F-49DAC7B9FB03@yahoo.com>
 <CAA68w_n-U4My0UE+gc4D8m2vXKQrXiO1XbEVp3BB_eQLLyBnYg@mail.gmail.com>
 <l1l6oi$a0i$1@ger.gmane.org>
Message-ID: <CAA68w_mVXTuhjXW36CBjwo1=gaq-NPXKnMZM+9tzpeh_1Q9PBw@mail.gmail.com>

"new iterator protocol"  :)  Is it still new?


On Sat, Sep 21, 2013 at 6:31 PM, Terry Reedy <tjreedy at udel.edu> wrote:

> On 9/21/2013 5:14 PM, Neil Girdhar wrote:
>
>  If you really think that there will never be a non-reiterable
>> non-iterator iterable,
>>
>
> I already posted a sensible non-iterator iterable that is no more
> reiterable than an iterator. I expect that there are examples in the wild.
>  If nothing else, there are probably some written before the new iterator
> protocol was added. These are explicitly supported.
>
> --
> Terry Jan Reedy
>
> ______________________________**_________________
> Python-ideas mailing list
> Python-ideas at python.org
> https://mail.python.org/**mailman/listinfo/python-ideas<https://mail.python.org/mailman/listinfo/python-ideas>
>
> --
>
> --- You received this message because you are subscribed to a topic in the
> Google Groups "python-ideas" group.
> To unsubscribe from this topic, visit https://groups.google.com/d/**
> topic/python-ideas/**OumiLGDwRWA/unsubscribe<https://groups.google.com/d/topic/python-ideas/OumiLGDwRWA/unsubscribe>
> .
> To unsubscribe from this group and all its topics, send an email to
> python-ideas+unsubscribe@**googlegroups.com<python-ideas%2Bunsubscribe at googlegroups.com>
> .
> For more options, visit https://groups.google.com/**groups/opt_out<https://groups.google.com/groups/opt_out>
> .
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20130921/ddc9fd9c/attachment.html>

From rymg19 at gmail.com  Sun Sep 22 03:11:14 2013
From: rymg19 at gmail.com (Ryan)
Date: Sat, 21 Sep 2013 20:11:14 -0500
Subject: [Python-ideas] Introduce collections.Reiterable
In-Reply-To: <A774D901-B9E2-4530-A60A-9A9C6177E4A4@yahoo.com>
References: <20130919121828.GK19939@ando>
 <CADiSq7cyp2=XWT2e+nsCR9Q2TqrqM5fdHUhFBwoTfHT4bwum9w@mail.gmail.com>
 <20130920094854.GO19939@ando> <l1idsm$dvp$1@ger.gmane.org>
 <84A21A05-BA4E-4324-98DB-6ADE7EB98D1D@gmail.com> <l1ingq$dov$1@ger.gmane.org>
 <5354F5F4-8052-451E-BAFB-BED214484AF8@gmail.com>
 <CAA68w_nLweisPY3QBDWYtcUCygPyA_FvjMy4GCh0cAFa4sXLKQ@mail.gmail.com>
 <523D0BF4.5020404@stoneleaf.us>
 <CAA68w_mhc2Mu0rj3ooCs1OXPohrQDUwtoe_rzeJVOa+d1CODhA@mail.gmail.com>
 <20130921040904.GW19939@ando>
 <CAA68w_n1w0noOmRn-at4CfoHkHUkPQEzQB8S3F59Onn9+sm4qw@mail.gmail.com>
 <7wpps2wz8l.fsf@benfinney.id.au>
 <56a4d514-2f52-4dbe-a861-5298f4970101@email.android.com>
 <87mwn62dyw.fsf@uwakimon.sk.tsukuba.ac.jp>
 <b41f2d49-9e87-4cc4-98d6-2056e9375a7e@email.android.com>
 <A774D901-B9E2-4530-A60A-9A9C6177E4A4@yahoo.com>
Message-ID: <e6d1b558-2066-4e95-9e8f-3371e1fd910b@email.android.com>

At that rate, why not just check for callability(?) in the first place?

Andrew Barnert <abarnert at yahoo.com> wrote:

>On Sep 21, 2013, at 13:02, Ryan <rymg19 at gmail.com> wrote:
>
>> The problem with the try...except statement is that it'll catch
>errors that occur inside the function.
>> Say the function calls zip and accidentally gives an int instead of
>their list. It'll raise a TypeError, which will be caught and a
>CallbackError will be raised.
>
>You can always use EAFP with a bit of "look back after you leaped" to
>help diagnose the error:
>
>try:
>    self.call[item]()
>except TypeError as e:
>    if not callable(self.call[item]):
>        raise CallbackError(...) from e
>    raise
>
>I'm not sure that would be appropriate in this case, but similar code
>is very common when dealing with, e.g., the filesystem (especially in
>2.x, where you had to distinguish errors on errno... but even in 3.x
>it's often worth telling the user that his error was because the folder
>be specified doesn't exist, as opposed to just the filename being
>wrong).
>
>But anyway, I think this is way off topic for this thread. We already
>have both EAFP and LBYL mechanisms for dealing with iteration; the
>argument isn't which one you should use, but whether the existing ABCs
>are sufficient or leave an important gap if you choose to use them for
>LBYL.
>
>> But, in some programs, that behavior doesn't work.
>> 
>> "Stephen J. Turnbull" <stephen at xemacs.org> wrote:
>>> 
>>> Ryan writes:
>>> 
>>>> I can still see why checking if its callable is a good idea in some
>>>> cases.
>>> 
>>> You can always rewrite the LBYL in EAFP form:
>>> 
>>> try:
>>> self.call[item]()
>>> except TypeError as e:
>>> raise CallbackError(...) from e
>>> 
>>> This is definitely preferred if you can enclose a whole suite in a
>try
>>> and expect the CallbackError to be infrequent, eg:
>>> 
>>> try:
>>> while item in input:
>>> self.call[item]()
>>> except TypeError as e:
>>> raise CallbackError(...) from e
>>> 
>>> However, what you probably really want to do (which is a better
>>> argument for LBYL, anyway) is
>>> 
>>> if callable(callback):
>>> self.call[item] = callback
>>> else:
>>> what_would_jruser_do()
>> 
>> -- 
>> Sent from my Android phone with K-9 Mail. Please excuse my brevity.
>> _______________________________________________
>> Python-ideas mailing list
>> Python-ideas at python.org
>> https://mail.python.org/mailman/listinfo/python-ideas

-- 
Sent from my Android phone with K-9 Mail. Please excuse my brevity.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20130921/6c42e457/attachment.html>

From tim.peters at gmail.com  Sun Sep 22 03:33:19 2013
From: tim.peters at gmail.com (Tim Peters)
Date: Sat, 21 Sep 2013 20:33:19 -0500
Subject: [Python-ideas] Numerical instability was: Re: Introduce
	collections.Reiterable
In-Reply-To: <CAHVvXxQbd8iXNrD-oUW3u6tixb8amSe0qxRNtawYGS-AFEpDOA@mail.gmail.com>
References: <CAHVvXxQbd8iXNrD-oUW3u6tixb8amSe0qxRNtawYGS-AFEpDOA@mail.gmail.com>
Message-ID: <CAExdVN=4-+AdN8sWCz4SWVSA0E+ku6Y92iNBFmPg1Hz8jL2XgA@mail.gmail.com>

[Oscar Benjamin <oscar.j.benjamin at gmail.com>]
> ...
> If you know of a one-pass algorithm (or a way to improve the
> implementation I showed) that is as accurate as either the two_pass or
> three_pass methods I'd be very interested to see it (I'm sure Steven
> would be as well).

This looks interesting:

ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/79/773/CS-TR-79-773.pdf

It give detailed error analyses of the methods already on the table
(although without use of `fsum()`), and invents some new ones.  It
gets hairy ;-)

From abarnert at yahoo.com  Sun Sep 22 04:05:06 2013
From: abarnert at yahoo.com (Andrew Barnert)
Date: Sat, 21 Sep 2013 19:05:06 -0700
Subject: [Python-ideas] Introduce collections.Reiterable
In-Reply-To: <l1l6oi$a0i$1@ger.gmane.org>
References: <575f4071-1b5c-4a16-b36c-b5f925cdd2f7@googlegroups.com>
 <CAA68w_kuPS6xVjAbropBFyiQib3aH9Ke+PiW8ecWhSkmHsnXnQ@mail.gmail.com>
 <CADiSq7cVXAUi4BLpfa13QjhwNmX9JancrMjHSEEfKn_P=F-_ag@mail.gmail.com>
 <20130919121828.GK19939@ando>
 <CADiSq7cyp2=XWT2e+nsCR9Q2TqrqM5fdHUhFBwoTfHT4bwum9w@mail.gmail.com>
 <20130920094854.GO19939@ando>
 <CACac1F-=X-nfKe9sYx-_U_Ties3nU3z4vqzs=DkdqfJFD0StiA@mail.gmail.com>
 <20130920111000.GQ19939@ando>
 <CAN8CLgnAWQ3iszEtvEo_FTydx9Un+Y+41Fe5fVsNdX70Ube9Ww@mail.gmail.com>
 <l1imst$88i$1@ger.gmane.org> <87zjr63oub.fsf@uwakimon.sk.tsukuba.ac.jp>
 <CAA68w_=fhyk8Wh2Mqx4-09cTaPhiqFTS3ZVjL4tez9X9-9Yx+Q@mail.gmail.com>
 <F6683518-F434-4B96-8D10-E424D7F53778@yahoo.com>
 <CAA68w_mkCqCjjFohiWQ2tj3SJxFA+uoAiqOEep30gMCBw33gtQ@mail.gmail.com>
 <CED2BED8-5B94-4203-88D5-4510BC5C067C@yahoo.com>
 <CAA68w_=s4kEgQeGa58g3tuoFZgQc+R48hBQJyFVcHs8TOgqkMg@mail.gmail.com>
 <1060C40C-4255-456C-9B2F-49DAC7B9FB03@yahoo.com>
 <CAA68w_n-U4My0UE+gc4D8m2vXKQrXiO1XbEVp3BB_eQLLyBnYg@mail.gmail.com>
 <l1l6oi$a0i$1@ger.gmane.org>
Message-ID: <05C362B4-4491-4745-94AE-4EBFC0AA5DDC@yahoo.com>

On Sep 21, 2013, at 15:31, Terry Reedy <tjreedy at udel.edu> wrote:

> On 9/21/2013 5:14 PM, Neil Girdhar wrote:
> 
>> If you really think that there will never be a non-reiterable
>> non-iterator iterable,
> 
> I already posted a sensible non-iterator iterable that is no more reiterable than an iterator.

You posted a long discussion of different ways in which "reiterable" could be defined, and gave vague examples of things that are reiterable in one sense but not in another. Accepting all of that at face value, there's no way Neil's Reiterable ABC would help that problem, because it would obviously only cover one of the possible senses.

Beyond that, I've looked through your posts on that thread, and I can't find anything that looks like a sensible non-iterator non-reiterable (in Neil's intended sense) iterable. Did I miss something?

> I expect that there are examples in the wild.  If nothing else, there are probably some written before the new iterator protocol was added. These are explicitly supported.
> 
> -- 
> Terry Jan Reedy
> 
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> https://mail.python.org/mailman/listinfo/python-ideas

From steve at pearwood.info  Sun Sep 22 04:56:43 2013
From: steve at pearwood.info (Steven D'Aprano)
Date: Sun, 22 Sep 2013 12:56:43 +1000
Subject: [Python-ideas] Introduce collections.Reiterable
In-Reply-To: <CAA68w_n-U4My0UE+gc4D8m2vXKQrXiO1XbEVp3BB_eQLLyBnYg@mail.gmail.com>
References: <CAN8CLgnAWQ3iszEtvEo_FTydx9Un+Y+41Fe5fVsNdX70Ube9Ww@mail.gmail.com>
 <l1imst$88i$1@ger.gmane.org> <87zjr63oub.fsf@uwakimon.sk.tsukuba.ac.jp>
 <CAA68w_=fhyk8Wh2Mqx4-09cTaPhiqFTS3ZVjL4tez9X9-9Yx+Q@mail.gmail.com>
 <F6683518-F434-4B96-8D10-E424D7F53778@yahoo.com>
 <CAA68w_mkCqCjjFohiWQ2tj3SJxFA+uoAiqOEep30gMCBw33gtQ@mail.gmail.com>
 <CED2BED8-5B94-4203-88D5-4510BC5C067C@yahoo.com>
 <CAA68w_=s4kEgQeGa58g3tuoFZgQc+R48hBQJyFVcHs8TOgqkMg@mail.gmail.com>
 <1060C40C-4255-456C-9B2F-49DAC7B9FB03@yahoo.com>
 <CAA68w_n-U4My0UE+gc4D8m2vXKQrXiO1XbEVp3BB_eQLLyBnYg@mail.gmail.com>
Message-ID: <20130922025643.GD19939@ando>

On Sat, Sep 21, 2013 at 05:14:56PM -0400, Neil Girdhar wrote:

> If you really think that there will never be a non-reiterable non-iterator
> iterable, then the standard should promise that and we're in total
> agreement.

Which standard are you referring to? It would help if you specified a 
concrete place in the documentation that you would like to see changed, 
and a concrete suggestion for what change you would like to see.

You should start with a definition of what precisely you mean by a 
Reiterable, and an example of what does, and what doesn't, count under 
that defintion. Even if you've already done so, this thread has become 
too big and too lumbering for me to keep track of everything discussed 
in it, and I'm sure I'm not the only one.


-- 
Steven

From ncoghlan at gmail.com  Sun Sep 22 06:56:52 2013
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Sun, 22 Sep 2013 14:56:52 +1000
Subject: [Python-ideas] Introduce collections.Reiterable
In-Reply-To: <20130920094854.GO19939@ando>
References: <575f4071-1b5c-4a16-b36c-b5f925cdd2f7@googlegroups.com>
 <CADiSq7c2G8E47Hth-kQWd4W33dpb9YFiyLs+vs7Seaw20w6fHw@mail.gmail.com>
 <CAA68w_kuPS6xVjAbropBFyiQib3aH9Ke+PiW8ecWhSkmHsnXnQ@mail.gmail.com>
 <CADiSq7cVXAUi4BLpfa13QjhwNmX9JancrMjHSEEfKn_P=F-_ag@mail.gmail.com>
 <20130919121828.GK19939@ando>
 <CADiSq7cyp2=XWT2e+nsCR9Q2TqrqM5fdHUhFBwoTfHT4bwum9w@mail.gmail.com>
 <20130920094854.GO19939@ando>
Message-ID: <CADiSq7dXOdgoXY+eAhexyyGgb9wadTotztP9ad2DoKbAZfr-qw@mail.gmail.com>

On 20 Sep 2013 19:49, "Steven D'Aprano" <steve at pearwood.info> wrote:
>
> On Thu, Sep 19, 2013 at 11:02:57PM +1000, Nick Coghlan wrote:
> > On 19 September 2013 22:18, Steven D'Aprano <steve at pearwood.info> wrote:
> [...]
> > > At the moment, dict views aren't directly iterable (you can't call
> > > next() on them). But in principle they could have been designed as
> > > re-iterable iterators.
> >
> > That's not what iterable means. The iterable/iterator distinction is
> > well defined and reflected in the collections ABCs:
>
> Actually, I think the collections ABC gets it wrong, according to both
> common practice and the definition given in the glossary:
>
> http://docs.python.org/3.4/glossary.html
>
> More on this below.
>
> As for my comment above, dict views don't obey the iterator protocol
> themselves, as they have no __next__ method, nor do they obey the
> sequence protocol, as they are not indexable. Hence they are not
> *directly* iterable, but they are *indirectly* iterable, since they have
> an __iter__ method which returns an iterator.

Um, no. Everywhere Python iterates over anything, we call iter(obj)
first. If there is anywhere we don't do that, it's a bug.

> I don't think this is a critical distinction. I think it is fine to call
> views "iterable", since they can be iterated over. On the rare occasion
> that it matters, we can just do what I did above, and talk about objects
> which are directly iterable (e.g. iterators, sequences, generator
> objects) and those which are indirectly iterable (e.g. dict views).

Or you could just use the existing terminology and talk about
iterables vs iterators instead of inventing your own terms.

> > * iterables are objects that return iterators from __iter__.
>
> That definition is incomplete, because iterable objects include those
> that obey the sequence protocol. This is not only by long-standing
> tradition (pre-dating the introduction of iterators, if I remember
> correctly), but also as per the definition in the glossary. Alas,
> collections.Iterable gets this wrong:
>
> py> class Seq:
> ...     def __getitem__(self, index):
> ...             if 0 <= index < 5: return index+1000
> ...             raise IndexError
> ...
> py> s = Seq()
> py> isinstance(s, Iterable)
> False
> py> list(s)  # definitely iterable
> [1000, 1001, 1002, 1003, 1004]
>
>
> (Note that although Seq obeys the sequence protocol, and is can be
> iterated over, it is not a fully-fledged Sequence since it has no
> __len__.)
>
> I think this is a bug in the Iterable ABC, but I'm not sure how one
> might fix it.

The ducktyping check could technically be expanded to use the same
fallback iter() does (i.e. __len__ and __getitem__).

However, that would reintroduce the Sequence/Mapping ambiguity that
ABCs were expressly designed to eliminate, so we don't want to do
that:

>>> class BadFallback:
...     def __len__(self):
...         return 1
...     def __getitem__(self, key):
...         if key != "the_one": raise KeyError(key)
...         return "the_value"
...
>>> c = BadFallback()
>>> c["the_one"]
'the_value'
>>> iter(c)
<iterator object at 0x7f9cf08a7f90>
>>> next(iter(c))
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "<stdin>", line 5, in __getitem__
KeyError: 0

In cases like this, the default behaviour is actually correct. Since
the fallback iterator only supports sequences rather than arbitrary
mappings, merely implementing __len__ and __getitem__ isn't considered
a reliable enough indication that an object is actually iterable.

Fortunately, we also designed the ABC system to make it trivial for
people to notify Python that their container is an iterable sequence
when the automatic ducktyping fails: they can just call register on
Iterable or one of its subclasses, and the interpreter will believe
them.

>>> from collections.abc import Iterable, Mapping
>>> isinstance(c, Iterable)
False
>>> isinstance(c, Mapping)
False
>>> Mapping.register(BadFallback)
<class '__main__.BadFallback'>
>>> isinstance(c, Iterable)
True
>>> isinstance(c, Mapping)
True

In this case, it's a bad registration, since the object in question
*doesn't* implement those interfaces properly, but it's easy to define
a type where it's more accurate:

>>> from collections import Sequence
>>> @Sequence.register
... class GoodFallback:
...     def __len__(self):
...         return 1
...     def __getitem__(self, idx):
...         if idx != 0: raise IndexError(idx)
...         return "the_entry"
...
>>> c2 = GoodFallback()
>>> list(c2)
['the_entry']
>>> isinstance(c2, Iterable)
True

Even "GoodFallback" doesn't implement the full Sequence API, but it's
likely to provide enough of it for many use cases. This is why type
checks on ABCs are vastly different to those on concrete classes -
ABCs still leave full control in the hands of the application
integrator (through explicit registrations), whereas strict interface
checks in a language like Java demand *full* interface compliance to
pass the check, even if you really only need a fraction of it.

> > That "iterators return self from __iter__" is important, since almost
> > everywhere Python iterates over something, it call "_itr = iter(obj)"
> > first.
>
> And then falls back on the sequence protocol.

And that final fallback *won't work properly* if the object in
question isn't actually a sequence.

Cheers,
Nick.

From g.brandl at gmx.net  Sun Sep 22 10:58:40 2013
From: g.brandl at gmx.net (Georg Brandl)
Date: Sun, 22 Sep 2013 10:58:40 +0200
Subject: [Python-ideas] Introduce collections.Reiterable
In-Reply-To: <CADiSq7dXOdgoXY+eAhexyyGgb9wadTotztP9ad2DoKbAZfr-qw@mail.gmail.com>
References: <575f4071-1b5c-4a16-b36c-b5f925cdd2f7@googlegroups.com>
 <CADiSq7c2G8E47Hth-kQWd4W33dpb9YFiyLs+vs7Seaw20w6fHw@mail.gmail.com>
 <CAA68w_kuPS6xVjAbropBFyiQib3aH9Ke+PiW8ecWhSkmHsnXnQ@mail.gmail.com>
 <CADiSq7cVXAUi4BLpfa13QjhwNmX9JancrMjHSEEfKn_P=F-_ag@mail.gmail.com>
 <20130919121828.GK19939@ando>
 <CADiSq7cyp2=XWT2e+nsCR9Q2TqrqM5fdHUhFBwoTfHT4bwum9w@mail.gmail.com>
 <20130920094854.GO19939@ando>
 <CADiSq7dXOdgoXY+eAhexyyGgb9wadTotztP9ad2DoKbAZfr-qw@mail.gmail.com>
Message-ID: <l1mbej$5k2$1@ger.gmane.org>

On 09/22/2013 06:56 AM, Nick Coghlan wrote:
> On 20 Sep 2013 19:49, "Steven D'Aprano" <steve at pearwood.info> wrote:
>>
>> On Thu, Sep 19, 2013 at 11:02:57PM +1000, Nick Coghlan wrote:
>> > On 19 September 2013 22:18, Steven D'Aprano <steve at pearwood.info> wrote:
>> [...]
>> > > At the moment, dict views aren't directly iterable (you can't call
>> > > next() on them). But in principle they could have been designed as
>> > > re-iterable iterators.
>> >
>> > That's not what iterable means. The iterable/iterator distinction is
>> > well defined and reflected in the collections ABCs:
>>
>> Actually, I think the collections ABC gets it wrong, according to both
>> common practice and the definition given in the glossary:
>>
>> http://docs.python.org/3.4/glossary.html
>>
>> More on this below.
>>
>> As for my comment above, dict views don't obey the iterator protocol
>> themselves, as they have no __next__ method, nor do they obey the
>> sequence protocol, as they are not indexable. Hence they are not
>> *directly* iterable, but they are *indirectly* iterable, since they have
>> an __iter__ method which returns an iterator.
> 
> Um, no. Everywhere Python iterates over anything, we call iter(obj)
> first. If there is anywhere we don't do that, it's a bug.
> 
>> I don't think this is a critical distinction. I think it is fine to call
>> views "iterable", since they can be iterated over. On the rare occasion
>> that it matters, we can just do what I did above, and talk about objects
>> which are directly iterable (e.g. iterators, sequences, generator
>> objects) and those which are indirectly iterable (e.g. dict views).
> 
> Or you could just use the existing terminology and talk about
> iterables vs iterators instead of inventing your own terms.

Ack. Please don't create new terms, rather suggest an improvement to the
glossary definition if you think it's inadequate.

Georg


From ncoghlan at gmail.com  Sun Sep 22 12:30:43 2013
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Sun, 22 Sep 2013 20:30:43 +1000
Subject: [Python-ideas] Introduce collections.Reiterable
In-Reply-To: <l1mbej$5k2$1@ger.gmane.org>
References: <575f4071-1b5c-4a16-b36c-b5f925cdd2f7@googlegroups.com>
 <CADiSq7c2G8E47Hth-kQWd4W33dpb9YFiyLs+vs7Seaw20w6fHw@mail.gmail.com>
 <CAA68w_kuPS6xVjAbropBFyiQib3aH9Ke+PiW8ecWhSkmHsnXnQ@mail.gmail.com>
 <CADiSq7cVXAUi4BLpfa13QjhwNmX9JancrMjHSEEfKn_P=F-_ag@mail.gmail.com>
 <20130919121828.GK19939@ando>
 <CADiSq7cyp2=XWT2e+nsCR9Q2TqrqM5fdHUhFBwoTfHT4bwum9w@mail.gmail.com>
 <20130920094854.GO19939@ando>
 <CADiSq7dXOdgoXY+eAhexyyGgb9wadTotztP9ad2DoKbAZfr-qw@mail.gmail.com>
 <l1mbej$5k2$1@ger.gmane.org>
Message-ID: <CADiSq7f8_PkwbYgKnhqykEBFvnSypkJg+NHPkB+2M=A5_ztyLA@mail.gmail.com>

On 22 September 2013 18:58, Georg Brandl <g.brandl at gmx.net> wrote:
> On 09/22/2013 06:56 AM, Nick Coghlan wrote:
>> On 20 Sep 2013 19:49, "Steven D'Aprano" <steve at pearwood.info> wrote:
>>> I don't think this is a critical distinction. I think it is fine to call
>>> views "iterable", since they can be iterated over. On the rare occasion
>>> that it matters, we can just do what I did above, and talk about objects
>>> which are directly iterable (e.g. iterators, sequences, generator
>>> objects) and those which are indirectly iterable (e.g. dict views).
>>
>> Or you could just use the existing terminology and talk about
>> iterables vs iterators instead of inventing your own terms.
>
> Ack. Please don't create new terms, rather suggest an improvement to the
> glossary definition if you think it's inadequate.

As near as I can tell, Steven's observation is that, for backwards
compatibility reasons, iter() tolerates sequences that define __len__
and __getitem__ without defining __iter__, whereas the collections
ABCs require an __iter__ method for their ducktyping to trigger. This
means that there are a small number of legacy cases where
"isinstance(c, collections.abc.Iterable)" can be False, while calling
"iter(c)" would still give you a working iterator.

My take on it is that when Guido formalised the container model in PEP
3119, he was *deliberately* relegating those "iterable without
defining __iter__" cases to be purely a backwards compatibility hack
without forming part of the formal object model. The class definitions
that aren't defining the full Sequence ABC (including __iter__) aren't
really proper sequences in Python 3, even though they'll still mostly
work (thanks to the prevalence of ducktyping).

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia

From steve at pearwood.info  Sun Sep 22 12:55:58 2013
From: steve at pearwood.info (Steven D'Aprano)
Date: Sun, 22 Sep 2013 20:55:58 +1000
Subject: [Python-ideas] Introduce collections.Reiterable
In-Reply-To: <l1mbej$5k2$1@ger.gmane.org>
References: <575f4071-1b5c-4a16-b36c-b5f925cdd2f7@googlegroups.com>
 <CADiSq7c2G8E47Hth-kQWd4W33dpb9YFiyLs+vs7Seaw20w6fHw@mail.gmail.com>
 <CAA68w_kuPS6xVjAbropBFyiQib3aH9Ke+PiW8ecWhSkmHsnXnQ@mail.gmail.com>
 <CADiSq7cVXAUi4BLpfa13QjhwNmX9JancrMjHSEEfKn_P=F-_ag@mail.gmail.com>
 <20130919121828.GK19939@ando>
 <CADiSq7cyp2=XWT2e+nsCR9Q2TqrqM5fdHUhFBwoTfHT4bwum9w@mail.gmail.com>
 <20130920094854.GO19939@ando>
 <CADiSq7dXOdgoXY+eAhexyyGgb9wadTotztP9ad2DoKbAZfr-qw@mail.gmail.com>
 <l1mbej$5k2$1@ger.gmane.org>
Message-ID: <20130922105558.GF19939@ando>

On Sun, Sep 22, 2013 at 10:58:40AM +0200, Georg Brandl wrote:

> > Or you could just use the existing terminology and talk about
> > iterables vs iterators instead of inventing your own terms.
> 
> Ack. Please don't create new terms, rather suggest an improvement to the
> glossary definition if you think it's inadequate.

I'm not inventing new terminology. I'm using the plain English meanings 
of "directly" and "indirectly", and the standard meaning of "iterate", 
"iterator", "iterable" as used by Python and described in the glossary.

As the glossary says, "The for statement [calls iter] for you, creating 
a TEMPORARY UNNAMED VARIABLE to hold the iterator for the duration of 
the loop." [emphasis added] All I am doing is distinguishing between the 
iterable object that the for-loop calls iter() on, which need not have a 
__next__ method, and the iterable object that the for-loop calls 
__next__ on. They're not always the same object.

But as I've already said, the distinction usually doesn't matter. I've 
already forgotten the context of why I thought it mattered when I first 
raised it *wink*


-- 
Steven

From techtonik at gmail.com  Sun Sep 22 13:21:07 2013
From: techtonik at gmail.com (anatoly techtonik)
Date: Sun, 22 Sep 2013 14:21:07 +0300
Subject: [Python-ideas] +1 button/counter for bugs.python.org
Message-ID: <CAPkN8xKAA-yGmTT0SBK7YaC3-_ip5eP+qYYDHGaNXtHGfEDFKQ@mail.gmail.com>

Does anybody think it is a good idea to personally approve good issues and
messages on bugs.python.org?

If yes, should it be a Google's +1 (easier to add), or a pythonic solution
for Roundup?
-- 
anatoly t.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20130922/895ee02e/attachment.html>

From abarnert at yahoo.com  Sun Sep 22 13:57:03 2013
From: abarnert at yahoo.com (Andrew Barnert)
Date: Sun, 22 Sep 2013 04:57:03 -0700
Subject: [Python-ideas] Introduce collections.Reiterable
In-Reply-To: <20130922105558.GF19939@ando>
References: <575f4071-1b5c-4a16-b36c-b5f925cdd2f7@googlegroups.com>
 <CADiSq7c2G8E47Hth-kQWd4W33dpb9YFiyLs+vs7Seaw20w6fHw@mail.gmail.com>
 <CAA68w_kuPS6xVjAbropBFyiQib3aH9Ke+PiW8ecWhSkmHsnXnQ@mail.gmail.com>
 <CADiSq7cVXAUi4BLpfa13QjhwNmX9JancrMjHSEEfKn_P=F-_ag@mail.gmail.com>
 <20130919121828.GK19939@ando>
 <CADiSq7cyp2=XWT2e+nsCR9Q2TqrqM5fdHUhFBwoTfHT4bwum9w@mail.gmail.com>
 <20130920094854.GO19939@ando>
 <CADiSq7dXOdgoXY+eAhexyyGgb9wadTotztP9ad2DoKbAZfr-qw@mail.gmail.com>
 <l1mbej$5k2$1@ger.gmane.org> <20130922105558.GF19939@ando>
Message-ID: <1EEA7DC3-1D24-4295-8BB8-B4099DE86A8A@yahoo.com>

On Sep 22, 2013, at 3:55, Steven D'Aprano <steve at pearwood.info> wrote:

> On Sun, Sep 22, 2013 at 10:58:40AM +0200, Georg Brandl wrote:
> 
>>> Or you could just use the existing terminology and talk about
>>> iterables vs iterators instead of inventing your own terms.
>> 
>> Ack. Please don't create new terms, rather suggest an improvement to the
>> glossary definition if you think it's inadequate.
> 
> I'm not inventing new terminology. I'm using the plain English meanings 
> of "directly" and "indirectly", and the standard meaning of "iterate", 
> "iterator", "iterable" as used by Python and described in the glossary.

No you aren't. You're using iterable as a synonym for iterator, which is not how it's used by Python or described in the glossary.

> As the glossary says, "The for statement [calls iter] for you, creating 
> a TEMPORARY UNNAMED VARIABLE to hold the iterator for the duration of 
> the loop." [emphasis added] All I am doing is distinguishing between the 
> iterable object that the for-loop calls iter() on, which need not have a 
> __next__ method, and the iterable object that the for-loop calls 
> __next__ on. They're not always the same object.

This is precisely the distinction between iterables and iterators.

The object that the for loop calls iter on is an iterable.

The object that the for loop gets back from that iter call, binds a temporary unnamed variable to, and calls __next__ on is an iterator.

> But as I've already said, the distinction usually doesn't matter.

Yes it does. This is a very important distinction. That's why Python already has separate terminology. And separate ABCs. 

From solipsis at pitrou.net  Sun Sep 22 14:03:47 2013
From: solipsis at pitrou.net (Antoine Pitrou)
Date: Sun, 22 Sep 2013 14:03:47 +0200
Subject: [Python-ideas] +1 button/counter for bugs.python.org
References: <CAPkN8xKAA-yGmTT0SBK7YaC3-_ip5eP+qYYDHGaNXtHGfEDFKQ@mail.gmail.com>
Message-ID: <20130922140347.2c54318d@fsol>

On Sun, 22 Sep 2013 14:21:07 +0300
anatoly techtonik <techtonik at gmail.com>
wrote:
> Does anybody think it is a good idea to personally approve good issues and
> messages on bugs.python.org?

Bug voting would be ok (not for "good issues", but "issues people care
about"), but -1 on voting on messages. This isn't a popularity contest.


From steve at pearwood.info  Sun Sep 22 14:21:49 2013
From: steve at pearwood.info (Steven D'Aprano)
Date: Sun, 22 Sep 2013 22:21:49 +1000
Subject: [Python-ideas] Introduce collections.Reiterable
In-Reply-To: <CADiSq7f8_PkwbYgKnhqykEBFvnSypkJg+NHPkB+2M=A5_ztyLA@mail.gmail.com>
References: <575f4071-1b5c-4a16-b36c-b5f925cdd2f7@googlegroups.com>
 <CADiSq7c2G8E47Hth-kQWd4W33dpb9YFiyLs+vs7Seaw20w6fHw@mail.gmail.com>
 <CAA68w_kuPS6xVjAbropBFyiQib3aH9Ke+PiW8ecWhSkmHsnXnQ@mail.gmail.com>
 <CADiSq7cVXAUi4BLpfa13QjhwNmX9JancrMjHSEEfKn_P=F-_ag@mail.gmail.com>
 <20130919121828.GK19939@ando>
 <CADiSq7cyp2=XWT2e+nsCR9Q2TqrqM5fdHUhFBwoTfHT4bwum9w@mail.gmail.com>
 <20130920094854.GO19939@ando>
 <CADiSq7dXOdgoXY+eAhexyyGgb9wadTotztP9ad2DoKbAZfr-qw@mail.gmail.com>
 <l1mbej$5k2$1@ger.gmane.org>
 <CADiSq7f8_PkwbYgKnhqykEBFvnSypkJg+NHPkB+2M=A5_ztyLA@mail.gmail.com>
Message-ID: <20130922122149.GH19939@ando>

On Sun, Sep 22, 2013 at 08:30:43PM +1000, Nick Coghlan wrote:

> As near as I can tell, Steven's observation is that, for backwards
> compatibility reasons, iter() tolerates sequences that define __len__
> and __getitem__ without defining __iter__, whereas the collections
> ABCs require an __iter__ method for their ducktyping to trigger. 

The sequence protocol doesn't require a __len__ method, it only requires 
a __getitem__ method that takes consecutive ints 0, 1, 2, ... and raises 
IndexError when there are no more items to get. But apart from that, 
yes, that's correct. There are iterables that fail the Iterable ABC 
test.


> This
> means that there are a small number of legacy cases where
> "isinstance(c, collections.abc.Iterable)" can be False, while calling
> "iter(c)" would still give you a working iterator.

I'm sure you realise this, but just to be clear, there's no need to 
explicitly call iter(c). More to my point, you can simply iterate over c 
using a for-loop:

for element in c:
    ...


thus proving that c is iterable, since you've just iterated over it.


-- 
Steven

From g.brandl at gmx.net  Sun Sep 22 14:43:05 2013
From: g.brandl at gmx.net (Georg Brandl)
Date: Sun, 22 Sep 2013 14:43:05 +0200
Subject: [Python-ideas] +1 button/counter for bugs.python.org
In-Reply-To: <20130922140347.2c54318d@fsol>
References: <CAPkN8xKAA-yGmTT0SBK7YaC3-_ip5eP+qYYDHGaNXtHGfEDFKQ@mail.gmail.com>
 <20130922140347.2c54318d@fsol>
Message-ID: <l1mojc$u2a$1@ger.gmane.org>

On 09/22/2013 02:03 PM, Antoine Pitrou wrote:
> On Sun, 22 Sep 2013 14:21:07 +0300
> anatoly techtonik <techtonik at gmail.com>
> wrote:
>> Does anybody think it is a good idea to personally approve good issues and
>> messages on bugs.python.org?
> 
> Bug voting would be ok (not for "good issues", but "issues people care
> about"), but -1 on voting on messages. This isn't a popularity contest.

As long as you can also vote -1 on messages :)

Seriously, I agree; if somebody implements it, voting "I want to see this
fixed/implemented" seems fine to me.

Georg


From mbuttu at oa-cagliari.inaf.it  Sun Sep 22 15:33:48 2013
From: mbuttu at oa-cagliari.inaf.it (Marco Buttu)
Date: Sun, 22 Sep 2013 15:33:48 +0200
Subject: [Python-ideas] +1 button/counter for bugs.python.org
In-Reply-To: <20130922140347.2c54318d@fsol>
References: <CAPkN8xKAA-yGmTT0SBK7YaC3-_ip5eP+qYYDHGaNXtHGfEDFKQ@mail.gmail.com>
 <20130922140347.2c54318d@fsol>
Message-ID: <523EF1BC.7050306@oa-cagliari.inaf.it>

On 09/22/2013 02:03 PM, Antoine Pitrou wrote:
> On Sun, 22 Sep 2013 14:21:07 +0300
> anatoly techtonik<techtonik at gmail.com>
> wrote:
>> >Does anybody think it is a good idea to personally approve good issues and
>> >messages on bugs.python.org?
> Bug voting would be ok (not for "good issues", but "issues people care
> about"),
+ 1 for bug voting :)

-- 
Marco Buttu

INAF Osservatorio Astronomico di Cagliari
Loc. Poggio dei Pini, Strada 54 - 09012 Capoterra (CA) - Italy
Phone: +39 070 71180255
Email: mbuttu at oa-cagliari.inaf.it


From ncoghlan at gmail.com  Sun Sep 22 16:22:19 2013
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Mon, 23 Sep 2013 00:22:19 +1000
Subject: [Python-ideas] Introduce collections.Reiterable
In-Reply-To: <20130922122149.GH19939@ando>
References: <575f4071-1b5c-4a16-b36c-b5f925cdd2f7@googlegroups.com>
 <CADiSq7c2G8E47Hth-kQWd4W33dpb9YFiyLs+vs7Seaw20w6fHw@mail.gmail.com>
 <CAA68w_kuPS6xVjAbropBFyiQib3aH9Ke+PiW8ecWhSkmHsnXnQ@mail.gmail.com>
 <CADiSq7cVXAUi4BLpfa13QjhwNmX9JancrMjHSEEfKn_P=F-_ag@mail.gmail.com>
 <20130919121828.GK19939@ando>
 <CADiSq7cyp2=XWT2e+nsCR9Q2TqrqM5fdHUhFBwoTfHT4bwum9w@mail.gmail.com>
 <20130920094854.GO19939@ando>
 <CADiSq7dXOdgoXY+eAhexyyGgb9wadTotztP9ad2DoKbAZfr-qw@mail.gmail.com>
 <l1mbej$5k2$1@ger.gmane.org>
 <CADiSq7f8_PkwbYgKnhqykEBFvnSypkJg+NHPkB+2M=A5_ztyLA@mail.gmail.com>
 <20130922122149.GH19939@ando>
Message-ID: <CADiSq7fpmqpn8Bi=6yXz-=bbm2jHzxF-SG7JHMdN2aJkvU7YXA@mail.gmail.com>

On 22 September 2013 22:21, Steven D'Aprano <steve at pearwood.info> wrote:
>> This
>> means that there are a small number of legacy cases where
>> "isinstance(c, collections.abc.Iterable)" can be False, while calling
>> "iter(c)" would still give you a working iterator.
>
> I'm sure you realise this, but just to be clear, there's no need to
> explicitly call iter(c). More to my point, you can simply iterate over c
> using a for-loop:
>
> for element in c:
>     ...
>
>
> thus proving that c is iterable, since you've just iterated over it.

It's still the implicit call to iter() inside the for loop that
converts the iterable to an iterator though. And these are exactly the
cases that I am saying *deliberately* fail the more formal check
instituted in PEP 3119. The __getitem__ fallback is a backwards
compatibility hack, not part of the formal definition of an iterable.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia

From tshepang at gmail.com  Sun Sep 22 17:52:58 2013
From: tshepang at gmail.com (Tshepang Lekhonkhobe)
Date: Sun, 22 Sep 2013 17:52:58 +0200
Subject: [Python-ideas] +1 button/counter for bugs.python.org
In-Reply-To: <CAPkN8xKAA-yGmTT0SBK7YaC3-_ip5eP+qYYDHGaNXtHGfEDFKQ@mail.gmail.com>
References: <CAPkN8xKAA-yGmTT0SBK7YaC3-_ip5eP+qYYDHGaNXtHGfEDFKQ@mail.gmail.com>
Message-ID: <CAA77j2DYtrX2bwjnbjfH8ukq_wrw72rebBcuT3iLb0GnCtiyeA@mail.gmail.com>

On Sun, Sep 22, 2013 at 1:21 PM, anatoly techtonik <techtonik at gmail.com> wrote:
> Does anybody think it is a good idea to personally approve good issues and
> messages on bugs.python.org?
>
> If yes, should it be a Google's +1 (easier to add), or a pythonic solution
> for Roundup?

Is it not enough that one can subscribe to the bug? It's very easy
(click the '+' button, then hit subscribe). That way, one can also
keep track of where the conversation is going, instead of a mere
vote-n-forget.

From stephen at xemacs.org  Sun Sep 22 18:25:32 2013
From: stephen at xemacs.org (Stephen J. Turnbull)
Date: Mon, 23 Sep 2013 01:25:32 +0900
Subject: [Python-ideas] +1 button/counter for bugs.python.org
In-Reply-To: <CAA77j2DYtrX2bwjnbjfH8ukq_wrw72rebBcuT3iLb0GnCtiyeA@mail.gmail.com>
References: <CAPkN8xKAA-yGmTT0SBK7YaC3-_ip5eP+qYYDHGaNXtHGfEDFKQ@mail.gmail.com>
 <CAA77j2DYtrX2bwjnbjfH8ukq_wrw72rebBcuT3iLb0GnCtiyeA@mail.gmail.com>
Message-ID: <87li2o3jsz.fsf@uwakimon.sk.tsukuba.ac.jp>

Tshepang Lekhonkhobe writes:

 > Is it not enough that one can subscribe to the bug? It's very easy
 > (click the '+' button, then hit subscribe). That way, one can also
 > keep track of where the conversation is going, instead of a mere
 > vote-n-forget.

More important in the context of this thread, it says you care enough
to accept mail, which is a much stronger endorsement than clicking a
+1 button.

I think it might be useful to add a "most subscribed open issues"
table in the weekly report, but I hope that Guido, Antoine, Barry,
Benjamin, Brett, Georg, Nick, Raymond, Tim, ... *ignore* any "cheap
talk" voting mechanism and go on picking issues on their intuitions
about what's important to make Python beautiful.  After all, don't
*all* issues deserve enough attention to close them (if only as
"wontfix")?  Put me down for +1 on everything!<wink/>


From tjreedy at udel.edu  Sun Sep 22 18:28:01 2013
From: tjreedy at udel.edu (Terry Reedy)
Date: Sun, 22 Sep 2013 12:28:01 -0400
Subject: [Python-ideas] Introduce collections.Reiterable
In-Reply-To: <05C362B4-4491-4745-94AE-4EBFC0AA5DDC@yahoo.com>
References: <575f4071-1b5c-4a16-b36c-b5f925cdd2f7@googlegroups.com>
 <20130919121828.GK19939@ando>
 <CADiSq7cyp2=XWT2e+nsCR9Q2TqrqM5fdHUhFBwoTfHT4bwum9w@mail.gmail.com>
 <20130920094854.GO19939@ando>
 <CACac1F-=X-nfKe9sYx-_U_Ties3nU3z4vqzs=DkdqfJFD0StiA@mail.gmail.com>
 <20130920111000.GQ19939@ando>
 <CAN8CLgnAWQ3iszEtvEo_FTydx9Un+Y+41Fe5fVsNdX70Ube9Ww@mail.gmail.com>
 <l1imst$88i$1@ger.gmane.org> <87zjr63oub.fsf@uwakimon.sk.tsukuba.ac.jp>
 <CAA68w_=fhyk8Wh2Mqx4-09cTaPhiqFTS3ZVjL4tez9X9-9Yx+Q@mail.gmail.com>
 <F6683518-F434-4B96-8D10-E424D7F53778@yahoo.com>
 <CAA68w_mkCqCjjFohiWQ2tj3SJxFA+uoAiqOEep30gMCBw33gtQ@mail.gmail.com>
 <CED2BED8-5B94-4203-88D5-4510BC5C067C@yahoo.com>
 <CAA68w_=s4kEgQeGa58g3tuoFZgQc+R48hBQJyFVcHs8TOgqkMg@mail.gmail.com>
 <1060C40C-4255-456C-9B2F-49DAC7B9FB03@yahoo.com>
 <CAA68w_n-U4My0UE+gc4D8m2vXKQrXiO1XbEVp3BB_eQLLyBnYg@mail.gmail.com>
 <l1l6oi$a0i$1@ger.gmane.org> <05C362B4-4491-4745-94AE-4EBFC0AA5DDC@yahoo.com>
Message-ID: <l1n5qd$3l0$1@ger.gmane.org>

On 9/21/2013 10:05 PM, Andrew Barnert wrote:
> On Sep 21, 2013, at 15:31, Terry Reedy
> <tjreedy at udel.edu> wrote:
>
>> On 9/21/2013 5:14 PM, Neil Girdhar wrote:
>>
>>> If you really think that there will never be a non-reiterable
>>> non-iterator iterable,
>>
>> I already posted a sensible non-iterator iterable that is no more
>> reiterable than an iterator.

class Cnt:
     def __init__(self, maxn):
         self.n = 0
         self.maxn = maxn
     def __getitem__(self, dummy):
         n = self.n + 1
         if n <= self.maxn:
             self.n = n
             return n
         else:
             raise IndexError

c3 = Cnt(3)
print(c3 is not iter(c3), list(c3), list(c3))
 >>>
True [1, 2, 3] []

The only difference between this and an equivalent iterator is that True 
would instead be False. I would not call this reiterable, unless one 
says that all iterables, including exhausted iterators, are reiterable, 
because you can always call iter on them again and do a null iteration.

While I sympathize with the desire to classify, there is a reason why 
the inventors of the newer protocol left boundedness and reiterablity to 
negotiation between writers and users of functions taking iterable args. 
They were not unaware of the issues and problems that have been 
discussed in this thread.

> You posted a long discussion of different ways in which "reiterable"
> could be defined, and gave vague examples of things that are
> reiterable in one sense but not in another. Accepting all of that at
> face value, there's no way Neil's Reiterable ABC would help that
> problem, because it would obviously only cover one of the possible
> senses.

We agree on the last sentence.

> Beyond that, I've looked through your posts on that thread, and I
> can't find anything that looks like a sensible non-iterator
> non-reiterable (in Neil's intended sense) iterable. Did I miss
> something?

See above, though I do not know that Neil's intended sense is, or if 
indeed he has exactly one intended sense.

>> I expect that there are examples in the wild.  If nothing else,
>> there are probably some written before the new iterator protocol
>> was added. These are explicitly supported.

/new/newer/

-- 
Terry Jan Reedy


From stephen at xemacs.org  Sun Sep 22 18:30:40 2013
From: stephen at xemacs.org (Stephen J. Turnbull)
Date: Mon, 23 Sep 2013 01:30:40 +0900
Subject: [Python-ideas] Introduce collections.Reiterable
In-Reply-To: <CADiSq7fpmqpn8Bi=6yXz-=bbm2jHzxF-SG7JHMdN2aJkvU7YXA@mail.gmail.com>
References: <575f4071-1b5c-4a16-b36c-b5f925cdd2f7@googlegroups.com>
 <CADiSq7c2G8E47Hth-kQWd4W33dpb9YFiyLs+vs7Seaw20w6fHw@mail.gmail.com>
 <CAA68w_kuPS6xVjAbropBFyiQib3aH9Ke+PiW8ecWhSkmHsnXnQ@mail.gmail.com>
 <CADiSq7cVXAUi4BLpfa13QjhwNmX9JancrMjHSEEfKn_P=F-_ag@mail.gmail.com>
 <20130919121828.GK19939@ando>
 <CADiSq7cyp2=XWT2e+nsCR9Q2TqrqM5fdHUhFBwoTfHT4bwum9w@mail.gmail.com>
 <20130920094854.GO19939@ando>
 <CADiSq7dXOdgoXY+eAhexyyGgb9wadTotztP9ad2DoKbAZfr-qw@mail.gmail.com>
 <l1mbej$5k2$1@ger.gmane.org>
 <CADiSq7f8_PkwbYgKnhqykEBFvnSypkJg+NHPkB+2M=A5_ztyLA@mail.gmail.com>
 <20130922122149.GH19939@ando>
 <CADiSq7fpmqpn8Bi=6yXz-=bbm2jHzxF-SG7JHMdN2aJkvU7YXA@mail.gmail.com>
Message-ID: <87k3i83jkf.fsf@uwakimon.sk.tsukuba.ac.jp>

Nick Coghlan writes:

 > And these are exactly the cases that I am saying *deliberately*
 > fail the more formal check instituted in PEP 3119. The __getitem__
 > fallback is a backwards compatibility hack, not part of the formal
 > definition of an iterable.

I think that resolves my issue that __getitem__ is polymorphic, too.

That is, item access by integer index doesn't care about order (could
be a table of Goedel numbers) any more item access by arbitrary
hashable does, and __iter__ takes care of the cases where the
programmer does care about order.

From tjreedy at udel.edu  Sun Sep 22 18:37:52 2013
From: tjreedy at udel.edu (Terry Reedy)
Date: Sun, 22 Sep 2013 12:37:52 -0400
Subject: [Python-ideas] Introduce collections.Reiterable
In-Reply-To: <CADiSq7fpmqpn8Bi=6yXz-=bbm2jHzxF-SG7JHMdN2aJkvU7YXA@mail.gmail.com>
References: <575f4071-1b5c-4a16-b36c-b5f925cdd2f7@googlegroups.com>
 <CADiSq7c2G8E47Hth-kQWd4W33dpb9YFiyLs+vs7Seaw20w6fHw@mail.gmail.com>
 <CAA68w_kuPS6xVjAbropBFyiQib3aH9Ke+PiW8ecWhSkmHsnXnQ@mail.gmail.com>
 <CADiSq7cVXAUi4BLpfa13QjhwNmX9JancrMjHSEEfKn_P=F-_ag@mail.gmail.com>
 <20130919121828.GK19939@ando>
 <CADiSq7cyp2=XWT2e+nsCR9Q2TqrqM5fdHUhFBwoTfHT4bwum9w@mail.gmail.com>
 <20130920094854.GO19939@ando>
 <CADiSq7dXOdgoXY+eAhexyyGgb9wadTotztP9ad2DoKbAZfr-qw@mail.gmail.com>
 <l1mbej$5k2$1@ger.gmane.org>
 <CADiSq7f8_PkwbYgKnhqykEBFvnSypkJg+NHPkB+2M=A5_ztyLA@mail.gmail.com>
 <20130922122149.GH19939@ando>
 <CADiSq7fpmqpn8Bi=6yXz-=bbm2jHzxF-SG7JHMdN2aJkvU7YXA@mail.gmail.com>
Message-ID: <l1n6ct$8m0$1@ger.gmane.org>

On 9/22/2013 10:22 AM, Nick Coghlan wrote:

> The __getitem__ fallback is a backwards
> compatibility hack, not part of the formal definition of an iterable.

When I suggested that, by suggesting that the fallback *perhaps* could 
be called 'semi-deprecated, but kept for back compatibility' in the 
glossary entry, Raymond screamed at me and accused me of trying to 
change the language. He considers it an intended language feature that 
one can write a sequence class and not bother with __iter__. I guess we 
do not all agree ;-).

-- 
Terry Jan Reedy


From mistersheik at gmail.com  Sun Sep 22 21:04:46 2013
From: mistersheik at gmail.com (Neil Girdhar)
Date: Sun, 22 Sep 2013 15:04:46 -0400
Subject: [Python-ideas] Introduce collections.Reiterable
In-Reply-To: <l1n6ct$8m0$1@ger.gmane.org>
References: <575f4071-1b5c-4a16-b36c-b5f925cdd2f7@googlegroups.com>
 <CADiSq7c2G8E47Hth-kQWd4W33dpb9YFiyLs+vs7Seaw20w6fHw@mail.gmail.com>
 <CAA68w_kuPS6xVjAbropBFyiQib3aH9Ke+PiW8ecWhSkmHsnXnQ@mail.gmail.com>
 <CADiSq7cVXAUi4BLpfa13QjhwNmX9JancrMjHSEEfKn_P=F-_ag@mail.gmail.com>
 <20130919121828.GK19939@ando>
 <CADiSq7cyp2=XWT2e+nsCR9Q2TqrqM5fdHUhFBwoTfHT4bwum9w@mail.gmail.com>
 <20130920094854.GO19939@ando>
 <CADiSq7dXOdgoXY+eAhexyyGgb9wadTotztP9ad2DoKbAZfr-qw@mail.gmail.com>
 <l1mbej$5k2$1@ger.gmane.org>
 <CADiSq7f8_PkwbYgKnhqykEBFvnSypkJg+NHPkB+2M=A5_ztyLA@mail.gmail.com>
 <20130922122149.GH19939@ando>
 <CADiSq7fpmqpn8Bi=6yXz-=bbm2jHzxF-SG7JHMdN2aJkvU7YXA@mail.gmail.com>
 <l1n6ct$8m0$1@ger.gmane.org>
Message-ID: <CAA68w_=wbZaLY6L-q+=_Eiv3ChUrgmRW57vZzxw4fL1Tqvjttg@mail.gmail.com>

I'm with you on this.

If you want an Iterable and you wrote __getitem__, then it's not too much
to ask that you either write a trivial __iter__:

def __iter__(self): return (self.__getitem__(i) for i in itertools.count())

or you write __len__ and inherit from collections.Sequence.

We should deprecate the sequence protocol.

Neil


On Sun, Sep 22, 2013 at 12:37 PM, Terry Reedy <tjreedy at udel.edu> wrote:

> On 9/22/2013 10:22 AM, Nick Coghlan wrote:
>
>  The __getitem__ fallback is a backwards
>> compatibility hack, not part of the formal definition of an iterable.
>>
>
> When I suggested that, by suggesting that the fallback *perhaps* could be
> called 'semi-deprecated, but kept for back compatibility' in the glossary
> entry, Raymond screamed at me and accused me of trying to change the
> language. He considers it an intended language feature that one can write a
> sequence class and not bother with __iter__. I guess we do not all agree
> ;-).
>
> --
> Terry Jan Reedy
>
>
> ______________________________**_________________
> Python-ideas mailing list
> Python-ideas at python.org
> https://mail.python.org/**mailman/listinfo/python-ideas<https://mail.python.org/mailman/listinfo/python-ideas>
>
> --
>
> --- You received this message because you are subscribed to a topic in the
> Google Groups "python-ideas" group.
> To unsubscribe from this topic, visit https://groups.google.com/d/**
> topic/python-ideas/**OumiLGDwRWA/unsubscribe<https://groups.google.com/d/topic/python-ideas/OumiLGDwRWA/unsubscribe>
> .
> To unsubscribe from this group and all its topics, send an email to
> python-ideas+unsubscribe@**googlegroups.com<python-ideas%2Bunsubscribe at googlegroups.com>
> .
> For more options, visit https://groups.google.com/**groups/opt_out<https://groups.google.com/groups/opt_out>
> .
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20130922/b2a91153/attachment.html>

From solipsis at pitrou.net  Sun Sep 22 21:23:26 2013
From: solipsis at pitrou.net (Antoine Pitrou)
Date: Sun, 22 Sep 2013 21:23:26 +0200
Subject: [Python-ideas] +1 button/counter for bugs.python.org
References: <CAPkN8xKAA-yGmTT0SBK7YaC3-_ip5eP+qYYDHGaNXtHGfEDFKQ@mail.gmail.com>
 <CAA77j2DYtrX2bwjnbjfH8ukq_wrw72rebBcuT3iLb0GnCtiyeA@mail.gmail.com>
 <87li2o3jsz.fsf@uwakimon.sk.tsukuba.ac.jp>
Message-ID: <20130922212326.1402d7b2@fsol>

On Mon, 23 Sep 2013 01:25:32 +0900
"Stephen J. Turnbull" <stephen at xemacs.org>
wrote:
> Tshepang Lekhonkhobe writes:
> 
>  > Is it not enough that one can subscribe to the bug? It's very easy
>  > (click the '+' button, then hit subscribe). That way, one can also
>  > keep track of where the conversation is going, instead of a mere
>  > vote-n-forget.
> 
> More important in the context of this thread, it says you care enough
> to accept mail, which is a much stronger endorsement than clicking a
> +1 button.
> 
> I think it might be useful to add a "most subscribed open issues"
> table in the weekly report, but I hope that Guido, Antoine, Barry,
> Benjamin, Brett, Georg, Nick, Raymond, Tim, ... *ignore* any "cheap
> talk" voting mechanism and go on picking issues on their intuitions
> about what's important to make Python beautiful.

Well, intuition and personal taste are of course a major factor, but
sometimes it can be useful to know that a particular problem affects a
lot of people (especially when it's the kind of very un-sexy problem,
e.g. distutils). Of course any request that core developers tackle the
most voted issues in strict order would be silly.

Regards

Antoine.


From abarnert at yahoo.com  Sun Sep 22 21:23:43 2013
From: abarnert at yahoo.com (Andrew Barnert)
Date: Sun, 22 Sep 2013 12:23:43 -0700
Subject: [Python-ideas] Introduce collections.Reiterable
In-Reply-To: <l1n5qd$3l0$1@ger.gmane.org>
References: <575f4071-1b5c-4a16-b36c-b5f925cdd2f7@googlegroups.com>
 <20130919121828.GK19939@ando>
 <CADiSq7cyp2=XWT2e+nsCR9Q2TqrqM5fdHUhFBwoTfHT4bwum9w@mail.gmail.com>
 <20130920094854.GO19939@ando>
 <CACac1F-=X-nfKe9sYx-_U_Ties3nU3z4vqzs=DkdqfJFD0StiA@mail.gmail.com>
 <20130920111000.GQ19939@ando>
 <CAN8CLgnAWQ3iszEtvEo_FTydx9Un+Y+41Fe5fVsNdX70Ube9Ww@mail.gmail.com>
 <l1imst$88i$1@ger.gmane.org> <87zjr63oub.fsf@uwakimon.sk.tsukuba.ac.jp>
 <CAA68w_=fhyk8Wh2Mqx4-09cTaPhiqFTS3ZVjL4tez9X9-9Yx+Q@mail.gmail.com>
 <F6683518-F434-4B96-8D10-E424D7F53778@yahoo.com>
 <CAA68w_mkCqCjjFohiWQ2tj3SJxFA+uoAiqOEep30gMCBw33gtQ@mail.gmail.com>
 <CED2BED8-5B94-4203-88D5-4510BC5C067C@yahoo.com>
 <CAA68w_=s4kEgQeGa58g3tuoFZgQc+R48hBQJyFVcHs8TOgqkMg@mail.gmail.com>
 <1060C40C-4255-456C-9B2F-49DAC7B9FB03@yahoo.com>
 <CAA68w_n-U4My0UE+gc4D8m2vXKQrXiO1XbEVp3BB_eQLLyBnYg@mail.gmail.com>
 <l1l6oi$a0i$1@ger.gmane.org> <05C362B4-4491-4745-94AE-4EBFC0AA5DDC@yahoo.com>
 <l1n5qd$3l0$1@ger.gmane.org>
Message-ID: <C57F5693-BCDD-4C12-A88C-C4793AA4A795@yahoo.com>

On Sep 22, 2013, at 9:28, Terry Reedy <tjreedy at udel.edu> wrote:

> On 9/21/2013 10:05 PM, Andrew Barnert wrote:
>> On Sep 21, 2013, at 15:31, Terry Reedy
>> <tjreedy at udel.edu> wrote:
>> 
>>> On 9/21/2013 5:14 PM, Neil Girdhar wrote:
>>> 
>>>> If you really think that there will never be a non-reiterable
>>>> non-iterator iterable,
>>> 
>>> I already posted a sensible non-iterator iterable that is no more
>>> reiterable than an iterator.
> 
> class Cnt:
>    def __init__(self, maxn):
>        self.n = 0
>        self.maxn = maxn
>    def __getitem__(self, dummy):
>        n = self.n + 1
>        if n <= self.maxn:
>            self.n = n
>            return n
>        else:
>            raise IndexError

But this is a silly class, not a reasonable one. Why would you ever write this class, except to deceive users of it? It's more complicated and more verbose than an equivalent iterator, or the equivalent sequence (which you'd spell "range(n)", and the only "benefit" is that it pretends not to be an iterator.

You can just as easily write something that claims to be a sequence but iterates it's elements in random order; that wouldn't prove that sequences are unordered, just that the ABCs don't test for all possible incorrect semantics.

> c3 = Cnt(3)
> print(c3 is not iter(c3), list(c3), list(c3))
> >>>
> True [1, 2, 3] []
> 
> The only difference between this and an equivalent iterator is that True would instead be False. I would not call this reiterable, unless one says that all iterables, including exhausted iterators, are reiterable, because you can always call iter on them again and do a null iteration.
> 
> While I sympathize with the desire to classify, there is a reason why the inventors of the newer protocol left boundedness and reiterablity to negotiation between writers and users of functions taking iterable args. They were not unaware of the issues and problems that have been discussed in this thread.
> 
>> You posted a long discussion of different ways in which "reiterable"
>> could be defined, and gave vague examples of things that are
>> reiterable in one sense but not in another. Accepting all of that at
>> face value, there's no way Neil's Reiterable ABC would help that
>> problem, because it would obviously only cover one of the possible
>> senses.
> 
> We agree on the last sentence.
> 
>> Beyond that, I've looked through your posts on that thread, and I
>> can't find anything that looks like a sensible non-iterator
>> non-reiterable (in Neil's intended sense) iterable. Did I miss
>> something?
> 
> See above, though I do not know that Neil's intended sense is, or if indeed he has exactly one intended sense.
> 
>>> I expect that there are examples in the wild.  If nothing else,
>>> there are probably some written before the new iterator protocol
>>> was added. These are explicitly supported.
> 
> /new/newer/
> 
> -- 
> Terry Jan Reedy
> 
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> https://mail.python.org/mailman/listinfo/python-ideas

From brian at python.org  Mon Sep 23 00:57:53 2013
From: brian at python.org (Brian Curtin)
Date: Sun, 22 Sep 2013 17:57:53 -0500
Subject: [Python-ideas] +1 button/counter for bugs.python.org
In-Reply-To: <20130922140347.2c54318d@fsol>
References: <CAPkN8xKAA-yGmTT0SBK7YaC3-_ip5eP+qYYDHGaNXtHGfEDFKQ@mail.gmail.com>
 <20130922140347.2c54318d@fsol>
Message-ID: <CAD+XWwqELeYYVkr9NiKB=E2r4=Xd157yVT22k-gCULDqOgPOog@mail.gmail.com>

On Sun, Sep 22, 2013 at 7:03 AM, Antoine Pitrou <solipsis at pitrou.net> wrote:
> On Sun, 22 Sep 2013 14:21:07 +0300
> anatoly techtonik <techtonik at gmail.com>
> wrote:
>> Does anybody think it is a good idea to personally approve good issues and
>> messages on bugs.python.org?
>
> Bug voting would be ok (not for "good issues", but "issues people care
> about"), but -1 on voting on messages. This isn't a popularity contest.

Adding a +1 on issues seems fine, but I doubt it'll change anything. I
think, for the most part, we're pretty aware of issues people care
about that need to be fixed. We'll probably need to document the
feature to set expectations for what those votes actually mean, which
is probably close to nothing.

I can mostly just see this being abused. "Remove the GIL" will get
submitted, then posted to reddit, then we'll have 5,000 votes to
remove the GIL and zero attempts to do it.

From timothy.c.delaney at gmail.com  Mon Sep 23 01:43:17 2013
From: timothy.c.delaney at gmail.com (Tim Delaney)
Date: Mon, 23 Sep 2013 09:43:17 +1000
Subject: [Python-ideas] +1 button/counter for bugs.python.org
In-Reply-To: <CAD+XWwqELeYYVkr9NiKB=E2r4=Xd157yVT22k-gCULDqOgPOog@mail.gmail.com>
References: <CAPkN8xKAA-yGmTT0SBK7YaC3-_ip5eP+qYYDHGaNXtHGfEDFKQ@mail.gmail.com>
 <20130922140347.2c54318d@fsol>
 <CAD+XWwqELeYYVkr9NiKB=E2r4=Xd157yVT22k-gCULDqOgPOog@mail.gmail.com>
Message-ID: <CAN8CLg=Qt2VnptoMm+1R66xqG16pQTuYhXmkVD7msTnD+GOLgw@mail.gmail.com>

On 23 September 2013 08:57, Brian Curtin <brian at python.org> wrote:

> On Sun, Sep 22, 2013 at 7:03 AM, Antoine Pitrou <solipsis at pitrou.net>
> wrote:
> > On Sun, 22 Sep 2013 14:21:07 +0300
> > anatoly techtonik <techtonik at gmail.com>
> > wrote:
> >> Does anybody think it is a good idea to personally approve good issues
> and
> >> messages on bugs.python.org?
> >
> > Bug voting would be ok (not for "good issues", but "issues people care
> > about"), but -1 on voting on messages. This isn't a popularity contest.
>
> Adding a +1 on issues seems fine, but I doubt it'll change anything. I
> think, for the most part, we're pretty aware of issues people care
> about that need to be fixed. We'll probably need to document the
> feature to set expectations for what those votes actually mean, which
> is probably close to nothing.
>
> I can mostly just see this being abused. "Remove the GIL" will get
> submitted, then posted to reddit, then we'll have 5,000 votes to
> remove the GIL and zero attempts to do it.


That's why I agree that the number of subscribers to the bug is a more
useful figure. If someone cares enough about a bug to be notified when it's
modified, that's someone who's really interested in the bug being fixed.

Unfortunately, the only subscription option (that I'm aware of) is nosy. I
think people might be willing to subscribe to be notified when a bug is
closed (or possibly even any time its status changes), but not want to
receive all the notifications you get when you're on the nosy list.

Tim Delaney
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20130923/a171c8e0/attachment.html>

From steve at pearwood.info  Mon Sep 23 01:46:37 2013
From: steve at pearwood.info (Steven D'Aprano)
Date: Mon, 23 Sep 2013 09:46:37 +1000
Subject: [Python-ideas] Introduce collections.Reiterable
In-Reply-To: <l1n6ct$8m0$1@ger.gmane.org>
References: <CADiSq7cVXAUi4BLpfa13QjhwNmX9JancrMjHSEEfKn_P=F-_ag@mail.gmail.com>
 <20130919121828.GK19939@ando>
 <CADiSq7cyp2=XWT2e+nsCR9Q2TqrqM5fdHUhFBwoTfHT4bwum9w@mail.gmail.com>
 <20130920094854.GO19939@ando>
 <CADiSq7dXOdgoXY+eAhexyyGgb9wadTotztP9ad2DoKbAZfr-qw@mail.gmail.com>
 <l1mbej$5k2$1@ger.gmane.org>
 <CADiSq7f8_PkwbYgKnhqykEBFvnSypkJg+NHPkB+2M=A5_ztyLA@mail.gmail.com>
 <20130922122149.GH19939@ando>
 <CADiSq7fpmqpn8Bi=6yXz-=bbm2jHzxF-SG7JHMdN2aJkvU7YXA@mail.gmail.com>
 <l1n6ct$8m0$1@ger.gmane.org>
Message-ID: <20130922234637.GK19939@ando>

On Sun, Sep 22, 2013 at 12:37:52PM -0400, Terry Reedy wrote:
> On 9/22/2013 10:22 AM, Nick Coghlan wrote:
> 
> >The __getitem__ fallback is a backwards
> >compatibility hack, not part of the formal definition of an iterable.
> 
> When I suggested that, by suggesting that the fallback *perhaps* could 
> be called 'semi-deprecated, but kept for back compatibility' in the 
> glossary entry, Raymond screamed at me and accused me of trying to 
> change the language. He considers it an intended language feature that 
> one can write a sequence class and not bother with __iter__. I guess we 
> do not all agree ;-).

Raymond did not "scream", he wrote *one* word in uppercase for emphasis.
I quote:

    It is NOT deprecated.   People use and rely on this behavior.  It is 
    a guaranteed behavior.  Please don't use the glossary as a place to 
    introduce changes to the language.


I agree, and I disagree with Nick's characterization of the sequence 
protocol as a "backwards-compatibility hack". It is an elegant protocol 
for implementing iteration of sequences, an old and venerable one that 
predates iterators, and just as much of Python's defined iterable 
behaviour as the business with calling next with no argument until it 
raises StopIteration. If it were considered *merely* for backward 
compatibility with Python 1.5 code, there was plenty of opportunity to 
drop it when Python 3 came out.

The sequence protocol allows one to write a lazily generated, 
potentially infinite sequence that still allows random access to items. 
Here's a toy example:


py> class Squares:
...     def __getitem__(self, index):
...         return index**2
...
py> for sq in Squares():
...     if sq > 9: break
...     print(sq)
...
0
1
4
9


Because it's infinite, there's no value that __len__ can return, and no 
need for a __len__. Because it supports random access to items, writing 
this as an iterator with __next__ is inappropriate. Writing *both* is 
unnecessary, and complicates the class for no benefit. As written, 
Squares is naturally thread-safe -- two threads can iterate over the 
same Squares object without interfering.


-- 
Steven

From steve at pearwood.info  Mon Sep 23 02:12:39 2013
From: steve at pearwood.info (Steven D'Aprano)
Date: Mon, 23 Sep 2013 10:12:39 +1000
Subject: [Python-ideas] +1 button/counter for bugs.python.org
In-Reply-To: <CAA77j2DYtrX2bwjnbjfH8ukq_wrw72rebBcuT3iLb0GnCtiyeA@mail.gmail.com>
References: <CAPkN8xKAA-yGmTT0SBK7YaC3-_ip5eP+qYYDHGaNXtHGfEDFKQ@mail.gmail.com>
 <CAA77j2DYtrX2bwjnbjfH8ukq_wrw72rebBcuT3iLb0GnCtiyeA@mail.gmail.com>
Message-ID: <20130923001239.GN19939@ando>

On Sun, Sep 22, 2013 at 05:52:58PM +0200, Tshepang Lekhonkhobe wrote:
> On Sun, Sep 22, 2013 at 1:21 PM, anatoly techtonik <techtonik at gmail.com> wrote:
> > Does anybody think it is a good idea to personally approve good issues and
> > messages on bugs.python.org?
> >
> > If yes, should it be a Google's +1 (easier to add), or a pythonic solution
> > for Roundup?
> 
> Is it not enough that one can subscribe to the bug? It's very easy
> (click the '+' button, then hit subscribe). That way, one can also
> keep track of where the conversation is going, instead of a mere
> vote-n-forget.

Exactly. 

I think that masses of +1 votes from people who care so little about an 
issue that they can't be bothered to add themselves to the Nosy list is 
next to worthless.


-- 
Steven

From mistersheik at gmail.com  Mon Sep 23 02:24:05 2013
From: mistersheik at gmail.com (Neil Girdhar)
Date: Sun, 22 Sep 2013 20:24:05 -0400
Subject: [Python-ideas] Introduce collections.Reiterable
In-Reply-To: <20130922234637.GK19939@ando>
References: <CADiSq7cVXAUi4BLpfa13QjhwNmX9JancrMjHSEEfKn_P=F-_ag@mail.gmail.com>
 <20130919121828.GK19939@ando>
 <CADiSq7cyp2=XWT2e+nsCR9Q2TqrqM5fdHUhFBwoTfHT4bwum9w@mail.gmail.com>
 <20130920094854.GO19939@ando>
 <CADiSq7dXOdgoXY+eAhexyyGgb9wadTotztP9ad2DoKbAZfr-qw@mail.gmail.com>
 <l1mbej$5k2$1@ger.gmane.org>
 <CADiSq7f8_PkwbYgKnhqykEBFvnSypkJg+NHPkB+2M=A5_ztyLA@mail.gmail.com>
 <20130922122149.GH19939@ando>
 <CADiSq7fpmqpn8Bi=6yXz-=bbm2jHzxF-SG7JHMdN2aJkvU7YXA@mail.gmail.com>
 <l1n6ct$8m0$1@ger.gmane.org> <20130922234637.GK19939@ando>
Message-ID: <CAA68w_kfp_uhK5bep8E+srDi=AD7fM9+8-BXNTUKgMpmetUONQ@mail.gmail.com>

Why not just add one line?

def __iter__(self): return (self.__getitem__(i) for i in itertools.count())


On Sun, Sep 22, 2013 at 7:46 PM, Steven D'Aprano <steve at pearwood.info>wrote:

> On Sun, Sep 22, 2013 at 12:37:52PM -0400, Terry Reedy wrote:
> > On 9/22/2013 10:22 AM, Nick Coghlan wrote:
> >
> > >The __getitem__ fallback is a backwards
> > >compatibility hack, not part of the formal definition of an iterable.
> >
> > When I suggested that, by suggesting that the fallback *perhaps* could
> > be called 'semi-deprecated, but kept for back compatibility' in the
> > glossary entry, Raymond screamed at me and accused me of trying to
> > change the language. He considers it an intended language feature that
> > one can write a sequence class and not bother with __iter__. I guess we
> > do not all agree ;-).
>
> Raymond did not "scream", he wrote *one* word in uppercase for emphasis.
> I quote:
>
>     It is NOT deprecated.   People use and rely on this behavior.  It is
>     a guaranteed behavior.  Please don't use the glossary as a place to
>     introduce changes to the language.
>
>
> I agree, and I disagree with Nick's characterization of the sequence
> protocol as a "backwards-compatibility hack". It is an elegant protocol
> for implementing iteration of sequences, an old and venerable one that
> predates iterators, and just as much of Python's defined iterable
> behaviour as the business with calling next with no argument until it
> raises StopIteration. If it were considered *merely* for backward
> compatibility with Python 1.5 code, there was plenty of opportunity to
> drop it when Python 3 came out.
>
> The sequence protocol allows one to write a lazily generated,
> potentially infinite sequence that still allows random access to items.
> Here's a toy example:
>
>
> py> class Squares:
> ...     def __getitem__(self, index):
> ...         return index**2
> ...
> py> for sq in Squares():
> ...     if sq > 9: break
> ...     print(sq)
> ...
> 0
> 1
> 4
> 9
>
>
> Because it's infinite, there's no value that __len__ can return, and no
> need for a __len__. Because it supports random access to items, writing
> this as an iterator with __next__ is inappropriate. Writing *both* is
> unnecessary, and complicates the class for no benefit. As written,
> Squares is naturally thread-safe -- two threads can iterate over the
> same Squares object without interfering.
>
>
>
> --
> Steven
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> https://mail.python.org/mailman/listinfo/python-ideas
>
> --
>
> ---
> You received this message because you are subscribed to a topic in the
> Google Groups "python-ideas" group.
> To unsubscribe from this topic, visit
> https://groups.google.com/d/topic/python-ideas/OumiLGDwRWA/unsubscribe.
> To unsubscribe from this group and all its topics, send an email to
> python-ideas+unsubscribe at googlegroups.com.
> For more options, visit https://groups.google.com/groups/opt_out.
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20130922/dae51eb0/attachment-0001.html>

From ethan at stoneleaf.us  Mon Sep 23 01:55:49 2013
From: ethan at stoneleaf.us (Ethan Furman)
Date: Sun, 22 Sep 2013 16:55:49 -0700
Subject: [Python-ideas] Introduce collections.Reiterable
In-Reply-To: <20130922234637.GK19939@ando>
References: <CADiSq7cVXAUi4BLpfa13QjhwNmX9JancrMjHSEEfKn_P=F-_ag@mail.gmail.com>
 <20130919121828.GK19939@ando>
 <CADiSq7cyp2=XWT2e+nsCR9Q2TqrqM5fdHUhFBwoTfHT4bwum9w@mail.gmail.com>
 <20130920094854.GO19939@ando>
 <CADiSq7dXOdgoXY+eAhexyyGgb9wadTotztP9ad2DoKbAZfr-qw@mail.gmail.com>
 <l1mbej$5k2$1@ger.gmane.org>
 <CADiSq7f8_PkwbYgKnhqykEBFvnSypkJg+NHPkB+2M=A5_ztyLA@mail.gmail.com>
 <20130922122149.GH19939@ando>
 <CADiSq7fpmqpn8Bi=6yXz-=bbm2jHzxF-SG7JHMdN2aJkvU7YXA@mail.gmail.com>
 <l1n6ct$8m0$1@ger.gmane.org> <20130922234637.GK19939@ando>
Message-ID: <523F8385.9040903@stoneleaf.us>

On 09/22/2013 04:46 PM, Steven D'Aprano wrote:
>
> The sequence protocol allows one to write a lazily generated,
> potentially infinite sequence that still allows random access to items.
> Here's a toy example:
>
>
> py> class Squares:
> ...     def __getitem__(self, index):
> ...         return index**2
> ...
> py> for sq in Squares():
> ...     if sq > 9: break
> ...     print(sq)
> ...
> 0
> 1
> 4
> 9
>
>
> Because it's infinite, there's no value that __len__ can return, and no
> need for a __len__. Because it supports random access to items, writing
> this as an iterator with __next__ is inappropriate. Writing *both* is
> unnecessary, and complicates the class for no benefit. As written,
> Squares is naturally thread-safe -- two threads can iterate over the
> same Squares object without interfering.

Nice example.  :)

--
~Ethan~

From ethan at stoneleaf.us  Mon Sep 23 02:48:52 2013
From: ethan at stoneleaf.us (Ethan Furman)
Date: Sun, 22 Sep 2013 17:48:52 -0700
Subject: [Python-ideas] Introduce collections.Reiterable
In-Reply-To: <CAA68w_kfp_uhK5bep8E+srDi=AD7fM9+8-BXNTUKgMpmetUONQ@mail.gmail.com>
References: <CADiSq7cVXAUi4BLpfa13QjhwNmX9JancrMjHSEEfKn_P=F-_ag@mail.gmail.com>
 <20130919121828.GK19939@ando>
 <CADiSq7cyp2=XWT2e+nsCR9Q2TqrqM5fdHUhFBwoTfHT4bwum9w@mail.gmail.com>
 <20130920094854.GO19939@ando>
 <CADiSq7dXOdgoXY+eAhexyyGgb9wadTotztP9ad2DoKbAZfr-qw@mail.gmail.com>
 <l1mbej$5k2$1@ger.gmane.org>
 <CADiSq7f8_PkwbYgKnhqykEBFvnSypkJg+NHPkB+2M=A5_ztyLA@mail.gmail.com>
 <20130922122149.GH19939@ando>
 <CADiSq7fpmqpn8Bi=6yXz-=bbm2jHzxF-SG7JHMdN2aJkvU7YXA@mail.gmail.com>
 <l1n6ct$8m0$1@ger.gmane.org> <20130922234637.GK19939@ando>
 <CAA68w_kfp_uhK5bep8E+srDi=AD7fM9+8-BXNTUKgMpmetUONQ@mail.gmail.com>
Message-ID: <523F8FF4.8070505@stoneleaf.us>

On 09/22/2013 05:24 PM, Neil Girdhar wrote:
> Why not just add one line?
>
> def __iter__(self): return (self.__getitem__(i) for i in itertools.count())

Why should he?  Python treats his class just fine the way it is.

--
~Ethan~

From ethan at stoneleaf.us  Mon Sep 23 03:41:53 2013
From: ethan at stoneleaf.us (Ethan Furman)
Date: Sun, 22 Sep 2013 18:41:53 -0700
Subject: [Python-ideas] Introduce collections.Reiterable
In-Reply-To: <CAA68w_=wbZaLY6L-q+=_Eiv3ChUrgmRW57vZzxw4fL1Tqvjttg@mail.gmail.com>
References: <575f4071-1b5c-4a16-b36c-b5f925cdd2f7@googlegroups.com>
 <CADiSq7c2G8E47Hth-kQWd4W33dpb9YFiyLs+vs7Seaw20w6fHw@mail.gmail.com>
 <CAA68w_kuPS6xVjAbropBFyiQib3aH9Ke+PiW8ecWhSkmHsnXnQ@mail.gmail.com>
 <CADiSq7cVXAUi4BLpfa13QjhwNmX9JancrMjHSEEfKn_P=F-_ag@mail.gmail.com>
 <20130919121828.GK19939@ando>
 <CADiSq7cyp2=XWT2e+nsCR9Q2TqrqM5fdHUhFBwoTfHT4bwum9w@mail.gmail.com>
 <20130920094854.GO19939@ando>
 <CADiSq7dXOdgoXY+eAhexyyGgb9wadTotztP9ad2DoKbAZfr-qw@mail.gmail.com>
 <l1mbej$5k2$1@ger.gmane.org>
 <CADiSq7f8_PkwbYgKnhqykEBFvnSypkJg+NHPkB+2M=A5_ztyLA@mail.gmail.com>
 <20130922122149.GH19939@ando>
 <CADiSq7fpmqpn8Bi=6yXz-=bbm2jHzxF-SG7JHMdN2aJkvU7YXA@mail.gmail.com>
 <l1n6ct$8m0$1@ger.gmane.org>
 <CAA68w_=wbZaLY6L-q+=_Eiv3ChUrgmRW57vZzxw4fL1Tqvjttg@mail.gmail.com>
Message-ID: <523F9C61.2020506@stoneleaf.us>

On 09/22/2013 12:04 PM, Neil Girdhar wrote:
> I'm with you on this.
>
> If you want an Iterable and you wrote __getitem__, then it's not too much to ask that you either write a trivial __iter__:

If you want to be in full compliance, sure.  But one of the nice things about Python is you aren't forced to write more 
than you need.  If I have objects that I want to have be equal to each other I can just write __eq__ -- I don't have to 
write __lt__, __gt__, __le__, nor __ge__.  And if I want to not equal to be the opposite of equal (as opposed to 
something weird) I don't even need to write __ne__ any more.

> or you write __len__ and inherit from collections.Sequence.

I don't like inheriting from any of the abc's (maybe I just haven't written a large enough framework yet).  And I'm not 
writing __len__ unless I plan on supporting len().

> We should deprecate the sequence protocol.

No, we shouldn't.

--
~Ethan~

From anikom15 at gmail.com  Mon Sep 23 05:55:58 2013
From: anikom15 at gmail.com (=?iso-8859-1?Q?Westley_Mart=EDnez?=)
Date: Sun, 22 Sep 2013 20:55:58 -0700
Subject: [Python-ideas] +1 button/counter for bugs.python.org
In-Reply-To: <CAD+XWwqELeYYVkr9NiKB=E2r4=Xd157yVT22k-gCULDqOgPOog@mail.gmail.com>
References: <CAPkN8xKAA-yGmTT0SBK7YaC3-_ip5eP+qYYDHGaNXtHGfEDFKQ@mail.gmail.com>
 <20130922140347.2c54318d@fsol>
 <CAD+XWwqELeYYVkr9NiKB=E2r4=Xd157yVT22k-gCULDqOgPOog@mail.gmail.com>
Message-ID: <001801ceb810$d0e63b60$72b2b220$@gmail.com>

> -----Original Message-----
> From: Python-ideas [mailto:python-ideas-bounces+anikom15=gmail.com at python.org]
> On Behalf Of Brian Curtin
> Sent: Sunday, September 22, 2013 3:58 PM
> To: Antoine Pitrou
> Cc: python-ideas
> Subject: Re: [Python-ideas] +1 button/counter for bugs.python.org
> 
> On Sun, Sep 22, 2013 at 7:03 AM, Antoine Pitrou <solipsis at pitrou.net> wrote:
> > On Sun, 22 Sep 2013 14:21:07 +0300
> > anatoly techtonik <techtonik at gmail.com>
> > wrote:
> >> Does anybody think it is a good idea to personally approve good issues and
> >> messages on bugs.python.org?
> >
> > Bug voting would be ok (not for "good issues", but "issues people care
> > about"), but -1 on voting on messages. This isn't a popularity contest.
> 
> Adding a +1 on issues seems fine, but I doubt it'll change anything. I
> think, for the most part, we're pretty aware of issues people care
> about that need to be fixed. We'll probably need to document the
> feature to set expectations for what those votes actually mean, which
> is probably close to nothing.
> 
> I can mostly just see this being abused. "Remove the GIL" will get
> submitted, then posted to reddit, then we'll have 5,000 votes to
> remove the GIL and zero attempts to do it.

+1
I don't think this can work without having some sort of karma system
which does not need to happen.  Python is not a democracy.  It's a
tyrannical dictatorship.


From ncoghlan at gmail.com  Mon Sep 23 06:58:08 2013
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Mon, 23 Sep 2013 14:58:08 +1000
Subject: [Python-ideas] Introduce collections.Reiterable
In-Reply-To: <20130922234637.GK19939@ando>
References: <CADiSq7cVXAUi4BLpfa13QjhwNmX9JancrMjHSEEfKn_P=F-_ag@mail.gmail.com>
 <20130919121828.GK19939@ando>
 <CADiSq7cyp2=XWT2e+nsCR9Q2TqrqM5fdHUhFBwoTfHT4bwum9w@mail.gmail.com>
 <20130920094854.GO19939@ando>
 <CADiSq7dXOdgoXY+eAhexyyGgb9wadTotztP9ad2DoKbAZfr-qw@mail.gmail.com>
 <l1mbej$5k2$1@ger.gmane.org>
 <CADiSq7f8_PkwbYgKnhqykEBFvnSypkJg+NHPkB+2M=A5_ztyLA@mail.gmail.com>
 <20130922122149.GH19939@ando>
 <CADiSq7fpmqpn8Bi=6yXz-=bbm2jHzxF-SG7JHMdN2aJkvU7YXA@mail.gmail.com>
 <l1n6ct$8m0$1@ger.gmane.org> <20130922234637.GK19939@ando>
Message-ID: <CADiSq7cv4uBk45VQzmNMx+LzDazi-QWG0=n9PmMFNME-BEqodw@mail.gmail.com>

On 23 Sep 2013 09:47, "Steven D'Aprano" <steve at pearwood.info> wrote:
>
> On Sun, Sep 22, 2013 at 12:37:52PM -0400, Terry Reedy wrote:
> > On 9/22/2013 10:22 AM, Nick Coghlan wrote:
> >
> > >The __getitem__ fallback is a backwards
> > >compatibility hack, not part of the formal definition of an iterable.
> >
> > When I suggested that, by suggesting that the fallback *perhaps* could
> > be called 'semi-deprecated, but kept for back compatibility' in the
> > glossary entry, Raymond screamed at me and accused me of trying to
> > change the language. He considers it an intended language feature that
> > one can write a sequence class and not bother with __iter__. I guess we
> > do not all agree ;-).
>
> Raymond did not "scream", he wrote *one* word in uppercase for emphasis.
> I quote:
>
>     It is NOT deprecated.   People use and rely on this behavior.  It is
>     a guaranteed behavior.  Please don't use the glossary as a place to
>     introduce changes to the language.
>
>
> I agree, and I disagree with Nick's characterization of the sequence
> protocol as a "backwards-compatibility hack". It is an elegant protocol
> for implementing iteration of sequences, an old and venerable one that
> predates iterators, and just as much of Python's defined iterable
> behaviour as the business with calling next with no argument until it
> raises StopIteration. If it were considered *merely* for backward
> compatibility with Python 1.5 code, there was plenty of opportunity to
> drop it when Python 3 came out.
>
> The sequence protocol allows one to write a lazily generated,
> potentially infinite sequence that still allows random access to items.
> Here's a toy example:
>
>
> py> class Squares:
> ...     def __getitem__(self, index):
> ...         return index**2
> ...
> py> for sq in Squares():
> ...     if sq > 9: break
> ...     print(sq)
> ...
> 0
> 1
> 4
> 9
>
>
> Because it's infinite, there's no value that __len__ can return, and no
> need for a __len__. Because it supports random access to items, writing
> this as an iterator with __next__ is inappropriate. Writing *both* is
> unnecessary, and complicates the class for no benefit. As written,
> Squares is naturally thread-safe -- two threads can iterate over the
> same Squares object without interfering.

And PEP 3119 means you have to decorate it with "@Iterable.register" for
Python to *formally* consider it an iterable (or a third party can do the
registration later). Merely defining __getitem__ is considered
insufficient, since it is possible to define that *without* intending to
create an iterable.

Cheers,
Nick.

>
>
>
> --
> Steven
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> https://mail.python.org/mailman/listinfo/python-ideas
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20130923/d0aa9b8b/attachment-0001.html>

From stephen at xemacs.org  Mon Sep 23 10:04:10 2013
From: stephen at xemacs.org (Stephen J. Turnbull)
Date: Mon, 23 Sep 2013 17:04:10 +0900
Subject: [Python-ideas] Introduce collections.Reiterable
In-Reply-To: <20130922234637.GK19939@ando>
References: <CADiSq7cVXAUi4BLpfa13QjhwNmX9JancrMjHSEEfKn_P=F-_ag@mail.gmail.com>
 <20130919121828.GK19939@ando>
 <CADiSq7cyp2=XWT2e+nsCR9Q2TqrqM5fdHUhFBwoTfHT4bwum9w@mail.gmail.com>
 <20130920094854.GO19939@ando>
 <CADiSq7dXOdgoXY+eAhexyyGgb9wadTotztP9ad2DoKbAZfr-qw@mail.gmail.com>
 <l1mbej$5k2$1@ger.gmane.org>
 <CADiSq7f8_PkwbYgKnhqykEBFvnSypkJg+NHPkB+2M=A5_ztyLA@mail.gmail.com>
 <20130922122149.GH19939@ando>
 <CADiSq7fpmqpn8Bi=6yXz-=bbm2jHzxF-SG7JHMdN2aJkvU7YXA@mail.gmail.com>
 <l1n6ct$8m0$1@ger.gmane.org> <20130922234637.GK19939@ando>
Message-ID: <87bo3k2ccl.fsf@uwakimon.sk.tsukuba.ac.jp>

Executive summary:

The ability to create a quick iterable with just a simple __getitem__
is cool and not a "hack" (ie, no need whatsoever to deprecate it), but
it is clearly a "consenting adults" construction (which includes
"knowing where your children are at 10pm").

Steven D'Aprano writes:

 > I agree, and I disagree with Nick's characterization of the
 > sequence protocol as a "backwards-compatibility hack". It is an
 > elegant protocol

Gotta disagree with you there (except I agree there's no need for a
word like "hack").  Because __getitem__ is polymorphic (at the
abstract level of duck-typing), this protocol is ugly.  The "must
accept 0" clause is a wart.

 > The sequence protocol allows one to write a lazily generated, 
 > potentially infinite sequence that still allows random access to items. 

Sure, but it's not fully general.  One may not *want* to write
__next__ using __getitem__.

A somewhat pathological<wink/> example is the case of Goedel numbering
of syntactically correct programs.  programs.__getitem__ can be
implemented directly by arithmetic, while programs.__next__ is best
implemented by "unrolling" the grammar.

Of course it makes sense to use an already written __getitem__ to
implement __next__ when the numerical indicies provide a semantically
useful order.  But that's already done by the Sequence ABC:

    class Squares(Sequence):        # implies mixin Iterable
        def __getitem(self, n):
            return n*n
        # __iter__ is provided as a mixin method using __getitem__
        # by Iterable

The problem is that Sequence requires a __len__ method.  OK, so

    # put this in your toolbox
    class UndefinedLengthError(TypeError):
        pass

    class InfiniteSequence(Sequence):
        def __len__(self):
            raise UndefinedLengthError

    # in programs
    from toolbox import InfiniteSequence
    class Squares(InfiniteSequence):
        def __getitem__(self, i):
            return i*i

 > Because it's infinite, there's no value that __len__ can return,
 > and no need for a __len__.

Well, it *could* return an infinite value or None, but list() isn't
prepared for that.  list() isn't even prepared for

    class Squares(object):
        def __init__(self, n):
            self.listsize = n
        def __getitem__(self, i):
            return i*i
        def __len__(self):
            return self.listsize

(It doesn't return in a sane amount of time.  I guess it goes ahead
and attempts to construct an infinite list with

    l = []
    for x in squares:
        l.append(x)

Perhaps it's a shame it doesn't detect that there's a __len__ and use
it to truncate the sequence, but most of the time it would just be
overhead, I guess.)  A lot of other functions are also going to be
upset when they get a Squares object.

This discussion is relevant because these are the kinds of things that
bothered the OP.

 > Because it supports random access to items, writing this as an
 > iterator with __next__ is inappropriate. Writing *both* is
 > unnecessary,

Incorrect, as written.  In order to iterate over a sequence (small
"s"), "somebody" has to write __next__.  It's just that the function
is generic, already written, and the compiler automatically binds it
(actually, a closure using it) to the __next__ attribute of the
automatically created iterator.  This makes it unnecessary for the
application programmer to write it.  That is indeed elegant.

 > and complicates the class for no benefit. As written, Squares is
 > naturally thread-safe -- two threads can iterate over the same
 > Squares object without interfering.

The obvious way of writing this as a generator would also be naturally
thread-safe:

    class Squares(object):
        def __iter__(self):
            n = 0
            while True:
                yield n*n
                n = n + 1

AFAICS this is faster (less function-call overhead).  In this
application it doesn't matter, but it could.  And anything where a bit
of state is useful (eg, the Fibonacci sequence) would be a lot faster
with a hand-written __iter__.


From solipsis at pitrou.net  Mon Sep 23 10:14:51 2013
From: solipsis at pitrou.net (Antoine Pitrou)
Date: Mon, 23 Sep 2013 10:14:51 +0200
Subject: [Python-ideas] +1 button/counter for bugs.python.org
References: <CAPkN8xKAA-yGmTT0SBK7YaC3-_ip5eP+qYYDHGaNXtHGfEDFKQ@mail.gmail.com>
 <CAA77j2DYtrX2bwjnbjfH8ukq_wrw72rebBcuT3iLb0GnCtiyeA@mail.gmail.com>
 <20130923001239.GN19939@ando>
Message-ID: <20130923101451.691dfa7b@pitrou.net>

Le Mon, 23 Sep 2013 10:12:39 +1000,
Steven D'Aprano <steve at pearwood.info> a
?crit :
> On Sun, Sep 22, 2013 at 05:52:58PM +0200, Tshepang Lekhonkhobe wrote:
> > On Sun, Sep 22, 2013 at 1:21 PM, anatoly techtonik
> > <techtonik at gmail.com> wrote:
> > > Does anybody think it is a good idea to personally approve good
> > > issues and messages on bugs.python.org?
> > >
> > > If yes, should it be a Google's +1 (easier to add), or a pythonic
> > > solution for Roundup?
> > 
> > Is it not enough that one can subscribe to the bug? It's very easy
> > (click the '+' button, then hit subscribe). That way, one can also
> > keep track of where the conversation is going, instead of a mere
> > vote-n-forget.
> 
> Exactly. 
> 
> I think that masses of +1 votes from people who care so little about
> an issue that they can't be bothered to add themselves to the Nosy
> list is next to worthless.

I don't know about you, but I don't add myself to the Nosy list of
every bug that irks me on third-party software. There's no reason to
subscribe to an issue's messages when you are a mere end-user. That
doesn't mean the bug isn't affecting you.

Regards

Antoine.


From ncoghlan at gmail.com  Mon Sep 23 10:44:12 2013
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Mon, 23 Sep 2013 18:44:12 +1000
Subject: [Python-ideas] Introduce collections.Reiterable
In-Reply-To: <87bo3k2ccl.fsf@uwakimon.sk.tsukuba.ac.jp>
References: <CADiSq7cVXAUi4BLpfa13QjhwNmX9JancrMjHSEEfKn_P=F-_ag@mail.gmail.com>
 <20130919121828.GK19939@ando>
 <CADiSq7cyp2=XWT2e+nsCR9Q2TqrqM5fdHUhFBwoTfHT4bwum9w@mail.gmail.com>
 <20130920094854.GO19939@ando>
 <CADiSq7dXOdgoXY+eAhexyyGgb9wadTotztP9ad2DoKbAZfr-qw@mail.gmail.com>
 <l1mbej$5k2$1@ger.gmane.org>
 <CADiSq7f8_PkwbYgKnhqykEBFvnSypkJg+NHPkB+2M=A5_ztyLA@mail.gmail.com>
 <20130922122149.GH19939@ando>
 <CADiSq7fpmqpn8Bi=6yXz-=bbm2jHzxF-SG7JHMdN2aJkvU7YXA@mail.gmail.com>
 <l1n6ct$8m0$1@ger.gmane.org> <20130922234637.GK19939@ando>
 <87bo3k2ccl.fsf@uwakimon.sk.tsukuba.ac.jp>
Message-ID: <CADiSq7cs+H1mpYL2vXEsZLc9LYH1roDd7iOPAzS6_VbzQ3WqOg@mail.gmail.com>

On 23 September 2013 18:04, Stephen J. Turnbull <stephen at xemacs.org> wrote:
> Executive summary:
>
> The ability to create a quick iterable with just a simple __getitem__
> is cool and not a "hack" (ie, no need whatsoever to deprecate it), but
> it is clearly a "consenting adults" construction (which includes
> "knowing where your children are at 10pm").
>
> Steven D'Aprano writes:
>
>  > I agree, and I disagree with Nick's characterization of the
>  > sequence protocol as a "backwards-compatibility hack". It is an
>  > elegant protocol
>
> Gotta disagree with you there (except I agree there's no need for a
> word like "hack").  Because __getitem__ is polymorphic (at the
> abstract level of duck-typing), this protocol is ugly.  The "must
> accept 0" clause is a wart.

I think others object to the word "hack" more than I do (or give it
additional implications like "in danger of being deprecated"). To me
it's just a shorthand for saying "this is a case where practicality
beat purity". Just because something is a hack doesn't mean it isn't
useful and isn't a good idea.

I consider functools.wraps to be a hack that managed to preserve
introspectability of most decorated functions with minimal development
effort. runpy and the -m switch took quite a while to evolve into
something that wasn't a hack (although they still have some hacky
parts due to limitations of the import protocol). The code that makes
objects that override __eq__ without overriding __hash__ non-hashable
(and the associated "__hash__ = None") trick is a hack. Python 3's new
super is incredibly nice and easy to use, but there's also a lot of
hackery lurking behind it.

The fact Python 3 lets you create ranges you can't directly take the
length of is a bit of a hack, too (because of the pain involved in
defining an alternative __len__ protocol that didn't funnel everything
through an ssize_t value):

>>> x = range(10**100)
>>> len(x)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
OverflowError: Python int too large to convert to C ssize_t
>>> (x.stop - x.start) // x.step
10000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000

To create a properly defined iterable in modern Python, you must
either implement __iter__ or implement an iter() compatible
__getitem__ and explicitly register with Iterable (to indicate that
your __getitem__ *is* compatible with the fallback protocol in
iter()). Steven's right that I left out that second alternative when
stating what it takes for an item to be considered an iterable, but I
still consider the __getitem__ fallback to be just a neat backwards
compatibility hack for sequences that were defined before the iterator
protocol existed and before the Iterable ABC provided a way to
explicitly declare that your __getitem__ implementation was compatible
with the sequence-iterator protocol.

I was also wrong about iter() checking for __len__ - that's part of
the sequence API fallback in reversed(), rather than the one in
iter():

>>> class InfiniteIter:
...     def __getitem__(self, idx):
...         return idx
...
>>> reversed(InfiniteIter())
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: object of type 'InfiniteIter' has no len()

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia

From oscar.j.benjamin at gmail.com  Mon Sep 23 10:55:19 2013
From: oscar.j.benjamin at gmail.com (Oscar Benjamin)
Date: Mon, 23 Sep 2013 09:55:19 +0100
Subject: [Python-ideas] Introduce collections.Reiterable
In-Reply-To: <CADiSq7cs+H1mpYL2vXEsZLc9LYH1roDd7iOPAzS6_VbzQ3WqOg@mail.gmail.com>
References: <CADiSq7cVXAUi4BLpfa13QjhwNmX9JancrMjHSEEfKn_P=F-_ag@mail.gmail.com>
 <20130919121828.GK19939@ando>
 <CADiSq7cyp2=XWT2e+nsCR9Q2TqrqM5fdHUhFBwoTfHT4bwum9w@mail.gmail.com>
 <20130920094854.GO19939@ando>
 <CADiSq7dXOdgoXY+eAhexyyGgb9wadTotztP9ad2DoKbAZfr-qw@mail.gmail.com>
 <l1mbej$5k2$1@ger.gmane.org>
 <CADiSq7f8_PkwbYgKnhqykEBFvnSypkJg+NHPkB+2M=A5_ztyLA@mail.gmail.com>
 <20130922122149.GH19939@ando>
 <CADiSq7fpmqpn8Bi=6yXz-=bbm2jHzxF-SG7JHMdN2aJkvU7YXA@mail.gmail.com>
 <l1n6ct$8m0$1@ger.gmane.org> <20130922234637.GK19939@ando>
 <87bo3k2ccl.fsf@uwakimon.sk.tsukuba.ac.jp>
 <CADiSq7cs+H1mpYL2vXEsZLc9LYH1roDd7iOPAzS6_VbzQ3WqOg@mail.gmail.com>
Message-ID: <CAHVvXxTypBZ2U1O+xEZKdOE91cgB4FKNH3jbPUFObVUzSEr2kA@mail.gmail.com>

On 23 September 2013 09:44, Nick Coghlan <ncoghlan at gmail.com> wrote:
>
> The fact Python 3 lets you create ranges you can't directly take the
> length of is a bit of a hack, too (because of the pain involved in
> defining an alternative __len__ protocol that didn't funnel everything
> through an ssize_t value):
>
>>>> x = range(10**100)
>>>> len(x)
> Traceback (most recent call last):
>   File "<stdin>", line 1, in <module>
> OverflowError: Python int too large to convert to C ssize_t
>>>> (x.stop - x.start) // x.step
> 10000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000

I wouldn't call that a "hack". It's clearly a bug.


Oscar

From ncoghlan at gmail.com  Mon Sep 23 12:43:22 2013
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Mon, 23 Sep 2013 20:43:22 +1000
Subject: [Python-ideas] Introduce collections.Reiterable
In-Reply-To: <CAHVvXxTypBZ2U1O+xEZKdOE91cgB4FKNH3jbPUFObVUzSEr2kA@mail.gmail.com>
References: <CADiSq7cVXAUi4BLpfa13QjhwNmX9JancrMjHSEEfKn_P=F-_ag@mail.gmail.com>
 <20130919121828.GK19939@ando>
 <CADiSq7cyp2=XWT2e+nsCR9Q2TqrqM5fdHUhFBwoTfHT4bwum9w@mail.gmail.com>
 <20130920094854.GO19939@ando>
 <CADiSq7dXOdgoXY+eAhexyyGgb9wadTotztP9ad2DoKbAZfr-qw@mail.gmail.com>
 <l1mbej$5k2$1@ger.gmane.org>
 <CADiSq7f8_PkwbYgKnhqykEBFvnSypkJg+NHPkB+2M=A5_ztyLA@mail.gmail.com>
 <20130922122149.GH19939@ando>
 <CADiSq7fpmqpn8Bi=6yXz-=bbm2jHzxF-SG7JHMdN2aJkvU7YXA@mail.gmail.com>
 <l1n6ct$8m0$1@ger.gmane.org> <20130922234637.GK19939@ando>
 <87bo3k2ccl.fsf@uwakimon.sk.tsukuba.ac.jp>
 <CADiSq7cs+H1mpYL2vXEsZLc9LYH1roDd7iOPAzS6_VbzQ3WqOg@mail.gmail.com>
 <CAHVvXxTypBZ2U1O+xEZKdOE91cgB4FKNH3jbPUFObVUzSEr2kA@mail.gmail.com>
Message-ID: <CADiSq7cVOwKOru2y0nbiCxg_dCSh5T4cUx+tXxk-_mVY2sHVLQ@mail.gmail.com>

On 23 September 2013 18:55, Oscar Benjamin <oscar.j.benjamin at gmail.com> wrote:
> On 23 September 2013 09:44, Nick Coghlan <ncoghlan at gmail.com> wrote:
>>
>> The fact Python 3 lets you create ranges you can't directly take the
>> length of is a bit of a hack, too (because of the pain involved in
>> defining an alternative __len__ protocol that didn't funnel everything
>> through an ssize_t value):
>>
>>>>> x = range(10**100)
>>>>> len(x)
>> Traceback (most recent call last):
>>   File "<stdin>", line 1, in <module>
>> OverflowError: Python int too large to convert to C ssize_t
>>>>> (x.stop - x.start) // x.step
>> 10000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000
>
> I wouldn't call that a "hack". It's clearly a bug.

The hack is that we were able to make those large ranges possible
*without* fixing the limitation that len() (or, more accurately, the
CPython tp_len slot) only supports 64-bit containers. Solving the
latter is a *much* harder problem that would require a PEP to add a
new type slot, and it's hard to justify doing all that work for such a
niche use case, especially when the workaround is relatively simple.

If we'd taken the purist approach, then the result would more likely
have been that ranges would have remained limited to lengths that fit
in 64 bits rather than that the 64-bit limitation would have been
removed.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia

From oscar.j.benjamin at gmail.com  Mon Sep 23 12:53:44 2013
From: oscar.j.benjamin at gmail.com (Oscar Benjamin)
Date: Mon, 23 Sep 2013 11:53:44 +0100
Subject: [Python-ideas] Introduce collections.Reiterable
In-Reply-To: <CADiSq7cVOwKOru2y0nbiCxg_dCSh5T4cUx+tXxk-_mVY2sHVLQ@mail.gmail.com>
References: <CADiSq7cVXAUi4BLpfa13QjhwNmX9JancrMjHSEEfKn_P=F-_ag@mail.gmail.com>
 <20130919121828.GK19939@ando>
 <CADiSq7cyp2=XWT2e+nsCR9Q2TqrqM5fdHUhFBwoTfHT4bwum9w@mail.gmail.com>
 <20130920094854.GO19939@ando>
 <CADiSq7dXOdgoXY+eAhexyyGgb9wadTotztP9ad2DoKbAZfr-qw@mail.gmail.com>
 <l1mbej$5k2$1@ger.gmane.org>
 <CADiSq7f8_PkwbYgKnhqykEBFvnSypkJg+NHPkB+2M=A5_ztyLA@mail.gmail.com>
 <20130922122149.GH19939@ando>
 <CADiSq7fpmqpn8Bi=6yXz-=bbm2jHzxF-SG7JHMdN2aJkvU7YXA@mail.gmail.com>
 <l1n6ct$8m0$1@ger.gmane.org> <20130922234637.GK19939@ando>
 <87bo3k2ccl.fsf@uwakimon.sk.tsukuba.ac.jp>
 <CADiSq7cs+H1mpYL2vXEsZLc9LYH1roDd7iOPAzS6_VbzQ3WqOg@mail.gmail.com>
 <CAHVvXxTypBZ2U1O+xEZKdOE91cgB4FKNH3jbPUFObVUzSEr2kA@mail.gmail.com>
 <CADiSq7cVOwKOru2y0nbiCxg_dCSh5T4cUx+tXxk-_mVY2sHVLQ@mail.gmail.com>
Message-ID: <CAHVvXxScT7TsweZFM-T1Q4eKOHyoZfpba1oD4d9_NChA0_PMeA@mail.gmail.com>

On 23 September 2013 11:43, Nick Coghlan <ncoghlan at gmail.com> wrote:
> On 23 September 2013 18:55, Oscar Benjamin <oscar.j.benjamin at gmail.com> wrote:
>> On 23 September 2013 09:44, Nick Coghlan <ncoghlan at gmail.com> wrote:
>>>
>>> The fact Python 3 lets you create ranges you can't directly take the
>>> length of is a bit of a hack, too (because of the pain involved in
>>> defining an alternative __len__ protocol that didn't funnel everything
>>> through an ssize_t value):
>>>
>>>>>> x = range(10**100)
>>>>>> len(x)
>>> Traceback (most recent call last):
>>>   File "<stdin>", line 1, in <module>
>>> OverflowError: Python int too large to convert to C ssize_t
>>>>>> (x.stop - x.start) // x.step
>>> 10000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000
>>
>> I wouldn't call that a "hack". It's clearly a bug.
>
> The hack is that we were able to make those large ranges possible
> *without* fixing the limitation that len() (or, more accurately, the
> CPython tp_len slot) only supports 64-bit containers. Solving the
> latter is a *much* harder problem that would require a PEP to add a
> new type slot, and it's hard to justify doing all that work for such a
> niche use case, especially when the workaround is relatively simple.
>
> If we'd taken the purist approach, then the result would more likely
> have been that ranges would have remained limited to lengths that fit
> in 64 bits rather than that the 64-bit limitation would have been
> removed.

It may not be worth fixing but I still consider it a bug. It also
doesn't work for Python classes:

$ python3
Python 3.3.2 (v3.3.2:d047928ae3f6, May 16 2013, 00:03:43) [MSC v.1600
32 bit (Intel)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> class myrange:
...   def __len__(self): return 10 ** 1000
...
>>> len(myrange())
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
OverflowError: cannot fit 'int' into an index-sized integer


Oscar

From ncoghlan at gmail.com  Mon Sep 23 12:57:19 2013
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Mon, 23 Sep 2013 20:57:19 +1000
Subject: [Python-ideas] Introduce collections.Reiterable
In-Reply-To: <CAHVvXxScT7TsweZFM-T1Q4eKOHyoZfpba1oD4d9_NChA0_PMeA@mail.gmail.com>
References: <CADiSq7cVXAUi4BLpfa13QjhwNmX9JancrMjHSEEfKn_P=F-_ag@mail.gmail.com>
 <20130919121828.GK19939@ando>
 <CADiSq7cyp2=XWT2e+nsCR9Q2TqrqM5fdHUhFBwoTfHT4bwum9w@mail.gmail.com>
 <20130920094854.GO19939@ando>
 <CADiSq7dXOdgoXY+eAhexyyGgb9wadTotztP9ad2DoKbAZfr-qw@mail.gmail.com>
 <l1mbej$5k2$1@ger.gmane.org>
 <CADiSq7f8_PkwbYgKnhqykEBFvnSypkJg+NHPkB+2M=A5_ztyLA@mail.gmail.com>
 <20130922122149.GH19939@ando>
 <CADiSq7fpmqpn8Bi=6yXz-=bbm2jHzxF-SG7JHMdN2aJkvU7YXA@mail.gmail.com>
 <l1n6ct$8m0$1@ger.gmane.org> <20130922234637.GK19939@ando>
 <87bo3k2ccl.fsf@uwakimon.sk.tsukuba.ac.jp>
 <CADiSq7cs+H1mpYL2vXEsZLc9LYH1roDd7iOPAzS6_VbzQ3WqOg@mail.gmail.com>
 <CAHVvXxTypBZ2U1O+xEZKdOE91cgB4FKNH3jbPUFObVUzSEr2kA@mail.gmail.com>
 <CADiSq7cVOwKOru2y0nbiCxg_dCSh5T4cUx+tXxk-_mVY2sHVLQ@mail.gmail.com>
 <CAHVvXxScT7TsweZFM-T1Q4eKOHyoZfpba1oD4d9_NChA0_PMeA@mail.gmail.com>
Message-ID: <CADiSq7cojsVmmBVPxN9YMjobg4artceP72g1q9jTFMwQ2oPffg@mail.gmail.com>

On 23 September 2013 20:53, Oscar Benjamin <oscar.j.benjamin at gmail.com> wrote:
> It may not be worth fixing but I still consider it a bug.

len() being limited to 64-bit values is indeed a bug in CPython.
That's not what I was citing as an example of what I consider a neat
hack, though - the neat hack is the fact ranges that don't fit in
64-bits are still mostly supported *despite* that bug.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia

From steve at pearwood.info  Mon Sep 23 16:23:37 2013
From: steve at pearwood.info (Steven D'Aprano)
Date: Tue, 24 Sep 2013 00:23:37 +1000
Subject: [Python-ideas] Introduce collections.Reiterable
In-Reply-To: <87bo3k2ccl.fsf@uwakimon.sk.tsukuba.ac.jp>
References: <CADiSq7cyp2=XWT2e+nsCR9Q2TqrqM5fdHUhFBwoTfHT4bwum9w@mail.gmail.com>
 <20130920094854.GO19939@ando>
 <CADiSq7dXOdgoXY+eAhexyyGgb9wadTotztP9ad2DoKbAZfr-qw@mail.gmail.com>
 <l1mbej$5k2$1@ger.gmane.org>
 <CADiSq7f8_PkwbYgKnhqykEBFvnSypkJg+NHPkB+2M=A5_ztyLA@mail.gmail.com>
 <20130922122149.GH19939@ando>
 <CADiSq7fpmqpn8Bi=6yXz-=bbm2jHzxF-SG7JHMdN2aJkvU7YXA@mail.gmail.com>
 <l1n6ct$8m0$1@ger.gmane.org> <20130922234637.GK19939@ando>
 <87bo3k2ccl.fsf@uwakimon.sk.tsukuba.ac.jp>
Message-ID: <20130923142336.GA7989@ando>

On Mon, Sep 23, 2013 at 05:04:10PM +0900, Stephen J. Turnbull wrote:
> Executive summary:
> 
> The ability to create a quick iterable with just a simple __getitem__
> is cool and not a "hack" (ie, no need whatsoever to deprecate it), but
> it is clearly a "consenting adults" construction (which includes
> "knowing where your children are at 10pm").

When I first raised the issue that some iterables are not recognised by 
collections.Iterable as iterable, I asked to be convinced that it is not 
a bug. Now I'm convinced.


1) Objects which inherit from the Iterable ABC are not merely iterables 
in the sense of "can be iterated over", but also iterables in the ABC 
sense. Obviously. These can be considered "official" iterables, or 
perhaps Iterables with a capital I.

2) Objects with a __getitem__ method that obey the sequence protocol are 
also iterable in the sense of "can be iterated over", but if they don't 
inherit from Iterable they don't pass the ABC isinstance test.

Since "objects with a __getitem__ method that can be iterated over but 
that don't inherit from collections.Iterable" is a bit of a mouthful, 
for brevity I'm going to call them "de facto iterables", at the risk of 
being told off for inventing my own terminology *wink*

3) While it may appear strange to have something that can be iterated 
over not be recognised as an iterable, this is not very different from 
what can happen with duck-typing in general. We might write a class that 
duck-types as (say) a string, and have it not be recognised as such by 
isinstance(obj, string). Such is life.

4) If you want your de facto iterable to pass isinstance(obj, Iterable) 
tests, then you have to register it to make it official.


[...]
> This discussion is relevant because these are the kinds of things that
> bothered the OP.

Yes, we've certainly covered a lot of ground from the question of 
"Reiterable".


-- 
Steven

From mistersheik at gmail.com  Tue Sep 24 02:19:40 2013
From: mistersheik at gmail.com (Neil Girdhar)
Date: Mon, 23 Sep 2013 20:19:40 -0400
Subject: [Python-ideas] Introduce collections.Reiterable
In-Reply-To: <20130923142336.GA7989@ando>
References: <CADiSq7cyp2=XWT2e+nsCR9Q2TqrqM5fdHUhFBwoTfHT4bwum9w@mail.gmail.com>
 <20130920094854.GO19939@ando>
 <CADiSq7dXOdgoXY+eAhexyyGgb9wadTotztP9ad2DoKbAZfr-qw@mail.gmail.com>
 <l1mbej$5k2$1@ger.gmane.org>
 <CADiSq7f8_PkwbYgKnhqykEBFvnSypkJg+NHPkB+2M=A5_ztyLA@mail.gmail.com>
 <20130922122149.GH19939@ando>
 <CADiSq7fpmqpn8Bi=6yXz-=bbm2jHzxF-SG7JHMdN2aJkvU7YXA@mail.gmail.com>
 <l1n6ct$8m0$1@ger.gmane.org> <20130922234637.GK19939@ando>
 <87bo3k2ccl.fsf@uwakimon.sk.tsukuba.ac.jp> <20130923142336.GA7989@ando>
Message-ID: <CAA68w_ngs4D0H13gOt2s1WqH-NbV0MoP4SD3jyWOap7TL6ns7w@mail.gmail.com>

If infinite sequences are so common, it might be better to add a
collections.abc for them.  Then, besides a default __iter__ being
generated, you could automatically generate advanced slicing into the
original infinite sequence that returns a Sequence.   This slice would then
support the automatically-generated .index method, etc.

Also, the big advantage to inheriting from an abc rather than defining
__getitem__and counting on Python silently allowing __iter__ to work is
that the former is an explicit declaration of intent.  The latter counts on
people knowing a weird feature of python.

Best

Neil


On Mon, Sep 23, 2013 at 10:23 AM, Steven D'Aprano <steve at pearwood.info>wrote:

> On Mon, Sep 23, 2013 at 05:04:10PM +0900, Stephen J. Turnbull wrote:
> > Executive summary:
> >
> > The ability to create a quick iterable with just a simple __getitem__
> > is cool and not a "hack" (ie, no need whatsoever to deprecate it), but
> > it is clearly a "consenting adults" construction (which includes
> > "knowing where your children are at 10pm").
>
> When I first raised the issue that some iterables are not recognised by
> collections.Iterable as iterable, I asked to be convinced that it is not
> a bug. Now I'm convinced.
>
>
> 1) Objects which inherit from the Iterable ABC are not merely iterables
> in the sense of "can be iterated over", but also iterables in the ABC
> sense. Obviously. These can be considered "official" iterables, or
> perhaps Iterables with a capital I.
>
> 2) Objects with a __getitem__ method that obey the sequence protocol are
> also iterable in the sense of "can be iterated over", but if they don't
> inherit from Iterable they don't pass the ABC isinstance test.
>
> Since "objects with a __getitem__ method that can be iterated over but
> that don't inherit from collections.Iterable" is a bit of a mouthful,
> for brevity I'm going to call them "de facto iterables", at the risk of
> being told off for inventing my own terminology *wink*
>
> 3) While it may appear strange to have something that can be iterated
> over not be recognised as an iterable, this is not very different from
> what can happen with duck-typing in general. We might write a class that
> duck-types as (say) a string, and have it not be recognised as such by
> isinstance(obj, string). Such is life.
>
> 4) If you want your de facto iterable to pass isinstance(obj, Iterable)
> tests, then you have to register it to make it official.
>
>
> [...]
> > This discussion is relevant because these are the kinds of things that
> > bothered the OP.
>
> Yes, we've certainly covered a lot of ground from the question of
> "Reiterable".
>
>
> --
> Steven
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> https://mail.python.org/mailman/listinfo/python-ideas
>
> --
>
> ---
> You received this message because you are subscribed to a topic in the
> Google Groups "python-ideas" group.
> To unsubscribe from this topic, visit
> https://groups.google.com/d/topic/python-ideas/OumiLGDwRWA/unsubscribe.
> To unsubscribe from this group and all its topics, send an email to
> python-ideas+unsubscribe at googlegroups.com.
> For more options, visit https://groups.google.com/groups/opt_out.
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20130923/0370f1b6/attachment-0001.html>

From stephen at xemacs.org  Tue Sep 24 03:04:03 2013
From: stephen at xemacs.org (Stephen J. Turnbull)
Date: Tue, 24 Sep 2013 10:04:03 +0900
Subject: [Python-ideas] Introduce collections.Reiterable
In-Reply-To: <CAA68w_ngs4D0H13gOt2s1WqH-NbV0MoP4SD3jyWOap7TL6ns7w@mail.gmail.com>
References: <CADiSq7cyp2=XWT2e+nsCR9Q2TqrqM5fdHUhFBwoTfHT4bwum9w@mail.gmail.com>
 <20130920094854.GO19939@ando>
 <CADiSq7dXOdgoXY+eAhexyyGgb9wadTotztP9ad2DoKbAZfr-qw@mail.gmail.com>
 <l1mbej$5k2$1@ger.gmane.org>
 <CADiSq7f8_PkwbYgKnhqykEBFvnSypkJg+NHPkB+2M=A5_ztyLA@mail.gmail.com>
 <20130922122149.GH19939@ando>
 <CADiSq7fpmqpn8Bi=6yXz-=bbm2jHzxF-SG7JHMdN2aJkvU7YXA@mail.gmail.com>
 <l1n6ct$8m0$1@ger.gmane.org> <20130922234637.GK19939@ando>
 <87bo3k2ccl.fsf@uwakimon.sk.tsukuba.ac.jp>
 <20130923142336.GA7989@ando>
 <CAA68w_ngs4D0H13gOt2s1WqH-NbV0MoP4SD3jyWOap7TL6ns7w@mail.gmail.com>
Message-ID: <87wqm7114s.fsf@uwakimon.sk.tsukuba.ac.jp>

Neil Girdhar writes:

 > If infinite sequences are so common, it might be better to add a
 > collections.abc for them.

I suspect this falls under the "not every 3-line function" clause,
because it would really require a PEP to get right (changes to
builtins like list and dict would be needed, IIUC).

Just inherit from Sequence and add a __len__ which returns a unique
object (probably could be None, actually), and check for that private
protocol yourself.

P.S.  Please don't post via Google Groups.  It results in spam for
those of us who don't subscribe to the Google Group.

From steve at pearwood.info  Tue Sep 24 03:37:04 2013
From: steve at pearwood.info (Steven D'Aprano)
Date: Tue, 24 Sep 2013 11:37:04 +1000
Subject: [Python-ideas] Introduce collections.Reiterable
In-Reply-To: <87wqm7114s.fsf@uwakimon.sk.tsukuba.ac.jp>
References: <l1mbej$5k2$1@ger.gmane.org>
 <CADiSq7f8_PkwbYgKnhqykEBFvnSypkJg+NHPkB+2M=A5_ztyLA@mail.gmail.com>
 <20130922122149.GH19939@ando>
 <CADiSq7fpmqpn8Bi=6yXz-=bbm2jHzxF-SG7JHMdN2aJkvU7YXA@mail.gmail.com>
 <l1n6ct$8m0$1@ger.gmane.org> <20130922234637.GK19939@ando>
 <87bo3k2ccl.fsf@uwakimon.sk.tsukuba.ac.jp> <20130923142336.GA7989@ando>
 <CAA68w_ngs4D0H13gOt2s1WqH-NbV0MoP4SD3jyWOap7TL6ns7w@mail.gmail.com>
 <87wqm7114s.fsf@uwakimon.sk.tsukuba.ac.jp>
Message-ID: <20130924013704.GE7989@ando>

On Tue, Sep 24, 2013 at 10:04:03AM +0900, Stephen J. Turnbull wrote:
> Neil Girdhar writes:
> 
>  > If infinite sequences are so common, it might be better to add a
>  > collections.abc for them.
> 
> I suspect this falls under the "not every 3-line function" clause,
> because it would really require a PEP to get right (changes to
> builtins like list and dict would be needed, IIUC).

A lot of work for virtually no benefit. Besides, who said that infinite 
iterators are common? 

 
> Just inherit from Sequence and add a __len__ which returns a unique
> object (probably could be None, actually), and check for that private
> protocol yourself.

Alas, that doesn't work.

py> class X:
...     def __len__(self):
...         return None
...
py> x = X()
py> len(x)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: 'NoneType' object cannot be interpreted as an integer


If you care about infinite iterators, you can add your own "isinfinite" 
flag on them. Personally, I wouldn't bother. I just consider this a case 
for programming by contract: unless the function you are calling 
promises to be safe with infinite iterators, you should not use them.


-- 
Steven

From stephen at xemacs.org  Tue Sep 24 05:42:18 2013
From: stephen at xemacs.org (Stephen J. Turnbull)
Date: Tue, 24 Sep 2013 12:42:18 +0900
Subject: [Python-ideas] Introduce collections.Reiterable
In-Reply-To: <20130924013704.GE7989@ando>
References: <l1mbej$5k2$1@ger.gmane.org>
 <CADiSq7f8_PkwbYgKnhqykEBFvnSypkJg+NHPkB+2M=A5_ztyLA@mail.gmail.com>
 <20130922122149.GH19939@ando>
 <CADiSq7fpmqpn8Bi=6yXz-=bbm2jHzxF-SG7JHMdN2aJkvU7YXA@mail.gmail.com>
 <l1n6ct$8m0$1@ger.gmane.org> <20130922234637.GK19939@ando>
 <87bo3k2ccl.fsf@uwakimon.sk.tsukuba.ac.jp>
 <20130923142336.GA7989@ando>
 <CAA68w_ngs4D0H13gOt2s1WqH-NbV0MoP4SD3jyWOap7TL6ns7w@mail.gmail.com>
 <87wqm7114s.fsf@uwakimon.sk.tsukuba.ac.jp>
 <20130924013704.GE7989@ando>
Message-ID: <87vc1q28dh.fsf@uwakimon.sk.tsukuba.ac.jp>

Steven D'Aprano writes:

 > A lot of work for virtually no benefit. Besides, who said that infinite 
 > iterators are common? 

Infinite, no.  Don't know the length until you're done, common.
Length nondeterministic and in principle unbounded, common.

 > If you care about infinite iterators, you can add your own "isinfinite" 
 > flag on them. Personally, I wouldn't bother. I just consider this a case 
 > for programming by contract: unless the function you are calling 
 > promises to be safe with infinite iterators, you should not use them.

But finite iterators can cause problems too (eg, Nick's length=1google
range -- even with an attosecond processor, that will take a while to
exhaust :-).  It would be nice if a program could choose its own value
of "too big", and process "large finite" and "infinite" lists in the
same way by taking "as much as possible".

That's what frustrates the OP -- it's *hard* to write a function that
makes a valid promise to be safe with all iterables.  (Of course his
definition of "safe" is much stricter, he requires "reiterable", not
just "finite and of 'reasonable' size".  But the principle is the same
-- Python should make it easy to write safe functions.  Of course Nick
is right: "Although practicality beats purity.")

From steve at pearwood.info  Tue Sep 24 06:21:05 2013
From: steve at pearwood.info (Steven D'Aprano)
Date: Tue, 24 Sep 2013 14:21:05 +1000
Subject: [Python-ideas] Introduce collections.Reiterable
In-Reply-To: <87vc1q28dh.fsf@uwakimon.sk.tsukuba.ac.jp>
References: <20130922122149.GH19939@ando>
 <CADiSq7fpmqpn8Bi=6yXz-=bbm2jHzxF-SG7JHMdN2aJkvU7YXA@mail.gmail.com>
 <l1n6ct$8m0$1@ger.gmane.org> <20130922234637.GK19939@ando>
 <87bo3k2ccl.fsf@uwakimon.sk.tsukuba.ac.jp> <20130923142336.GA7989@ando>
 <CAA68w_ngs4D0H13gOt2s1WqH-NbV0MoP4SD3jyWOap7TL6ns7w@mail.gmail.com>
 <87wqm7114s.fsf@uwakimon.sk.tsukuba.ac.jp> <20130924013704.GE7989@ando>
 <87vc1q28dh.fsf@uwakimon.sk.tsukuba.ac.jp>
Message-ID: <20130924042105.GH7989@ando>

On Tue, Sep 24, 2013 at 12:42:18PM +0900, Stephen J. Turnbull wrote:
> Steven D'Aprano writes:
> 
>  > A lot of work for virtually no benefit. Besides, who said that infinite 
>  > iterators are common? 
> 
> Infinite, no.  Don't know the length until you're done, common.

Which is why iterators don't require a length. Another way of spelling 
"length is unknown" is "object has no __len__".


> Length nondeterministic and in principle unbounded, common.

Maybe it's the mathematician in me speaking, but I don't think very many 
unbounded iterators are found outside of maths sequences. After all, 
even if you were to iterate over every atom in the universe, that would 
be bounded, and quite small compared to some of the numbers 
mathematicians deal with... :-)


>  > If you care about infinite iterators, you can add your own "isinfinite" 
>  > flag on them. Personally, I wouldn't bother. I just consider this a case 
>  > for programming by contract: unless the function you are calling 
>  > promises to be safe with infinite iterators, you should not use them.
> 
> But finite iterators can cause problems too (eg, Nick's length=1google
> range -- even with an attosecond processor, that will take a while to
> exhaust :-).  It would be nice if a program could choose its own value
> of "too big", and process "large finite" and "infinite" lists in the
> same way by taking "as much as possible".

You can already do that, although it requires a bit of manual work 
and preperation. Within Python, you can use itertools.islice, and take 
slices of everything to limit the number of items processed:

process(islice(some_iterator, MAXIMUM))

Or you can use your operating system to manage resource limits, e.g. on 
Linux systems ulimit -v seems to work for me:


py> def big():
...     while True:
...             yield 1
...
py> list(big())
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
MemoryError

It would be nice if Python allowed you to tune memory consumption within 
Python itself, but failing that, that's what the OS is for.

Mind you, I have repeatedly been bitten by accidently calling list() on 
a too large iterator. So I'm sympathetic to the view that this is a hard 
problem to solve and Python should help solve it.


-- 
Steven

From stephen at xemacs.org  Tue Sep 24 07:59:03 2013
From: stephen at xemacs.org (Stephen J. Turnbull)
Date: Tue, 24 Sep 2013 14:59:03 +0900
Subject: [Python-ideas] Introduce collections.Reiterable
In-Reply-To: <20130924042105.GH7989@ando>
References: <20130922122149.GH19939@ando>
 <CADiSq7fpmqpn8Bi=6yXz-=bbm2jHzxF-SG7JHMdN2aJkvU7YXA@mail.gmail.com>
 <l1n6ct$8m0$1@ger.gmane.org> <20130922234637.GK19939@ando>
 <87bo3k2ccl.fsf@uwakimon.sk.tsukuba.ac.jp>
 <20130923142336.GA7989@ando>
 <CAA68w_ngs4D0H13gOt2s1WqH-NbV0MoP4SD3jyWOap7TL6ns7w@mail.gmail.com>
 <87wqm7114s.fsf@uwakimon.sk.tsukuba.ac.jp>
 <20130924013704.GE7989@ando>
 <87vc1q28dh.fsf@uwakimon.sk.tsukuba.ac.jp>
 <20130924042105.GH7989@ando>
Message-ID: <87siwu221k.fsf@uwakimon.sk.tsukuba.ac.jp>

Steven D'Aprano writes:

 > Maybe it's the mathematician in me speaking, but I don't think very many 
 > unbounded iterators are found outside of maths sequences.

Nothing *in* math is ever found *outside* of math.  Even the number 0
is unreliable in quantum physics. :-)

In the real world, the use of unbounded math models is intended to
remind us of the fact that the Tokyo Electric Power Company learned
the hard way on March 11, 2011: if you put a practical bound on the
size of tsunamis, soon enough one of size BOUND + 1 comes along.

 > > It would be nice if a program could choose its own value of "too
 > > big", and process "large finite" and "infinite" lists in the same
 > > way by taking "as much as possible".
 > 
 > You can already do that,

Of course we can.  Are we not Men?  No, we are HACKERS.[1] :-)

 > although it requires a bit of manual work and preperation.

Aye, and there's the rub.  But I'll grant it "never is often better
than right now".  And the actual gripe ("re-iteration") hasn't really
been given a math definition yet, and is therefore much harder to
diagnose syntactically.


Footnotes: 
[1]  In Nick's sense, of course.


From ram.rachum at gmail.com  Tue Sep 24 13:49:20 2013
From: ram.rachum at gmail.com (Ram Rachum)
Date: Tue, 24 Sep 2013 04:49:20 -0700 (PDT)
Subject: [Python-ideas] `OrderedDict.sort`
Message-ID: <a7a49b35-4baa-4cb5-891d-cfaf9f31127f@googlegroups.com>

What do you think about providing an `OrderedDict.sort` method? I've been 
using my own `OrderedDict` subclass that defines `sort` for years, and I 
always wondered why the stdlib one doesn't provide `sort`.

I can write the patch if needed.


Thanks,
Ram.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20130924/323832ff/attachment.html>

From steve at pearwood.info  Tue Sep 24 14:13:15 2013
From: steve at pearwood.info (Steven D'Aprano)
Date: Tue, 24 Sep 2013 22:13:15 +1000
Subject: [Python-ideas] `OrderedDict.sort`
In-Reply-To: <a7a49b35-4baa-4cb5-891d-cfaf9f31127f@googlegroups.com>
References: <a7a49b35-4baa-4cb5-891d-cfaf9f31127f@googlegroups.com>
Message-ID: <20130924121315.GI7989@ando>

On Tue, Sep 24, 2013 at 04:49:20AM -0700, Ram Rachum wrote:
> What do you think about providing an `OrderedDict.sort` method? I've been 
> using my own `OrderedDict` subclass that defines `sort` for years, and I 
> always wondered why the stdlib one doesn't provide `sort`.
> 
> I can write the patch if needed.

I'm not entirely sure why anyone would need an OrderedDict sort method. 
Ordered Dicts store keys by insertion order. Sorting the keys goes 
against the purpose of an OrderedDict.

I can understand a request for a SortedDict, that keeps the keys in 
sorted order as they are deleted or inserted. I personally don't have 
any need for one, since when I need the keys in sorted order I just 
sort them on the fly:

for key in sorted(dict):
    ...

 
but in any case, that's a separate issue from sorting an OrderedDict. 
Can you explain the use-case for why somebody might want to throw away 
the insertion order and replace with sorted order?


-- 
Steven

From ram.rachum at gmail.com  Tue Sep 24 14:27:08 2013
From: ram.rachum at gmail.com (Ram Rachum)
Date: Tue, 24 Sep 2013 05:27:08 -0700 (PDT)
Subject: [Python-ideas] `OrderedDict.sort`
In-Reply-To: <20130924121315.GI7989@ando>
References: <a7a49b35-4baa-4cb5-891d-cfaf9f31127f@googlegroups.com>
 <20130924121315.GI7989@ando>
Message-ID: <993eee00-84f4-4219-9637-797fd995055a@googlegroups.com>

I think that your mistake is defining OrderedDict as a dict sorting by 
insertion order. I see no reason to define it that way, and the fact that 
insertion order is the default is not a reason in my opinion. It's just a 
dict with an order, and I see no reason to not let users move elements 
about as they wish. Yes, I'm aware that the documentation defined 
OrderedDict your way too; I still think it's a pointless restriction. 

Regarding examples:

I've used my `OrderedDict.sort` at least 10 times. Just today I've used it 
again. I was putting three items in an ordrered dict, with keys 'low', 
'medium' and 'high'. I wanted to have them sorted as 'low', 'medium' and 
'high' but the insertion order was different because of the algorithm that 
calculated them. (Also not all 3 items were guaranteed to exist, I wanted 
to sort those that existed.)

So I created an OrderedDict of my subclass and called `.sort`.

I'm sure you can think of a bunch more examples, if not I can give them to 
you.

On Tuesday, September 24, 2013 3:13:15 PM UTC+3, Steven D'Aprano wrote:
>
> On Tue, Sep 24, 2013 at 04:49:20AM -0700, Ram Rachum wrote: 
> > What do you think about providing an `OrderedDict.sort` method? I've 
> been 
> > using my own `OrderedDict` subclass that defines `sort` for years, and I 
> > always wondered why the stdlib one doesn't provide `sort`. 
> > 
> > I can write the patch if needed. 
>
> I'm not entirely sure why anyone would need an OrderedDict sort method. 
> Ordered Dicts store keys by insertion order. Sorting the keys goes 
> against the purpose of an OrderedDict. 
>
> I can understand a request for a SortedDict, that keeps the keys in 
> sorted order as they are deleted or inserted. I personally don't have 
> any need for one, since when I need the keys in sorted order I just 
> sort them on the fly: 
>
> for key in sorted(dict): 
>     ... 
>
>   
> but in any case, that's a separate issue from sorting an OrderedDict. 
> Can you explain the use-case for why somebody might want to throw away 
> the insertion order and replace with sorted order? 
>
>
>
> -- 
> Steven 
> _______________________________________________ 
> Python-ideas mailing list 
> Python... at python.org <javascript:> 
> https://mail.python.org/mailman/listinfo/python-ideas 
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20130924/b415a06f/attachment.html>

From solipsis at pitrou.net  Tue Sep 24 14:29:55 2013
From: solipsis at pitrou.net (Antoine Pitrou)
Date: Tue, 24 Sep 2013 14:29:55 +0200
Subject: [Python-ideas] `OrderedDict.sort`
References: <a7a49b35-4baa-4cb5-891d-cfaf9f31127f@googlegroups.com>
 <20130924121315.GI7989@ando>
Message-ID: <20130924142955.33f17503@fsol>

On Tue, 24 Sep 2013 22:13:15 +1000
Steven D'Aprano <steve at pearwood.info> wrote:
> On Tue, Sep 24, 2013 at 04:49:20AM -0700, Ram Rachum wrote:
> > What do you think about providing an `OrderedDict.sort` method? I've been 
> > using my own `OrderedDict` subclass that defines `sort` for years, and I 
> > always wondered why the stdlib one doesn't provide `sort`.
> > 
> > I can write the patch if needed.
> 
> I'm not entirely sure why anyone would need an OrderedDict sort method. 
> Ordered Dicts store keys by insertion order. Sorting the keys goes 
> against the purpose of an OrderedDict.

An OrderedDict is basically an associative container with a
well-defined ordering. It's not only "insertion order", because you can
use move_to_end() to reorder it piecewise.
(at some point I also filed a feature request to rotate an OrderedDict:
 http://bugs.python.org/issue17100)

However, sorting would be difficult to implement efficiently with the
natural implementation of an OrderedDict, which uses linked lists.
Basically, you're probably as good sorting the items separately and
reinitializing the OrderedDict with them.

Regards

Antoine.


From ram at rachum.com  Tue Sep 24 14:50:08 2013
From: ram at rachum.com (Ram Rachum)
Date: Tue, 24 Sep 2013 15:50:08 +0300
Subject: [Python-ideas] `OrderedDict.sort`
In-Reply-To: <20130924142955.33f17503@fsol>
References: <a7a49b35-4baa-4cb5-891d-cfaf9f31127f@googlegroups.com>
 <20130924121315.GI7989@ando> <20130924142955.33f17503@fsol>
Message-ID: <CANXboVZ1bkE8+vJh55HoUob2UeB4HKrrMpO3B0CAAyx-8mLYNQ@mail.gmail.com>

Antoine, my concern is not efficiency but convenience. Whoever has high
efficiency requirements and wants to sort an ordered dict will have to find
their own solution anyway, and the other 99% of people who just want to
sort a 20-items-long ordered dict in their small web app could happily use
`OrderedDict.sort`.

And while we're on that subject, can we also add `OrderedDict.index`?


On Tue, Sep 24, 2013 at 3:29 PM, Antoine Pitrou <solipsis at pitrou.net> wrote:

> On Tue, 24 Sep 2013 22:13:15 +1000
> Steven D'Aprano <steve at pearwood.info> wrote:
> > On Tue, Sep 24, 2013 at 04:49:20AM -0700, Ram Rachum wrote:
> > > What do you think about providing an `OrderedDict.sort` method? I've
> been
> > > using my own `OrderedDict` subclass that defines `sort` for years, and
> I
> > > always wondered why the stdlib one doesn't provide `sort`.
> > >
> > > I can write the patch if needed.
> >
> > I'm not entirely sure why anyone would need an OrderedDict sort method.
> > Ordered Dicts store keys by insertion order. Sorting the keys goes
> > against the purpose of an OrderedDict.
>
> An OrderedDict is basically an associative container with a
> well-defined ordering. It's not only "insertion order", because you can
> use move_to_end() to reorder it piecewise.
> (at some point I also filed a feature request to rotate an OrderedDict:
>  http://bugs.python.org/issue17100)
>
> However, sorting would be difficult to implement efficiently with the
> natural implementation of an OrderedDict, which uses linked lists.
> Basically, you're probably as good sorting the items separately and
> reinitializing the OrderedDict with them.
>
> Regards
>
> Antoine.
>
>
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> https://mail.python.org/mailman/listinfo/python-ideas
>
> --
>
> ---
> You received this message because you are subscribed to a topic in the
> Google Groups "python-ideas" group.
> To unsubscribe from this topic, visit
> https://groups.google.com/d/topic/python-ideas/-RFTqV8_aS0/unsubscribe.
> To unsubscribe from this group and all its topics, send an email to
> python-ideas+unsubscribe at googlegroups.com.
> For more options, visit https://groups.google.com/groups/opt_out.
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20130924/7c9b96c3/attachment-0001.html>

From brian at python.org  Tue Sep 24 15:50:31 2013
From: brian at python.org (Brian Curtin)
Date: Tue, 24 Sep 2013 08:50:31 -0500
Subject: [Python-ideas] `OrderedDict.sort`
In-Reply-To: <993eee00-84f4-4219-9637-797fd995055a@googlegroups.com>
References: <a7a49b35-4baa-4cb5-891d-cfaf9f31127f@googlegroups.com>
 <20130924121315.GI7989@ando>
 <993eee00-84f4-4219-9637-797fd995055a@googlegroups.com>
Message-ID: <CAD+XWwrbBAX8tvpjxttCo2Mb+UpOwDPu+kppef6tixSVvvEaBQ@mail.gmail.com>

On Tue, Sep 24, 2013 at 7:27 AM, Ram Rachum <ram.rachum at gmail.com> wrote:
> I think that your mistake is defining OrderedDict as a dict sorting by
> insertion order.

That's the definition straight out of the documentation. "An
OrderedDict is a dict that remembers the order that keys were first
inserted."

From ram.rachum at gmail.com  Tue Sep 24 15:52:34 2013
From: ram.rachum at gmail.com (Ram Rachum)
Date: Tue, 24 Sep 2013 16:52:34 +0300
Subject: [Python-ideas] `OrderedDict.sort`
In-Reply-To: <CAD+XWwrbBAX8tvpjxttCo2Mb+UpOwDPu+kppef6tixSVvvEaBQ@mail.gmail.com>
References: <a7a49b35-4baa-4cb5-891d-cfaf9f31127f@googlegroups.com>
 <20130924121315.GI7989@ando>
 <993eee00-84f4-4219-9637-797fd995055a@googlegroups.com>
 <CAD+XWwrbBAX8tvpjxttCo2Mb+UpOwDPu+kppef6tixSVvvEaBQ@mail.gmail.com>
Message-ID: <CANXboVaV80mYY9jjJOYnO76df7vap5FxL5U2FnmxV5E2CvkKYw@mail.gmail.com>

Then how do you explain `move_to_end`?

Sent from my phone.
On Sep 24, 2013 3:50 PM, "Brian Curtin" <brian at python.org> wrote:

> On Tue, Sep 24, 2013 at 7:27 AM, Ram Rachum <ram.rachum at gmail.com> wrote:
> > I think that your mistake is defining OrderedDict as a dict sorting by
> > insertion order.
>
> That's the definition straight out of the documentation. "An
> OrderedDict is a dict that remembers the order that keys were first
> inserted."
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20130924/29fadf71/attachment.html>

From ethan at stoneleaf.us  Tue Sep 24 16:17:25 2013
From: ethan at stoneleaf.us (Ethan Furman)
Date: Tue, 24 Sep 2013 07:17:25 -0700
Subject: [Python-ideas] `OrderedDict.sort`
In-Reply-To: <993eee00-84f4-4219-9637-797fd995055a@googlegroups.com>
References: <a7a49b35-4baa-4cb5-891d-cfaf9f31127f@googlegroups.com>
 <20130924121315.GI7989@ando>
 <993eee00-84f4-4219-9637-797fd995055a@googlegroups.com>
Message-ID: <52419EF5.1070305@stoneleaf.us>

On 09/24/2013 05:27 AM, Ram Rachum wrote:
>
> I think that your mistake is defining OrderedDict as a dict sorting by insertion order. I see no reason to define it
> that way [...]

How would you like it sorted?

   - ascending?  you can write an algorithm for that

   - descending?  you can write an algorithm for that

   - cyclic?  you can write an algorithm for that

   - insertion order?  you can *not* write an algorithm for that

Insertion order is the one that you either remember, or is lost.

As for a practical example, think of classes that want to know which order their attributes were created in -- 
OrderedDict to the rescue!  :)

--
~Ethan~

From ram at rachum.com  Tue Sep 24 17:23:58 2013
From: ram at rachum.com (Ram Rachum)
Date: Tue, 24 Sep 2013 18:23:58 +0300
Subject: [Python-ideas] `OrderedDict.sort`
In-Reply-To: <52419EF5.1070305@stoneleaf.us>
References: <a7a49b35-4baa-4cb5-891d-cfaf9f31127f@googlegroups.com>
 <20130924121315.GI7989@ando>
 <993eee00-84f4-4219-9637-797fd995055a@googlegroups.com>
 <52419EF5.1070305@stoneleaf.us>
Message-ID: <CANXboVZ-r+DApR7awj7HQgkrTLiMpaKSyvj6WnPa4e3CSHQ7ug@mail.gmail.com>

Ethan, you've misunderstood my message and given a correct objection to an
argument I did not make.

I did not argue against ordering by insertion order on init. I agree with
that decision. I disagree with defining the entire class as an insertion
ordering class and refusing to allow users to reorder it as they wish after
it's created.

Sent from my phone.
On Sep 24, 2013 4:42 PM, "Ethan Furman" <ethan at stoneleaf.us> wrote:

> On 09/24/2013 05:27 AM, Ram Rachum wrote:
>
>>
>> I think that your mistake is defining OrderedDict as a dict sorting by
>> insertion order. I see no reason to define it
>> that way [...]
>>
>
> How would you like it sorted?
>
>   - ascending?  you can write an algorithm for that
>
>   - descending?  you can write an algorithm for that
>
>   - cyclic?  you can write an algorithm for that
>
>   - insertion order?  you can *not* write an algorithm for that
>
> Insertion order is the one that you either remember, or is lost.
>
> As for a practical example, think of classes that want to know which order
> their attributes were created in -- OrderedDict to the rescue!  :)
>
> --
> ~Ethan~
> ______________________________**_________________
> Python-ideas mailing list
> Python-ideas at python.org
> https://mail.python.org/**mailman/listinfo/python-ideas<https://mail.python.org/mailman/listinfo/python-ideas>
>
> --
>
> --- You received this message because you are subscribed to a topic in the
> Google Groups "python-ideas" group.
> To unsubscribe from this topic, visit https://groups.google.com/d/**
> topic/python-ideas/-RFTqV8_**aS0/unsubscribe<https://groups.google.com/d/topic/python-ideas/-RFTqV8_aS0/unsubscribe>
> .
> To unsubscribe from this group and all its topics, send an email to
> python-ideas+unsubscribe@**googlegroups.com<python-ideas%2Bunsubscribe at googlegroups.com>
> .
> For more options, visit https://groups.google.com/**groups/opt_out<https://groups.google.com/groups/opt_out>
> .
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20130924/1900e7ac/attachment.html>

From mal at egenix.com  Tue Sep 24 17:49:12 2013
From: mal at egenix.com (M.-A. Lemburg)
Date: Tue, 24 Sep 2013 17:49:12 +0200
Subject: [Python-ideas] `OrderedDict.sort`
In-Reply-To: <CANXboVZ-r+DApR7awj7HQgkrTLiMpaKSyvj6WnPa4e3CSHQ7ug@mail.gmail.com>
References: <a7a49b35-4baa-4cb5-891d-cfaf9f31127f@googlegroups.com>	<20130924121315.GI7989@ando>	<993eee00-84f4-4219-9637-797fd995055a@googlegroups.com>	<52419EF5.1070305@stoneleaf.us>
 <CANXboVZ-r+DApR7awj7HQgkrTLiMpaKSyvj6WnPa4e3CSHQ7ug@mail.gmail.com>
Message-ID: <5241B478.5060605@egenix.com>

On 24.09.2013 17:23, Ram Rachum wrote:
> Ethan, you've misunderstood my message and given a correct objection to an
> argument I did not make.
> 
> I did not argue against ordering by insertion order on init. I agree with
> that decision. I disagree with defining the entire class as an insertion
> ordering class and refusing to allow users to reorder it as they wish after
> it's created.

The overhead introduced by completely recreating the internal
data structure after the sort is just as high as creating a
new OrderedDict, so I don't understand why you don't like about:

from collections import OrderedDict
o = OrderedDict(((3,4), (5,4), (1,2)))
p = OrderedDict(sorted(o.iteritems()))

This even allows you to keep the original insert order should
you need it again. If you don't need this, you can just use:

o = dict(((3,4), (5,4), (1,2)))
p = OrderedDict(sorted(o.iteritems()))

which is also faster than first creating an OrderedDict and
then recreating it with sorted entries.

Put those two lines into a function and you have:

def SortedOrderedDict(*args, **kws):
    o = dict(*args, **kws)
    return OrderedDict(sorted(o.iteritems()))

p = SortedOrderedDict(((3,4), (5,4), (1,2)))

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Sep 24 2013)
>>> Python Projects, Consulting and Support ...   http://www.egenix.com/
>>> mxODBC.Zope/Plone.Database.Adapter ...       http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________
2013-09-11: Released eGenix PyRun 1.3.0 ...       http://egenix.com/go49
2013-09-28: PyDDF Sprint ...                                4 days to go
2013-10-14: PyCon DE 2013, Cologne, Germany ...            20 days to go

   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
    D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
           Registered at Amtsgericht Duesseldorf: HRB 46611
               http://www.egenix.com/company/contact/

From ram at rachum.com  Tue Sep 24 17:51:43 2013
From: ram at rachum.com (Ram Rachum)
Date: Tue, 24 Sep 2013 18:51:43 +0300
Subject: [Python-ideas] `OrderedDict.sort`
In-Reply-To: <5241B478.5060605@egenix.com>
References: <a7a49b35-4baa-4cb5-891d-cfaf9f31127f@googlegroups.com>
 <20130924121315.GI7989@ando>
 <993eee00-84f4-4219-9637-797fd995055a@googlegroups.com>
 <52419EF5.1070305@stoneleaf.us>
 <CANXboVZ-r+DApR7awj7HQgkrTLiMpaKSyvj6WnPa4e3CSHQ7ug@mail.gmail.com>
 <5241B478.5060605@egenix.com>
Message-ID: <CANXboVaJAUXKbem_z-tpbqzrGP_uRE6ejfoo9jRo=3POjbmwZw@mail.gmail.com>

I get your point. It's a nice idea. But I think it's slightly less elegant
to create another dict. So I think it's almost as good as having a `.sort`
method, but not quite as nice.

(By the way, couldn't you make the same argument about `list.sort`?)


On Tue, Sep 24, 2013 at 6:49 PM, M.-A. Lemburg <mal at egenix.com> wrote:

> On 24.09.2013 17:23, Ram Rachum wrote:
> > Ethan, you've misunderstood my message and given a correct objection to
> an
> > argument I did not make.
> >
> > I did not argue against ordering by insertion order on init. I agree with
> > that decision. I disagree with defining the entire class as an insertion
> > ordering class and refusing to allow users to reorder it as they wish
> after
> > it's created.
>
> The overhead introduced by completely recreating the internal
> data structure after the sort is just as high as creating a
> new OrderedDict, so I don't understand why you don't like about:
>
> from collections import OrderedDict
> o = OrderedDict(((3,4), (5,4), (1,2)))
> p = OrderedDict(sorted(o.iteritems()))
>
> This even allows you to keep the original insert order should
> you need it again. If you don't need this, you can just use:
>
> o = dict(((3,4), (5,4), (1,2)))
> p = OrderedDict(sorted(o.iteritems()))
>
> which is also faster than first creating an OrderedDict and
> then recreating it with sorted entries.
>
> Put those two lines into a function and you have:
>
> def SortedOrderedDict(*args, **kws):
>     o = dict(*args, **kws)
>     return OrderedDict(sorted(o.iteritems()))
>
> p = SortedOrderedDict(((3,4), (5,4), (1,2)))
>
> --
> Marc-Andre Lemburg
> eGenix.com
>
> Professional Python Services directly from the Source  (#1, Sep 24 2013)
> >>> Python Projects, Consulting and Support ...   http://www.egenix.com/
> >>> mxODBC.Zope/Plone.Database.Adapter ...       http://zope.egenix.com/
> >>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
> ________________________________________________________________________
> 2013-09-11: Released eGenix PyRun 1.3.0 ...       http://egenix.com/go49
> 2013-09-28: PyDDF Sprint ...                                4 days to go
> 2013-10-14: PyCon DE 2013, Cologne, Germany ...            20 days to go
>
>    eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
>     D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
>            Registered at Amtsgericht Duesseldorf: HRB 46611
>                http://www.egenix.com/company/contact/
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20130924/8c2929d9/attachment-0001.html>

From ethan at stoneleaf.us  Tue Sep 24 17:36:17 2013
From: ethan at stoneleaf.us (Ethan Furman)
Date: Tue, 24 Sep 2013 08:36:17 -0700
Subject: [Python-ideas] `OrderedDict.sort`
In-Reply-To: <CANXboVZ-r+DApR7awj7HQgkrTLiMpaKSyvj6WnPa4e3CSHQ7ug@mail.gmail.com>
References: <a7a49b35-4baa-4cb5-891d-cfaf9f31127f@googlegroups.com>
 <20130924121315.GI7989@ando>
 <993eee00-84f4-4219-9637-797fd995055a@googlegroups.com>
 <52419EF5.1070305@stoneleaf.us>
 <CANXboVZ-r+DApR7awj7HQgkrTLiMpaKSyvj6WnPa4e3CSHQ7ug@mail.gmail.com>
Message-ID: <5241B171.80501@stoneleaf.us>

On 09/24/2013 08:23 AM, Ram Rachum wrote:
> On Sep 24, 2013 4:42 PM, Ethan Furman wrote:
>> On 09/24/2013 05:27 AM, Ram Rachum wrote:
>>>
>>> I think that your mistake is defining OrderedDict as a dict sorting by insertion order. I see no reason to define it
>>> that way [...]
>>
>> Insertion order is the one that you either remember, or is lost.
>
> Ethan, you've misunderstood my message and given a correct objection to an argument I did not make.
>
> I did not argue against ordering by insertion order on init. I agree with that decision. I disagree with defining the
> entire class as an insertion ordering class and refusing to allow users to reorder it as they wish after it's created.

Two points:

   - What happens when a new element is added to the OrderedDict after the user sorts it?

   - If by 'init' you mean something like `d = OrderedDict(a=1, b=2, c=3)` --
     this does not preserve an insertion order as the keywords end up in a
     regular, unsorted dict that is passed to OrderedDict.__init__

--
~Ethan~

From ram at rachum.com  Tue Sep 24 18:02:12 2013
From: ram at rachum.com (Ram Rachum)
Date: Tue, 24 Sep 2013 19:02:12 +0300
Subject: [Python-ideas] `OrderedDict.sort`
In-Reply-To: <5241B171.80501@stoneleaf.us>
References: <a7a49b35-4baa-4cb5-891d-cfaf9f31127f@googlegroups.com>
 <20130924121315.GI7989@ando>
 <993eee00-84f4-4219-9637-797fd995055a@googlegroups.com>
 <52419EF5.1070305@stoneleaf.us>
 <CANXboVZ-r+DApR7awj7HQgkrTLiMpaKSyvj6WnPa4e3CSHQ7ug@mail.gmail.com>
 <5241B171.80501@stoneleaf.us>
Message-ID: <CANXboVaq49kx3dAgHO3+spO2kmCxk1i9HNBr-HdTEdbc7O2OnQ@mail.gmail.com>

On Tue, Sep 24, 2013 at 6:36 PM, Ethan Furman <ethan at stoneleaf.us> wrote:

> On 09/24/2013 08:23 AM, Ram Rachum wrote:
>
>> On Sep 24, 2013 4:42 PM, Ethan Furman wrote:
>>
>>> On 09/24/2013 05:27 AM, Ram Rachum wrote:
>>>
>>>>
>>>> I think that your mistake is defining OrderedDict as a dict sorting by
>>>> insertion order. I see no reason to define it
>>>> that way [...]
>>>>
>>>
>>> Insertion order is the one that you either remember, or is lost.
>>>
>>
>> Ethan, you've misunderstood my message and given a correct objection to
>> an argument I did not make.
>>
>> I did not argue against ordering by insertion order on init. I agree with
>> that decision. I disagree with defining the
>> entire class as an insertion ordering class and refusing to allow users
>> to reorder it as they wish after it's created.
>>
>
> Two points:
>
>   - What happens when a new element is added to the OrderedDict after the
> user sorts it?
>

The exact same thing that happens if the user does `.move_to_end` and then
adds a new element, and the exact same thing that happens when a user does
`list.sort` and adds a new element, and the exact same thing that happens
when a user does `sorted(whatever)` and adds a new element. It just gets
put in the end.


>
>   - If by 'init' you mean something like `d = OrderedDict(a=1, b=2, c=3)`
> --
>     this does not preserve an insertion order as the keywords end up in a
>     regular, unsorted dict that is passed to OrderedDict.__init__


Does this relate to my proposal in any way? I don't see how. (I meant
__init__, I was typing from a phone.)


>
>
> --
> ~Ethan~
> ______________________________**_________________
> Python-ideas mailing list
> Python-ideas at python.org
> https://mail.python.org/**mailman/listinfo/python-ideas<https://mail.python.org/mailman/listinfo/python-ideas>
>
> --
>
> --- You received this message because you are subscribed to a topic in the
> Google Groups "python-ideas" group.
> To unsubscribe from this topic, visit https://groups.google.com/d/**
> topic/python-ideas/-RFTqV8_**aS0/unsubscribe<https://groups.google.com/d/topic/python-ideas/-RFTqV8_aS0/unsubscribe>
> .
> To unsubscribe from this group and all its topics, send an email to
> python-ideas+unsubscribe@**googlegroups.com<python-ideas%2Bunsubscribe at googlegroups.com>
> .
> For more options, visit https://groups.google.com/**groups/opt_out<https://groups.google.com/groups/opt_out>
> .
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20130924/aa31d543/attachment.html>

From stephen at xemacs.org  Tue Sep 24 18:02:33 2013
From: stephen at xemacs.org (Stephen J. Turnbull)
Date: Wed, 25 Sep 2013 01:02:33 +0900
Subject: [Python-ideas] `OrderedDict.sort`
In-Reply-To: <CANXboVZ-r+DApR7awj7HQgkrTLiMpaKSyvj6WnPa4e3CSHQ7ug@mail.gmail.com>
References: <a7a49b35-4baa-4cb5-891d-cfaf9f31127f@googlegroups.com>
 <20130924121315.GI7989@ando>
 <993eee00-84f4-4219-9637-797fd995055a@googlegroups.com>
 <52419EF5.1070305@stoneleaf.us>
 <CANXboVZ-r+DApR7awj7HQgkrTLiMpaKSyvj6WnPa4e3CSHQ7ug@mail.gmail.com>
Message-ID: <87ppry1a3q.fsf@uwakimon.sk.tsukuba.ac.jp>

Ram Rachum writes:

 > I disagree with defining the entire class as an insertion ordering
 > class and refusing

There's no refusal.  It's just not in the battery pack.

 > to allow users to reorder it as they wish after it's created.

You can put your inefficient but useful implementation on PyPI.
You can write a PEP in which
you define the API.
You can provide an efficient implementation suitable for the stdlib, or
you can convince the gatekeepers that it doesn't need to be efficient.
You can promise to maintain it for 5 years.[1]

Why don't you?  Four or five hackers do it every cycle (although
sometimes it takes more than a cycle to actually get approval).
Recent successes include Ethan and Steven, who are giving you the
benefit of their experience.

OTOH, the barrier for mere suggestions (even backed up by proof of
concept implementations) these days is quite high.  You need to
convince somebody to do all of the above, which usually requires an
argument that it's at least tricky to do right, and perhaps hard to do
at all.

Footnotes: 
[1]  Or whatever the going rate is these days.


From ericsnowcurrently at gmail.com  Tue Sep 24 18:15:37 2013
From: ericsnowcurrently at gmail.com (Eric Snow)
Date: Tue, 24 Sep 2013 10:15:37 -0600
Subject: [Python-ideas] Indicate if an iterable is ordered or not
Message-ID: <CALFfu7BNQuV=JO-uWGi4L0+Uv+40ZCF_HOcZY44-LK4AodFaXg@mail.gmail.com>

Iterables are not necessarily ordered (e.g. dict vs. OrderedDict).
Sequences are but Sets aren't.  I'm not aware of any good way
currently to know if an arbitrary iterable is ordered.  Without an
explicit indicator of ordered-ness, you must know in advance for each
specific type.

One possible solution is an __isordered__ attribute (on the class),
set to a boolean.  The absence of the attribute would imply False.

Such an attribute would be added to existing types:

* collections.abc.Iterable (default: False)
* list (True)
* tuple (True)
* set (False)
* dict (False)
* collections.OrderedDict (True)
* ...

Thoughts?

-eric

From ram at rachum.com  Tue Sep 24 18:19:56 2013
From: ram at rachum.com (Ram Rachum)
Date: Tue, 24 Sep 2013 19:19:56 +0300
Subject: [Python-ideas] `OrderedDict.sort`
In-Reply-To: <87ppry1a3q.fsf@uwakimon.sk.tsukuba.ac.jp>
References: <a7a49b35-4baa-4cb5-891d-cfaf9f31127f@googlegroups.com>
 <20130924121315.GI7989@ando>
 <993eee00-84f4-4219-9637-797fd995055a@googlegroups.com>
 <52419EF5.1070305@stoneleaf.us>
 <CANXboVZ-r+DApR7awj7HQgkrTLiMpaKSyvj6WnPa4e3CSHQ7ug@mail.gmail.com>
 <87ppry1a3q.fsf@uwakimon.sk.tsukuba.ac.jp>
Message-ID: <CANXboVYcds=+7DoWMLgPe8FqkHC+GsyAakwy9DP4MqoY_Y1zAg@mail.gmail.com>

On Tue, Sep 24, 2013 at 7:02 PM, Stephen J. Turnbull <stephen at xemacs.org>wrote:

> Ram Rachum writes:
>
>  > I disagree with defining the entire class as an insertion ordering
>  > class and refusing
>
> There's no refusal.  It's just not in the battery pack.
>
>  > to allow users to reorder it as they wish after it's created.
>
> You can put your inefficient but useful implementation on PyPI.
> You can write a PEP in which
> you define the API.
> You can provide an efficient implementation suitable for the stdlib, or
> you can convince the gatekeepers that it doesn't need to be efficient.
> You can promise to maintain it for 5 years.[1]
>

I can do an inefficient implementation and put it on PyPI. I don't see the
need for writing a PEP for a simple method. ("Define the API"? Anything I'm
missing beyond a call signature `def sort(self, key=None)`?)

If people here are opposed to allowing an implementation of
`OrderedDict.sort` in the stdlib, I don't see a reason to waste my time
putting an implementation on PyPI. What's that implementation going to help
if you won't allow it anyway?

Here's a simple inefficient implementation you can use:

    def sort(self, key=None):
        '''
        Sort the items according to their keys, changing the order in-place.

        The optional `key` argument, (not to be confused with the dictionary
        keys,) will be passed to the `sorted` function as a key function.
        '''
        sorted_keys = sorted(self.keys(), key=key)
        for key_ in sorted_keys[1:]:
            self.move_to_end(key_)

Regarding committing to maintain it for N years: Sorry, that's beyond what
I'm willing to do. If that's a requirement for contributing a minor feature
to Python, I'll have to withdraw my suggestion.


>
> Why don't you?  Four or five hackers do it every cycle (although
> sometimes it takes more than a cycle to actually get approval).
> Recent successes include Ethan and Steven, who are giving you the
> benefit of their experience.
>
> OTOH, the barrier for mere suggestions (even backed up by proof of
> concept implementations) these days is quite high.  You need to
> convince somebody to do all of the above, which usually requires an
> argument that it's at least tricky to do right, and perhaps hard to do
> at all.
>
> Footnotes:
> [1]  Or whatever the going rate is these days.
>
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> https://mail.python.org/mailman/listinfo/python-ideas
>
> --
>
> ---
> You received this message because you are subscribed to a topic in the
> Google Groups "python-ideas" group.
> To unsubscribe from this topic, visit
> https://groups.google.com/d/topic/python-ideas/-RFTqV8_aS0/unsubscribe.
> To unsubscribe from this group and all its topics, send an email to
> python-ideas+unsubscribe at googlegroups.com.
> For more options, visit https://groups.google.com/groups/opt_out.
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20130924/7fdabf1b/attachment-0001.html>

From guido at python.org  Tue Sep 24 18:22:49 2013
From: guido at python.org (Guido van Rossum)
Date: Tue, 24 Sep 2013 09:22:49 -0700
Subject: [Python-ideas] Indicate if an iterable is ordered or not
In-Reply-To: <CALFfu7BNQuV=JO-uWGi4L0+Uv+40ZCF_HOcZY44-LK4AodFaXg@mail.gmail.com>
References: <CALFfu7BNQuV=JO-uWGi4L0+Uv+40ZCF_HOcZY44-LK4AodFaXg@mail.gmail.com>
Message-ID: <CAP7+vJJzWhXRS=t7qM++uvCx-baXzHqKRdjTTUwVnhfyyzZBOg@mail.gmail.com>

What do you want to do with this knowledge?

On Tue, Sep 24, 2013 at 9:15 AM, Eric Snow <ericsnowcurrently at gmail.com> wrote:
> Iterables are not necessarily ordered (e.g. dict vs. OrderedDict).
> Sequences are but Sets aren't.  I'm not aware of any good way
> currently to know if an arbitrary iterable is ordered.  Without an
> explicit indicator of ordered-ness, you must know in advance for each
> specific type.
>
> One possible solution is an __isordered__ attribute (on the class),
> set to a boolean.  The absence of the attribute would imply False.
>
> Such an attribute would be added to existing types:
>
> * collections.abc.Iterable (default: False)
> * list (True)
> * tuple (True)
> * set (False)
> * dict (False)
> * collections.OrderedDict (True)
> * ...
>
> Thoughts?
>
> -eric
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> https://mail.python.org/mailman/listinfo/python-ideas


-- 
--Guido van Rossum (python.org/~guido)

From abarnert at yahoo.com  Tue Sep 24 18:27:08 2013
From: abarnert at yahoo.com (Andrew Barnert)
Date: Tue, 24 Sep 2013 09:27:08 -0700
Subject: [Python-ideas] `OrderedDict.sort`
In-Reply-To: <CANXboVaJAUXKbem_z-tpbqzrGP_uRE6ejfoo9jRo=3POjbmwZw@mail.gmail.com>
References: <a7a49b35-4baa-4cb5-891d-cfaf9f31127f@googlegroups.com>
 <20130924121315.GI7989@ando>
 <993eee00-84f4-4219-9637-797fd995055a@googlegroups.com>
 <52419EF5.1070305@stoneleaf.us>
 <CANXboVZ-r+DApR7awj7HQgkrTLiMpaKSyvj6WnPa4e3CSHQ7ug@mail.gmail.com>
 <5241B478.5060605@egenix.com>
 <CANXboVaJAUXKbem_z-tpbqzrGP_uRE6ejfoo9jRo=3POjbmwZw@mail.gmail.com>
Message-ID: <35E8DF33-71BC-4831-8B3B-4A9695613A75@yahoo.com>

On Sep 24, 2013, at 8:51, Ram Rachum <ram at rachum.com> wrote:

> I get your point. It's a nice idea. But I think it's slightly less elegant to create another dict. So I think it's almost as good as having a `.sort` method, but not quite as nice.

Honestly, I think having a sorted mapping in the stdlib would be even nicer in almost any situation where this might be nice. But, given that we don't have such a thing, and getting one into the stdlib is harder than it appears, maybe that's not an argument against your (obviously simpler) idea.

Of course in most cases, you just want to iterate once in sorted order, and it's hard to beat this:

    for k, v in sorted(o.items()):

> (By the way, couldn't you make the same argument about `list.sort`?)

You could. Except that list.sort predates sorted. And it's faster and saves memory, which isn't true of your suggestion. I don't know if that would be enough to add it today, but it's more than enough to keep it around.

> On Tue, Sep 24, 2013 at 6:49 PM, M.-A. Lemburg <mal at egenix.com> wrote:
>> On 24.09.2013 17:23, Ram Rachum wrote:
>> > Ethan, you've misunderstood my message and given a correct objection to an
>> > argument I did not make.
>> >
>> > I did not argue against ordering by insertion order on init. I agree with
>> > that decision. I disagree with defining the entire class as an insertion
>> > ordering class and refusing to allow users to reorder it as they wish after
>> > it's created.
>> 
>> The overhead introduced by completely recreating the internal
>> data structure after the sort is just as high as creating a
>> new OrderedDict, so I don't understand why you don't like about:
>> 
>> from collections import OrderedDict
>> o = OrderedDict(((3,4), (5,4), (1,2)))
>> p = OrderedDict(sorted(o.iteritems()))
>> 
>> This even allows you to keep the original insert order should
>> you need it again. If you don't need this, you can just use:
>> 
>> o = dict(((3,4), (5,4), (1,2)))
>> p = OrderedDict(sorted(o.iteritems()))
>> 
>> which is also faster than first creating an OrderedDict and
>> then recreating it with sorted entries.
>> 
>> Put those two lines into a function and you have:
>> 
>> def SortedOrderedDict(*args, **kws):
>>     o = dict(*args, **kws)
>>     return OrderedDict(sorted(o.iteritems()))
>> 
>> p = SortedOrderedDict(((3,4), (5,4), (1,2)))
>> 
>> --
>> Marc-Andre Lemburg
>> eGenix.com
>> 
>> Professional Python Services directly from the Source  (#1, Sep 24 2013)
>> >>> Python Projects, Consulting and Support ...   http://www.egenix.com/
>> >>> mxODBC.Zope/Plone.Database.Adapter ...       http://zope.egenix.com/
>> >>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
>> ________________________________________________________________________
>> 2013-09-11: Released eGenix PyRun 1.3.0 ...       http://egenix.com/go49
>> 2013-09-28: PyDDF Sprint ...                                4 days to go
>> 2013-10-14: PyCon DE 2013, Cologne, Germany ...            20 days to go
>> 
>>    eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
>>     D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
>>            Registered at Amtsgericht Duesseldorf: HRB 46611
>>                http://www.egenix.com/company/contact/
> 
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> https://mail.python.org/mailman/listinfo/python-ideas
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20130924/49f51d33/attachment.html>

From ram at rachum.com  Tue Sep 24 18:33:15 2013
From: ram at rachum.com (Ram Rachum)
Date: Tue, 24 Sep 2013 19:33:15 +0300
Subject: [Python-ideas] `OrderedDict.sort`
In-Reply-To: <35E8DF33-71BC-4831-8B3B-4A9695613A75@yahoo.com>
References: <a7a49b35-4baa-4cb5-891d-cfaf9f31127f@googlegroups.com>
 <20130924121315.GI7989@ando>
 <993eee00-84f4-4219-9637-797fd995055a@googlegroups.com>
 <52419EF5.1070305@stoneleaf.us>
 <CANXboVZ-r+DApR7awj7HQgkrTLiMpaKSyvj6WnPa4e3CSHQ7ug@mail.gmail.com>
 <5241B478.5060605@egenix.com>
 <CANXboVaJAUXKbem_z-tpbqzrGP_uRE6ejfoo9jRo=3POjbmwZw@mail.gmail.com>
 <35E8DF33-71BC-4831-8B3B-4A9695613A75@yahoo.com>
Message-ID: <CANXboVYjewpOMFORjqT9sHpS8Ps6Ny9PsKvyKeSSYbUAzNQReg@mail.gmail.com>

On Tue, Sep 24, 2013 at 7:27 PM, Andrew Barnert <abarnert at yahoo.com> wrote:

> On Sep 24, 2013, at 8:51, Ram Rachum <ram at rachum.com> wrote:
>
> I get your point. It's a nice idea. But I think it's slightly less elegant
> to create another dict. So I think it's almost as good as having a `.sort`
> method, but not quite as nice.
>
>
> Honestly, I think having a sorted mapping in the stdlib would be even
> nicer in almost any situation where this might be nice. But, given that we
> don't have such a thing, and getting one into the stdlib is harder than it
> appears, maybe that's not an argument against your (obviously simpler) idea.
>

For the record, I think that having a SortedDict in the stdlib would be
awesome.


>
> Of course in most cases, you just want to iterate once in sorted order,
> and it's hard to beat this:
>
>     for k, v in sorted(o.items()):
>

I think that in most of my cases it won't work. Either because I iterate in
Django templates, or I iterate several times which would make this
cumbersome and wasteful.


>
> (By the way, couldn't you make the same argument about `list.sort`?)
>
>
> You could. Except that list.sort predates sorted. And it's faster and
> saves memory, which isn't true of your suggestion. I don't know if that
> would be enough to add it today, but it's more than enough to keep it
> around.
>
> On Tue, Sep 24, 2013 at 6:49 PM, M.-A. Lemburg <mal at egenix.com> wrote:
>
>> On 24.09.2013 17:23, Ram Rachum wrote:
>> > Ethan, you've misunderstood my message and given a correct objection to
>> an
>> > argument I did not make.
>> >
>> > I did not argue against ordering by insertion order on init. I agree
>> with
>> > that decision. I disagree with defining the entire class as an insertion
>> > ordering class and refusing to allow users to reorder it as they wish
>> after
>> > it's created.
>>
>> The overhead introduced by completely recreating the internal
>> data structure after the sort is just as high as creating a
>> new OrderedDict, so I don't understand why you don't like about:
>>
>> from collections import OrderedDict
>> o = OrderedDict(((3,4), (5,4), (1,2)))
>> p = OrderedDict(sorted(o.iteritems()))
>>
>> This even allows you to keep the original insert order should
>> you need it again. If you don't need this, you can just use:
>>
>> o = dict(((3,4), (5,4), (1,2)))
>> p = OrderedDict(sorted(o.iteritems()))
>>
>> which is also faster than first creating an OrderedDict and
>> then recreating it with sorted entries.
>>
>> Put those two lines into a function and you have:
>>
>> def SortedOrderedDict(*args, **kws):
>>     o = dict(*args, **kws)
>>     return OrderedDict(sorted(o.iteritems()))
>>
>> p = SortedOrderedDict(((3,4), (5,4), (1,2)))
>>
>> --
>> Marc-Andre Lemburg
>> eGenix.com
>>
>> Professional Python Services directly from the Source  (#1, Sep 24 2013)
>> >>> Python Projects, Consulting and Support ...   http://www.egenix.com/
>> >>> mxODBC.Zope/Plone.Database.Adapter ...       http://zope.egenix.com/
>> >>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
>> ________________________________________________________________________
>> 2013-09-11: Released eGenix PyRun 1.3.0 ...       http://egenix.com/go49
>> 2013-09-28 <http://egenix.com/go492013-09-28>: PyDDF Sprint ...
>>                        4 days to go
>> 2013-10-14: PyCon DE 2013, Cologne, Germany ...            20 days to go
>>
>>    eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
>>     D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
>>            Registered at Amtsgericht Duesseldorf: HRB 46611
>>                http://www.egenix.com/company/contact/
>>
>
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> https://mail.python.org/mailman/listinfo/python-ideas
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20130924/7f9232a3/attachment-0001.html>

From abarnert at yahoo.com  Tue Sep 24 18:37:26 2013
From: abarnert at yahoo.com (Andrew Barnert)
Date: Tue, 24 Sep 2013 09:37:26 -0700
Subject: [Python-ideas] `OrderedDict.sort`
In-Reply-To: <CANXboVYcds=+7DoWMLgPe8FqkHC+GsyAakwy9DP4MqoY_Y1zAg@mail.gmail.com>
References: <a7a49b35-4baa-4cb5-891d-cfaf9f31127f@googlegroups.com>
 <20130924121315.GI7989@ando>
 <993eee00-84f4-4219-9637-797fd995055a@googlegroups.com>
 <52419EF5.1070305@stoneleaf.us>
 <CANXboVZ-r+DApR7awj7HQgkrTLiMpaKSyvj6WnPa4e3CSHQ7ug@mail.gmail.com>
 <87ppry1a3q.fsf@uwakimon.sk.tsukuba.ac.jp>
 <CANXboVYcds=+7DoWMLgPe8FqkHC+GsyAakwy9DP4MqoY_Y1zAg@mail.gmail.com>
Message-ID: <24238F83-89EA-4746-819C-D821CA427284@yahoo.com>

On Sep 24, 2013, at 9:19, Ram Rachum <ram at rachum.com> wrote:

> If people here are opposed to allowing an implementation of `OrderedDict.sort` in the stdlib, I don't see a reason to waste my time putting an implementation on PyPI. What's that implementation going to help if you won't allow it anyway?

Do you not see the benefit to ipython, numpy, requests, the various popular web frameworks, fancy collections like blist, tools like scrapy, etc. being a simple pip away? Why wouldn't the same be true for your module?

A useful module on PyPI helps thousands of people who otherwise would have had to reproduce all the work themselves or settled for not having it. It also leads to de facto standard ways to do things, which makes it easier to communicate with devs on other projects. (Imagine trying to get help with "my custom multidimensional array class" or "a web scraper that I built from scratch" vs. numpy or scrapy.)

Do you think your idea is so trivial that there really is no benefit in any of that?


From ram at rachum.com  Tue Sep 24 18:48:35 2013
From: ram at rachum.com (Ram Rachum)
Date: Tue, 24 Sep 2013 19:48:35 +0300
Subject: [Python-ideas] `OrderedDict.sort`
In-Reply-To: <24238F83-89EA-4746-819C-D821CA427284@yahoo.com>
References: <a7a49b35-4baa-4cb5-891d-cfaf9f31127f@googlegroups.com>
 <20130924121315.GI7989@ando>
 <993eee00-84f4-4219-9637-797fd995055a@googlegroups.com>
 <52419EF5.1070305@stoneleaf.us>
 <CANXboVZ-r+DApR7awj7HQgkrTLiMpaKSyvj6WnPa4e3CSHQ7ug@mail.gmail.com>
 <87ppry1a3q.fsf@uwakimon.sk.tsukuba.ac.jp>
 <CANXboVYcds=+7DoWMLgPe8FqkHC+GsyAakwy9DP4MqoY_Y1zAg@mail.gmail.com>
 <24238F83-89EA-4746-819C-D821CA427284@yahoo.com>
Message-ID: <CANXboVb+NSvG=Hg-Eoz+VuRcPWXyDSmqb5S2YxFzXcaGkPnDfQ@mail.gmail.com>

My code* is *on PyPI, just not isolated to the OrderedDict improvements.
My OrderedDict improvements are here:

http://pypi.python.org/pypi/python_toolbox

This is a big package with all my stuff.


On Tue, Sep 24, 2013 at 7:37 PM, Andrew Barnert <abarnert at yahoo.com> wrote:

> On Sep 24, 2013, at 9:19, Ram Rachum <ram at rachum.com> wrote:
>
> > If people here are opposed to allowing an implementation of
> `OrderedDict.sort` in the stdlib, I don't see a reason to waste my time
> putting an implementation on PyPI. What's that implementation going to help
> if you won't allow it anyway?
>
> Do you not see the benefit to ipython, numpy, requests, the various
> popular web frameworks, fancy collections like blist, tools like scrapy,
> etc. being a simple pip away? Why wouldn't the same be true for your module?
>
> A useful module on PyPI helps thousands of people who otherwise would have
> had to reproduce all the work themselves or settled for not having it. It
> also leads to de facto standard ways to do things, which makes it easier to
> communicate with devs on other projects. (Imagine trying to get help with
> "my custom multidimensional array class" or "a web scraper that I built
> from scratch" vs. numpy or scrapy.)
>
> Do you think your idea is so trivial that there really is no benefit in
> any of that?
>
>
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> https://mail.python.org/mailman/listinfo/python-ideas
>
> --
>
> ---
> You received this message because you are subscribed to a topic in the
> Google Groups "python-ideas" group.
> To unsubscribe from this topic, visit
> https://groups.google.com/d/topic/python-ideas/-RFTqV8_aS0/unsubscribe.
> To unsubscribe from this group and all its topics, send an email to
> python-ideas+unsubscribe at googlegroups.com.
> For more options, visit https://groups.google.com/groups/opt_out.
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20130924/8911e6a3/attachment.html>

From random832 at fastmail.us  Tue Sep 24 19:24:55 2013
From: random832 at fastmail.us (random832 at fastmail.us)
Date: Tue, 24 Sep 2013 13:24:55 -0400
Subject: [Python-ideas] `OrderedDict.sort`
In-Reply-To: <CANXboVYjewpOMFORjqT9sHpS8Ps6Ny9PsKvyKeSSYbUAzNQReg@mail.gmail.com>
References: <a7a49b35-4baa-4cb5-891d-cfaf9f31127f@googlegroups.com>
 <20130924121315.GI7989@ando>
 <993eee00-84f4-4219-9637-797fd995055a@googlegroups.com>
 <52419EF5.1070305@stoneleaf.us>
 <CANXboVZ-r+DApR7awj7HQgkrTLiMpaKSyvj6WnPa4e3CSHQ7ug@mail.gmail.com>
 <5241B478.5060605@egenix.com>
 <CANXboVaJAUXKbem_z-tpbqzrGP_uRE6ejfoo9jRo=3POjbmwZw@mail.gmail.com>
 <35E8DF33-71BC-4831-8B3B-4A9695613A75@yahoo.com>
 <CANXboVYjewpOMFORjqT9sHpS8Ps6Ny9PsKvyKeSSYbUAzNQReg@mail.gmail.com>
Message-ID: <1380043495.6261.25918509.61E22EA3@webmail.messagingengine.com>

On Tue, Sep 24, 2013, at 12:33, Ram Rachum wrote:
> For the record, I think that having a SortedDict in the stdlib would be
> awesome.

There are two issues with that. First of all, this demands that every
element be orderable with every other element. Since not every element
is going to be compared with every other element on insertion, it's easy
to imagine a case where this won't be caught until it's sorted again
later on. And this is ignoring the pathological behavior of
floating-point NaN values, which already silently break list sorting.
(Can someone explain to me how nan works as a dict key, by the way?)

Secondly, a SortedDict (or SortedSet) implies that the sorting is used
_instead of_ hashing, for lookup. This raises the question as to whether
keys/elements should be required to be hashable. On the one hand,
requiring them to be hashable gives you the implied guarantee of an
immutable equality relationship, which is _likely_ to also imply (on
orderable types) an immutable ordering, whereas there is nothing else
that can be used that directly implies an immutable ordering.

From solipsis at pitrou.net  Tue Sep 24 19:36:59 2013
From: solipsis at pitrou.net (Antoine Pitrou)
Date: Tue, 24 Sep 2013 19:36:59 +0200
Subject: [Python-ideas] `OrderedDict.sort`
References: <a7a49b35-4baa-4cb5-891d-cfaf9f31127f@googlegroups.com>
 <20130924121315.GI7989@ando>
 <993eee00-84f4-4219-9637-797fd995055a@googlegroups.com>
 <52419EF5.1070305@stoneleaf.us>
 <CANXboVZ-r+DApR7awj7HQgkrTLiMpaKSyvj6WnPa4e3CSHQ7ug@mail.gmail.com>
 <5241B478.5060605@egenix.com>
 <CANXboVaJAUXKbem_z-tpbqzrGP_uRE6ejfoo9jRo=3POjbmwZw@mail.gmail.com>
Message-ID: <20130924193659.63fca9fc@fsol>

On Tue, 24 Sep 2013 18:51:43 +0300
Ram Rachum <ram at rachum.com> wrote:
> I get your point. It's a nice idea. But I think it's slightly less elegant
> to create another dict. So I think it's almost as good as having a `.sort`
> method, but not quite as nice.
> 
> (By the way, couldn't you make the same argument about `list.sort`?)

list.sort() sorts the list in-place, it doesn't reallocate a new vector
to replace the old one.
(AFAIR anyway, but I trust Tim and Raymond here (or was it Tim,
Tim, Raymond and Tim? :-)).

Regards

Antoine.


From mal at egenix.com  Tue Sep 24 20:10:36 2013
From: mal at egenix.com (M.-A. Lemburg)
Date: Tue, 24 Sep 2013 20:10:36 +0200
Subject: [Python-ideas] `OrderedDict.sort`
In-Reply-To: <CANXboVaJAUXKbem_z-tpbqzrGP_uRE6ejfoo9jRo=3POjbmwZw@mail.gmail.com>
References: <a7a49b35-4baa-4cb5-891d-cfaf9f31127f@googlegroups.com>	<20130924121315.GI7989@ando>	<993eee00-84f4-4219-9637-797fd995055a@googlegroups.com>	<52419EF5.1070305@stoneleaf.us>	<CANXboVZ-r+DApR7awj7HQgkrTLiMpaKSyvj6WnPa4e3CSHQ7ug@mail.gmail.com>	<5241B478.5060605@egenix.com>
 <CANXboVaJAUXKbem_z-tpbqzrGP_uRE6ejfoo9jRo=3POjbmwZw@mail.gmail.com>
Message-ID: <5241D59C.9020009@egenix.com>

On 24.09.2013 17:51, Ram Rachum wrote:
> I get your point. It's a nice idea. But I think it's slightly less elegant
> to create another dict. So I think it's almost as good as having a `.sort`
> method, but not quite as nice.

You can avoid the temp dict by doing some introspection of
the arguments and using iterators instead.

> (By the way, couldn't you make the same argument about `list.sort`?)

The use case is different. With list.sort() you don't want to create
a copy of the list, but instead have the list sort itself, since
you're not interested in the original order.

You'd only use an OrderedDict to begin with if you're interested in
the insert order, otherwise you'd start out with a plain dict().

> On Tue, Sep 24, 2013 at 6:49 PM, M.-A. Lemburg <mal at egenix.com> wrote:
> 
>> On 24.09.2013 17:23, Ram Rachum wrote:
>>> Ethan, you've misunderstood my message and given a correct objection to
>> an
>>> argument I did not make.
>>>
>>> I did not argue against ordering by insertion order on init. I agree with
>>> that decision. I disagree with defining the entire class as an insertion
>>> ordering class and refusing to allow users to reorder it as they wish
>> after
>>> it's created.
>>
>> The overhead introduced by completely recreating the internal
>> data structure after the sort is just as high as creating a
>> new OrderedDict, so I don't understand why you don't like about:
>>
>> from collections import OrderedDict
>> o = OrderedDict(((3,4), (5,4), (1,2)))
>> p = OrderedDict(sorted(o.iteritems()))
>>
>> This even allows you to keep the original insert order should
>> you need it again. If you don't need this, you can just use:
>>
>> o = dict(((3,4), (5,4), (1,2)))
>> p = OrderedDict(sorted(o.iteritems()))
>>
>> which is also faster than first creating an OrderedDict and
>> then recreating it with sorted entries.
>>
>> Put those two lines into a function and you have:
>>
>> def SortedOrderedDict(*args, **kws):
>>     o = dict(*args, **kws)
>>     return OrderedDict(sorted(o.iteritems()))
>>
>> p = SortedOrderedDict(((3,4), (5,4), (1,2)))
>>
>> --
>> Marc-Andre Lemburg
>> eGenix.com
>>
>> Professional Python Services directly from the Source  (#1, Sep 24 2013)
>>>>> Python Projects, Consulting and Support ...   http://www.egenix.com/
>>>>> mxODBC.Zope/Plone.Database.Adapter ...       http://zope.egenix.com/
>>>>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
>> ________________________________________________________________________
>> 2013-09-11: Released eGenix PyRun 1.3.0 ...       http://egenix.com/go49
>> 2013-09-28: PyDDF Sprint ...                                4 days to go
>> 2013-10-14: PyCon DE 2013, Cologne, Germany ...            20 days to go
>>
>>    eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
>>     D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
>>            Registered at Amtsgericht Duesseldorf: HRB 46611
>>                http://www.egenix.com/company/contact/
>>
> 
> 
> 
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> https://mail.python.org/mailman/listinfo/python-ideas
> 

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Sep 24 2013)
>>> Python Projects, Consulting and Support ...   http://www.egenix.com/
>>> mxODBC.Zope/Plone.Database.Adapter ...       http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________
2013-09-11: Released eGenix PyRun 1.3.0 ...       http://egenix.com/go49
2013-09-28: PyDDF Sprint ...                                4 days to go
2013-10-14: PyCon DE 2013, Cologne, Germany ...            20 days to go

   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
    D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
           Registered at Amtsgericht Duesseldorf: HRB 46611
               http://www.egenix.com/company/contact/

From kim.grasman at gmail.com  Tue Sep 24 20:41:32 2013
From: kim.grasman at gmail.com (=?ISO-8859-1?Q?Kim_Gr=E4sman?=)
Date: Tue, 24 Sep 2013 20:41:32 +0200
Subject: [Python-ideas] Have os.unlink remove junction points on Windows
In-Reply-To: <CANt7B+fu7_keoiBC2THYrQbBvqgn-jfyRrfaJ6uSx-h0HyjuNw@mail.gmail.com>
References: <CANt7B+cAMu1p6D05oKHDTCFGsA7Td-QWc3xJenuVmH3x+xzf3g@mail.gmail.com>
 <CANt7B+fu7_keoiBC2THYrQbBvqgn-jfyRrfaJ6uSx-h0HyjuNw@mail.gmail.com>
Message-ID: <CANt7B+f9kJnwTSm0Cjex3nnJ-TG46=GCjE+JXfvhX=km0FU+0g@mail.gmail.com>

Hi all,

On Sun, Aug 25, 2013 at 8:26 PM, Kim Gr?sman <kim.grasman at gmail.com> wrote:
> Ping?
>
> Can I clarify something to move this forward? It seems like a good
> idea to me, but I don't have the history of Py_DeleteFileW -- maybe
> somebody tried this already?

Is there a better place to look for opinions?

I'm happy to see Python getting more link-aware on Windows, and I
think this could help getting further in that direction.

Thanks,
- Kim

From tjreedy at udel.edu  Tue Sep 24 22:39:28 2013
From: tjreedy at udel.edu (Terry Reedy)
Date: Tue, 24 Sep 2013 16:39:28 -0400
Subject: [Python-ideas] Introduce collections.Reiterable
In-Reply-To: <20130924042105.GH7989@ando>
References: <20130922122149.GH19939@ando>
 <CADiSq7fpmqpn8Bi=6yXz-=bbm2jHzxF-SG7JHMdN2aJkvU7YXA@mail.gmail.com>
 <l1n6ct$8m0$1@ger.gmane.org> <20130922234637.GK19939@ando>
 <87bo3k2ccl.fsf@uwakimon.sk.tsukuba.ac.jp> <20130923142336.GA7989@ando>
 <CAA68w_ngs4D0H13gOt2s1WqH-NbV0MoP4SD3jyWOap7TL6ns7w@mail.gmail.com>
 <87wqm7114s.fsf@uwakimon.sk.tsukuba.ac.jp> <20130924013704.GE7989@ando>
 <87vc1q28dh.fsf@uwakimon.sk.tsukuba.ac.jp> <20130924042105.GH7989@ando>
Message-ID: <l1st9n$t1a$1@ger.gmane.org>

On 9/24/2013 12:21 AM, Steven D'Aprano wrote:


> Maybe it's the mathematician in me speaking, but I don't think very many
> unbounded iterators are found outside of maths sequences.

Perhaps you are confusing 'actual infinity' or mathematics with the 
potential infinity of iterators. Unbound, or more exactly, potentially 
unbounded iterators are quite common.

First, many source iterators based on external sources are or are 
potentially unbounded. For example, text-mode files are text line 
iterators. Files based on finite disk files are bounded, but others 
(based on keyboard, socket, or other input channels) may not be. 
Consider the following example (simplified, like all examples, for 
illustrative purposes).

def source(prompt):
   "Yield user responses to prompt."
   while True:
     yield input(prompt))
   # Even if 'quit' were recognized and turned into StopIteration,
   # it still might never happen.

or

def measures(read_instrument):
   "Yield values returned by read_instrument."
   while True:
     yield read_instrument()

A queue can yield an unbounded sequence even if it is always finite and 
even it it has a maximum size, perhaps because the pool of potential 
queue members is finite.

Second, many transform iterators are unbounded if the input iterable is 
unbounded.

def transform(func, iterable):
   for item in iterable:
     try:
       yield func(item)
     except ValueError:
       pass

for i in transform(int, source('Enter an integer: ')):
   # process unbounded stream of ints.

Filter, map, and some itertools potentially produce infinite iterators. 
Itertools.islice turns infinite iterables finite.
Itertools.cycle turns finite iterables infinite.

At the highest level, interactive apps, including OSes, usually process 
indefinite streams of user-generated events.

> After all,
> even if you were to iterate over every atom in the universe, that would
> be bounded, and quite small compared to some of the numbers
> mathematicians deal with... :-)

The atoms of the universe can be reused over and over again in the same 
or different combinations to keep the iteration going indefinitely.

-- 
Terry Jan Reedy


From timothy.c.delaney at gmail.com  Tue Sep 24 22:42:19 2013
From: timothy.c.delaney at gmail.com (Tim Delaney)
Date: Wed, 25 Sep 2013 06:42:19 +1000
Subject: [Python-ideas] `OrderedDict.sort`
In-Reply-To: <5241D59C.9020009@egenix.com>
References: <a7a49b35-4baa-4cb5-891d-cfaf9f31127f@googlegroups.com>
 <20130924121315.GI7989@ando>
 <993eee00-84f4-4219-9637-797fd995055a@googlegroups.com>
 <52419EF5.1070305@stoneleaf.us>
 <CANXboVZ-r+DApR7awj7HQgkrTLiMpaKSyvj6WnPa4e3CSHQ7ug@mail.gmail.com>
 <5241B478.5060605@egenix.com>
 <CANXboVaJAUXKbem_z-tpbqzrGP_uRE6ejfoo9jRo=3POjbmwZw@mail.gmail.com>
 <5241D59C.9020009@egenix.com>
Message-ID: <CAN8CLg=8ZxP5_99aPB14LWCfHOiuFy60YHOa+sz_qPZAaPy=jA@mail.gmail.com>

On 25 September 2013 04:10, M.-A. Lemburg <mal at egenix.com> wrote:

> On 24.09.2013 17:51, Ram Rachum wrote:
> > I get your point. It's a nice idea. But I think it's slightly less
> elegant
> > to create another dict. So I think it's almost as good as having a
> `.sort`
> > method, but not quite as nice.
>
> You can avoid the temp dict by doing some introspection of
> the arguments and using iterators instead.
>
> > (By the way, couldn't you make the same argument about `list.sort`?)
>
> The use case is different. With list.sort() you don't want to create
> a copy of the list, but instead have the list sort itself, since
> you're not interested in the original order.
>
> You'd only use an OrderedDict to begin with if you're interested in
> the insert order, otherwise you'd start out with a plain dict().


Not quite. As Ram showed, it's perfectly possible to sort an OrderedDict
in-place, which you couldn't do with a normal dict. In which case you're
looking at equivalent semantics as for a list (where items are just added
using append) - using Ram's implementation above:

>>> import collections
>>>
>>> class SortableOrderedDict(collections.OrderedDict):
...     def sort(self, key=None):
...         sorted_keys = sorted(self.keys(), key=key)
...         for key_ in sorted_keys[1:]:
...             self.move_to_end(key_)
...
>>> x = []
>>> x.append('c')
>>> x.append('b')
>>> x.sort()
>>> x.append('a')
>>>
>>> y = SortableOrderedDict()
>>> y['c'] = 1
>>> y['b'] = 2
>>> y.sort()
>>> y['a'] = 3
>>>
>>> print(x)
['b', 'c', 'a']
>>> print(y)
SortableOrderedDict([('b', 2), ('c', 1), ('a', 3)])
>>> print(x == list(y.keys()))
True
>>>

FWIW Ram I think you should put the implementation up on PyPI.

Tim Delaney
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20130925/34d1c0a0/attachment.html>

From tjreedy at udel.edu  Tue Sep 24 23:16:40 2013
From: tjreedy at udel.edu (Terry Reedy)
Date: Tue, 24 Sep 2013 17:16:40 -0400
Subject: [Python-ideas] Indicate if an iterable is ordered or not
In-Reply-To: <CALFfu7BNQuV=JO-uWGi4L0+Uv+40ZCF_HOcZY44-LK4AodFaXg@mail.gmail.com>
References: <CALFfu7BNQuV=JO-uWGi4L0+Uv+40ZCF_HOcZY44-LK4AodFaXg@mail.gmail.com>
Message-ID: <l1svfg$njf$1@ger.gmane.org>

On 9/24/2013 12:15 PM, Eric Snow wrote:
> Iterables are not necessarily ordered (e.g. dict vs. OrderedDict).
> Sequences are but Sets aren't.  I'm not aware of any good way
> currently to know if an arbitrary iterable is ordered.  Without an
> explicit indicator of ordered-ness, you must know in advance for each
> specific type.
>
> One possible solution is an __isordered__ attribute (on the class),
> set to a boolean.  The absence of the attribute would imply False.
>
> Such an attribute would be added to existing types:
>
> * collections.abc.Iterable (default: False)
> * list (True)
> * tuple (True)
> * set (False)
> * dict (False)
> * collections.OrderedDict (True)
> * ...
>
> Thoughts?

The iterator protocol is intentionally simple.  It only requires an 
__iter__ method or a __next__ method with a standard __iter__ method. 
This makes iterables -- and generator functions that produce iterators 
-- easy to write.

A generator instance may and may not produce items in an intented order, 
so a class attribute is not possible. The same is generally true of 
transform iterators, like map and filter instances, and most itertools 
classes.  It is also not true that lists (and tuples) always have a 
significant order. list(set) has the artificial order of set iteration. 
  Both are reiterable with the same order. Why would you call one True 
and the other False? In general, list(iterable) has as much order as the 
iterable.

The __isordered__ attribute would have to be an instance attribute, 
properly propagated. How would you do that with generator functions? or 
generator expression?

Anyone is free to privately extend the protocol for special purposes and 
restrict their universe to object that follow. Builtins can be extended, 
wrapped, or mapped, or their internal iterator classes mapped, to make 
them conform. The following helps with the last idea.

 >>> for cls in list, tuple, set, frozenset, dict:
	type(iter(cls()))
	
<class 'list_iterator'>
<class 'tuple_iterator'>
<class 'set_iterator'>
<class 'set_iterator'>
<class 'dict_keyiterator'>

-- 
Terry Jan Reedy


From ncoghlan at gmail.com  Wed Sep 25 00:41:00 2013
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Wed, 25 Sep 2013 08:41:00 +1000
Subject: [Python-ideas] `OrderedDict.sort`
In-Reply-To: <CANXboVaJAUXKbem_z-tpbqzrGP_uRE6ejfoo9jRo=3POjbmwZw@mail.gmail.com>
References: <a7a49b35-4baa-4cb5-891d-cfaf9f31127f@googlegroups.com>
 <20130924121315.GI7989@ando>
 <993eee00-84f4-4219-9637-797fd995055a@googlegroups.com>
 <52419EF5.1070305@stoneleaf.us>
 <CANXboVZ-r+DApR7awj7HQgkrTLiMpaKSyvj6WnPa4e3CSHQ7ug@mail.gmail.com>
 <5241B478.5060605@egenix.com>
 <CANXboVaJAUXKbem_z-tpbqzrGP_uRE6ejfoo9jRo=3POjbmwZw@mail.gmail.com>
Message-ID: <CADiSq7evcVOeLgxhjt00q6S88_0wmutXRjmEiRqw2GCYNZjnfg@mail.gmail.com>

On 25 Sep 2013 01:52, "Ram Rachum" <ram at rachum.com> wrote:
>
> I get your point. It's a nice idea. But I think it's slightly less
elegant to create another dict. So I think it's almost as good as having a
`.sort` method, but not quite as nice.
>
> (By the way, couldn't you make the same argument about `list.sort`?)

No, because list.sort() both predates the sorted builtin and is optimised
to be blazingly fast with reasonable memory overhead by directly
interacting with internal details of the list object. It's actually the
pre-existing list sorting machinery that powers the builtin.

The situation is different now: the sorted builtin provides a generic API
to get a sorted version of any iterable. This means a proposed in-place
sort() method on a container has to demonstrate a few things to overcome
the "default deny" that is applied to any proposal to add more methods to
an object interface:

- there are common use cases that can't be handled by sorting the input
when creating the container in the first place
- there are significant speed gains from an in-place sorting operation
- there are significant memory gains from an in-place sorting operation

Now, in the case of OrderedDict it *may* be possible to back up one or more
of those assertions (especially the latter two if you talk to Eric Snow
about an in-place sort method for his C implementation of the API).

However, in the absence of such evidence, the default reaction will always
be to avoid expanding APIs with functionality that can be provided by
applying external algorithms to the existing API.

Cheers,
Nick.


>
>
> On Tue, Sep 24, 2013 at 6:49 PM, M.-A. Lemburg <mal at egenix.com> wrote:
>>
>> On 24.09.2013 17:23, Ram Rachum wrote:
>> > Ethan, you've misunderstood my message and given a correct objection
to an
>> > argument I did not make.
>> >
>> > I did not argue against ordering by insertion order on init. I agree
with
>> > that decision. I disagree with defining the entire class as an
insertion
>> > ordering class and refusing to allow users to reorder it as they wish
after
>> > it's created.
>>
>> The overhead introduced by completely recreating the internal
>> data structure after the sort is just as high as creating a
>> new OrderedDict, so I don't understand why you don't like about:
>>
>> from collections import OrderedDict
>> o = OrderedDict(((3,4), (5,4), (1,2)))
>> p = OrderedDict(sorted(o.iteritems()))
>>
>> This even allows you to keep the original insert order should
>> you need it again. If you don't need this, you can just use:
>>
>> o = dict(((3,4), (5,4), (1,2)))
>> p = OrderedDict(sorted(o.iteritems()))
>>
>> which is also faster than first creating an OrderedDict and
>> then recreating it with sorted entries.
>>
>> Put those two lines into a function and you have:
>>
>> def SortedOrderedDict(*args, **kws):
>>     o = dict(*args, **kws)
>>     return OrderedDict(sorted(o.iteritems()))
>>
>> p = SortedOrderedDict(((3,4), (5,4), (1,2)))
>>
>> --
>> Marc-Andre Lemburg
>> eGenix.com
>>
>> Professional Python Services directly from the Source  (#1, Sep 24 2013)
>> >>> Python Projects, Consulting and Support ...   http://www.egenix.com/
>> >>> mxODBC.Zope/Plone.Database.Adapter ...       http://zope.egenix.com/
>> >>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
>> ________________________________________________________________________
>> 2013-09-11: Released eGenix PyRun 1.3.0 ...       http://egenix.com/go49
>> 2013-09-28: PyDDF Sprint ...                                4 days to go
>> 2013-10-14: PyCon DE 2013, Cologne, Germany ...            20 days to go
>>
>>    eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
>>     D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
>>            Registered at Amtsgericht Duesseldorf: HRB 46611
>>                http://www.egenix.com/company/contact/
>
>
>
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> https://mail.python.org/mailman/listinfo/python-ideas
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20130925/a54419e8/attachment-0001.html>

From ericsnowcurrently at gmail.com  Wed Sep 25 01:00:18 2013
From: ericsnowcurrently at gmail.com (Eric Snow)
Date: Tue, 24 Sep 2013 17:00:18 -0600
Subject: [Python-ideas] Indicate if an iterable is ordered or not
In-Reply-To: <CAP7+vJJzWhXRS=t7qM++uvCx-baXzHqKRdjTTUwVnhfyyzZBOg@mail.gmail.com>
References: <CALFfu7BNQuV=JO-uWGi4L0+Uv+40ZCF_HOcZY44-LK4AodFaXg@mail.gmail.com>
 <CAP7+vJJzWhXRS=t7qM++uvCx-baXzHqKRdjTTUwVnhfyyzZBOg@mail.gmail.com>
Message-ID: <CALFfu7C0_01TwKUi4MyL+buL3pLhNvdg4TXUZEVh668iihH2bA@mail.gmail.com>

On Tue, Sep 24, 2013 at 10:22 AM, Guido van Rossum <guido at python.org> wrote:
> What do you want to do with this knowledge?

At this point, nothing. :)  I realized while writing the message that
my use case was not helped by knowing whether or not the iterable is
ordered.  I sent the message anyway because it does seem like there's
a gap--just not one that perhaps anyone cares about. <wink>

-eric

From ncoghlan at gmail.com  Wed Sep 25 01:01:47 2013
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Wed, 25 Sep 2013 09:01:47 +1000
Subject: [Python-ideas] Indicate if an iterable is ordered or not
In-Reply-To: <CAP7+vJJzWhXRS=t7qM++uvCx-baXzHqKRdjTTUwVnhfyyzZBOg@mail.gmail.com>
References: <CALFfu7BNQuV=JO-uWGi4L0+Uv+40ZCF_HOcZY44-LK4AodFaXg@mail.gmail.com>
 <CAP7+vJJzWhXRS=t7qM++uvCx-baXzHqKRdjTTUwVnhfyyzZBOg@mail.gmail.com>
Message-ID: <CADiSq7e0NWLk1=itaWNWotkfCWFDDM+QMjU+p4O8FCCkLuFtBA@mail.gmail.com>

On 25 Sep 2013 02:24, "Guido van Rossum" <guido at python.org> wrote:
>
> What do you want to do with this knowledge?

My reaction is the same as Guido's.

There's already an implicit expectation that iterables will be *consistent*
in the absence of mutation (i.e.  arbitrarily ordered rather than
unordered), but I don't see how "ordered based on container internal
details" is meaningfully different from "ordered by some external
criterion".

Cheers,
Nick.

>
> On Tue, Sep 24, 2013 at 9:15 AM, Eric Snow <ericsnowcurrently at gmail.com>
wrote:
> > Iterables are not necessarily ordered (e.g. dict vs. OrderedDict).
> > Sequences are but Sets aren't.  I'm not aware of any good way
> > currently to know if an arbitrary iterable is ordered.  Without an
> > explicit indicator of ordered-ness, you must know in advance for each
> > specific type.
> >
> > One possible solution is an __isordered__ attribute (on the class),
> > set to a boolean.  The absence of the attribute would imply False.
> >
> > Such an attribute would be added to existing types:
> >
> > * collections.abc.Iterable (default: False)
> > * list (True)
> > * tuple (True)
> > * set (False)
> > * dict (False)
> > * collections.OrderedDict (True)
> > * ...
> >
> > Thoughts?
> >
> > -eric
> > _______________________________________________
> > Python-ideas mailing list
> > Python-ideas at python.org
> > https://mail.python.org/mailman/listinfo/python-ideas
>
>
>
> --
> --Guido van Rossum (python.org/~guido)
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> https://mail.python.org/mailman/listinfo/python-ideas
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20130925/d17d53f1/attachment.html>

From guido at python.org  Wed Sep 25 01:01:40 2013
From: guido at python.org (Guido van Rossum)
Date: Tue, 24 Sep 2013 16:01:40 -0700
Subject: [Python-ideas] Indicate if an iterable is ordered or not
In-Reply-To: <CALFfu7C0_01TwKUi4MyL+buL3pLhNvdg4TXUZEVh668iihH2bA@mail.gmail.com>
References: <CALFfu7BNQuV=JO-uWGi4L0+Uv+40ZCF_HOcZY44-LK4AodFaXg@mail.gmail.com>
 <CAP7+vJJzWhXRS=t7qM++uvCx-baXzHqKRdjTTUwVnhfyyzZBOg@mail.gmail.com>
 <CALFfu7C0_01TwKUi4MyL+buL3pLhNvdg4TXUZEVh668iihH2bA@mail.gmail.com>
Message-ID: <CAP7+vJLJXkQF11=_TqpFWJAnEstLBRr58Nwbce8+gUNVXSGwgQ@mail.gmail.com>

On Tue, Sep 24, 2013 at 4:00 PM, Eric Snow <ericsnowcurrently at gmail.com> wrote:
> On Tue, Sep 24, 2013 at 10:22 AM, Guido van Rossum <guido at python.org> wrote:
>> What do you want to do with this knowledge?
>
> At this point, nothing. :)  I realized while writing the message that
> my use case was not helped by knowing whether or not the iterable is
> ordered.  I sent the message anyway because it does seem like there's
> a gap--just not one that perhaps anyone cares about. <wink>
>
> -eric


-- 
--Guido van Rossum (python.org/~guido)

From guido at python.org  Wed Sep 25 01:02:06 2013
From: guido at python.org (Guido van Rossum)
Date: Tue, 24 Sep 2013 16:02:06 -0700
Subject: [Python-ideas] Indicate if an iterable is ordered or not
In-Reply-To: <CAP7+vJLJXkQF11=_TqpFWJAnEstLBRr58Nwbce8+gUNVXSGwgQ@mail.gmail.com>
References: <CALFfu7BNQuV=JO-uWGi4L0+Uv+40ZCF_HOcZY44-LK4AodFaXg@mail.gmail.com>
 <CAP7+vJJzWhXRS=t7qM++uvCx-baXzHqKRdjTTUwVnhfyyzZBOg@mail.gmail.com>
 <CALFfu7C0_01TwKUi4MyL+buL3pLhNvdg4TXUZEVh668iihH2bA@mail.gmail.com>
 <CAP7+vJLJXkQF11=_TqpFWJAnEstLBRr58Nwbce8+gUNVXSGwgQ@mail.gmail.com>
Message-ID: <CAP7+vJJMg8DdNwXRbRb9Ffn5fTmSEG_k0q1vD2xC7YG287Bg4Q@mail.gmail.com>

On Tue, Sep 24, 2013 at 4:01 PM, Guido van Rossum <guido at python.org> wrote:
> On Tue, Sep 24, 2013 at 4:00 PM, Eric Snow <ericsnowcurrently at gmail.com> wrote:
>> On Tue, Sep 24, 2013 at 10:22 AM, Guido van Rossum <guido at python.org> wrote:
>>> What do you want to do with this knowledge?
>>
>> At this point, nothing. :)  I realized while writing the message that
>> my use case was not helped by knowing whether or not the iterable is
>> ordered.  I sent the message anyway because it does seem like there's
>> a gap--just not one that perhaps anyone cares about. <wink>

To the contrary, I say there is no gap and there is nothing to gain by
adding the proposed API.

[Sorry for the blank reply earlier.]

-- 
--Guido van Rossum (python.org/~guido)

From ericsnowcurrently at gmail.com  Wed Sep 25 01:10:38 2013
From: ericsnowcurrently at gmail.com (Eric Snow)
Date: Tue, 24 Sep 2013 17:10:38 -0600
Subject: [Python-ideas] Indicate if an iterable is ordered or not
In-Reply-To: <l1svfg$njf$1@ger.gmane.org>
References: <CALFfu7BNQuV=JO-uWGi4L0+Uv+40ZCF_HOcZY44-LK4AodFaXg@mail.gmail.com>
 <l1svfg$njf$1@ger.gmane.org>
Message-ID: <CALFfu7B4BAgm3ce8H8K8aUP9HinDx7fY3U0Ev=-=h-mFXOXRDQ@mail.gmail.com>

FYI, at this point I not longer have a use case for this feature, and
I'm not in favor of this idea without one.

On Tue, Sep 24, 2013 at 3:16 PM, Terry Reedy <tjreedy at udel.edu> wrote:
> The iterator protocol is intentionally simple.  It only requires an __iter__
> method or a __next__ method with a standard __iter__ method. This makes
> iterables -- and generator functions that produce iterators -- easy to
> write.

This is not a proposal for an addition to the iterator protocol.  It
is about indicating (without iterating) that the iteration order of
instances of a particular class will be consistent.

> A generator instance may and may not produce items in an intented order, so
> a class attribute is not possible. The same is generally true of transform
> iterators, like map and filter instances, and most itertools classes.  It is
> also not true that lists (and tuples) always have a significant order.
> list(set) has the artificial order of set iteration.  Both are reiterable
> with the same order. Why would you call one True and the other False? In
> general, list(iterable) has as much order as the iterable.

However, once values are added to the list, that order is consistent.

-eric

From ericsnowcurrently at gmail.com  Wed Sep 25 01:44:04 2013
From: ericsnowcurrently at gmail.com (Eric Snow)
Date: Tue, 24 Sep 2013 17:44:04 -0600
Subject: [Python-ideas] Indicate if an iterable is ordered or not
In-Reply-To: <CADiSq7e0NWLk1=itaWNWotkfCWFDDM+QMjU+p4O8FCCkLuFtBA@mail.gmail.com>
References: <CALFfu7BNQuV=JO-uWGi4L0+Uv+40ZCF_HOcZY44-LK4AodFaXg@mail.gmail.com>
 <CAP7+vJJzWhXRS=t7qM++uvCx-baXzHqKRdjTTUwVnhfyyzZBOg@mail.gmail.com>
 <CADiSq7e0NWLk1=itaWNWotkfCWFDDM+QMjU+p4O8FCCkLuFtBA@mail.gmail.com>
Message-ID: <CALFfu7BYxkyGQin4ujgZ=i5ZBWbLrh-vmtfXjqgHv8H4OFezLw@mail.gmail.com>

On Tue, Sep 24, 2013 at 5:01 PM, Nick Coghlan <ncoghlan at gmail.com> wrote:
> There's already an implicit expectation that iterables will be *consistent*
> in the absence of mutation (i.e.  arbitrarily ordered rather than
> unordered), but I don't see how "ordered based on container internal
> details" is meaningfully different from "ordered by some external
> criterion".

"container internal details" is a good way to put it.  "ordered" is a
little too vague, isn't it. :)

-eric

From dreamingforward at gmail.com  Wed Sep 25 03:50:44 2013
From: dreamingforward at gmail.com (Mark Janssen)
Date: Tue, 24 Sep 2013 18:50:44 -0700
Subject: [Python-ideas] Indicate if an iterable is ordered or not
In-Reply-To: <CALFfu7BNQuV=JO-uWGi4L0+Uv+40ZCF_HOcZY44-LK4AodFaXg@mail.gmail.com>
References: <CALFfu7BNQuV=JO-uWGi4L0+Uv+40ZCF_HOcZY44-LK4AodFaXg@mail.gmail.com>
Message-ID: <CAMjeLr-m04e8m2iB01LovpBeSWYDkKLv_nFUb1k_FP_2Z=3nqQ@mail.gmail.com>

> Iterables are not necessarily ordered (e.g. dict vs. OrderedDict).
> Sequences are but Sets aren't.  I'm not aware of any good way
> currently to know if an arbitrary iterable is ordered.  Without an
> explicit indicator of ordered-ness, you must know in advance for each
> specific type.
>
> One possible solution is an __isordered__ attribute (on the class),
> set to a boolean.  The absence of the attribute would imply False.

Isn't the traditional way to do this via "inheritance"?  Then you call
issubclass(list, OrderedContainer), etc.

But, then, no Python hasn't completely ordered its data structures yet.

Mark

From shane at umbrellacode.com  Wed Sep 25 04:09:56 2013
From: shane at umbrellacode.com (Shane Green)
Date: Tue, 24 Sep 2013 19:09:56 -0700
Subject: [Python-ideas] Indicate if an iterable is ordered or not
In-Reply-To: <CAP7+vJLJXkQF11=_TqpFWJAnEstLBRr58Nwbce8+gUNVXSGwgQ@mail.gmail.com>
References: <CALFfu7BNQuV=JO-uWGi4L0+Uv+40ZCF_HOcZY44-LK4AodFaXg@mail.gmail.com>
 <CAP7+vJJzWhXRS=t7qM++uvCx-baXzHqKRdjTTUwVnhfyyzZBOg@mail.gmail.com>
 <CALFfu7C0_01TwKUi4MyL+buL3pLhNvdg4TXUZEVh668iihH2bA@mail.gmail.com>
 <CAP7+vJLJXkQF11=_TqpFWJAnEstLBRr58Nwbce8+gUNVXSGwgQ@mail.gmail.com>
Message-ID: <54726930-A1B8-4874-B4D4-D06B8AFAB0E0@umbrellacode.com>

I suppose you could support some subset of slice/index operations, with some serious limitations?  


On Sep 24, 2013, at 4:01 PM, Guido van Rossum <guido at python.org> wrote:

> On Tue, Sep 24, 2013 at 4:00 PM, Eric Snow <ericsnowcurrently at gmail.com> wrote:
>> On Tue, Sep 24, 2013 at 10:22 AM, Guido van Rossum <guido at python.org> wrote:
>>> What do you want to do with this knowledge?
>> 
>> At this point, nothing. :)  I realized while writing the message that
>> my use case was not helped by knowing whether or not the iterable is
>> ordered.  I sent the message anyway because it does seem like there's
>> a gap--just not one that perhaps anyone cares about. <wink>
>> 
>> -eric
> 
> 
> 
> -- 
> --Guido van Rossum (python.org/~guido)
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> https://mail.python.org/mailman/listinfo/python-ideas


From abarnert at yahoo.com  Wed Sep 25 05:27:36 2013
From: abarnert at yahoo.com (Andrew Barnert)
Date: Tue, 24 Sep 2013 20:27:36 -0700 (PDT)
Subject: [Python-ideas] `OrderedDict.sort`
In-Reply-To: <1380043495.6261.25918509.61E22EA3@webmail.messagingengine.com>
References: <a7a49b35-4baa-4cb5-891d-cfaf9f31127f@googlegroups.com>
 <20130924121315.GI7989@ando>
 <993eee00-84f4-4219-9637-797fd995055a@googlegroups.com>
 <52419EF5.1070305@stoneleaf.us>
 <CANXboVZ-r+DApR7awj7HQgkrTLiMpaKSyvj6WnPa4e3CSHQ7ug@mail.gmail.com>
 <5241B478.5060605@egenix.com>
 <CANXboVaJAUXKbem_z-tpbqzrGP_uRE6ejfoo9jRo=3POjbmwZw@mail.gmail.com>
 <35E8DF33-71BC-4831-8B3B-4A9695613A75@yahoo.com>
 <CANXboVYjewpOMFORjqT9sHpS8Ps6Ny9PsKvyKeSSYbUAzNQReg@mail.gmail.com>
 <1380043495.6261.25918509.61E22EA3@webmail.messagingengine.com>
Message-ID: <1380079656.50137.YahooMailNeo@web184704.mail.ne1.yahoo.com>

From: "random832 at fastmail.us" <random832 at fastmail.us>

Sent: Tuesday, September 24, 2013 10:24 AM


> On Tue, Sep 24, 2013, at 12:33, Ram Rachum wrote:
>>  For the record, I think that having a SortedDict in the stdlib would be
>>  awesome.
> 
> There are two issues with that.?

This discussion comes up at least once every two months, and I don't think anyone wants to have the whole discussion all over again. See?http://stupidpythonideas.blogspot.com/2013/07/sorted-collections-in-stdlib.html, which I wrote one or two iterations ago to collect all of the issues, and please let me know if I missed any or you have anything to add.

Your two issues aren't really problems, just choices to be made, and I think everyone who's interested in this who has an opinion is unanimous. (There _is_ a problem, however: there are multiple good implementations out there, but none of them comes with someone who's willing to stdlibify it and maintain it for a few years?)?But briefly:?

Yes, every key must be comparable with every other key, and the comparison must define a strict weak order, and the keys must be comparison-immutable, and there's no way to test either of those automatically. By comparison, a dict needs hashable keys, which can be tested automatically, and equality-immutable and hash-immutable keys, which can't really be tested but in practice hash is an acceptable test. But it's no worse?than many other requirements in the stdlib that can't be tested automatically.

And yes, NaN is a problem, but it's exactly the same problem it is everywhere else in Python.


From stephen at xemacs.org  Wed Sep 25 06:27:08 2013
From: stephen at xemacs.org (Stephen J. Turnbull)
Date: Wed, 25 Sep 2013 13:27:08 +0900
Subject: [Python-ideas] `OrderedDict.sort`
In-Reply-To: <1380079656.50137.YahooMailNeo@web184704.mail.ne1.yahoo.com>
References: <a7a49b35-4baa-4cb5-891d-cfaf9f31127f@googlegroups.com>
 <20130924121315.GI7989@ando>
 <993eee00-84f4-4219-9637-797fd995055a@googlegroups.com>
 <52419EF5.1070305@stoneleaf.us>
 <CANXboVZ-r+DApR7awj7HQgkrTLiMpaKSyvj6WnPa4e3CSHQ7ug@mail.gmail.com>
 <5241B478.5060605@egenix.com>
 <CANXboVaJAUXKbem_z-tpbqzrGP_uRE6ejfoo9jRo=3POjbmwZw@mail.gmail.com>
 <35E8DF33-71BC-4831-8B3B-4A9695613A75@yahoo.com>
 <CANXboVYjewpOMFORjqT9sHpS8Ps6Ny9PsKvyKeSSYbUAzNQReg@mail.gmail.com>
 <1380043495.6261.25918509.61E22EA3@webmail.messagingengine.com>
 <1380079656.50137.YahooMailNeo@web184704.mail.ne1.yahoo.com>
Message-ID: <87k3i51q77.fsf@uwakimon.sk.tsukuba.ac.jp>

Andrew Barnert writes:

 > See?http://stupidpythonideas.blogspot.com/2013/07/sorted-collections-in-stdlib.html,
 > which I wrote one or two iterations ago to collect all of the
 > issues, and please let me know if I missed any or you have anything
 > to add.

A small nit: SortedSequence and SortedDicts should be mappings,
guaranteeing "fast" (preferably O(1)) access for any key (integral and
arbitrary, respectively).  Therefore, in the case of a SortedDict the
user should be no more surprised at a complaint about hashability than
they should be in the case of a dict (especially considering the name!)

I'll grant that some users might be perfectly happy with O(log N)
"reasonably fast" access, but others would not be pleased.

From g.brandl at gmx.net  Wed Sep 25 08:59:05 2013
From: g.brandl at gmx.net (Georg Brandl)
Date: Wed, 25 Sep 2013 08:59:05 +0200
Subject: [Python-ideas] +1 button/counter for bugs.python.org
In-Reply-To: <20130923101451.691dfa7b@pitrou.net>
References: <CAPkN8xKAA-yGmTT0SBK7YaC3-_ip5eP+qYYDHGaNXtHGfEDFKQ@mail.gmail.com>
 <CAA77j2DYtrX2bwjnbjfH8ukq_wrw72rebBcuT3iLb0GnCtiyeA@mail.gmail.com>
 <20130923001239.GN19939@ando> <20130923101451.691dfa7b@pitrou.net>
Message-ID: <l1u1hu$6jk$1@ger.gmane.org>

Am 23.09.2013 10:14, schrieb Antoine Pitrou:
> Le Mon, 23 Sep 2013 10:12:39 +1000,
> Steven D'Aprano <steve at pearwood.info> a
> ?crit :
>> On Sun, Sep 22, 2013 at 05:52:58PM +0200, Tshepang Lekhonkhobe wrote:
>> > On Sun, Sep 22, 2013 at 1:21 PM, anatoly techtonik
>> > <techtonik at gmail.com> wrote:
>> > > Does anybody think it is a good idea to personally approve good
>> > > issues and messages on bugs.python.org?
>> > >
>> > > If yes, should it be a Google's +1 (easier to add), or a pythonic
>> > > solution for Roundup?
>> > 
>> > Is it not enough that one can subscribe to the bug? It's very easy
>> > (click the '+' button, then hit subscribe). That way, one can also
>> > keep track of where the conversation is going, instead of a mere
>> > vote-n-forget.
>> 
>> Exactly. 
>> 
>> I think that masses of +1 votes from people who care so little about
>> an issue that they can't be bothered to add themselves to the Nosy
>> list is next to worthless.
> 
> I don't know about you, but I don't add myself to the Nosy list of
> every bug that irks me on third-party software. There's no reason to
> subscribe to an issue's messages when you are a mere end-user. That
> doesn't mean the bug isn't affecting you.

I agree.

We don't have to call it "vote"; a second-tier nosy list would probably
the most useful thing for both sides:

a button [This affects me]

meaning "please count me amoung those who like to see it fixed and
send me an email when the issue is closed"

But I think that effort may be better spent on some of the existing
98 open issues in the meta-tracker here: http://psf.upfronthosting.co.za/
unless this feature is contributed.

cheers,
Georg


From oscar.j.benjamin at gmail.com  Wed Sep 25 12:06:58 2013
From: oscar.j.benjamin at gmail.com (Oscar Benjamin)
Date: Wed, 25 Sep 2013 11:06:58 +0100
Subject: [Python-ideas] Have os.unlink remove junction points on Windows
In-Reply-To: <CANt7B+f9kJnwTSm0Cjex3nnJ-TG46=GCjE+JXfvhX=km0FU+0g@mail.gmail.com>
References: <CANt7B+cAMu1p6D05oKHDTCFGsA7Td-QWc3xJenuVmH3x+xzf3g@mail.gmail.com>
 <CANt7B+fu7_keoiBC2THYrQbBvqgn-jfyRrfaJ6uSx-h0HyjuNw@mail.gmail.com>
 <CANt7B+f9kJnwTSm0Cjex3nnJ-TG46=GCjE+JXfvhX=km0FU+0g@mail.gmail.com>
Message-ID: <CAHVvXxQNWBoDwhhOCcsXvNZSwQNr9ay8+RdAYjAEnkF1cOVJPw@mail.gmail.com>

On 24 September 2013 19:41, Kim Gr?sman <kim.grasman at gmail.com> wrote:
> On Sun, Aug 25, 2013 at 8:26 PM, Kim Gr?sman <kim.grasman at gmail.com> wrote:
>> Ping?
>>
>> Can I clarify something to move this forward? It seems like a good
>> idea to me, but I don't have the history of Py_DeleteFileW -- maybe
>> somebody tried this already?
>
> Is there a better place to look for opinions?
>
> I'm happy to see Python getting more link-aware on Windows, and I
> think this could help getting further in that direction.

Since no one has responded to this for some time I would estimate that
not many people particularly dislike your idea. So feel free to open
an issue about it on the tracker (after checking that there isn't
already an open issue and that your problem is not already solved in
the most recent release):
http://bugs.python.org/

On the other hand evidently not many people are very enthusiastic
about this idea so it's possible that the tracker issue will not go
anywhere unless you write the patch yourself.


Oscar

From kim.grasman at gmail.com  Wed Sep 25 13:01:59 2013
From: kim.grasman at gmail.com (=?ISO-8859-1?Q?Kim_Gr=E4sman?=)
Date: Wed, 25 Sep 2013 13:01:59 +0200
Subject: [Python-ideas] Have os.unlink remove junction points on Windows
In-Reply-To: <CAHVvXxQNWBoDwhhOCcsXvNZSwQNr9ay8+RdAYjAEnkF1cOVJPw@mail.gmail.com>
References: <CANt7B+cAMu1p6D05oKHDTCFGsA7Td-QWc3xJenuVmH3x+xzf3g@mail.gmail.com>
 <CANt7B+fu7_keoiBC2THYrQbBvqgn-jfyRrfaJ6uSx-h0HyjuNw@mail.gmail.com>
 <CANt7B+f9kJnwTSm0Cjex3nnJ-TG46=GCjE+JXfvhX=km0FU+0g@mail.gmail.com>
 <CAHVvXxQNWBoDwhhOCcsXvNZSwQNr9ay8+RdAYjAEnkF1cOVJPw@mail.gmail.com>
Message-ID: <CANt7B+dmMFTgnxXYM9jnS+ZsPdbWxqoU8NWOJ-Z-vm6wZj=Jjw@mail.gmail.com>

Hi Oscar,

On Wed, Sep 25, 2013 at 12:06 PM, Oscar Benjamin
<oscar.j.benjamin at gmail.com> wrote:
>
> Since no one has responded to this for some time I would estimate that
> not many people particularly dislike your idea. So feel free to open
> an issue about it on the tracker (after checking that there isn't
> already an open issue and that your problem is not already solved in
> the most recent release):
> http://bugs.python.org/
>
> On the other hand evidently not many people are very enthusiastic
> about this idea so it's possible that the tracker issue will not go
> anywhere unless you write the patch yourself.

Thanks for responding!

I opened an issue before posting here: http://bugs.python.org/issue18314

I'd be happy to provide a patch, but I only want to put time into it
if there's a reasonable chance it gets committed. That's why I wanted
to hear if there were any objections, so I don't end up writing,
testing and posting a patch only to end up in quibbles around the
general idea.

I'm new to Python development; would a concrete patch help move this forward?

Thanks,
- Kim

From oscar.j.benjamin at gmail.com  Wed Sep 25 13:50:14 2013
From: oscar.j.benjamin at gmail.com (Oscar Benjamin)
Date: Wed, 25 Sep 2013 12:50:14 +0100
Subject: [Python-ideas] Have os.unlink remove junction points on Windows
In-Reply-To: <CANt7B+dmMFTgnxXYM9jnS+ZsPdbWxqoU8NWOJ-Z-vm6wZj=Jjw@mail.gmail.com>
References: <CANt7B+cAMu1p6D05oKHDTCFGsA7Td-QWc3xJenuVmH3x+xzf3g@mail.gmail.com>
 <CANt7B+fu7_keoiBC2THYrQbBvqgn-jfyRrfaJ6uSx-h0HyjuNw@mail.gmail.com>
 <CANt7B+f9kJnwTSm0Cjex3nnJ-TG46=GCjE+JXfvhX=km0FU+0g@mail.gmail.com>
 <CAHVvXxQNWBoDwhhOCcsXvNZSwQNr9ay8+RdAYjAEnkF1cOVJPw@mail.gmail.com>
 <CANt7B+dmMFTgnxXYM9jnS+ZsPdbWxqoU8NWOJ-Z-vm6wZj=Jjw@mail.gmail.com>
Message-ID: <CAHVvXxS3SETxP0768FL=y9jniX_1_L4-tm5OQn6Tapu3oUPo+w@mail.gmail.com>

On 25 September 2013 12:01, Kim Gr?sman <kim.grasman at gmail.com> wrote:
> Hi Oscar,
>
> On Wed, Sep 25, 2013 at 12:06 PM, Oscar Benjamin
> <oscar.j.benjamin at gmail.com> wrote:
>>
>> Since no one has responded to this for some time I would estimate that
>> not many people particularly dislike your idea. So feel free to open
>> an issue about it on the tracker (after checking that there isn't
>> already an open issue and that your problem is not already solved in
>> the most recent release):
>> http://bugs.python.org/
>>
>> On the other hand evidently not many people are very enthusiastic
>> about this idea so it's possible that the tracker issue will not go
>> anywhere unless you write the patch yourself.
>
> Thanks for responding!
>
> I opened an issue before posting here: http://bugs.python.org/issue18314

Sorry, I've just looked back over this thread and I see that now.

>
> I'd be happy to provide a patch, but I only want to put time into it
> if there's a reasonable chance it gets committed. That's why I wanted
> to hear if there were any objections, so I don't end up writing,
> testing and posting a patch only to end up in quibbles around the
> general idea.
>
> I'm new to Python development; would a concrete patch help move this forward?

It doesn't look like anyone else will write a patch so I don't think
much will happen if you don't either. I don't know anything about
junction points though so I have no idea how likely it is that a patch
would be accepted.


Oscar

From ncoghlan at gmail.com  Wed Sep 25 14:04:50 2013
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Wed, 25 Sep 2013 22:04:50 +1000
Subject: [Python-ideas] Have os.unlink remove junction points on Windows
In-Reply-To: <CAHVvXxS3SETxP0768FL=y9jniX_1_L4-tm5OQn6Tapu3oUPo+w@mail.gmail.com>
References: <CANt7B+cAMu1p6D05oKHDTCFGsA7Td-QWc3xJenuVmH3x+xzf3g@mail.gmail.com>
 <CANt7B+fu7_keoiBC2THYrQbBvqgn-jfyRrfaJ6uSx-h0HyjuNw@mail.gmail.com>
 <CANt7B+f9kJnwTSm0Cjex3nnJ-TG46=GCjE+JXfvhX=km0FU+0g@mail.gmail.com>
 <CAHVvXxQNWBoDwhhOCcsXvNZSwQNr9ay8+RdAYjAEnkF1cOVJPw@mail.gmail.com>
 <CANt7B+dmMFTgnxXYM9jnS+ZsPdbWxqoU8NWOJ-Z-vm6wZj=Jjw@mail.gmail.com>
 <CAHVvXxS3SETxP0768FL=y9jniX_1_L4-tm5OQn6Tapu3oUPo+w@mail.gmail.com>
Message-ID: <CADiSq7de5onO4OVTjdp8u-wb2gHBiBSex5aRet5k1HZG6GarDg@mail.gmail.com>

On 25 Sep 2013 21:51, "Oscar Benjamin" <oscar.j.benjamin at gmail.com> wrote:
>
> On 25 September 2013 12:01, Kim Gr?sman <kim.grasman at gmail.com> wrote:
> > Hi Oscar,
> >
> > On Wed, Sep 25, 2013 at 12:06 PM, Oscar Benjamin
> > <oscar.j.benjamin at gmail.com> wrote:
> >>
> >> Since no one has responded to this for some time I would estimate that
> >> not many people particularly dislike your idea. So feel free to open
> >> an issue about it on the tracker (after checking that there isn't
> >> already an open issue and that your problem is not already solved in
> >> the most recent release):
> >> http://bugs.python.org/
> >>
> >> On the other hand evidently not many people are very enthusiastic
> >> about this idea so it's possible that the tracker issue will not go
> >> anywhere unless you write the patch yourself.
> >
> > Thanks for responding!
> >
> > I opened an issue before posting here: http://bugs.python.org/issue18314
>
> Sorry, I've just looked back over this thread and I see that now.
>
> >
> > I'd be happy to provide a patch, but I only want to put time into it
> > if there's a reasonable chance it gets committed. That's why I wanted
> > to hear if there were any objections, so I don't end up writing,
> > testing and posting a patch only to end up in quibbles around the
> > general idea.
> >
> > I'm new to Python development; would a concrete patch help move this
forward?
>
> It doesn't look like anyone else will write a patch so I don't think
> much will happen if you don't either. I don't know anything about
> junction points though so I have no idea how likely it is that a patch
> would be accepted.

My recollection is that permissions around junction points are a little
weird at the Windows OS level (so the access denied might be genuine for a
regular user account), but if a patch can make os.unlink handle them more
like *nix symlinks, that sounds reasonable to me.

Cheers,
Nick.

>
>
> Oscar
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> https://mail.python.org/mailman/listinfo/python-ideas
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20130925/96f25306/attachment.html>

From random832 at fastmail.us  Wed Sep 25 15:47:18 2013
From: random832 at fastmail.us (random832 at fastmail.us)
Date: Wed, 25 Sep 2013 09:47:18 -0400
Subject: [Python-ideas] `OrderedDict.sort`
In-Reply-To: <1380079656.50137.YahooMailNeo@web184704.mail.ne1.yahoo.com>
References: <a7a49b35-4baa-4cb5-891d-cfaf9f31127f@googlegroups.com>
 <20130924121315.GI7989@ando>
 <993eee00-84f4-4219-9637-797fd995055a@googlegroups.com>
 <52419EF5.1070305@stoneleaf.us>
 <CANXboVZ-r+DApR7awj7HQgkrTLiMpaKSyvj6WnPa4e3CSHQ7ug@mail.gmail.com>
 <5241B478.5060605@egenix.com>
 <CANXboVaJAUXKbem_z-tpbqzrGP_uRE6ejfoo9jRo=3POjbmwZw@mail.gmail.com>
 <35E8DF33-71BC-4831-8B3B-4A9695613A75@yahoo.com>
 <CANXboVYjewpOMFORjqT9sHpS8Ps6Ny9PsKvyKeSSYbUAzNQReg@mail.gmail.com>
 <1380043495.6261.25918509.61E22EA3@webmail.messagingengine.com>
 <1380079656.50137.YahooMailNeo@web184704.mail.ne1.yahoo.com>
Message-ID: <1380116838.26707.26290665.7C90A335@webmail.messagingengine.com>

On Tue, Sep 24, 2013, at 23:27, Andrew Barnert wrote:
> Yes, every key must be comparable with every other key, and the
> comparison must define a strict weak order, and the keys must be
> comparison-immutable, and there's no way to test either of those
> automatically. By comparison, a dict needs hashable keys, which can be
> tested automatically, and equality-immutable and hash-immutable keys,
> which can't really be tested but in practice hash is an acceptable test.

I think of this as part of the hashable protocol, whereas we know that
lists are orderable despite being mutable.

> But it's no worse?than many other requirements in the stdlib that can't
> be tested automatically.
> 
> And yes, NaN is a problem, but it's exactly the same problem it is
> everywhere else in Python.

I was serious about wanting to know how dictionaries handle NaN as a
key. Is it a special case? The obvious way of implementing it would
conclude it is a hash collision but not a match. I notice that
Decimal('NaN') and float nan don't match each other (as do any other
float/Decimal with the same value) but they do both work as dictionary
keys.

From oscar.j.benjamin at gmail.com  Wed Sep 25 15:53:43 2013
From: oscar.j.benjamin at gmail.com (Oscar Benjamin)
Date: Wed, 25 Sep 2013 14:53:43 +0100
Subject: [Python-ideas] `OrderedDict.sort`
In-Reply-To: <1380116838.26707.26290665.7C90A335@webmail.messagingengine.com>
References: <a7a49b35-4baa-4cb5-891d-cfaf9f31127f@googlegroups.com>
 <20130924121315.GI7989@ando>
 <993eee00-84f4-4219-9637-797fd995055a@googlegroups.com>
 <52419EF5.1070305@stoneleaf.us>
 <CANXboVZ-r+DApR7awj7HQgkrTLiMpaKSyvj6WnPa4e3CSHQ7ug@mail.gmail.com>
 <5241B478.5060605@egenix.com>
 <CANXboVaJAUXKbem_z-tpbqzrGP_uRE6ejfoo9jRo=3POjbmwZw@mail.gmail.com>
 <35E8DF33-71BC-4831-8B3B-4A9695613A75@yahoo.com>
 <CANXboVYjewpOMFORjqT9sHpS8Ps6Ny9PsKvyKeSSYbUAzNQReg@mail.gmail.com>
 <1380043495.6261.25918509.61E22EA3@webmail.messagingengine.com>
 <1380079656.50137.YahooMailNeo@web184704.mail.ne1.yahoo.com>
 <1380116838.26707.26290665.7C90A335@webmail.messagingengine.com>
Message-ID: <CAHVvXxRK=AY0nVzb9iGhgOUwK_TayM-+G+aXBwbP6Kg6W58NWg@mail.gmail.com>

On 25 September 2013 14:47,  <random832 at fastmail.us> wrote:
>> And yes, NaN is a problem, but it's exactly the same problem it is
>> everywhere else in Python.
>
> I was serious about wanting to know how dictionaries handle NaN as a
> key. Is it a special case? The obvious way of implementing it would
> conclude it is a hash collision but not a match. I notice that
> Decimal('NaN') and float nan don't match each other (as do any other
> float/Decimal with the same value) but they do both work as dictionary
> keys.

They're effectively compared by identity:

>>> {float('nan'), float('nan')}
set([nan, nan])
>>> a = float('nan')
>>> {a, a}
set([nan])


Oscar

From vernondcole at gmail.com  Wed Sep 25 16:11:41 2013
From: vernondcole at gmail.com (Vernon D. Cole)
Date: Wed, 25 Sep 2013 15:11:41 +0100
Subject: [Python-ideas] Subject: Re:  `OrderedDict.sort`
Message-ID: <CAH-ZgAc2XHaYzE6HU+hptcRXkGH3SVOqAMn9X5Lziy==Pq6oFQ@mail.gmail.com>

>
>
> > And yes, NaN is a problem, but it's exactly the same problem it is
> > everywhere else in Python.
>
> I was serious about wanting to know how dictionaries handle NaN as a
> key. Is it a special case? The obvious way of implementing it would
> conclude it is a hash collision but not a match. I notice that
> Decimal('NaN') and float nan don't match each other (as do any other
> float/Decimal with the same value) but they do both work as dictionary
> keys.
>

NaN is, by definition, never equal to another NaN, which is why the
following happens:

>>> nan = float('NAN')
>>> n2 = nan
>>> n2 == nan
False
>>> n2 is nan
True

It turns out that many other things which I never thought about before can
be dictionary keys...

>>> d = {'a':1, nan: 2}
>>> d[n2]
2
>>> d[NotImplemented] = 3
>>> d[...] = 4
>>> d[None] = 5
>>> d[True] = 6
>>> d[False] = 7
>>> d
{'a': 1, nan: 2, False: 7, True: 6, NotImplemented: 3, Ellipsis: 4, None: 5}
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20130925/88006fb6/attachment.html>

From abarnert at yahoo.com  Wed Sep 25 18:02:21 2013
From: abarnert at yahoo.com (Andrew Barnert)
Date: Wed, 25 Sep 2013 09:02:21 -0700
Subject: [Python-ideas] Subject: Re:  `OrderedDict.sort`
In-Reply-To: <CAH-ZgAc2XHaYzE6HU+hptcRXkGH3SVOqAMn9X5Lziy==Pq6oFQ@mail.gmail.com>
References: <CAH-ZgAc2XHaYzE6HU+hptcRXkGH3SVOqAMn9X5Lziy==Pq6oFQ@mail.gmail.com>
Message-ID: <A57F2CDD-6948-43F1-B9AE-72D7D18B1851@yahoo.com>

On Sep 25, 2013, at 7:11, "Vernon D. Cole" <vernondcole at gmail.com> wrote:

>> 
>> > And yes, NaN is a problem, but it's exactly the same problem it is
>> > everywhere else in Python.
>> 
>> I was serious about wanting to know how dictionaries handle NaN as a
>> key. Is it a special case? The obvious way of implementing it would
>> conclude it is a hash collision but not a match. I notice that
>> Decimal('NaN') and float nan don't match each other (as do any other
>> float/Decimal with the same value) but they do both work as dictionary
>> keys.
> 
> NaN is, by definition, never equal to another NaN, which is why the following happens:
> 
> >>> nan = float('NAN')
> >>> n2 = nan
> >>> n2 == nan
> False
> >>> n2 is nan
> True
> 
> It turns out that many other things which I never thought about before can be dictionary keys...
> 
> >>> d = {'a':1, nan: 2}
> >>> d[n2]
> 2
> >>> d[NotImplemented] = 3
> >>> d[...] = 4
> >>> d[None] = 5
> >>> d[True] = 6
> >>> d[False] = 7
> >>> d
> {'a': 1, nan: 2, False: 7, True: 6, NotImplemented: 3, Ellipsis: 4, None: 5}

While some of these are odd things to use as keys, they don't have any odd behavior with equality except nan. 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20130925/69b6e0f5/attachment.html>

From abarnert at yahoo.com  Wed Sep 25 17:59:20 2013
From: abarnert at yahoo.com (Andrew Barnert)
Date: Wed, 25 Sep 2013 08:59:20 -0700
Subject: [Python-ideas] `OrderedDict.sort`
In-Reply-To: <87k3i51q77.fsf@uwakimon.sk.tsukuba.ac.jp>
References: <a7a49b35-4baa-4cb5-891d-cfaf9f31127f@googlegroups.com>
 <20130924121315.GI7989@ando>
 <993eee00-84f4-4219-9637-797fd995055a@googlegroups.com>
 <52419EF5.1070305@stoneleaf.us>
 <CANXboVZ-r+DApR7awj7HQgkrTLiMpaKSyvj6WnPa4e3CSHQ7ug@mail.gmail.com>
 <5241B478.5060605@egenix.com>
 <CANXboVaJAUXKbem_z-tpbqzrGP_uRE6ejfoo9jRo=3POjbmwZw@mail.gmail.com>
 <35E8DF33-71BC-4831-8B3B-4A9695613A75@yahoo.com>
 <CANXboVYjewpOMFORjqT9sHpS8Ps6Ny9PsKvyKeSSYbUAzNQReg@mail.gmail.com>
 <1380043495.6261.25918509.61E22EA3@webmail.messagingengine.com>
 <1380079656.50137.YahooMailNeo@web184704.mail.ne1.yahoo.com>
 <87k3i51q77.fsf@uwakimon.sk.tsukuba.ac.jp>
Message-ID: <09D64CCF-A19D-489D-BDF6-045D5BA16C8A@yahoo.com>


On Sep 24, 2013, at 21:27, "Stephen J. Turnbull" <stephen at xemacs.org> wrote:

> Andrew Barnert writes:
> 
>> See http://stupidpythonideas.blogspot.com/2013/07/sorted-collections-in-stdlib.html,
>> which I wrote one or two iterations ago to collect all of the
>> issues, and please let me know if I missed any or you have anything
>> to add.
> 
> A small nit: SortedSequence and SortedDicts should be mappings,
> guaranteeing "fast" (preferably O(1)) access for any key (integral and
> arbitrary, respectively).  Therefore, in the case of a SortedDict the
> user should be no more surprised at a complaint about hashability than
> they should be in the case of a dict (especially considering the name!)
> 
> I'll grant that some users might be perfectly happy with O(log N)
> "reasonably fast" access, but others would not be pleased.

O(log N) is fast enough for the standard mappings in C++, Java, etc., are python users more demanding of performance than C++? I don't know of any language that has a SortedAndHashedDict in it's stdlib, but there are many that have a SortedDict based on a tree. I don't know of any modules on PyPI that offer the former, but multiple popular modules offer the latter. 

Also, given a SortedSequence and a dict, you can trivially build a SortedAndHashedDict if you really want it for something; without SortedSequence, you can't. The other way around isn't true; if you want a SortedDict, without the time and space and requirements burden, a SortedAndHashedSequence is no help.

If you think the name SortedDict is misleading, we could call it something different, with fewer implications. But, given that libraries like blist generally offer the type under a name like SortedDict, and in other languages that offer both tree-based and hash-based collections the names are always parallel (like map and unordered_map in C++), I don't think this is a problem.

From abarnert at yahoo.com  Wed Sep 25 18:35:14 2013
From: abarnert at yahoo.com (Andrew Barnert)
Date: Wed, 25 Sep 2013 09:35:14 -0700
Subject: [Python-ideas] `OrderedDict.sort`
In-Reply-To: <1380116838.26707.26290665.7C90A335@webmail.messagingengine.com>
References: <a7a49b35-4baa-4cb5-891d-cfaf9f31127f@googlegroups.com>
 <20130924121315.GI7989@ando>
 <993eee00-84f4-4219-9637-797fd995055a@googlegroups.com>
 <52419EF5.1070305@stoneleaf.us>
 <CANXboVZ-r+DApR7awj7HQgkrTLiMpaKSyvj6WnPa4e3CSHQ7ug@mail.gmail.com>
 <5241B478.5060605@egenix.com>
 <CANXboVaJAUXKbem_z-tpbqzrGP_uRE6ejfoo9jRo=3POjbmwZw@mail.gmail.com>
 <35E8DF33-71BC-4831-8B3B-4A9695613A75@yahoo.com>
 <CANXboVYjewpOMFORjqT9sHpS8Ps6Ny9PsKvyKeSSYbUAzNQReg@mail.gmail.com>
 <1380043495.6261.25918509.61E22EA3@webmail.messagingengine.com>
 <1380079656.50137.YahooMailNeo@web184704.mail.ne1.yahoo.com>
 <1380116838.26707.26290665.7C90A335@webmail.messagingengine.com>
Message-ID: <F58B27AD-B9D7-41F9-BB5C-895838100C14@yahoo.com>

On Sep 25, 2013, at 6:47, random832 at fastmail.us wrote:

> On Tue, Sep 24, 2013, at 23:27, Andrew Barnert wrote:
>> Yes, every key must be comparable with every other key, and the
>> comparison must define a strict weak order, and the keys must be
>> comparison-immutable, and there's no way to test either of those
>> automatically. By comparison, a dict needs hashable keys, which can be
>> tested automatically, and equality-immutable and hash-immutable keys,
>> which can't really be tested but in practice hash is an acceptable test.
> 
> I think of this as part of the hashable protocol, whereas we know that
> lists are orderable despite being mutable.

Please read the blog post rather than the one-line summary if you want to discuss the contents.

>> But it's no worse than many other requirements in the stdlib that can't
>> be tested automatically.
>> 
>> And yes, NaN is a problem, but it's exactly the same problem it is
>> everywhere else in Python.
> 
> I was serious about wanting to know how dictionaries handle NaN as a
> key. Is it a special case? The obvious way of implementing it would
> conclude it is a hash collision but not a match.

I believe that, at least in CPython and PyPy, a hash collision is a match if they're identical or equal, which is why NaN values work, and why float("nan") and Decimal("nan") aren't matches, and so on.

But is there anything in the documentation that requires this, or is it just a side effect of implementation specifics? I don't know.

> I notice that
> Decimal('NaN') and float nan don't match each other (as do any other
> float/Decimal with the same value) but they do both work as dictionary
> keys.

From ethan at stoneleaf.us  Wed Sep 25 18:53:47 2013
From: ethan at stoneleaf.us (Ethan Furman)
Date: Wed, 25 Sep 2013 09:53:47 -0700
Subject: [Python-ideas] `OrderedDict.sort`
In-Reply-To: <09D64CCF-A19D-489D-BDF6-045D5BA16C8A@yahoo.com>
References: <a7a49b35-4baa-4cb5-891d-cfaf9f31127f@googlegroups.com>
 <20130924121315.GI7989@ando>
 <993eee00-84f4-4219-9637-797fd995055a@googlegroups.com>
 <52419EF5.1070305@stoneleaf.us>
 <CANXboVZ-r+DApR7awj7HQgkrTLiMpaKSyvj6WnPa4e3CSHQ7ug@mail.gmail.com>
 <5241B478.5060605@egenix.com>
 <CANXboVaJAUXKbem_z-tpbqzrGP_uRE6ejfoo9jRo=3POjbmwZw@mail.gmail.com>
 <35E8DF33-71BC-4831-8B3B-4A9695613A75@yahoo.com>
 <CANXboVYjewpOMFORjqT9sHpS8Ps6Ny9PsKvyKeSSYbUAzNQReg@mail.gmail.com>
 <1380043495.6261.25918509.61E22EA3@webmail.messagingengine.com>
 <1380079656.50137.YahooMailNeo@web184704.mail.ne1.yahoo.com>
 <87k3i51q77.fsf@uwakimon.sk.tsukuba.ac.jp>
 <09D64CCF-A19D-489D-BDF6-045D5BA16C8A@yahoo.com>
Message-ID: <5243151B.5030904@stoneleaf.us>

On 09/25/2013 08:59 AM, Andrew Barnert wrote:
> On Sep 24, 2013, at 21:27, "Stephen J. Turnbull" <stephen at xemacs.org> wrote:
>>
>> I'll grant that some users might be perfectly happy with O(log N)
>> "reasonably fast" access, but others would not be pleased.
>
> O(log N) is fast enough for the standard mappings in C++, Java, etc., are python users more demanding of performance than C++?

I admit I know next to nothing about C++ and Java, but in Python the dict is ubiquitous: modules have them, classes have 
them, nearly every user defined instance has them, they're passed into functions, they're used for dispatch tables, 
etc., etc..

So I suspect that Python is more demanding of its mapping than the others are.

--
~Ethan~

From abarnert at yahoo.com  Wed Sep 25 21:29:51 2013
From: abarnert at yahoo.com (Andrew Barnert)
Date: Wed, 25 Sep 2013 12:29:51 -0700
Subject: [Python-ideas] `OrderedDict.sort`
In-Reply-To: <5243151B.5030904@stoneleaf.us>
References: <a7a49b35-4baa-4cb5-891d-cfaf9f31127f@googlegroups.com>
 <20130924121315.GI7989@ando>
 <993eee00-84f4-4219-9637-797fd995055a@googlegroups.com>
 <52419EF5.1070305@stoneleaf.us>
 <CANXboVZ-r+DApR7awj7HQgkrTLiMpaKSyvj6WnPa4e3CSHQ7ug@mail.gmail.com>
 <5241B478.5060605@egenix.com>
 <CANXboVaJAUXKbem_z-tpbqzrGP_uRE6ejfoo9jRo=3POjbmwZw@mail.gmail.com>
 <35E8DF33-71BC-4831-8B3B-4A9695613A75@yahoo.com>
 <CANXboVYjewpOMFORjqT9sHpS8Ps6Ny9PsKvyKeSSYbUAzNQReg@mail.gmail.com>
 <1380043495.6261.25918509.61E22EA3@webmail.messagingengine.com>
 <1380079656.50137.YahooMailNeo@web184704.mail.ne1.yahoo.com>
 <87k3i51q77.fsf@uwakimon.sk.tsukuba.ac.jp>
 <09D64CCF-A19D-489D-BDF6-045D5BA16C8A@yahoo.com>
 <5243151B.5030904@stoneleaf.us>
Message-ID: <14468CAA-FD00-42BE-9C99-7D417409FAE7@yahoo.com>

On Sep 25, 2013, at 9:53, Ethan Furman <ethan at stoneleaf.us> wrote:

> On 09/25/2013 08:59 AM, Andrew Barnert wrote:
>> On Sep 24, 2013, at 21:27, "Stephen J. Turnbull" <stephen at xemacs.org> wrote:
>>> 
>>> I'll grant that some users might be perfectly happy with O(log N)
>>> "reasonably fast" access, but others would not be pleased.
>> 
>> O(log N) is fast enough for the standard mappings in C++, Java, etc., are python users more demanding of performance than C++?
> 
> I admit I know next to nothing about C++ and Java, but in Python the dict is ubiquitous: modules have them, classes have them, nearly every user defined instance has them, they're passed into functions, they're used for dispatch tables, etc., etc..
> 
> So I suspect that Python is more demanding of its mapping than the others are.

Nobody is suggesting replacing dict with a tree-based mapping, just adding one in the collections module for the use cases where it's what you want.


From tjreedy at udel.edu  Wed Sep 25 22:51:24 2013
From: tjreedy at udel.edu (Terry Reedy)
Date: Wed, 25 Sep 2013 16:51:24 -0400
Subject: [Python-ideas] Have os.unlink remove junction points on Windows
In-Reply-To: <CANt7B+dmMFTgnxXYM9jnS+ZsPdbWxqoU8NWOJ-Z-vm6wZj=Jjw@mail.gmail.com>
References: <CANt7B+cAMu1p6D05oKHDTCFGsA7Td-QWc3xJenuVmH3x+xzf3g@mail.gmail.com>
 <CANt7B+fu7_keoiBC2THYrQbBvqgn-jfyRrfaJ6uSx-h0HyjuNw@mail.gmail.com>
 <CANt7B+f9kJnwTSm0Cjex3nnJ-TG46=GCjE+JXfvhX=km0FU+0g@mail.gmail.com>
 <CAHVvXxQNWBoDwhhOCcsXvNZSwQNr9ay8+RdAYjAEnkF1cOVJPw@mail.gmail.com>
 <CANt7B+dmMFTgnxXYM9jnS+ZsPdbWxqoU8NWOJ-Z-vm6wZj=Jjw@mail.gmail.com>
Message-ID: <l1vic4$o7i$1@ger.gmane.org>

On 9/25/2013 7:01 AM, Kim Gr?sman wrote:

> I opened an issue before posting here: http://bugs.python.org/issue18314

I added as nosy the two Windows experts listed in
http://docs.python.org/devguide/experts.html#experts

I suspect that at least one of them knows enough about junction points 
to review a patch *were you to write one*.

> I'd be happy to provide a patch, but I only want to put time into it
> if there's a reasonable chance it gets committed. That's why I wanted
> to hear if there were any objections, so I don't end up writing,
> testing and posting a patch only to end up in quibbles around the
> general idea.

It is possible that one of the two might have an opinion to the 
contrary, but after reading the Wikipedia article,

https://en.wikipedia.org/wiki/NTFS_junction_point

It seems that you ought to be able to delete junction points from 
Python. That is no guarantee that any particular patch will be accepted.

> I'm new to Python development; would a concrete patch help move this forward?

Definitely. I added a note to the issue about testing and Windows versions.

-- 
Terry Jan Reedy


From kim.grasman at gmail.com  Thu Sep 26 07:37:25 2013
From: kim.grasman at gmail.com (=?ISO-8859-1?Q?Kim_Gr=E4sman?=)
Date: Thu, 26 Sep 2013 07:37:25 +0200
Subject: [Python-ideas] Have os.unlink remove junction points on Windows
In-Reply-To: <CADiSq7de5onO4OVTjdp8u-wb2gHBiBSex5aRet5k1HZG6GarDg@mail.gmail.com>
References: <CANt7B+cAMu1p6D05oKHDTCFGsA7Td-QWc3xJenuVmH3x+xzf3g@mail.gmail.com>
 <CANt7B+fu7_keoiBC2THYrQbBvqgn-jfyRrfaJ6uSx-h0HyjuNw@mail.gmail.com>
 <CANt7B+f9kJnwTSm0Cjex3nnJ-TG46=GCjE+JXfvhX=km0FU+0g@mail.gmail.com>
 <CAHVvXxQNWBoDwhhOCcsXvNZSwQNr9ay8+RdAYjAEnkF1cOVJPw@mail.gmail.com>
 <CANt7B+dmMFTgnxXYM9jnS+ZsPdbWxqoU8NWOJ-Z-vm6wZj=Jjw@mail.gmail.com>
 <CAHVvXxS3SETxP0768FL=y9jniX_1_L4-tm5OQn6Tapu3oUPo+w@mail.gmail.com>
 <CADiSq7de5onO4OVTjdp8u-wb2gHBiBSex5aRet5k1HZG6GarDg@mail.gmail.com>
Message-ID: <CANt7B+djWAARYcwk2=T8dcZGZdpvs+nXE9qR63rk3bKgbKGKxQ@mail.gmail.com>

Hi Nick,

On Wed, Sep 25, 2013 at 2:04 PM, Nick Coghlan <ncoghlan at gmail.com> wrote:
>
> My recollection is that permissions around junction points are a little
> weird at the Windows OS level (so the access denied might be genuine for a
> regular user account), but if a patch can make os.unlink handle them more
> like *nix symlinks, that sounds reasonable to me.

Thanks for the heads-up!

I haven't observed any differences on XP or Windows 7. Now I'm stuck
in an organization where they force all command prompts to be
elevated, so it's been a while since I was able to test the more
normal cases.

- Kim

From kim.grasman at gmail.com  Thu Sep 26 07:38:49 2013
From: kim.grasman at gmail.com (=?ISO-8859-1?Q?Kim_Gr=E4sman?=)
Date: Thu, 26 Sep 2013 07:38:49 +0200
Subject: [Python-ideas] Have os.unlink remove junction points on Windows
In-Reply-To: <CAHVvXxS3SETxP0768FL=y9jniX_1_L4-tm5OQn6Tapu3oUPo+w@mail.gmail.com>
References: <CANt7B+cAMu1p6D05oKHDTCFGsA7Td-QWc3xJenuVmH3x+xzf3g@mail.gmail.com>
 <CANt7B+fu7_keoiBC2THYrQbBvqgn-jfyRrfaJ6uSx-h0HyjuNw@mail.gmail.com>
 <CANt7B+f9kJnwTSm0Cjex3nnJ-TG46=GCjE+JXfvhX=km0FU+0g@mail.gmail.com>
 <CAHVvXxQNWBoDwhhOCcsXvNZSwQNr9ay8+RdAYjAEnkF1cOVJPw@mail.gmail.com>
 <CANt7B+dmMFTgnxXYM9jnS+ZsPdbWxqoU8NWOJ-Z-vm6wZj=Jjw@mail.gmail.com>
 <CAHVvXxS3SETxP0768FL=y9jniX_1_L4-tm5OQn6Tapu3oUPo+w@mail.gmail.com>
Message-ID: <CANt7B+ekAYt+ooj6ogp7TzVhPYOTbF4x4jkQmzWMFmPLnN+koA@mail.gmail.com>

Hi Oscar,

On Wed, Sep 25, 2013 at 1:50 PM, Oscar Benjamin
<oscar.j.benjamin at gmail.com> wrote:
>
>> I'm new to Python development; would a concrete patch help move this forward?
>
> It doesn't look like anyone else will write a patch so I don't think
> much will happen if you don't either. I don't know anything about
> junction points though so I have no idea how likely it is that a patch
> would be accepted.

I'll have to cook it up and see. I figured I'd run it by the community
to see if anyone had considered it before, at least.

Thanks!

- Kim

From kim.grasman at gmail.com  Thu Sep 26 07:40:47 2013
From: kim.grasman at gmail.com (=?ISO-8859-1?Q?Kim_Gr=E4sman?=)
Date: Thu, 26 Sep 2013 07:40:47 +0200
Subject: [Python-ideas] Have os.unlink remove junction points on Windows
In-Reply-To: <l1vic4$o7i$1@ger.gmane.org>
References: <CANt7B+cAMu1p6D05oKHDTCFGsA7Td-QWc3xJenuVmH3x+xzf3g@mail.gmail.com>
 <CANt7B+fu7_keoiBC2THYrQbBvqgn-jfyRrfaJ6uSx-h0HyjuNw@mail.gmail.com>
 <CANt7B+f9kJnwTSm0Cjex3nnJ-TG46=GCjE+JXfvhX=km0FU+0g@mail.gmail.com>
 <CAHVvXxQNWBoDwhhOCcsXvNZSwQNr9ay8+RdAYjAEnkF1cOVJPw@mail.gmail.com>
 <CANt7B+dmMFTgnxXYM9jnS+ZsPdbWxqoU8NWOJ-Z-vm6wZj=Jjw@mail.gmail.com>
 <l1vic4$o7i$1@ger.gmane.org>
Message-ID: <CANt7B+f7QMqVhOgL87K-5pa7SZY61=5_wvocM2mwKa0qoeJxeA@mail.gmail.com>

Hi Terry,

On Wed, Sep 25, 2013 at 10:51 PM, Terry Reedy <tjreedy at udel.edu> wrote:
>
>> I opened an issue before posting here: http://bugs.python.org/issue18314
>
>
> I added as nosy the two Windows experts listed in
> http://docs.python.org/devguide/experts.html#experts
>
> I suspect that at least one of them knows enough about junction points to
> review a patch *were you to write one*.

OK, I'll get to it when I find time. Last time I looked at it, it
seemed pretty trivial, but I need to get the development environment
for Python up.

> Definitely. I added a note to the issue about testing and Windows versions.

Thanks for your help!

- Kim

From p.f.moore at gmail.com  Thu Sep 26 09:23:20 2013
From: p.f.moore at gmail.com (Paul Moore)
Date: Thu, 26 Sep 2013 08:23:20 +0100
Subject: [Python-ideas] Have os.unlink remove junction points on Windows
In-Reply-To: <CANt7B+djWAARYcwk2=T8dcZGZdpvs+nXE9qR63rk3bKgbKGKxQ@mail.gmail.com>
References: <CANt7B+cAMu1p6D05oKHDTCFGsA7Td-QWc3xJenuVmH3x+xzf3g@mail.gmail.com>
 <CANt7B+fu7_keoiBC2THYrQbBvqgn-jfyRrfaJ6uSx-h0HyjuNw@mail.gmail.com>
 <CANt7B+f9kJnwTSm0Cjex3nnJ-TG46=GCjE+JXfvhX=km0FU+0g@mail.gmail.com>
 <CAHVvXxQNWBoDwhhOCcsXvNZSwQNr9ay8+RdAYjAEnkF1cOVJPw@mail.gmail.com>
 <CANt7B+dmMFTgnxXYM9jnS+ZsPdbWxqoU8NWOJ-Z-vm6wZj=Jjw@mail.gmail.com>
 <CAHVvXxS3SETxP0768FL=y9jniX_1_L4-tm5OQn6Tapu3oUPo+w@mail.gmail.com>
 <CADiSq7de5onO4OVTjdp8u-wb2gHBiBSex5aRet5k1HZG6GarDg@mail.gmail.com>
 <CANt7B+djWAARYcwk2=T8dcZGZdpvs+nXE9qR63rk3bKgbKGKxQ@mail.gmail.com>
Message-ID: <CACac1F-p1kDzaEdeoYUix+wQ6kXjY+fNN3oV1gkPA5iUv70ZXA@mail.gmail.com>

On 26 September 2013 06:37, Kim Gr?sman <kim.grasman at gmail.com> wrote:
> I haven't observed any differences on XP or Windows 7. Now I'm stuck
> in an organization where they force all command prompts to be
> elevated, so it's been a while since I was able to test the more
> normal cases.

Er, does this not already work?

>From an elevated Powershell prompt:

PS 08:20 C:\Work\Scratch
>new-symlink symps .\ps.vim

Mode           LastWriteTime       Length Name
----           -------------       ------ ----
-a---    26/09/2013    08:20    <SYMLINK> symps [C:\Work\Scratch\ps.vim]


PS 08:20 C:\Work\Scratch
>type symps
set shell=powershell
set shellcmdflag=-c
set shellquote=\"
set shellxquote=

>From a non-elevated prompt:

PS 08:20 C:\Work\Scratch
>py
Python 3.3.2 (v3.3.2:d047928ae3f6, May 16 2013, 00:06:53) [MSC v.1600
64 bit (AMD64)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> import os
>>> os.unlink('symps')
>>> ^Z

PS 08:21 C:\Work\Scratch
>type symps
type : Cannot find path 'C:\Work\Scratch\symps' because it does not exist.
At line:1 char:1
+ type symps
+ ~~~~~~~~~~
    + CategoryInfo          : ObjectNotFound:
(C:\Work\Scratch\symps:String) [Get-Content], ItemNotFoundException
    + FullyQualifiedErrorId :
PathNotFound,Microsoft.PowerShell.Commands.GetContentCommand

Paul

From random832 at fastmail.us  Thu Sep 26 14:51:55 2013
From: random832 at fastmail.us (random832 at fastmail.us)
Date: Thu, 26 Sep 2013 08:51:55 -0400
Subject: [Python-ideas] Have os.unlink remove junction points on Windows
In-Reply-To: <CACac1F-p1kDzaEdeoYUix+wQ6kXjY+fNN3oV1gkPA5iUv70ZXA@mail.gmail.com>
References: <CANt7B+cAMu1p6D05oKHDTCFGsA7Td-QWc3xJenuVmH3x+xzf3g@mail.gmail.com>
 <CANt7B+fu7_keoiBC2THYrQbBvqgn-jfyRrfaJ6uSx-h0HyjuNw@mail.gmail.com>
 <CANt7B+f9kJnwTSm0Cjex3nnJ-TG46=GCjE+JXfvhX=km0FU+0g@mail.gmail.com>
 <CAHVvXxQNWBoDwhhOCcsXvNZSwQNr9ay8+RdAYjAEnkF1cOVJPw@mail.gmail.com>
 <CANt7B+dmMFTgnxXYM9jnS+ZsPdbWxqoU8NWOJ-Z-vm6wZj=Jjw@mail.gmail.com>
 <CAHVvXxS3SETxP0768FL=y9jniX_1_L4-tm5OQn6Tapu3oUPo+w@mail.gmail.com>
 <CADiSq7de5onO4OVTjdp8u-wb2gHBiBSex5aRet5k1HZG6GarDg@mail.gmail.com>
 <CANt7B+djWAARYcwk2=T8dcZGZdpvs+nXE9qR63rk3bKgbKGKxQ@mail.gmail.com>
 <CACac1F-p1kDzaEdeoYUix+wQ6kXjY+fNN3oV1gkPA5iUv70ZXA@mail.gmail.com>
Message-ID: <1380199915.23284.26724813.20340F46@webmail.messagingengine.com>

On Thu, Sep 26, 2013, at 3:23, Paul Moore wrote:
> On 26 September 2013 06:37, Kim Gr?sman <kim.grasman at gmail.com> wrote:
> > I haven't observed any differences on XP or Windows 7. Now I'm stuck
> > in an organization where they force all command prompts to be
> > elevated, so it's been a while since I was able to test the more
> > normal cases.
> 
> Er, does this not already work?

Symlinks and junction points are not actually the same thing.

From p.f.moore at gmail.com  Thu Sep 26 18:04:31 2013
From: p.f.moore at gmail.com (Paul Moore)
Date: Thu, 26 Sep 2013 17:04:31 +0100
Subject: [Python-ideas] Have os.unlink remove junction points on Windows
In-Reply-To: <1380199915.23284.26724813.20340F46@webmail.messagingengine.com>
References: <CANt7B+cAMu1p6D05oKHDTCFGsA7Td-QWc3xJenuVmH3x+xzf3g@mail.gmail.com>
 <CANt7B+fu7_keoiBC2THYrQbBvqgn-jfyRrfaJ6uSx-h0HyjuNw@mail.gmail.com>
 <CANt7B+f9kJnwTSm0Cjex3nnJ-TG46=GCjE+JXfvhX=km0FU+0g@mail.gmail.com>
 <CAHVvXxQNWBoDwhhOCcsXvNZSwQNr9ay8+RdAYjAEnkF1cOVJPw@mail.gmail.com>
 <CANt7B+dmMFTgnxXYM9jnS+ZsPdbWxqoU8NWOJ-Z-vm6wZj=Jjw@mail.gmail.com>
 <CAHVvXxS3SETxP0768FL=y9jniX_1_L4-tm5OQn6Tapu3oUPo+w@mail.gmail.com>
 <CADiSq7de5onO4OVTjdp8u-wb2gHBiBSex5aRet5k1HZG6GarDg@mail.gmail.com>
 <CANt7B+djWAARYcwk2=T8dcZGZdpvs+nXE9qR63rk3bKgbKGKxQ@mail.gmail.com>
 <CACac1F-p1kDzaEdeoYUix+wQ6kXjY+fNN3oV1gkPA5iUv70ZXA@mail.gmail.com>
 <1380199915.23284.26724813.20340F46@webmail.messagingengine.com>
Message-ID: <CACac1F_+GkK27UuV3sa5bAnQ7qwUQ0P+RDkiDDYNY4JigohboQ@mail.gmail.com>

On 26 September 2013 13:51,  <random832 at fastmail.us> wrote:
>> Er, does this not already work?
>
> Symlinks and junction points are not actually the same thing.

Sorry, I misread an earlier comment. You're right. Sorry for the noise.
Paul

From davidhalter88 at gmail.com  Thu Sep 26 20:11:35 2013
From: davidhalter88 at gmail.com (David Halter)
Date: Thu, 26 Sep 2013 22:41:35 +0430
Subject: [Python-ideas] Should we improve `dir`?
In-Reply-To: <CADiSq7ctMFBBAMetXC7wOLFnQ6W611750mr1mgDU-NeG5s9JaQ@mail.gmail.com>
References: <CAA=HWYgAn-Ub2pdG1w7=x6uxr4D78tciPx_yHYZfkPc2payQqw@mail.gmail.com>
 <F1DC6C48-04D0-4F1E-AE2A-AD01529D903B@gmail.com>
 <l12qg3$spm$1@ger.gmane.org>
 <CADiSq7e7G-D+SGOEj1no3Y1Rarm1Gt_OEYHc39QLUxxm2jeXEw@mail.gmail.com>
 <CAA=HWYjiSgFqtggEDFwDWWhD=QP1LEEtBw7oC4qmmAqHeCvLgw@mail.gmail.com>
 <CADiSq7ctMFBBAMetXC7wOLFnQ6W611750mr1mgDU-NeG5s9JaQ@mail.gmail.com>
Message-ID: <CAA=HWYjfZ2OBbUGg5Mp6iDniMEMfPFzEOM3v1iRwK6_A9XoDtA@mail.gmail.com>

Sorry for answering so late, but I've stayed in a very rural area of
Afghanistan and enjoyed my life :-)

I also realized that this discussion has been removed from python-ideas,
sorry!


2013/9/15 Nick Coghlan <ncoghlan at gmail.com>

> On 15 September 2013 16:06, David Halter <davidhalter88 at gmail.com> wrote:
> >
> > 2013/9/15 Nick Coghlan <ncoghlan at gmail.com>
> >>
> >> If introspection tools want to show all the operations available *on the
> >> class*, then they need to include "dir(type(cls))" as well. So there
> may be
> >> a legitimate feature request for a new section in the pydoc output
> showing
> >> "class only" methods and attributes.
> >
> >
> >  How about adding a keyword argument to `dir`: ``dir(object,
> > with_class_methods=False)``?
> >
> > I get that there are compatibility issues with changing the default `dir`
> > functionality. But at the same time adding such an option could make it
> > easier for beginners, why type attributes are not being listed (because
> one
> > could read that in the `dir` docstring).
>
> It's actually the metaclass methods/attributes that are missing. The
> trick with dir is it *stops at the class*, and thus always leaves out
> the metaclass. While in a important sense "classes are just objects",
> attribute access is a critical area where they're *different* from
> most other objects, because they play different roles in the
> descriptor protocol.
>
> That means the question is whether it is worth adding an appropriate
> flag to dir(), over updating introspection tools (like IDLE's tab
> completion as Chris points out) to consider "dir(type(cls))" when
> appropriate.
>
> "dir" currently works roughly as follows for instances:
>
>   - check the instance
>   - check the class MRO
>
> And for classes:
>
>   - check the class MRO
>
> If an "include_metaclass" flag is added, then setting it to True has
> an obvious meaning for classes:
>
>   - check the class MRO
>   - check the metaclass MRO
>
> But what does "include_metaclass=True" mean for instances? You can't
> access metaclass attributes and methods from an instance - the
> attribute lookup only traverses one step. So, it could be reasonable
> to have "include_metaclass=True" do nothing for instances, and only
> change dir() behaviour for classes.
>

Good point. I haven't thought about this, but an "include_metaclass" option
for dir can also be quite confusing (in the case of instances).


> On the other hand, if the flag was called "include_class", then it
> would need to be tri-valued:
>
>   None: use appropriate default based on the kind of object
>   True: default for instances, forces inclusion of the metaclass MRO for
> classes
>   False: default for classes, forces omission of the class MRO (and
> thus all descriptors) for instances
>
> Alternatively, if we don't change dir() at all and just document that
> getting a complete list of attributes means doing "sorted(set(dir(obj)
> + dir(type(obj))", we'd have something that works for all versions of
> Python, rather than something that was only available in 3.4+:
>

Yes, IMHO that's the least we should do. But I would strongly suggest to
adjust  the `dir` method docstrings (not only the online docs). I think
that the current documentation really needs improvement (it is quite
confusing now).

>>> def full_dir(obj):
> ...     return sorted(set(dir(obj) + dir(tyape(obj))))
> ...
> >>> len(set(full_dir(1)) - set(dir(1)))
> 0
> >>> len(set(full_dir(int)) - set(dir(int)))
> 19
> >>> len(set(dir(type)) - set(dir(object)))
> 19
>
> That's why my preference is for the latter approach - this isn't new
> behaviour, and it's introspection tools that don't handle metaclasses
> properly that need updating, rather than changing the dir() builtin.


Well, I would still opt for changing dir, but I can understand that that
would cause serious backwards-compatibility issues. If you would do that,
it would be something for a Python 4 (and we're not even close to that). So
for now that really leaves us with documenting it better.

I don't really like the "include_metaclass" option. Maybe your
"include_class" might make a little bit more sense. But even that one would
complicate things.

Cheers!
Dave
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20130926/b79b3d9b/attachment.html>

From storchaka at gmail.com  Fri Sep 27 12:07:18 2013
From: storchaka at gmail.com (Serhiy Storchaka)
Date: Fri, 27 Sep 2013 13:07:18 +0300
Subject: [Python-ideas] pprint in displayhook
Message-ID: <l23lbh$4el$1@ger.gmane.org>

What are you think about using pprint.pprint() to output the result of 
evaluating an expression entered in an interactive Python session (and 
in IDLE)?


From solipsis at pitrou.net  Fri Sep 27 12:15:18 2013
From: solipsis at pitrou.net (Antoine Pitrou)
Date: Fri, 27 Sep 2013 12:15:18 +0200
Subject: [Python-ideas] pprint in displayhook
References: <l23lbh$4el$1@ger.gmane.org>
Message-ID: <20130927121518.6a2d7bcb@pitrou.net>

Le Fri, 27 Sep 2013 13:07:18 +0300,
Serhiy Storchaka <storchaka at gmail.com> a
?crit :
> What are you think about using pprint.pprint() to output the result
> of evaluating an expression entered in an interactive Python session
> (and in IDLE)?

I'm not sure I like this idea.  AFAICT pprint() isn't bullet-proof,
and it would be a pain to debug if it failed to display some objects
properly.

Regards

Antoine.


From storchaka at gmail.com  Fri Sep 27 13:15:13 2013
From: storchaka at gmail.com (Serhiy Storchaka)
Date: Fri, 27 Sep 2013 14:15:13 +0300
Subject: [Python-ideas] pprint in displayhook
In-Reply-To: <20130927121518.6a2d7bcb@pitrou.net>
References: <l23lbh$4el$1@ger.gmane.org> <20130927121518.6a2d7bcb@pitrou.net>
Message-ID: <l23qms$55h$1@ger.gmane.org>

27.09.13 13:15, Antoine Pitrou ???????(??):
> Le Fri, 27 Sep 2013 13:07:18 +0300,
> Serhiy Storchaka <storchaka at gmail.com> a
> ?crit :
>> What are you think about using pprint.pprint() to output the result
>> of evaluating an expression entered in an interactive Python session
>> (and in IDLE)?
>
> I'm not sure I like this idea.  AFAICT pprint() isn't bullet-proof,
> and it would be a pain to debug if it failed to display some objects
> properly.

We can set displayhook in site.py and for debug restore it from 
sys.__displayhook__. This is not more painful than use readline and 
enable completion by default.


From solipsis at pitrou.net  Fri Sep 27 13:57:13 2013
From: solipsis at pitrou.net (Antoine Pitrou)
Date: Fri, 27 Sep 2013 13:57:13 +0200
Subject: [Python-ideas] pprint in displayhook
References: <l23lbh$4el$1@ger.gmane.org> <20130927121518.6a2d7bcb@pitrou.net>
 <l23qms$55h$1@ger.gmane.org>
Message-ID: <20130927135713.552f9e84@pitrou.net>

Le Fri, 27 Sep 2013 14:15:13 +0300,
Serhiy Storchaka <storchaka at gmail.com> a
?crit :
> 27.09.13 13:15, Antoine Pitrou ???????(??):
> > Le Fri, 27 Sep 2013 13:07:18 +0300,
> > Serhiy Storchaka <storchaka at gmail.com> a
> > ?crit :
> >> What are you think about using pprint.pprint() to output the result
> >> of evaluating an expression entered in an interactive Python
> >> session (and in IDLE)?
> >
> > I'm not sure I like this idea.  AFAICT pprint() isn't bullet-proof,
> > and it would be a pain to debug if it failed to display some objects
> > properly.
> 
> We can set displayhook in site.py and for debug restore it from 
> sys.__displayhook__. This is not more painful than use readline and 
> enable completion by default.

:-) I don't know, I'll let other people experiment with it.

Regards

Antoine.


From ncoghlan at gmail.com  Fri Sep 27 14:40:13 2013
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Fri, 27 Sep 2013 22:40:13 +1000
Subject: [Python-ideas] pprint in displayhook
In-Reply-To: <20130927135713.552f9e84@pitrou.net>
References: <l23lbh$4el$1@ger.gmane.org> <20130927121518.6a2d7bcb@pitrou.net>
 <l23qms$55h$1@ger.gmane.org> <20130927135713.552f9e84@pitrou.net>
Message-ID: <CADiSq7f65as-pdASWnd6SSTRm0iX0OvuR5hPpoJqDFRTsAezdQ@mail.gmail.com>

On 27 Sep 2013 21:58, "Antoine Pitrou" <solipsis at pitrou.net> wrote:
>
> Le Fri, 27 Sep 2013 14:15:13 +0300,
> Serhiy Storchaka <storchaka at gmail.com> a
> ?crit :
> > 27.09.13 13:15, Antoine Pitrou ???????(??):
> > > Le Fri, 27 Sep 2013 13:07:18 +0300,
> > > Serhiy Storchaka <storchaka at gmail.com> a
> > > ?crit :
> > >> What are you think about using pprint.pprint() to output the result
> > >> of evaluating an expression entered in an interactive Python
> > >> session (and in IDLE)?
> > >
> > > I'm not sure I like this idea.  AFAICT pprint() isn't bullet-proof,
> > > and it would be a pain to debug if it failed to display some objects
> > > properly.
> >
> > We can set displayhook in site.py and for debug restore it from
> > sys.__displayhook__. This is not more painful than use readline and
> > enable completion by default.
>
> :-) I don't know, I'll let other people experiment with it.

displayhook uses repr by default. Even normal print would make numbers and
numeric string output ambiguous.

Cheers,
Nick.

>
> Regards
>
> Antoine.
>
>
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> https://mail.python.org/mailman/listinfo/python-ideas
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20130927/31361014/attachment.html>

From eric at trueblade.com  Fri Sep 27 14:47:42 2013
From: eric at trueblade.com (Eric V. Smith)
Date: Fri, 27 Sep 2013 08:47:42 -0400
Subject: [Python-ideas] pprint in displayhook
In-Reply-To: <CADiSq7f65as-pdASWnd6SSTRm0iX0OvuR5hPpoJqDFRTsAezdQ@mail.gmail.com>
References: <l23lbh$4el$1@ger.gmane.org> <20130927121518.6a2d7bcb@pitrou.net>
 <l23qms$55h$1@ger.gmane.org> <20130927135713.552f9e84@pitrou.net>
 <CADiSq7f65as-pdASWnd6SSTRm0iX0OvuR5hPpoJqDFRTsAezdQ@mail.gmail.com>
Message-ID: <52457E6E.9040809@trueblade.com>

On 9/27/2013 8:40 AM, Nick Coghlan wrote:
> 
> On 27 Sep 2013 21:58, "Antoine Pitrou" <solipsis at pitrou.net
> <mailto:solipsis at pitrou.net>> wrote:
>>
>> Le Fri, 27 Sep 2013 14:15:13 +0300,
>> Serhiy Storchaka <storchaka at gmail.com <mailto:storchaka at gmail.com>> a
>> ?crit :
>> > 27.09.13 13:15, Antoine Pitrou ???????(??):
>> > > Le Fri, 27 Sep 2013 13:07:18 +0300,
>> > > Serhiy Storchaka <storchaka at gmail.com <mailto:storchaka at gmail.com>> a
>> > > ?crit :
>> > >> What are you think about using pprint.pprint() to output the result
>> > >> of evaluating an expression entered in an interactive Python
>> > >> session (and in IDLE)?
>> > >
>> > > I'm not sure I like this idea.  AFAICT pprint() isn't bullet-proof,
>> > > and it would be a pain to debug if it failed to display some objects
>> > > properly.
>> >
>> > We can set displayhook in site.py and for debug restore it from
>> > sys.__displayhook__. This is not more painful than use readline and
>> > enable completion by default.
>>
>> :-) I don't know, I'll let other people experiment with it.
> 
> displayhook uses repr by default. Even normal print would make numbers
> and numeric string output ambiguous.

Wouldn't this also invalidate the millions (I'm guessing) of examples in
blogs, how-tos, etc. that show interactive command line examples? I'm
sympathetic, but I don't think it's worth it.

-- 
Eric.

From solipsis at pitrou.net  Fri Sep 27 15:05:50 2013
From: solipsis at pitrou.net (Antoine Pitrou)
Date: Fri, 27 Sep 2013 15:05:50 +0200
Subject: [Python-ideas] pprint in displayhook
References: <l23lbh$4el$1@ger.gmane.org> <20130927121518.6a2d7bcb@pitrou.net>
 <l23qms$55h$1@ger.gmane.org> <20130927135713.552f9e84@pitrou.net>
 <CADiSq7f65as-pdASWnd6SSTRm0iX0OvuR5hPpoJqDFRTsAezdQ@mail.gmail.com>
 <52457E6E.9040809@trueblade.com>
Message-ID: <20130927150550.2a60cf96@pitrou.net>

Le Fri, 27 Sep 2013 08:47:42 -0400,
"Eric V. Smith" <eric at trueblade.com> a ?crit :
> On 9/27/2013 8:40 AM, Nick Coghlan wrote:
> > 
> > On 27 Sep 2013 21:58, "Antoine Pitrou" <solipsis at pitrou.net
> > <mailto:solipsis at pitrou.net>> wrote:
> >>
> >> Le Fri, 27 Sep 2013 14:15:13 +0300,
> >> Serhiy Storchaka <storchaka at gmail.com
> >> <mailto:storchaka at gmail.com>> a ?crit :
> >> > 27.09.13 13:15, Antoine Pitrou ???????(??):
> >> > > Le Fri, 27 Sep 2013 13:07:18 +0300,
> >> > > Serhiy Storchaka <storchaka at gmail.com
> >> > > <mailto:storchaka at gmail.com>> a ?crit :
> >> > >> What are you think about using pprint.pprint() to output the
> >> > >> result of evaluating an expression entered in an interactive
> >> > >> Python session (and in IDLE)?
> >> > >
> >> > > I'm not sure I like this idea.  AFAICT pprint() isn't
> >> > > bullet-proof, and it would be a pain to debug if it failed to
> >> > > display some objects properly.
> >> >
> >> > We can set displayhook in site.py and for debug restore it from
> >> > sys.__displayhook__. This is not more painful than use readline
> >> > and enable completion by default.
> >>
> >> :-) I don't know, I'll let other people experiment with it.
> > 
> > displayhook uses repr by default. Even normal print would make
> > numbers and numeric string output ambiguous.
> 
> Wouldn't this also invalidate the millions (I'm guessing) of examples
> in blogs, how-tos, etc. that show interactive command line examples?
> I'm sympathetic, but I don't think it's worth it.

Oh and how about... doctest? :-)


From storchaka at gmail.com  Fri Sep 27 15:52:09 2013
From: storchaka at gmail.com (Serhiy Storchaka)
Date: Fri, 27 Sep 2013 16:52:09 +0300
Subject: [Python-ideas] pprint in displayhook
In-Reply-To: <20130927135713.552f9e84@pitrou.net>
References: <l23lbh$4el$1@ger.gmane.org> <20130927121518.6a2d7bcb@pitrou.net>
 <l23qms$55h$1@ger.gmane.org> <20130927135713.552f9e84@pitrou.net>
Message-ID: <l242h6$bqo$1@ger.gmane.org>

27.09.13 14:57, Antoine Pitrou ???????(??):
> Le Fri, 27 Sep 2013 14:15:13 +0300,
> Serhiy Storchaka <storchaka at gmail.com> a
> ?crit :
>> 27.09.13 13:15, Antoine Pitrou ???????(??):
>>> Le Fri, 27 Sep 2013 13:07:18 +0300,
>>> Serhiy Storchaka <storchaka at gmail.com> a
>>> ?crit :
>>>> What are you think about using pprint.pprint() to output the result
>>>> of evaluating an expression entered in an interactive Python
>>>> session (and in IDLE)?
>>>
>>> I'm not sure I like this idea.  AFAICT pprint() isn't bullet-proof,
>>> and it would be a pain to debug if it failed to display some objects
>>> properly.
>>
>> We can set displayhook in site.py and for debug restore it from
>> sys.__displayhook__. This is not more painful than use readline and
>> enable completion by default.
>
> :-) I don't know, I'll let other people experiment with it.

http://bugs.python.org/issue19103

Of course we should first resolve some other pprint-related issues (i.e. 
#19100, #19104).


From storchaka at gmail.com  Fri Sep 27 15:55:12 2013
From: storchaka at gmail.com (Serhiy Storchaka)
Date: Fri, 27 Sep 2013 16:55:12 +0300
Subject: [Python-ideas] pprint in displayhook
In-Reply-To: <20130927150550.2a60cf96@pitrou.net>
References: <l23lbh$4el$1@ger.gmane.org> <20130927121518.6a2d7bcb@pitrou.net>
 <l23qms$55h$1@ger.gmane.org> <20130927135713.552f9e84@pitrou.net>
 <CADiSq7f65as-pdASWnd6SSTRm0iX0OvuR5hPpoJqDFRTsAezdQ@mail.gmail.com>
 <52457E6E.9040809@trueblade.com> <20130927150550.2a60cf96@pitrou.net>
Message-ID: <l242mp$bqo$2@ger.gmane.org>

27.09.13 16:05, Antoine Pitrou ???????(??):
>> Wouldn't this also invalidate the millions (I'm guessing) of examples
>> in blogs, how-tos, etc. that show interactive command line examples?
>> I'm sympathetic, but I don't think it's worth it.
>
> Oh and how about... doctest? :-)

Doctest restores sys.__displayhook__.


From solipsis at pitrou.net  Fri Sep 27 15:59:51 2013
From: solipsis at pitrou.net (Antoine Pitrou)
Date: Fri, 27 Sep 2013 15:59:51 +0200
Subject: [Python-ideas] pprint in displayhook
References: <l23lbh$4el$1@ger.gmane.org> <20130927121518.6a2d7bcb@pitrou.net>
 <l23qms$55h$1@ger.gmane.org> <20130927135713.552f9e84@pitrou.net>
 <CADiSq7f65as-pdASWnd6SSTRm0iX0OvuR5hPpoJqDFRTsAezdQ@mail.gmail.com>
 <52457E6E.9040809@trueblade.com>
 <20130927150550.2a60cf96@pitrou.net> <l242mp$bqo$2@ger.gmane.org>
Message-ID: <20130927155951.63623b10@pitrou.net>

Le Fri, 27 Sep 2013 16:55:12 +0300,
Serhiy Storchaka <storchaka at gmail.com> a
?crit :
> 27.09.13 16:05, Antoine Pitrou ???????(??):
> >> Wouldn't this also invalidate the millions (I'm guessing) of
> >> examples in blogs, how-tos, etc. that show interactive command
> >> line examples? I'm sympathetic, but I don't think it's worth it.
> >
> > Oh and how about... doctest? :-)
> 
> Doctest restores sys.__displayhook__.

I'm thinking more about the consistency of doctest output with actual
interpreter output. One of the selling points of doctest is that it
helps showcase API behaviour by showing interactive interpreter
snippets. If the actual output starts being different, it might confuse
people.

Regards

Antoine.


From storchaka at gmail.com  Fri Sep 27 15:57:29 2013
From: storchaka at gmail.com (Serhiy Storchaka)
Date: Fri, 27 Sep 2013 16:57:29 +0300
Subject: [Python-ideas] pprint in displayhook
In-Reply-To: <52457E6E.9040809@trueblade.com>
References: <l23lbh$4el$1@ger.gmane.org> <20130927121518.6a2d7bcb@pitrou.net>
 <l23qms$55h$1@ger.gmane.org> <20130927135713.552f9e84@pitrou.net>
 <CADiSq7f65as-pdASWnd6SSTRm0iX0OvuR5hPpoJqDFRTsAezdQ@mail.gmail.com>
 <52457E6E.9040809@trueblade.com>
Message-ID: <l242r2$bqo$3@ger.gmane.org>

27.09.13 15:47, Eric V. Smith ???????(??):
> Wouldn't this also invalidate the millions (I'm guessing) of examples in
> blogs, how-tos, etc. that show interactive command line examples? I'm
> sympathetic, but I don't think it's worth it.

Only if they are too long.


From storchaka at gmail.com  Fri Sep 27 16:19:55 2013
From: storchaka at gmail.com (Serhiy Storchaka)
Date: Fri, 27 Sep 2013 17:19:55 +0300
Subject: [Python-ideas] pprint in displayhook
In-Reply-To: <20130927155951.63623b10@pitrou.net>
References: <l23lbh$4el$1@ger.gmane.org> <20130927121518.6a2d7bcb@pitrou.net>
 <l23qms$55h$1@ger.gmane.org> <20130927135713.552f9e84@pitrou.net>
 <CADiSq7f65as-pdASWnd6SSTRm0iX0OvuR5hPpoJqDFRTsAezdQ@mail.gmail.com>
 <52457E6E.9040809@trueblade.com> <20130927150550.2a60cf96@pitrou.net>
 <l242mp$bqo$2@ger.gmane.org> <20130927155951.63623b10@pitrou.net>
Message-ID: <l24458$1fv$1@ger.gmane.org>

27.09.13 16:59, Antoine Pitrou ???????(??):
> Le Fri, 27 Sep 2013 16:55:12 +0300,
> Serhiy Storchaka <storchaka at gmail.com> a
> ?crit :
>> 27.09.13 16:05, Antoine Pitrou ???????(??):
>>>> Wouldn't this also invalidate the millions (I'm guessing) of
>>>> examples in blogs, how-tos, etc. that show interactive command
>>>> line examples? I'm sympathetic, but I don't think it's worth it.
>>>
>>> Oh and how about... doctest? :-)
>>
>> Doctest restores sys.__displayhook__.
>
> I'm thinking more about the consistency of doctest output with actual
> interpreter output. One of the selling points of doctest is that it
> helps showcase API behaviour by showing interactive interpreter
> snippets. If the actual output starts being different, it might confuse
> people.

Most doctests are significant shorter 80 columns. Actually doctest can't 
be longer without violating PEP8.


From g.brandl at gmx.net  Fri Sep 27 18:55:14 2013
From: g.brandl at gmx.net (Georg Brandl)
Date: Fri, 27 Sep 2013 18:55:14 +0200
Subject: [Python-ideas] pprint in displayhook
In-Reply-To: <l23lbh$4el$1@ger.gmane.org>
References: <l23lbh$4el$1@ger.gmane.org>
Message-ID: <l24d7k$q1p$1@ger.gmane.org>

Am 27.09.2013 12:07, schrieb Serhiy Storchaka:
> What are you think about using pprint.pprint() to output the result of 
> evaluating an expression entered in an interactive Python session (and 
> in IDLE)?
> 

This is something users can set in their sitecustomize.py; for various
reasons people have already mentioned it is not a sensible choice for
default interactive interpreters.

It might be different for IDLE; I don't know how faithfully it follows
the interactive interpreter in other regards.

Georg


From bruce at leapyear.org  Fri Sep 27 19:02:43 2013
From: bruce at leapyear.org (Bruce Leban)
Date: Fri, 27 Sep 2013 10:02:43 -0700
Subject: [Python-ideas] pprint in displayhook
In-Reply-To: <l23lbh$4el$1@ger.gmane.org>
References: <l23lbh$4el$1@ger.gmane.org>
Message-ID: <CAGu0Anvz7PPaq7R0htUN=BjGjywnnF-BkP3hwRTEEBoDqXtHQQ@mail.gmail.com>

It's a great idea to be able to do this. Fortunately, you already can.
Changing the Python default is a terrible idea for all the other reasons
people mentioned, not least of which is the fact that pprint doesn't work
for all inputs.

I suggest writing a recipe that provides a pprint-ish replacement for repr.
I suggest producing results that are identical to repr with only two
exceptions: (1) adding whitespace, (2) adding line breaks to strings. While
I would never add this by default, there are certainly times where I would
prefer prettier output.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20130927/38965625/attachment.html>

From tjreedy at udel.edu  Fri Sep 27 21:25:15 2013
From: tjreedy at udel.edu (Terry Reedy)
Date: Fri, 27 Sep 2013 15:25:15 -0400
Subject: [Python-ideas] pprint in displayhook
In-Reply-To: <l24d7k$q1p$1@ger.gmane.org>
References: <l23lbh$4el$1@ger.gmane.org> <l24d7k$q1p$1@ger.gmane.org>
Message-ID: <l24m2k$7ha$1@ger.gmane.org>

On 9/27/2013 12:55 PM, Georg Brandl wrote:
> Am 27.09.2013 12:07, schrieb Serhiy Storchaka:
>> What are you think about using pprint.pprint() to output the result of
>> evaluating an expression entered in an interactive Python session (and
>> in IDLE)?
>>
>
> This is something users can set in their sitecustomize.py; for various
> reasons people have already mentioned it is not a sensible choice for
> default interactive interpreters.

I agree. The default interpreter cannot be configured on the fly.

> It might be different for IDLE;

It has a menu for both one-time actions and changing defaults. I think a 
menu item and hot-key to re-display the last output object with pprint 
would be a nice little feature. The object remains bound to '_' in the 
user process, so executing "pprint.pprint(_)" should be possible.
(The minor problem is that even if pprint is loaded in sys.modules on 
startup, it might not be in the user global namespace.)

Without seeing a bug-fixed pprint in action, I could not be sure about 
turning on pprint for all output. It is not needed often.

The sitecustomize option would work for a permanent change.

 > I don't know how faithfully it follows
> the interactive interpreter in other regards.

We try have it act the same except where there is a good reason not to.

-- 
Terry Jan Reedy


From steve at pearwood.info  Sat Sep 28 02:16:13 2013
From: steve at pearwood.info (Steven D'Aprano)
Date: Sat, 28 Sep 2013 10:16:13 +1000
Subject: [Python-ideas] pprint in displayhook
In-Reply-To: <l23lbh$4el$1@ger.gmane.org>
References: <l23lbh$4el$1@ger.gmane.org>
Message-ID: <20130928001613.GP7989@ando>

On Fri, Sep 27, 2013 at 01:07:18PM +0300, Serhiy Storchaka wrote:
> What are you think about using pprint.pprint() to output the result of 
> evaluating an expression entered in an interactive Python session (and 
> in IDLE)?

Well, let's try it and see...


py> L = list(range(50))
py> print(L)
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 
20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 
38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49]
py> pprint.pprint(L)
[0,
 1,
 2,
 3,
 4,
 5,
 6,
 7,
 8,
 9,
 10,
 11,
 12,
 13,
 14,
 15,
 16,
 17,
 18,
 19,
 20,
 21,
 22,
 23,
 24,
 25,
 26,
 27,
 28,
 29,
 30,
 31,
 32,
 33,
 34,
 35,
 36,
 37,
 38,
 39,
 40,
 41,
 42,
 43,
 44,
 45,
 46,
 47,
 48,
 49]


That would be an Absolutely Not.

However, if somebody wanted to give pprint some attention to make it 
actually pretty print, that would be very welcome.


-- 
Steven

From raymond.hettinger at gmail.com  Sat Sep 28 06:17:14 2013
From: raymond.hettinger at gmail.com (Raymond Hettinger)
Date: Fri, 27 Sep 2013 21:17:14 -0700
Subject: [Python-ideas] pprint in displayhook
In-Reply-To: <l23lbh$4el$1@ger.gmane.org>
References: <l23lbh$4el$1@ger.gmane.org>
Message-ID: <0BFAAF4A-F5C8-48FA-9C82-1B60D164A033@gmail.com>


On Sep 27, 2013, at 3:07 AM, Serhiy Storchaka <storchaka at gmail.com> wrote:

> What are you think about using pprint.pprint() to output the result of evaluating an expression entered in an interactive Python session (and in IDLE)?

This might be a reasonable idea if pprint were in better shape.
I think substantial work needs to be done on it, before it would
be worthy of becoming the default method of display.


Raymond

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20130927/383f9f89/attachment.html>

From techtonik at gmail.com  Sat Sep 28 06:44:46 2013
From: techtonik at gmail.com (anatoly techtonik)
Date: Sat, 28 Sep 2013 07:44:46 +0300
Subject: [Python-ideas] Python 3.4 should include docopt as-is
Message-ID: <CAPkN8xJ_-jq-iji9gZ+9x2QmWax+5MEeb+DUT2J_iWhV7VzZpg@mail.gmail.com>

This - http://docopt.org/ - should be included with Python 3.4 distribution.
-- 
anatoly t.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20130928/02abf63f/attachment.html>

From techtonik at gmail.com  Sat Sep 28 07:19:44 2013
From: techtonik at gmail.com (anatoly techtonik)
Date: Sat, 28 Sep 2013 08:19:44 +0300
Subject: [Python-ideas] 'from os.path import FILE,
	DIR' or internal structure of filenames
Message-ID: <CAPkN8xL4AofRrB-LE9HnZ1wnmRxCRU278pcBhqoT=T_z-98kdA@mail.gmail.com>

FILE = os.path.abspath(__file__)
DIR = os.path.abspath(os.path.dirname(__file__))
?

Repeated pattern for referencing resources relative to your scripts. Ideas
about alternative names / locations are welcome.

In PHP these are __FILE__ and __DIR__. For Python 3 adding __dir__ is
impossible, because the name clashes with __dir__ method (which is not
implemented for module object, but should be [ ] for consistency). Also
current __file__ is rarely absolute path, because it is never normalized [
].

So it will be nice to see normalization of Python file name after the
import to reduce mess and make its behaviour predictable -
http://stackoverflow.com/questions/7116889/python-file-attribute-absolute-or-relative


----[ possible spec. draft for a beautiful internal structure ]--
The Python interpreter should provide run-time information about:
1. order of import sequence
2. names of imported modules
3. unique location for each imported module which unambiguously identifies
it
4. run-time import dependency tree (not sure about this, but it can help
with debugging)
5. information about sys.path entry where this module was imported from
6. information about who and when added this sys.path entry
-- 
anatoly t.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20130928/7579582d/attachment.html>

From tjreedy at udel.edu  Sat Sep 28 09:24:43 2013
From: tjreedy at udel.edu (Terry Reedy)
Date: Sat, 28 Sep 2013 03:24:43 -0400
Subject: [Python-ideas] pprint in displayhook
In-Reply-To: <20130928001613.GP7989@ando>
References: <l23lbh$4el$1@ger.gmane.org> <20130928001613.GP7989@ando>
Message-ID: <l2607k$nei$1@ger.gmane.org>

On 9/27/2013 8:16 PM, Steven D'Aprano wrote:
> On Fri, Sep 27, 2013 at 01:07:18PM +0300, Serhiy Storchaka wrote:
>> What are you think about using pprint.pprint() to output the result of
>> evaluating an expression entered in an interactive Python session (and
>> in IDLE)?
>
> Well, let's try it and see...
>
>
> py> L = list(range(50))
> py> print(L)
> [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19,
> 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37,
> 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49]
> py> pprint.pprint(L)
> [0,
>   1,
>   2,
>   3,
>   4,
>   5,
>   6,
>   7,
>   8,
>   9,
>   10,
>   11,
>   12,
>   13,
>   14,
>   15,
>   16,
>   17,
>   18,
>   19,
>   20,
>   21,
>   22,
>   23,
>   24,
>   25,
>   26,
>   27,
>   28,
>   29,
>   30,
>   31,
>   32,
>   33,
>   34,
>   35,
>   36,
>   37,
>   38,
>   39,
>   40,
>   41,
>   42,
>   43,
>   44,
>   45,
>   46,
>   47,
>   48,
>   49]
>
>
> That would be an Absolutely Not.

This is why I suggested that I would consider making it available in 
Idle on a per object basis, for things like this

 >>> L = ['This is the first sentence.', 'This is the second, lets make 
it onger', 'and this is the third, but do not stop yet']
 >>> L
['This is the first sentence.', 'This is the second, lets make it 
onger', 'and this is the third, but do not stop yet']
 >>> import pprint
 >>> pprint.pprint(L)
['This is the first sentence.',
  'This is the second, lets make it onger',
  'and this is the third, but do not stop yet']

But your example is more typical of my usage.

> However, if somebody wanted to give pprint some attention to make it
> actually pretty print, that would be very welcome.

-- 
Terry Jan Reedy


From steve at pearwood.info  Sat Sep 28 10:10:09 2013
From: steve at pearwood.info (Steven D'Aprano)
Date: Sat, 28 Sep 2013 18:10:09 +1000
Subject: [Python-ideas] Python 3.4 should include docopt as-is
In-Reply-To: <CAPkN8xJ_-jq-iji9gZ+9x2QmWax+5MEeb+DUT2J_iWhV7VzZpg@mail.gmail.com>
References: <CAPkN8xJ_-jq-iji9gZ+9x2QmWax+5MEeb+DUT2J_iWhV7VzZpg@mail.gmail.com>
Message-ID: <20130928081009.GU7989@ando>

On Sat, Sep 28, 2013 at 07:44:46AM +0300, anatoly techtonik wrote:
> This - http://docopt.org/ - should be included with Python 3.4 distribution.

Are you the developer or maintainer of docopt?

If so, you'll probably need to write a PEP. Otherwise, you'll need to 
ask the maintainer of docopt to write a PEP. Some questions that will 
need to be asked:

- does the maintainer agree to distribute the software under the same 
licence as Python?

- does the maintainer agree to stick to Python's release schedule?

- is the maintainer happy with keeping the API frozen for the next ten 
or fifteen years?

I see that docopt is now up to version 0.6.1. To me, that indicates that 
the API should not be considered stable, it's under version 1. Perhaps 
the maintainer disagrees, and would be happy to freeze the API now.


-- 
Steven

From kwpolska at gmail.com  Sat Sep 28 11:02:52 2013
From: kwpolska at gmail.com (=?UTF-8?B?Q2hyaXMg4oCcS3dwb2xza2HigJ0gV2Fycmljaw==?=)
Date: Sat, 28 Sep 2013 11:02:52 +0200
Subject: [Python-ideas] Python 3.4 should include docopt as-is
In-Reply-To: <20130928081009.GU7989@ando>
References: <CAPkN8xJ_-jq-iji9gZ+9x2QmWax+5MEeb+DUT2J_iWhV7VzZpg@mail.gmail.com>
 <20130928081009.GU7989@ando>
Message-ID: <CAMw+j7+kKSJ0ZroB1NfbZueW8-zoQEyMuj2ewUTEv6ddbrcPFg@mail.gmail.com>

On Sat, Sep 28, 2013 at 10:10 AM, Steven D'Aprano <steve at pearwood.info> wrote:
> On Sat, Sep 28, 2013 at 07:44:46AM +0300, anatoly techtonik wrote:
>> This - http://docopt.org/ - should be included with Python 3.4 distribution.
>
> Are you the developer or maintainer of docopt?

He is not.  I CC?d the developer, Vladimir Keleshev.

> If so, you'll probably need to write a PEP. Otherwise, you'll need to
> ask the maintainer of docopt to write a PEP. Some questions that will
> need to be asked:
>
> - does the maintainer agree to distribute the software under the same
> licence as Python?
>
> - does the maintainer agree to stick to Python's release schedule?
>
> - is the maintainer happy with keeping the API frozen for the next ten
> or fifteen years?
>
> I see that docopt is now up to version 0.6.1. To me, that indicates that
> the API should not be considered stable, it's under version 1. Perhaps
> the maintainer disagrees, and would be happy to freeze the API now.
>
>
> --
> Steven
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> https://mail.python.org/mailman/listinfo/python-ideas


-- 
Chris ?Kwpolska? Warrick <http://kwpolska.tk>
PGP: 5EAAEA16
stop html mail | always bottom-post | only UTF-8 makes sense

From breamoreboy at yahoo.co.uk  Sat Sep 28 11:22:52 2013
From: breamoreboy at yahoo.co.uk (Mark Lawrence)
Date: Sat, 28 Sep 2013 10:22:52 +0100
Subject: [Python-ideas] Python 3.4 should include docopt as-is
In-Reply-To: <CAPkN8xJ_-jq-iji9gZ+9x2QmWax+5MEeb+DUT2J_iWhV7VzZpg@mail.gmail.com>
References: <CAPkN8xJ_-jq-iji9gZ+9x2QmWax+5MEeb+DUT2J_iWhV7VzZpg@mail.gmail.com>
Message-ID: <l26753$pc1$1@ger.gmane.org>

On 28/09/2013 05:44, anatoly techtonik wrote:
> This - http://docopt.org/ - should be included with Python 3.4 distribution.
> --
> anatoly t.
>

Have you had the courtesy to ask the maintainer of this library their 
opinions prior to placing this?

-- 
Cheers.

Mark Lawrence


From techtonik at gmail.com  Sat Sep 28 11:59:03 2013
From: techtonik at gmail.com (anatoly techtonik)
Date: Sat, 28 Sep 2013 12:59:03 +0300
Subject: [Python-ideas] AST Hash
In-Reply-To: <CAGmFidZpd5ocuOUR9KcS5QXY6q19-9GCwUDfZvEJat_7xMRvdg@mail.gmail.com>
References: <CAPkN8x+S--RVHUtgJCN2m8RxzHTpVr7W2mBj-RwFctMUNoGJMA@mail.gmail.com>
 <CAGmFidZpd5ocuOUR9KcS5QXY6q19-9GCwUDfZvEJat_7xMRvdg@mail.gmail.com>
Message-ID: <CAPkN8xJKeXq8d3twPOfjBZJ4N2G4sU710gVn2CgHb9tq_id0Mw@mail.gmail.com>

On Wed, Sep 11, 2013 at 8:05 PM, Amaury Forgeot d'Arc
<amauryfa at gmail.com> wrote:
> 2013/9/11 anatoly techtonik <techtonik at gmail.com>
>>
>> Hi,
>>
>> We need a checksum for code pieces. The goal of the checksum is to
>> reliably detect pieces of code with absolutely identical behaviour.
>> Borders of such checksum can be functions, classes, modules,.
>
>
> This looks like a nice project; I think this should first take the form of
> an external package.
> I'm sure there are many details to iron before this kind of technique can be
> widely adopted.

Yes, it is just an idea.

> For example:
> - Is there only one kind of hash? you suggested to erase the differences in
> variable names, are there other possible customizations?

Yes. There are different kinds of hashes depending on purpose, that
why I explicitly mentioned that AST hashes are named. Every name
corresponds to single purpose and to single set of filtering rules. I
can see at les Possible customizations:

-- 1 comments, docstrings and wihtespace handling --
1. preserve all whitespace including comments
2. preserve comments
3. standard erase comments, preserve docstrings
4. erase comments in addition to docstrings

-- 2 variable names handling --
1. preserve all
2. preserve external
3. preserve stdlib names (stdlib needs to be described to detect
namespace is from stdlib)
4. preserve thirdparty module names
5. preserve classes, rename variables
6. rename everything (abstract pattern matching)

Are stdlib detection ideas welcome?

> - To detect common patterns, is it interesting to hash and index all the
> nodes of an AST tree?

I am not sure, I need these hashes for sharing and detecting updates
to code snippets contained in various .py files across various Python
projects. I like to think that snippets are constrained on function or
class boundary, or else the management is rather tiresome.

> - Is there a central repository to store hashes of recipes? Is Google Search
> enough?

Google search indexes hashes of each revision for Mercurial
repositories. Sure it can do this too. Maintaining and downloading
files and snippets by hash from PyPI would be interesting. It seems
that most cloud storage solutions use hashes for storage, so
implementing this should be even easier than installing PyPI mirror.

> I don't need answers, only a reference implementation that people can
> discuss!

Reference implementation will take some time for sure. It may never be
done even, because things like
https://bitbucket.org/techtonik/python-stdlib/ have higher priority
and don't have sponsors.
-- 
anatoly t.

From techtonik at gmail.com  Sat Sep 28 12:30:35 2013
From: techtonik at gmail.com (anatoly techtonik)
Date: Sat, 28 Sep 2013 13:30:35 +0300
Subject: [Python-ideas] AST Hash
In-Reply-To: <5230D853.6090108@egenix.com>
References: <CAPkN8x+S--RVHUtgJCN2m8RxzHTpVr7W2mBj-RwFctMUNoGJMA@mail.gmail.com>
 <5230D853.6090108@egenix.com>
Message-ID: <CAPkN8xJqC1MZX9-OPhPY0sgMHzSTW=dujoc+ENsUOdEG6UbiAA@mail.gmail.com>

On Wed, Sep 11, 2013 at 11:53 PM, M.-A. Lemburg <mal at egenix.com> wrote:
> On 11.09.2013 18:05, anatoly techtonik wrote:
>> Hi,
>>
>> We need a checksum for code pieces. The goal of the checksum is to
>> reliably detect pieces of code with absolutely identical behaviour.
>> Borders of such checksum can be functions, classes, modules,.
>> Practical application for such checksums are:
>>
>>  - detecting usage of recipes and examples across PyPI packages
>>  - detecting usage of standard stdlib calls
>>  - creating execution safe serialization formats for data
>>    - choosing class to deserialize data fields of the object based on its hash
>>  - enable consistent validation and testing of results across various AST tools
>>
>> There can be two approaches to build such checksum:
>> 1. Code Section Hash
>> 2. AST Hash
>>
>> Code Section Hash is built from a substring of a source code, cut on
>> function or class boundaries. This hash is flaky - whitespace and
>> comment differences ruin it, even when behaviour (and bytecode) stays
>> the same. It is possible to reduce the effect of whitespace and
>> comment changes by normalizing the substring - dedenting, reindenting
>> with 4 spaces, stripping empty lines, comments and trailing
>> whitespace. And it still will be unreliable and affected by whitespace
>> changes in the middle of the string. Therefore a 2nd way of hashing is
>> more preferable.
>>
>> AST Hash is build on AST. This excludes any comments, whitespace etc.
>> and makes the hash strict and reliable. This is a canonical Default
>> AST Hash.
>>
>> There are cases when Default AST Hash may not be enough for
>> comparison. For example, if local variables are renamed, or docstrings
>> changed, the behaviour of a function may not change, but its AST hash
>> will. In these cases additional normalization rules apply. Such as
>> changing all local variable names to var1, var2, ... in order of
>> appearance, stripping docstrings etc. Every set of such normalization
>> rules should have a name. This will also be the name of resulting
>> custom AST Hash.
>>
>> Explicit naming of AST Hashes and hardlinking of names to rules that
>> are used to build them will settle common ground (base) for AST tools
>> interoperability and research papers. As such, it most likely require
>> a separate PEP.
>
> You might want to have a look at this paper which discussed
> AST compression (for Java, but the ideas apply to Python just
> as well):
>
> http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.135.5917&rep=rep1&type=pdf
>
> If you compress the AST into a string and take its hash,
> you should pretty much have what you want.

Thanks for the link. The idea to transform AST to string is obvious -
I don't know other way to build hash than to feed some kind of binary
string to function. But the guys addressed different problem -
bytecode is harder to compress that AST, and I agree, because it is
easier to analyse common patterns in AST and tune compressing
algorithms accordingly. Structure and dependency in binary data
matters and binary compression algorithms are usually dump. Google
improved bsdiff compression a lot for executable by making it aware of
binary structure.

"..compressed AST provide ..., platform independence" - I thought that
Java byte code platform independent.

If I could write a paper for every idea that I have (or at least draw
a diagram), I could be a president of academy of sciences already. =)
Anyway, an interesting reading. Unfortunately, not much time for that.
I am not sure their implementation can be adopted as a prototype for
hash implementation, it seems that simple tree walker should do the
job.
--
anatoly t.

From ned at nedbatchelder.com  Sat Sep 28 12:42:42 2013
From: ned at nedbatchelder.com (Ned Batchelder)
Date: Sat, 28 Sep 2013 06:42:42 -0400
Subject: [Python-ideas] Python 3.4 should include docopt as-is
In-Reply-To: <CAPkN8xJ_-jq-iji9gZ+9x2QmWax+5MEeb+DUT2J_iWhV7VzZpg@mail.gmail.com>
References: <CAPkN8xJ_-jq-iji9gZ+9x2QmWax+5MEeb+DUT2J_iWhV7VzZpg@mail.gmail.com>
Message-ID: <5246B2A2.4080508@nedbatchelder.com>

On 9/28/13 12:44 AM, anatoly techtonik wrote:
> This - http://docopt.org/ - should be included with Python 3.4 
> distribution.

In addition to the other questions already asked, you haven't answered 
the fundamental one: Why should docopt be included in the stdlib?  It's 
right there in PyPI where any one can get it.  Why is it better in the 
stdlib than in PyPI?

--Ned.
> -- 
> anatoly t.
>
>
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> https://mail.python.org/mailman/listinfo/python-ideas

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20130928/14408fdd/attachment.html>

From flying-sheep at web.de  Sat Sep 28 13:34:59 2013
From: flying-sheep at web.de (Philipp A.)
Date: Sat, 28 Sep 2013 13:34:59 +0200
Subject: [Python-ideas] 'from os.path import FILE,
 DIR' or internal structure of filenames
In-Reply-To: <CAPkN8xL4AofRrB-LE9HnZ1wnmRxCRU278pcBhqoT=T_z-98kdA@mail.gmail.com>
References: <CAPkN8xL4AofRrB-LE9HnZ1wnmRxCRU278pcBhqoT=T_z-98kdA@mail.gmail.com>
Message-ID: <CAN8d9gkvXm72xM4CtjqzWsG3fTV0KKf9AJjgKD36oW3V5LsKOg@mail.gmail.com>

as much as i would like the convenience, python has very few magic globals,
and they all have names encased in 4 underscores.

if we really add more globals, why not __abs_file__ and __abs_dir__ or sth.
like that?


2013/9/28 anatoly techtonik <techtonik at gmail.com>

> FILE = os.path.abspath(__file__)
> DIR = os.path.abspath(os.path.dirname(__file__))
> ?
>
> Repeated pattern for referencing resources relative to your scripts. Ideas
> about alternative names / locations are welcome.
>
> In PHP these are __FILE__ and __DIR__. For Python 3 adding __dir__ is
> impossible, because the name clashes with __dir__ method (which is not
> implemented for module object, but should be [ ] for consistency). Also
> current __file__ is rarely absolute path, because it is never normalized [
> ].
>
> So it will be nice to see normalization of Python file name after the
> import to reduce mess and make its behaviour predictable -
> http://stackoverflow.com/questions/7116889/python-file-attribute-absolute-or-relative
>
>
> ----[ possible spec. draft for a beautiful internal structure ]--
> The Python interpreter should provide run-time information about:
> 1. order of import sequence
>  2. names of imported modules
> 3. unique location for each imported module which unambiguously identifies
> it
> 4. run-time import dependency tree (not sure about this, but it can help
> with debugging)
> 5. information about sys.path entry where this module was imported from
> 6. information about who and when added this sys.path entry
> --
> anatoly t.
>
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> https://mail.python.org/mailman/listinfo/python-ideas
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20130928/78822f66/attachment.html>

From g.rodola at gmail.com  Sat Sep 28 13:48:02 2013
From: g.rodola at gmail.com (Giampaolo Rodola')
Date: Sat, 28 Sep 2013 13:48:02 +0200
Subject: [Python-ideas] PyPi per-file download counters
Message-ID: <CAFYqXL82hHpZ5PaY3NK9_FZ0kxGmhmTAQ0j2erc6cwz15B==cg@mail.gmail.com>

I recently moved psutil .tar.gz and .exe files from Google Code to PyPi and
noticed it doesn't show total per-file download counters:
https://pypi.python.org/pypi?:action=display&name=psutil#downloads
Why don't we add them? Thoughts?

--- Giampaolo
http://code.google.com/p/pyftpdlib/
http://code.google.com/p/psutil/
http://code.google.com/p/pysendfile/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20130928/d465bf30/attachment.html>

From ram.rachum at gmail.com  Sat Sep 28 10:53:37 2013
From: ram.rachum at gmail.com (Ram Rachum)
Date: Sat, 28 Sep 2013 01:53:37 -0700 (PDT)
Subject: [Python-ideas] String formatting
Message-ID: <d076b095-749a-458b-9fcb-3db47baf36eb@googlegroups.com>

Any reason why string formatting using % doesn't work when the list of 
arguments is in a list rather than a tuple?
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20130928/1415c71c/attachment-0001.html>

From g.rodola at gmail.com  Sat Sep 28 13:57:00 2013
From: g.rodola at gmail.com (Giampaolo Rodola')
Date: Sat, 28 Sep 2013 13:57:00 +0200
Subject: [Python-ideas] PyPi per-file download counters
In-Reply-To: <CAFYqXL82hHpZ5PaY3NK9_FZ0kxGmhmTAQ0j2erc6cwz15B==cg@mail.gmail.com>
References: <CAFYqXL82hHpZ5PaY3NK9_FZ0kxGmhmTAQ0j2erc6cwz15B==cg@mail.gmail.com>
Message-ID: <CAFYqXL8o6FjwSxfOkuQE3k8A-imogimpaGoqV1ay=o_jTKBCGQ@mail.gmail.com>

...also, it seems the current counters are broken.
I uploaded those files this morning and the page says there were over 5000
downloads in the last month.


--- Giampaolo
http://code.google.com/p/pyftpdlib/
http://code.google.com/p/psutil/
http://code.google.com/p/pysendfile/


On Sat, Sep 28, 2013 at 1:48 PM, Giampaolo Rodola' <g.rodola at gmail.com>wrote:

> I recently moved psutil .tar.gz and .exe files from Google Code to PyPi
> and noticed it doesn't show total per-file download counters:
> https://pypi.python.org/pypi?:action=display&name=psutil#downloads
> Why don't we add them? Thoughts?
>
> --- Giampaolo
> http://code.google.com/p/pyftpdlib/
> http://code.google.com/p/psutil/
> http://code.google.com/p/pysendfile/
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20130928/5a3c449d/attachment.html>

From steve at pearwood.info  Sat Sep 28 14:32:46 2013
From: steve at pearwood.info (Steven D'Aprano)
Date: Sat, 28 Sep 2013 22:32:46 +1000
Subject: [Python-ideas] String formatting
In-Reply-To: <d076b095-749a-458b-9fcb-3db47baf36eb@googlegroups.com>
References: <d076b095-749a-458b-9fcb-3db47baf36eb@googlegroups.com>
Message-ID: <20130928123246.GW7989@ando>

On Sat, Sep 28, 2013 at 01:53:37AM -0700, Ram Rachum wrote:
> Any reason why string formatting using % doesn't work when the list of 
> arguments is in a list rather than a tuple?

Because it's not supposed to. It is part of the design of % that 
arbitrary objects, including lists, require only a single % target:

py> L = list(range(8))
py> "Values: %s" % L
'Values: [0, 1, 2, 3, 4, 5, 6, 7]'

The deliberately single exception to that are tuples. This is 
unavoidable, since there's otherwise no other way for a binary operator 
like % to take arbitrary numbers of arguments.

Besides, even if this were a good idea, 20+ years of code that expects 
lists to be treated as a single object for the purposes of % formatting 
says we can't change it now.


-- 
Steven

From python at mrabarnett.plus.com  Sat Sep 28 18:51:29 2013
From: python at mrabarnett.plus.com (MRAB)
Date: Sat, 28 Sep 2013 17:51:29 +0100
Subject: [Python-ideas] 'from os.path import FILE,
 DIR' or internal structure of filenames
In-Reply-To: <CAN8d9gkvXm72xM4CtjqzWsG3fTV0KKf9AJjgKD36oW3V5LsKOg@mail.gmail.com>
References: <CAPkN8xL4AofRrB-LE9HnZ1wnmRxCRU278pcBhqoT=T_z-98kdA@mail.gmail.com>
 <CAN8d9gkvXm72xM4CtjqzWsG3fTV0KKf9AJjgKD36oW3V5LsKOg@mail.gmail.com>
Message-ID: <52470911.5020809@mrabarnett.plus.com>

On 28/09/2013 12:34, Philipp A. wrote:
> as much as i would like the convenience, python has very few magic
> globals, and they all have names encased in 4 underscores.
>
> if we really add more globals, why not __abs_file__ and __abs_dir__ or
> sth. like that?
>
+1

1. Do we need them?

2. If we do, then I agree with __abs_file__ and __abs_dir__.
>
> 2013/9/28 anatoly techtonik <techtonik at gmail.com
> <mailto:techtonik at gmail.com>>
>
>     FILE = os.path.abspath(__file__)
>     DIR = os.path.abspath(os.path.dirname(__file__))
>     ?
>
>     Repeated pattern for referencing resources relative to your scripts.
>     Ideas about alternative names / locations are welcome.
>
>     In PHP these are __FILE__ and __DIR__. For Python 3 adding __dir__
>     is impossible, because the name clashes with __dir__ method (which
>     is not implemented for module object, but should be [ ] for
>     consistency). Also current __file__ is rarely absolute path, because
>     it is never normalized [ ].
>
>     So it will be nice to see normalization of Python file name after
>     the import to reduce mess and make its behaviour predictable -
>     http://stackoverflow.com/questions/7116889/python-file-attribute-absolute-or-relative
>
>
>     ----[ possible spec. draft for a beautiful internal structure ]--
>     The Python interpreter should provide run-time information about:
>     1. order of import sequence
>     2. names of imported modules
>     3. unique location for each imported module which unambiguously
>     identifies it
>     4. run-time import dependency tree (not sure about this, but it can
>     help with debugging)
>     5. information about sys.path entry where this module was imported from
>     6. information about who and when added this sys.path entry
>


From tjreedy at udel.edu  Sat Sep 28 22:28:12 2013
From: tjreedy at udel.edu (Terry Reedy)
Date: Sat, 28 Sep 2013 16:28:12 -0400
Subject: [Python-ideas] Python 3.4 should include docopt as-is
In-Reply-To: <5246B2A2.4080508@nedbatchelder.com>
References: <CAPkN8xJ_-jq-iji9gZ+9x2QmWax+5MEeb+DUT2J_iWhV7VzZpg@mail.gmail.com>
 <5246B2A2.4080508@nedbatchelder.com>
Message-ID: <l27e4l$538$1@ger.gmane.org>

On 9/28/2013 6:42 AM, Ned Batchelder wrote:
> On 9/28/13 12:44 AM, anatoly techtonik wrote:
>> This - http://docopt.org/ - should be included with Python 3.4
>> distribution.
>
> In addition to the other questions already asked, you haven't answered
> the fundamental one: Why should docopt be included in the stdlib?  It's
> right there in PyPI where any one can get it.  Why is it better in the
> stdlib than in PyPI?

The stdlib has mostly switched from using optparse to argparse. The next 
question is what relation docopt has to either?? What is its backend? 
Anyway, it strikes be as a wrapper module best kept as third party, 
similar to re and urllib wrappers.

-- 
Terry Jan Reedy


From tjreedy at udel.edu  Sat Sep 28 22:34:03 2013
From: tjreedy at udel.edu (Terry Reedy)
Date: Sat, 28 Sep 2013 16:34:03 -0400
Subject: [Python-ideas] String formatting
In-Reply-To: <d076b095-749a-458b-9fcb-3db47baf36eb@googlegroups.com>
References: <d076b095-749a-458b-9fcb-3db47baf36eb@googlegroups.com>
Message-ID: <l27efk$8b0$1@ger.gmane.org>

On 9/28/2013 4:53 AM, Ram Rachum wrote:
> Any reason why string formatting using % doesn't work when the list of
> arguments is in a list rather than a tuple?

Because that would double the troublesome anomaly of a tuple being 
treated as a sequence of objects and not just an object itself.

 >>> 'object %s' % [1,2]
'object [1, 2]'
 >>> 'object %s' % (1,2)
Traceback (most recent call last):
   File "<pyshell#1>", line 1, in <module>
     'object %s' % (1,2)
TypeError: not all arguments converted during string formatting

One of the reasons for .format() is to eliminate that anomaly.

 >>> 'object {}'.format([1,2])
'object [1, 2]'
 >>> 'object {}'.format((1,2))
'object (1, 2)'

-- 
Terry Jan Reedy


From cs at zip.com.au  Sun Sep 29 01:26:41 2013
From: cs at zip.com.au (Cameron Simpson)
Date: Sun, 29 Sep 2013 09:26:41 +1000
Subject: [Python-ideas] 'from os.path import FILE,
 DIR' or internal structure of filenames
In-Reply-To: <CAPkN8xL4AofRrB-LE9HnZ1wnmRxCRU278pcBhqoT=T_z-98kdA@mail.gmail.com>
References: <CAPkN8xL4AofRrB-LE9HnZ1wnmRxCRU278pcBhqoT=T_z-98kdA@mail.gmail.com>
Message-ID: <20130928232641.GA15985@cskk.homeip.net>

On 28Sep2013 08:19, anatoly techtonik <techtonik at gmail.com> wrote:
| FILE = os.path.abspath(__file__)
| DIR = os.path.abspath(os.path.dirname(__file__))
| ?
| 
| Repeated pattern for referencing resources relative to your scripts. Ideas
| about alternative names / locations are welcome.
| 
| In PHP these are __FILE__ and __DIR__. For Python 3 adding __dir__ is
| impossible, because the name clashes with __dir__ method (which is not
| implemented for module object, but should be [ ] for consistency). Also
| current __file__ is rarely absolute path, because it is never normalized [
| ].

Maybe I'm grumpy this morning (though I felt the same reading this yesterday).

-1 for any names commencing with __ (or even _).

-1 for new globals.

-1 because I can imagine wanting different nuances on the definitions
   above; in particular for DIR I can well imagine wanting bare
   dirname(abspath(FILE)) - semanticly different to your construction.
   There's lots of scope for bikeshedding here.

-1 because this is trivial trival code.

-1 because you can do all this with relative paths anyway, no need for abspath

-1 because I can imagine being unable to compute abspath in certain
    circumstances ( certainly on older UNIX systems you could be
    inside a directory without sufficient privileges to walk back
    up the tree for getcwd and its equivalents )

-0 for adding some kind of convenience functions to importlib(?) for this
     (+0 except that I can see heaps of bikeshedding)

Cheers,
-- 
Cameron Simpson <cs at zip.com.au>

In an insane society, the sane man must appear insane.
        - Keith A. Schauer <keith at balrog.dseg.ti.com>

From ncoghlan at gmail.com  Sun Sep 29 02:21:43 2013
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Sun, 29 Sep 2013 10:21:43 +1000
Subject: [Python-ideas] 'from os.path import FILE,
 DIR' or internal structure of filenames
In-Reply-To: <20130928232641.GA15985@cskk.homeip.net>
References: <CAPkN8xL4AofRrB-LE9HnZ1wnmRxCRU278pcBhqoT=T_z-98kdA@mail.gmail.com>
 <20130928232641.GA15985@cskk.homeip.net>
Message-ID: <CADiSq7fyxBEAm48KMqf6=v_Ud565Z1rQdztucObzui5RKb3_mw@mail.gmail.com>

Note that any remaining occurrences of non-absolute values in __file__ are
generally considered bugs in the import system. However, we tend not to fix
them in maintenance releases, since converting relative paths to absolute
paths runs a risk of breaking user code.

We're definitely *not* going to further pollute the module namespace with
values that can be trivially and reliably derived from existing values.

Cheers,
Nick.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20130929/576fc976/attachment.html>

From songofacandy at gmail.com  Sun Sep 29 05:28:45 2013
From: songofacandy at gmail.com (INADA Naoki)
Date: Sun, 29 Sep 2013 12:28:45 +0900
Subject: [Python-ideas] 'from os.path import FILE,
 DIR' or internal structure of filenames
In-Reply-To: <CADiSq7fyxBEAm48KMqf6=v_Ud565Z1rQdztucObzui5RKb3_mw@mail.gmail.com>
References: <CAPkN8xL4AofRrB-LE9HnZ1wnmRxCRU278pcBhqoT=T_z-98kdA@mail.gmail.com>
 <20130928232641.GA15985@cskk.homeip.net>
 <CADiSq7fyxBEAm48KMqf6=v_Ud565Z1rQdztucObzui5RKb3_mw@mail.gmail.com>
Message-ID: <CAEfz+TxNd1YxZbC6BdPU4c9_91xjL=V-5Q96CGW8NJhe-mMJjw@mail.gmail.com>

os.path.abspath(__file__) returns wrong path after chdir.
So I don't think abspath of module can be trivially and reliably derived
from existing values.

$ cat foo.py
import os
print(os.path.abspath(__file__))
os.chdir('work')
print(os.path.abspath(__file__))

$ python foo.py
/home/inada-n/foo.py
/home/inada-n/work/foo.py


On Sun, Sep 29, 2013 at 9:21 AM, Nick Coghlan <ncoghlan at gmail.com> wrote:

> Note that any remaining occurrences of non-absolute values in __file__ are
> generally considered bugs in the import system. However, we tend not to fix
> them in maintenance releases, since converting relative paths to absolute
> paths runs a risk of breaking user code.
>
> We're definitely *not* going to further pollute the module namespace with
> values that can be trivially and reliably derived from existing values.
>
> Cheers,
> Nick.
>
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> https://mail.python.org/mailman/listinfo/python-ideas
>
>


-- 
INADA Naoki  <songofacandy at gmail.com>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20130929/7a4ee94a/attachment.html>

From clay.sweetser at gmail.com  Sun Sep 29 06:06:47 2013
From: clay.sweetser at gmail.com (Clay Sweetser)
Date: Sun, 29 Sep 2013 00:06:47 -0400
Subject: [Python-ideas] An exhaust() function for iterators
Message-ID: <CANW+cAXMjP2KrPuF21CFQy16iaaHzOs-GM=jkPyPxODpOWpmsA@mail.gmail.com>

Currently, several strategies exist for exhausting an iterable when
one does not care about what the iterable returns (such as when one
merely wants a side effect of the iteration process).

One can either use an empty for loop:
for x in side_effect_iterable:
    pass

A throwaway list comprehension:
[x for x in side_effect_iterable]

A try/except and a while:
next = side_effect_iterable.next
try:
    while True:
        next()
except StopIteration:
    pass

Or a number of other methods.

The question is, which one is the fastest? Which one is the most
memory efficient?
Though these are all obvious methods, none of them are both the
fastest and the most memory efficient (though the for/pass method
comes close).

As it turns out, the fastest and most efficient method available in
the standard library is collections.deque's __init__ and extend
methods.

from collections import deque
exhaust_iterable = deque(maxlen=0).extend
exhaust_iterable(side_effect_iterable)

When a deque object is initialized with a max length of zero or less,
a special function, consume_iterator, is used instead of the regular
element insertion calls.
This function, found at
http://hg.python.org/cpython/file/tip/Modules/_collectionsmodule.c#l278,
merely iterates through the iterator, without doing any work
allocating the object to the deque's internal structure.

I would like to propose that this function, or one very similar to it,
be added to the standard library, either in the itertools module, or
the standard namespace.
If nothing else, doing so would at least give a single *obvious* way
to exhaust an iterator, instead of the several miscellaneous methods
available.


-- 
"Evil begins when you begin to treat people as things." - Terry Pratchett

From g.brandl at gmx.net  Sun Sep 29 07:47:38 2013
From: g.brandl at gmx.net (Georg Brandl)
Date: Sun, 29 Sep 2013 07:47:38 +0200
Subject: [Python-ideas] An exhaust() function for iterators
In-Reply-To: <CANW+cAXMjP2KrPuF21CFQy16iaaHzOs-GM=jkPyPxODpOWpmsA@mail.gmail.com>
References: <CANW+cAXMjP2KrPuF21CFQy16iaaHzOs-GM=jkPyPxODpOWpmsA@mail.gmail.com>
Message-ID: <l28erq$90s$1@ger.gmane.org>

Am 29.09.2013 06:06, schrieb Clay Sweetser:

> I would like to propose that this function, or one very similar to it,
> be added to the standard library, either in the itertools module, or
> the standard namespace.
> If nothing else, doing so would at least give a single *obvious* way
> to exhaust an iterator, instead of the several miscellaneous methods
> available.

YAGNI.  This is not a very common operation.

On the point of obvious ways, the first one you gave

for _ in iterable:
    pass

is perfectly obvious and simple enough AFAICS.

cheers,
Georg


From g.brandl at gmx.net  Sun Sep 29 07:50:35 2013
From: g.brandl at gmx.net (Georg Brandl)
Date: Sun, 29 Sep 2013 07:50:35 +0200
Subject: [Python-ideas] Python 3.4 should include docopt as-is
In-Reply-To: <l27e4l$538$1@ger.gmane.org>
References: <CAPkN8xJ_-jq-iji9gZ+9x2QmWax+5MEeb+DUT2J_iWhV7VzZpg@mail.gmail.com>
 <5246B2A2.4080508@nedbatchelder.com> <l27e4l$538$1@ger.gmane.org>
Message-ID: <l28f1b$90s$2@ger.gmane.org>

Am 28.09.2013 22:28, schrieb Terry Reedy:
> On 9/28/2013 6:42 AM, Ned Batchelder wrote:
>> On 9/28/13 12:44 AM, anatoly techtonik wrote:
>>> This - http://docopt.org/ - should be included with Python 3.4
>>> distribution.
>>
>> In addition to the other questions already asked, you haven't answered
>> the fundamental one: Why should docopt be included in the stdlib?  It's
>> right there in PyPI where any one can get it.  Why is it better in the
>> stdlib than in PyPI?
> 
> The stdlib has mostly switched from using optparse to argparse. The next 
> question is what relation docopt has to either?? What is its backend? 
> Anyway, it strikes be as a wrapper module best kept as third party, 
> similar to re and urllib wrappers.

Especially since it's one of the more "magical" argument parsers, which is
fine as a library but not something we like to put in the standard library.

cheers,
Georg


From ncoghlan at gmail.com  Sun Sep 29 08:15:17 2013
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Sun, 29 Sep 2013 16:15:17 +1000
Subject: [Python-ideas] 'from os.path import FILE,
 DIR' or internal structure of filenames
In-Reply-To: <CAEfz+TxNd1YxZbC6BdPU4c9_91xjL=V-5Q96CGW8NJhe-mMJjw@mail.gmail.com>
References: <CAPkN8xL4AofRrB-LE9HnZ1wnmRxCRU278pcBhqoT=T_z-98kdA@mail.gmail.com>
 <20130928232641.GA15985@cskk.homeip.net>
 <CADiSq7fyxBEAm48KMqf6=v_Ud565Z1rQdztucObzui5RKb3_mw@mail.gmail.com>
 <CAEfz+TxNd1YxZbC6BdPU4c9_91xjL=V-5Q96CGW8NJhe-mMJjw@mail.gmail.com>
Message-ID: <CADiSq7fJa5J5SFm-QYZ=ZH-bvs=iR0UTPJXyJbDfT3pXSRg98g@mail.gmail.com>

On 29 September 2013 13:28, INADA Naoki <songofacandy at gmail.com> wrote:
> os.path.abspath(__file__) returns wrong path after chdir.
> So I don't think abspath of module can be trivially and reliably derived
> from existing values.

Hence the part about any remaining instances of non-absolute __file__
values being considered a bug in the import system.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia

From kim.grasman at gmail.com  Sun Sep 29 11:52:52 2013
From: kim.grasman at gmail.com (=?ISO-8859-1?Q?Kim_Gr=E4sman?=)
Date: Sun, 29 Sep 2013 11:52:52 +0200
Subject: [Python-ideas] Have os.unlink remove junction points on Windows
In-Reply-To: <CADiSq7de5onO4OVTjdp8u-wb2gHBiBSex5aRet5k1HZG6GarDg@mail.gmail.com>
References: <CANt7B+cAMu1p6D05oKHDTCFGsA7Td-QWc3xJenuVmH3x+xzf3g@mail.gmail.com>
 <CANt7B+fu7_keoiBC2THYrQbBvqgn-jfyRrfaJ6uSx-h0HyjuNw@mail.gmail.com>
 <CANt7B+f9kJnwTSm0Cjex3nnJ-TG46=GCjE+JXfvhX=km0FU+0g@mail.gmail.com>
 <CAHVvXxQNWBoDwhhOCcsXvNZSwQNr9ay8+RdAYjAEnkF1cOVJPw@mail.gmail.com>
 <CANt7B+dmMFTgnxXYM9jnS+ZsPdbWxqoU8NWOJ-Z-vm6wZj=Jjw@mail.gmail.com>
 <CAHVvXxS3SETxP0768FL=y9jniX_1_L4-tm5OQn6Tapu3oUPo+w@mail.gmail.com>
 <CADiSq7de5onO4OVTjdp8u-wb2gHBiBSex5aRet5k1HZG6GarDg@mail.gmail.com>
Message-ID: <CANt7B+dxpnS_hbSyGC=4wuxr5m287jugqAnJ2FCe4VaC7DVNWw@mail.gmail.com>

Nick,

On Wed, Sep 25, 2013 at 2:04 PM, Nick Coghlan <ncoghlan at gmail.com> wrote:
>
> My recollection is that permissions around junction points are a little
> weird at the Windows OS level (so the access denied might be genuine for a
> regular user account), but if a patch can make os.unlink handle them more
> like *nix symlinks, that sounds reasonable to me.

Just to follow up on this: I found a lot of Win32-specific code in
Lib/test/symlink_support.py to detect whether the current user has the
SeCreateSymbolicLink privilege (privileges in Windows are permissions
not bound to a resource, rather like global access flags). So I think
the "little weird" applies more to symbolic links than to junction
points. At least I can't find any privileges that apply to junction
points as such.

There is a blurb on MSDN about the system-provided junction points
from C:\Documents and Settings\... -> C:\Users\..., but that seems to
concern actual file system object permissions for those specific paths
rather than something general around junction points.
http://msdn.microsoft.com/en-us/library/windows/desktop/bb968829(v=vs.85).aspx

Cheers,
- Kim

From techtonik at gmail.com  Sun Sep 29 19:36:06 2013
From: techtonik at gmail.com (anatoly techtonik)
Date: Sun, 29 Sep 2013 20:36:06 +0300
Subject: [Python-ideas] 'from os.path import FILE,
 DIR' or internal structure of filenames
In-Reply-To: <CADiSq7fJa5J5SFm-QYZ=ZH-bvs=iR0UTPJXyJbDfT3pXSRg98g@mail.gmail.com>
References: <CAPkN8xL4AofRrB-LE9HnZ1wnmRxCRU278pcBhqoT=T_z-98kdA@mail.gmail.com>
 <20130928232641.GA15985@cskk.homeip.net>
 <CADiSq7fyxBEAm48KMqf6=v_Ud565Z1rQdztucObzui5RKb3_mw@mail.gmail.com>
 <CAEfz+TxNd1YxZbC6BdPU4c9_91xjL=V-5Q96CGW8NJhe-mMJjw@mail.gmail.com>
 <CADiSq7fJa5J5SFm-QYZ=ZH-bvs=iR0UTPJXyJbDfT3pXSRg98g@mail.gmail.com>
Message-ID: <CAPkN8xKiigOomcenMeOfHvT7bNE1KfxSmNEeT2-Qq-x_XZ8USA@mail.gmail.com>

On Sun, Sep 29, 2013 at 9:15 AM, Nick Coghlan <ncoghlan at gmail.com> wrote:
> On 29 September 2013 13:28, INADA Naoki <songofacandy at gmail.com> wrote:
>> os.path.abspath(__file__) returns wrong path after chdir.
>> So I don't think abspath of module can be trivially and reliably derived
>> from existing values.
>
> Hence the part about any remaining instances of non-absolute __file__
> values being considered a bug in the import system.

Bug that will not be fixed, i.e. a wart.

And as a result we don't have a way to reliably reference filename
of the current script and its directory. Hence the proposal.
--
anatoly t.

From techtonik at gmail.com  Sun Sep 29 19:39:44 2013
From: techtonik at gmail.com (anatoly techtonik)
Date: Sun, 29 Sep 2013 20:39:44 +0300
Subject: [Python-ideas] 'from os.path import FILE,
 DIR' or internal structure of filenames
In-Reply-To: <20130928232641.GA15985@cskk.homeip.net>
References: <CAPkN8xL4AofRrB-LE9HnZ1wnmRxCRU278pcBhqoT=T_z-98kdA@mail.gmail.com>
 <20130928232641.GA15985@cskk.homeip.net>
Message-ID: <CAPkN8xKdFgkOAumquZ1DvO476Apq0sNT9oE+MQuAAkibh++mfw@mail.gmail.com>

On Sun, Sep 29, 2013 at 2:26 AM, Cameron Simpson <cs at zip.com.au> wrote:
> On 28Sep2013 08:19, anatoly techtonik <techtonik at gmail.com> wrote:
> | FILE = os.path.abspath(__file__)
> | DIR = os.path.abspath(os.path.dirname(__file__))
> | ?
> |
> | Repeated pattern for referencing resources relative to your scripts. Ideas
> | about alternative names / locations are welcome.
> |
> | In PHP these are __FILE__ and __DIR__. For Python 3 adding __dir__ is
> | impossible, because the name clashes with __dir__ method (which is not
> | implemented for module object, but should be [ ] for consistency). Also
> | current __file__ is rarely absolute path, because it is never normalized [
> | ].
>
> Maybe I'm grumpy this morning (though I felt the same reading this yesterday).
>
> -1 for any names commencing with __ (or even _).
>
> -1 for new globals.
>
> -1 because I can imagine wanting different nuances on the definitions
>    above; in particular for DIR I can well imagine wanting bare
>    dirname(abspath(FILE)) - semanticly different to your construction.
>    There's lots of scope for bikeshedding here.
>
> -1 because this is trivial trival code.
>
> -1 because you can do all this with relative paths anyway, no need for abspath
>
> -1 because I can imagine being unable to compute abspath in certain
>     circumstances ( certainly on older UNIX systems you could be
>     inside a directory without sufficient privileges to walk back
>     up the tree for getcwd and its equivalents )
>
> -0 for adding some kind of convenience functions to importlib(?) for this
>      (+0 except that I can see heaps of bikeshedding)

With all -1 above, what is your preferred way to refer to resources
that are places into subdirectories of your script directory?
--
anatoly t.

From p.f.moore at gmail.com  Sun Sep 29 20:16:59 2013
From: p.f.moore at gmail.com (Paul Moore)
Date: Sun, 29 Sep 2013 19:16:59 +0100
Subject: [Python-ideas] 'from os.path import FILE,
 DIR' or internal structure of filenames
In-Reply-To: <CAPkN8xKdFgkOAumquZ1DvO476Apq0sNT9oE+MQuAAkibh++mfw@mail.gmail.com>
References: <CAPkN8xL4AofRrB-LE9HnZ1wnmRxCRU278pcBhqoT=T_z-98kdA@mail.gmail.com>
 <20130928232641.GA15985@cskk.homeip.net>
 <CAPkN8xKdFgkOAumquZ1DvO476Apq0sNT9oE+MQuAAkibh++mfw@mail.gmail.com>
Message-ID: <CACac1F_XV+=bthKPbqvuhSgb6_RmfBWVe8BXPnLb9rKai=iidA@mail.gmail.com>

On 29 September 2013 18:39, anatoly techtonik <techtonik at gmail.com> wrote:
> With all -1 above, what is your preferred way to refer to resources
> that are places into subdirectories of your script directory?

If you are an imported module, pkgutil.get_data (because that handles
modules in zipfiles, etc). Otherwise, if you are running a single-file
script:

with open(os.path.join(os.path.dirname(__file__), 'path', 'to', 'my',
'data')) as f:
    data = f.read()

Why is this a problem?

Paul

From tjreedy at udel.edu  Sun Sep 29 21:13:59 2013
From: tjreedy at udel.edu (Terry Reedy)
Date: Sun, 29 Sep 2013 15:13:59 -0400
Subject: [Python-ideas] 'from os.path import FILE,
	DIR' or internal structure of filenames
In-Reply-To: <CAPkN8xKiigOomcenMeOfHvT7bNE1KfxSmNEeT2-Qq-x_XZ8USA@mail.gmail.com>
References: <CAPkN8xL4AofRrB-LE9HnZ1wnmRxCRU278pcBhqoT=T_z-98kdA@mail.gmail.com>
 <20130928232641.GA15985@cskk.homeip.net>
 <CADiSq7fyxBEAm48KMqf6=v_Ud565Z1rQdztucObzui5RKb3_mw@mail.gmail.com>
 <CAEfz+TxNd1YxZbC6BdPU4c9_91xjL=V-5Q96CGW8NJhe-mMJjw@mail.gmail.com>
 <CADiSq7fJa5J5SFm-QYZ=ZH-bvs=iR0UTPJXyJbDfT3pXSRg98g@mail.gmail.com>
 <CAPkN8xKiigOomcenMeOfHvT7bNE1KfxSmNEeT2-Qq-x_XZ8USA@mail.gmail.com>
Message-ID: <l29u5h$k0j$1@ger.gmane.org>

On 9/29/2013 1:36 PM, anatoly techtonik wrote:
> On Sun, Sep 29, 2013 at 9:15 AM, Nick Coghlan <ncoghlan at gmail.com> wrote:
>> On 29 September 2013 13:28, INADA Naoki <songofacandy at gmail.com> wrote:
>>> os.path.abspath(__file__) returns wrong path after chdir.
>>> So I don't think abspath of module can be trivially and reliably derived
>>> from existing values.
>>
>> Hence the part about any remaining instances of non-absolute __file__
>> values being considered a bug in the import system.
>
> Bug that will not be fixed, i.e. a wart.

Nick said "we tend not to fix them in maintenance releases", which I 
take to mean we can fix in new versions.

> And as a result we don't have a way to reliably reference filename
> of the current script and its directory. Hence the proposal.

The proposed addition would not happen in maintenance releases either.

-- 
Terry Jan Reedy


From tjreedy at udel.edu  Sun Sep 29 21:19:38 2013
From: tjreedy at udel.edu (Terry Reedy)
Date: Sun, 29 Sep 2013 15:19:38 -0400
Subject: [Python-ideas] 'from os.path import FILE,
	DIR' or internal structure of filenames
In-Reply-To: <CAEfz+TxNd1YxZbC6BdPU4c9_91xjL=V-5Q96CGW8NJhe-mMJjw@mail.gmail.com>
References: <CAPkN8xL4AofRrB-LE9HnZ1wnmRxCRU278pcBhqoT=T_z-98kdA@mail.gmail.com>
 <20130928232641.GA15985@cskk.homeip.net>
 <CADiSq7fyxBEAm48KMqf6=v_Ud565Z1rQdztucObzui5RKb3_mw@mail.gmail.com>
 <CAEfz+TxNd1YxZbC6BdPU4c9_91xjL=V-5Q96CGW8NJhe-mMJjw@mail.gmail.com>
Message-ID: <l29ug3$nou$1@ger.gmane.org>

On 9/28/2013 11:28 PM, INADA Naoki wrote:
> os.path.abspath(__file__) returns wrong path after chdir.

So grab the path before chdir (which most programs do not do).

> So I don't think abspath of module can be trivially and reliably derived
> from existing values.

It apparently can if you do so in a timely fashion.
Grabbing it as soon as possible is the obvious time to do it.

> $ cat foo.py
> import os

foopath = os.path.abspath(__file__)
Now print it or do whatever you want with it.

-- 
Terry Jan Reedy


From storchaka at gmail.com  Sun Sep 29 22:38:30 2013
From: storchaka at gmail.com (Serhiy Storchaka)
Date: Sun, 29 Sep 2013 23:38:30 +0300
Subject: [Python-ideas] pprint in displayhook
In-Reply-To: <0BFAAF4A-F5C8-48FA-9C82-1B60D164A033@gmail.com>
References: <l23lbh$4el$1@ger.gmane.org>
 <0BFAAF4A-F5C8-48FA-9C82-1B60D164A033@gmail.com>
Message-ID: <l2a32u$3jc$1@ger.gmane.org>

28.09.13 07:17, Raymond Hettinger ???????(??):
> This might be a reasonable idea if pprint were in better shape.
> I think substantial work needs to be done on it, before it would
> be worthy of becoming the default method of display.

What should be changed in pprint?


From storchaka at gmail.com  Sun Sep 29 22:42:20 2013
From: storchaka at gmail.com (Serhiy Storchaka)
Date: Sun, 29 Sep 2013 23:42:20 +0300
Subject: [Python-ideas] An exhaust() function for iterators
In-Reply-To: <CANW+cAXMjP2KrPuF21CFQy16iaaHzOs-GM=jkPyPxODpOWpmsA@mail.gmail.com>
References: <CANW+cAXMjP2KrPuF21CFQy16iaaHzOs-GM=jkPyPxODpOWpmsA@mail.gmail.com>
Message-ID: <l2a3a8$7p8$1@ger.gmane.org>

29.09.13 07:06, Clay Sweetser ???????(??):
> I would like to propose that this function, or one very similar to it,
> be added to the standard library, either in the itertools module, or
> the standard namespace.
> If nothing else, doing so would at least give a single *obvious* way
> to exhaust an iterator, instead of the several miscellaneous methods
> available.

I prefer optimize the for loop so that it will be most efficient way (it 
is already most obvious way).


From ncoghlan at gmail.com  Mon Sep 30 00:18:41 2013
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Mon, 30 Sep 2013 08:18:41 +1000
Subject: [Python-ideas] 'from os.path import FILE,
 DIR' or internal structure of filenames
In-Reply-To: <l29u5h$k0j$1@ger.gmane.org>
References: <CAPkN8xL4AofRrB-LE9HnZ1wnmRxCRU278pcBhqoT=T_z-98kdA@mail.gmail.com>
 <20130928232641.GA15985@cskk.homeip.net>
 <CADiSq7fyxBEAm48KMqf6=v_Ud565Z1rQdztucObzui5RKb3_mw@mail.gmail.com>
 <CAEfz+TxNd1YxZbC6BdPU4c9_91xjL=V-5Q96CGW8NJhe-mMJjw@mail.gmail.com>
 <CADiSq7fJa5J5SFm-QYZ=ZH-bvs=iR0UTPJXyJbDfT3pXSRg98g@mail.gmail.com>
 <CAPkN8xKiigOomcenMeOfHvT7bNE1KfxSmNEeT2-Qq-x_XZ8USA@mail.gmail.com>
 <l29u5h$k0j$1@ger.gmane.org>
Message-ID: <CADiSq7eS9AYqUJ=z=T4ypU4Xys=hCbAnbv0UR9vc0NMFGartpw@mail.gmail.com>

On 30 Sep 2013 05:14, "Terry Reedy" <tjreedy at udel.edu> wrote:
>
> On 9/29/2013 1:36 PM, anatoly techtonik wrote:
>>
>> On Sun, Sep 29, 2013 at 9:15 AM, Nick Coghlan <ncoghlan at gmail.com> wrote:
>>>
>>> On 29 September 2013 13:28, INADA Naoki <songofacandy at gmail.com> wrote:
>>>>
>>>> os.path.abspath(__file__) returns wrong path after chdir.
>>>> So I don't think abspath of module can be trivially and reliably
derived
>>>> from existing values.
>>>
>>>
>>> Hence the part about any remaining instances of non-absolute __file__
>>> values being considered a bug in the import system.
>>
>>
>> Bug that will not be fixed, i.e. a wart.
>
>
> Nick said "we tend not to fix them in maintenance releases", which I take
to mean we can fix in new versions.

Correct, it's the kind of arguably backwards incompatible bug fix that
users will generally tolerate in a feature release but would be justifiably
upset about in a maintenance release.

Cheers,
Nick.

>
>
>> And as a result we don't have a way to reliably reference filename
>> of the current script and its directory. Hence the proposal.
>
>
> The proposed addition would not happen in maintenance releases either.
>
> --
> Terry Jan Reedy
>
>
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> https://mail.python.org/mailman/listinfo/python-ideas
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20130930/10232dea/attachment.html>

From cs at zip.com.au  Mon Sep 30 00:17:46 2013
From: cs at zip.com.au (Cameron Simpson)
Date: Mon, 30 Sep 2013 08:17:46 +1000
Subject: [Python-ideas] 'from os.path import FILE,
 DIR' or internal structure of filenames
In-Reply-To: <CAPkN8xKdFgkOAumquZ1DvO476Apq0sNT9oE+MQuAAkibh++mfw@mail.gmail.com>
References: <CAPkN8xKdFgkOAumquZ1DvO476Apq0sNT9oE+MQuAAkibh++mfw@mail.gmail.com>
Message-ID: <20130929221746.GA11746@cskk.homeip.net>

On 29Sep2013 20:39, anatoly techtonik <techtonik at gmail.com> wrote:
| On Sun, Sep 29, 2013 at 2:26 AM, Cameron Simpson <cs at zip.com.au> wrote:
| > Maybe I'm grumpy this morning (though I felt the same reading this yesterday).
[...]
| > -1 because this is trivial trival code.
| > -1 because you can do all this with relative paths anyway, no need for abspath
[...]
| With all -1 above, what is your preferred way to refer to resources
| that are places into subdirectories of your script directory?

Probably os.path.join(os.path.dirname(__file__), "datafile-here").
I've got some unit tests that want that.

No need for abspath at all.

Of course, chdir and passing paths to programs-which-are-not-my-children
present scope for wanting abspath, but in the very common simple
case: unnecessary and therefore undesirable.

And I'm aware that modules-inside-zip-files don't work with this;
let us ignore that; they won't work  with abspath either:-)

It is so trite that I can't imagine wanting to bolt it into the stdlib.

Cheers,
-- 
Cameron Simpson <cs at zip.com.au>

Thousands of years ago the Egyptians worshipped cats as gods.
Cats have never forgotten this. - David Wren-Hardin <bdh4 at quads.uchicago.edu>

From solipsis at pitrou.net  Mon Sep 30 10:17:59 2013
From: solipsis at pitrou.net (Antoine Pitrou)
Date: Mon, 30 Sep 2013 10:17:59 +0200
Subject: [Python-ideas] Python 3.4 should include docopt as-is
References: <CAPkN8xJ_-jq-iji9gZ+9x2QmWax+5MEeb+DUT2J_iWhV7VzZpg@mail.gmail.com>
 <5246B2A2.4080508@nedbatchelder.com> <l27e4l$538$1@ger.gmane.org>
 <l28f1b$90s$2@ger.gmane.org>
Message-ID: <20130930101759.0645c0b9@pitrou.net>

Le Sun, 29 Sep 2013 07:50:35 +0200,
Georg Brandl <g.brandl at gmx.net> a ?crit :
> Am 28.09.2013 22:28, schrieb Terry Reedy:
> > On 9/28/2013 6:42 AM, Ned Batchelder wrote:
> >> On 9/28/13 12:44 AM, anatoly techtonik wrote:
> >>> This - http://docopt.org/ - should be included with Python 3.4
> >>> distribution.
> >>
> >> In addition to the other questions already asked, you haven't
> >> answered the fundamental one: Why should docopt be included in the
> >> stdlib?  It's right there in PyPI where any one can get it.  Why
> >> is it better in the stdlib than in PyPI?
> > 
> > The stdlib has mostly switched from using optparse to argparse. The
> > next question is what relation docopt has to either?? What is its
> > backend? Anyway, it strikes be as a wrapper module best kept as
> > third party, similar to re and urllib wrappers.
> 
> Especially since it's one of the more "magical" argument parsers,
> which is fine as a library but not something we like to put in the
> standard library.

Agreed. It's also not the most appealing API IMO.

Regards

Antoine.


From vladimir at keleshev.com  Mon Sep 30 20:58:53 2013
From: vladimir at keleshev.com (Vladimir Keleshev)
Date: Mon, 30 Sep 2013 20:58:53 +0200
Subject: [Python-ideas] Python 3.4 should include docopt as-is
In-Reply-To: <CAMw+j7+kKSJ0ZroB1NfbZueW8-zoQEyMuj2ewUTEv6ddbrcPFg@mail.gmail.com>
References: <CAPkN8xJ_-jq-iji9gZ+9x2QmWax+5MEeb+DUT2J_iWhV7VzZpg@mail.gmail.com>
 <20130928081009.GU7989@ando>
 <CAMw+j7+kKSJ0ZroB1NfbZueW8-zoQEyMuj2ewUTEv6ddbrcPFg@mail.gmail.com>
Message-ID: <53381380567533@web12m.yandex.ru>

Thanks for notifying me, and sorry for the late reply.

I think it would be awesome if docopt became part of the
standard library. However, it's not ready yet. I expect
1.0.0 to be released not earlier than 2014. When it's ready
I will definitely write a PEP.

According to the schedule the "feature freeze" will occur
on Nov 24, 2013 together with 3.4.0 beta 1 release. If
"feature freeze" means no new things in standard library,
then, neither docopt, nor PEP will be ready by that time.

Seems like docopt will need to wait another 2 years. 

But don't lose harts, docopt 1.0.0 will be much better, the
language will be much more predictable and simple, the
error messages will be much more clear, it will be more
parseable and portable. That's why I think it is worth it
to wait. 

Getopt is now about 33 years old and still widely used; So
I want docopt to be ready for the year 2046.

Cheers, 
Vladimir Keleshev


28.09.2013, 11:02, "Chris ?Kwpolska? Warrick" <kwpolska at gmail.com>:
> On Sat, Sep 28, 2013 at 10:10 AM, Steven D'Aprano <steve at pearwood.info> wrote:
>
>> ?On Sat, Sep 28, 2013 at 07:44:46AM +0300, anatoly techtonik wrote:
>>> ?This - http://docopt.org/ - should be included with Python 3.4 distribution.
>> ?Are you the developer or maintainer of docopt?
>
> He is not. ?I CC?d the developer, Vladimir Keleshev.
>
>> ?If so, you'll probably need to write a PEP. Otherwise, you'll need to
>> ?ask the maintainer of docopt to write a PEP. Some questions that will
>> ?need to be asked:
>>
>> ?- does the maintainer agree to distribute the software under the same
>> ?licence as Python?
>>
>> ?- does the maintainer agree to stick to Python's release schedule?
>>
>> ?- is the maintainer happy with keeping the API frozen for the next ten
>> ?or fifteen years?
>>
>> ?I see that docopt is now up to version 0.6.1. To me, that indicates that
>> ?the API should not be considered stable, it's under version 1. Perhaps
>> ?the maintainer disagrees, and would be happy to freeze the API now.
>>
>> ?--
>> ?Steven
>> ?_______________________________________________
>> ?Python-ideas mailing list
>> ?Python-ideas at python.org
>> ?https://mail.python.org/mailman/listinfo/python-ideas
>
> --
> Chris ?Kwpolska? Warrick <http://kwpolska.tk>
> PGP: 5EAAEA16
> stop html mail | always bottom-post | only UTF-8 makes sense