From ezio.melotti at gmail.com Mon Dec 1 02:17:50 2008 From: ezio.melotti at gmail.com (Ezio Melotti) Date: Mon, 01 Dec 2008 03:17:50 +0200 Subject: [Python-ideas] A Wiki-style documentation with an approval process In-Reply-To: References: <4926BC57.8030509@gmail.com> <8763mgl3xz.fsf@xemacs.org> Message-ID: <49333B3E.4020409@gmail.com> Terry Reedy wrote: > Stephen J. Turnbull wrote: > >> C'mon, I bet you've let a typo or two slide because your brain was on >> fire to finish your latest hack. Haven't we all? If the doc you were >> reading was the wiki and a fix was a mouse click, two keystrokes, and >> another mouse click away, you might fix it in that situation. > > 1. There are very few overt typos left in the docs. > 2. Why would I use such an inferior version as a wiki version would be? > > Now, if someone wrote a Microsoft Help workalike program that also > included an 'email corrections' feature, that would be something else. > I agree with what Stephen J. Turnbull said. The main point here is "/making easy things easy/ and hard things possible". There are a several changes in the doc that don't require a related issue in the bug tracker and they would benefit from a wiki-like system (typos are just an example). If a change requires a discussion we can still use the bug tracker (and possibly edit the page directly at the end of the process, without using patches if they are not necessary.) This system is not intended as a replacement, but just an improvement of what we already have. Terry Reedy wrote: > I suspect that the doc maintainers would spend as much time rewriting > submissions as they do now and more time rejecting suggestions. Among the Defer/Approve/Reject radio buttons that Stephen suggested we can also add an "Open as a new issue" button. This can be used to "redirect" on the bug tracker the suggestions that are valid but still need some change. -- Ezio Melotti From clp at rebertia.com Thu Dec 4 08:51:55 2008 From: clp at rebertia.com (Chris Rebert) Date: Wed, 3 Dec 2008 23:51:55 -0800 Subject: [Python-ideas] Decimal literal? Message-ID: <47c890dc0812032351r22e5bf09q1d7ab634358186a5@mail.gmail.com> With Python 3.0 being released, and going over its many changes, I was reminded that decimal numbers (decimal.Decimal) are still relegated to a library and aren't built-in. Has there been any thought to adding decimal literals and making decimal a built-in type? I googled but was unable to locate any discussion of the exact issue. The closest I could find was a suggestion about making decimal the default instead of float: http://mail.python.org/pipermail/python-ideas/2008-May/001565.html It seems that decimal arithmetic is more intuitively correct that plain floating point and floating point's main (only?) advantage is speed, but it seems like premature optimization to favor speed over correctness by default at the language level. Obviously, making decimal the default instead of float would be fraught with backward compatibility problems and thus is not presently feasible, but at the least for now Python could make it easier to use decimals and their associated nice arithmetic by having a literal syntax for them and making them built-in. So what do people think of: 1. making decimal.Decimal a built-in type, named "decimal" (or "dec" if that's too long?) 2. adding a literal syntax for decimals; I'd naively suggest a 'd' suffix to the float literal syntax (which was suggested in the brief aforementioned thread) 3. (in Python 4.0/Python 4000) making decimal the default instead of float, with floats instead requiring a 'f' suffix Obviously #1 & #2 would be shooting for Python 3.1 or later. Cheers, Chris P.S. Yay for the long-awaited release of Python 3.0! Better than can be said for Perl 6. -- Follow the path of the Iguana... http://rebertia.com From python at rcn.com Thu Dec 4 09:00:04 2008 From: python at rcn.com (Raymond Hettinger) Date: Thu, 4 Dec 2008 00:00:04 -0800 Subject: [Python-ideas] Decimal literal? References: <47c890dc0812032351r22e5bf09q1d7ab634358186a5@mail.gmail.com> Message-ID: <204A1AE06E8341E7BE37050BF2674E55@RaymondLaptop1> From: "Chris Rebert" > With Python 3.0 being released, and going over its many changes, I was > reminded that decimal numbers (decimal.Decimal) are still relegated to > a library and aren't built-in. > > Has there been any thought to adding decimal literals and making > decimal a built-in type? It's a non-starter until there is a fast, clean C implementation of decimal. The current module is hundreds of times slower than binary floats. Raymond From clp at rebertia.com Thu Dec 4 09:23:24 2008 From: clp at rebertia.com (Chris Rebert) Date: Thu, 4 Dec 2008 00:23:24 -0800 Subject: [Python-ideas] Decimal literal? In-Reply-To: <204A1AE06E8341E7BE37050BF2674E55@RaymondLaptop1> References: <47c890dc0812032351r22e5bf09q1d7ab634358186a5@mail.gmail.com> <204A1AE06E8341E7BE37050BF2674E55@RaymondLaptop1> Message-ID: <47c890dc0812040023i3b887b28l4cb89db68ae539e5@mail.gmail.com> On Thu, Dec 4, 2008 at 12:00 AM, Raymond Hettinger wrote: > From: "Chris Rebert" > > >> With Python 3.0 being released, and going over its many changes, I was >> reminded that decimal numbers (decimal.Decimal) are still relegated to >> a library and aren't built-in. >> >> Has there been any thought to adding decimal literals and making >> decimal a built-in type? > > It's a non-starter until there is a fast, clean C implementation of decimal. > The current module is hundreds of times slower than binary floats. > Does performance matter quite *that* critically in most everyday programs? If people need such ruthless speed, they can use floats and accept the consequences or use another language entirely (e.g. C, C++, OCaml) as Python would be too slow even as it currently is. We're talking about giving people the option to explicitly, in a less cumbersome way, make that choice of correctness over performance. If slowing startup time for the interpreter is what worries you, a 'from __future__ import' directive could be required and the timeline for full built-in-ness pushed back. Also, by "built-in" I didn't mean to necessarily imply "written in C", but rather "being present in the builtin namespace and available by default". That said, there appears to be decNumber (http://speleotrove.com/decimal/#decNumber), an ANSI C implementation of the General Decimal Arithmetic spec to which Decimal.decimal adheres. At least there's a place to start. Cheers, Chris -- Follow the path of the Iguana... http://rebertia.com > > Raymond > From stephen at xemacs.org Thu Dec 4 09:52:22 2008 From: stephen at xemacs.org (Stephen J. Turnbull) Date: Thu, 04 Dec 2008 17:52:22 +0900 Subject: [Python-ideas] Decimal literal? In-Reply-To: <47c890dc0812040023i3b887b28l4cb89db68ae539e5@mail.gmail.com> References: <47c890dc0812032351r22e5bf09q1d7ab634358186a5@mail.gmail.com> <204A1AE06E8341E7BE37050BF2674E55@RaymondLaptop1> <47c890dc0812040023i3b887b28l4cb89db68ae539e5@mail.gmail.com> Message-ID: <87y6ywwce1.fsf@uwakimon.sk.tsukuba.ac.jp> Chris Rebert writes: > Does performance matter quite *that* critically in most everyday > programs? Of course not. But that's the wrong question. Python is a *general-purpose* programming language, not an "everyday application where performance isn't critical programming language". There are plenty of applications that just cry out for a Python implementation where it does matter. From clp at rebertia.com Thu Dec 4 10:10:40 2008 From: clp at rebertia.com (Chris Rebert) Date: Thu, 4 Dec 2008 01:10:40 -0800 Subject: [Python-ideas] Decimal literal? In-Reply-To: <87y6ywwce1.fsf@uwakimon.sk.tsukuba.ac.jp> References: <47c890dc0812032351r22e5bf09q1d7ab634358186a5@mail.gmail.com> <204A1AE06E8341E7BE37050BF2674E55@RaymondLaptop1> <47c890dc0812040023i3b887b28l4cb89db68ae539e5@mail.gmail.com> <87y6ywwce1.fsf@uwakimon.sk.tsukuba.ac.jp> Message-ID: <47c890dc0812040110w15cb30f3ld23e5bfdd0936a0f@mail.gmail.com> On Thu, Dec 4, 2008 at 12:52 AM, Stephen J. Turnbull wrote: > Chris Rebert writes: > > > Does performance matter quite *that* critically in most everyday > > programs? > > Of course not. But that's the wrong question. Python is a > *general-purpose* programming language, not an "everyday application > where performance isn't critical programming language". There are > plenty of applications that just cry out for a Python > implementation where it does matter. > We're talking about adding a feature, not taking speed away. If anything, this would increase adoption of Python as people writing programs that use decimals extensively would be able to use decimals with greater ease. Speed freaks could still use floats; there's no change as far as they're concerned. Yes, people who need BOTH decimals AND maximum speed would still be left out, but let's take this one step at a time, and in a later step maybe we can fully satisfy such people. We wouldn't want the perfect long term (speedy built-in decimals) getting in the way of the pretty good near term (built-in decimals). Additionally, your argument can be turned on its head ;-) Consider: > Does perfect accuracy matter quite *that* critically in most everyday programs? Of course not. But that's the wrong question. Python is a *general-purpose* programming language, not an "everyday application where accuracy isn't critical programming language". There are plenty of applications that just cry out for a Python implementation where it does matter. Cheers, Chris -- Follow the path of the Iguana... http://rebertia.com From rhamph at gmail.com Thu Dec 4 10:37:11 2008 From: rhamph at gmail.com (Adam Olsen) Date: Thu, 4 Dec 2008 02:37:11 -0700 Subject: [Python-ideas] Decimal literal? In-Reply-To: <47c890dc0812032351r22e5bf09q1d7ab634358186a5@mail.gmail.com> References: <47c890dc0812032351r22e5bf09q1d7ab634358186a5@mail.gmail.com> Message-ID: On Thu, Dec 4, 2008 at 12:51 AM, Chris Rebert wrote: > With Python 3.0 being released, and going over its many changes, I was > reminded that decimal numbers (decimal.Decimal) are still relegated to > a library and aren't built-in. > > Has there been any thought to adding decimal literals and making > decimal a built-in type? I googled but was unable to locate any > discussion of the exact issue. The closest I could find was a > suggestion about making decimal the default instead of float: > http://mail.python.org/pipermail/python-ideas/2008-May/001565.html > It seems that decimal arithmetic is more intuitively correct that > plain floating point and floating point's main (only?) advantage is > speed, but it seems like premature optimization to favor speed over > correctness by default at the language level. Intuitively, you'd think it's more correct, but for non-trivial usage I see no reason for it to be. The strongest arguments on [1] seem to be controllable precision and stricter standards. Controllable precision works just as well in a library. Stricter standards (ie very portable semantics) could be done with base-2 floats via software emulating on all platforms (and throwing performance out the window). Do you have some use cases that are (completely!) correct in decimal, and not in base-2 floating point? Something not trivial (emulating a schoolbook, writing a calculator, etc.) I see Decimal as a modest investment for a mild return. Not worth the effort to switch. -- Adam Olsen, aka Rhamphoryncus From python at rcn.com Thu Dec 4 10:51:08 2008 From: python at rcn.com (Raymond Hettinger) Date: Thu, 4 Dec 2008 01:51:08 -0800 Subject: [Python-ideas] Decimal literal? References: <47c890dc0812032351r22e5bf09q1d7ab634358186a5@mail.gmail.com> Message-ID: <3A55A52D0AFB41C9AF79A7B4C272DB53@RaymondLaptop1> If decimals are to become built-in, there are a number of things that need to happen and one of them includes a C implementation, not just for speed, but also to integrate with the parser and the rest of the language. Last time I looked, the existing C implementations out there were license compatible with Python. Also, there are other integration issues to solved, including that of contexts (which are an integral part of the spec). None of this is a trivial exercise or I would have already done it. I do want to move decimal towards being a builtin but don't underestimate the difficulty of doing so. Also, there are other API issues. As it stands, the decimal module is not friendly to newbies and presents challenges even for expert users. And don't underestimate the significance of performance -- it is a top reason that people currently avoid the decimal module and it is an issue for the language itself (lots of companies avoid Python because of its speed disadvantage). One other thought, decimal literals are likely not very helpful in real programs. Most apps that have specific numeric requirements, will have code that manipulates numbers read-in from external sources and written back out -- the scripts themselves typically contain very few constants (and those are typically integers), so you don't get much help from a decimal literal. Raymond Hettinger ----- Original Message ----- From: "Chris Rebert" To: "Python-Ideas" Sent: Wednesday, December 03, 2008 11:51 PM Subject: [Python-ideas] Decimal literal? > With Python 3.0 being released, and going over its many changes, I was > reminded that decimal numbers (decimal.Decimal) are still relegated to > a library and aren't built-in. > > Has there been any thought to adding decimal literals and making > decimal a built-in type? I googled but was unable to locate any > discussion of the exact issue. The closest I could find was a > suggestion about making decimal the default instead of float: > http://mail.python.org/pipermail/python-ideas/2008-May/001565.html > It seems that decimal arithmetic is more intuitively correct that > plain floating point and floating point's main (only?) advantage is > speed, but it seems like premature optimization to favor speed over > correctness by default at the language level. > Obviously, making decimal the default instead of float would be > fraught with backward compatibility problems and thus is not presently > feasible, but at the least for now Python could make it easier to use > decimals and their associated nice arithmetic by having a literal > syntax for them and making them built-in. > > So what do people think of: > 1. making decimal.Decimal a built-in type, named "decimal" (or "dec" > if that's too long?) > 2. adding a literal syntax for decimals; I'd naively suggest a 'd' > suffix to the float literal syntax (which was suggested in the brief > aforementioned thread) > 3. (in Python 4.0/Python 4000) making decimal the default instead of > float, with floats instead requiring a 'f' suffix > > Obviously #1 & #2 would be shooting for Python 3.1 or later. > > Cheers, > Chris > > P.S. Yay for the long-awaited release of Python 3.0! Better than can > be said for Perl 6. > > -- > Follow the path of the Iguana... > http://rebertia.com > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas From cesare.dimauro at a-tono.com Thu Dec 4 10:45:41 2008 From: cesare.dimauro at a-tono.com (Cesare Di Mauro) Date: Thu, 04 Dec 2008 10:45:41 +0100 Subject: [Python-ideas] Decimal literal? In-Reply-To: References: <47c890dc0812032351r22e5bf09q1d7ab634358186a5@mail.gmail.com> Message-ID: On 04 dicembre 2008 alle ore 10:37 AM, Adam Olsen wrote: > Intuitively, you'd think it's more correct, but for non-trivial usage > I see no reason for it to be. The strongest arguments on [1] seem to > be controllable precision and stricter standards. Controllable > precision works just as well in a library. Stricter standards (ie > very portable semantics) could be done with base-2 floats via software > emulating on all platforms (and throwing performance out the window). > > Do you have some use cases that are (completely!) correct in decimal, > and not in base-2 floating point? Something not trivial (emulating a > schoolbook, writing a calculator, etc.) > > I see Decimal as a modest investment for a mild return. Not worth the > effort to switch. But at least it will be more usable to have a short-hand for decimal declaration: a = 1234.5678d is simplier than: import decimal a = decimal.Decimal('1234.5678') or: from decimal import Decimal a = Decimal('1234.5678') Cheers Cesare From python at rcn.com Thu Dec 4 10:56:58 2008 From: python at rcn.com (Raymond Hettinger) Date: Thu, 4 Dec 2008 01:56:58 -0800 Subject: [Python-ideas] Decimal literal? References: <47c890dc0812032351r22e5bf09q1d7ab634358186a5@mail.gmail.com> Message-ID: <429565A0F2E34960A00B14E9D15F2D7F@RaymondLaptop1> From: "Cesare Di Mauro" > But at least it will be more usable to have a short-hand for decimal > declaration: > > a = 1234.5678d How often do you put non-integer constants in real programs? Don't you find that most real decimal apps start with external data sources instead of all the data values being hard-coded in your program? From clp at rebertia.com Thu Dec 4 10:54:47 2008 From: clp at rebertia.com (Chris Rebert) Date: Thu, 4 Dec 2008 01:54:47 -0800 Subject: [Python-ideas] Decimal literal? In-Reply-To: References: <47c890dc0812032351r22e5bf09q1d7ab634358186a5@mail.gmail.com> Message-ID: <47c890dc0812040154h5c0d96ebn4e30b1bf87c95ef5@mail.gmail.com> On Thu, Dec 4, 2008 at 1:37 AM, Adam Olsen wrote: > On Thu, Dec 4, 2008 at 12:51 AM, Chris Rebert wrote: >> With Python 3.0 being released, and going over its many changes, I was >> reminded that decimal numbers (decimal.Decimal) are still relegated to >> a library and aren't built-in. >> >> Has there been any thought to adding decimal literals and making >> decimal a built-in type? I googled but was unable to locate any >> discussion of the exact issue. The closest I could find was a >> suggestion about making decimal the default instead of float: >> http://mail.python.org/pipermail/python-ideas/2008-May/001565.html >> It seems that decimal arithmetic is more intuitively correct that >> plain floating point and floating point's main (only?) advantage is >> speed, but it seems like premature optimization to favor speed over >> correctness by default at the language level. > > Intuitively, you'd think it's more correct, but for non-trivial usage > I see no reason for it to be. The strongest arguments on [1] seem to > be controllable precision and stricter standards. Controllable > precision works just as well in a library. Stricter standards (ie > very portable semantics) could be done with base-2 floats via software > emulating on all platforms (and throwing performance out the window). > > Do you have some use cases that are (completely!) correct in decimal, > and not in base-2 floating point? Something not trivial (emulating a > schoolbook, writing a calculator, etc.) No, not personally, but I assume there must be or the decimal module would never have been added in the first place. PEP 327 suggests that accurate financial calculations benefit from decimal. Someone must have had (a) sufficiently compelling use case(s) to get the BDFL to say yes. GvR doesn't approve PEPs indiscriminately. Cheers, Chris -- Follow the path of the Iguana... http://rebertia.com > > I see Decimal as a modest investment for a mild return. Not worth the > effort to switch. > > > -- > Adam Olsen, aka Rhamphoryncus > From clp at rebertia.com Thu Dec 4 11:10:33 2008 From: clp at rebertia.com (Chris Rebert) Date: Thu, 4 Dec 2008 02:10:33 -0800 Subject: [Python-ideas] Decimal literal? In-Reply-To: <429565A0F2E34960A00B14E9D15F2D7F@RaymondLaptop1> References: <47c890dc0812032351r22e5bf09q1d7ab634358186a5@mail.gmail.com> <429565A0F2E34960A00B14E9D15F2D7F@RaymondLaptop1> Message-ID: <47c890dc0812040210u1db3f187gcfc61791f04b996c@mail.gmail.com> On Thu, Dec 4, 2008 at 1:56 AM, Raymond Hettinger wrote: > From: "Cesare Di Mauro" >> >> But at least it will be more usable to have a short-hand for decimal >> declaration: >> >> a = 1234.5678d > > How often do you put non-integer constants in real programs? > Don't you find that most real decimal apps start with external > data sources instead of all the data values being hard-coded > in your program? In all fairness, by that same argument we shouldn't have float literals, yet we do despite that. They're useful in scripts where things are hardcoded. Later, the scripts grow and we do end up reading the numbers in from external sources. That doesn't mean the initial script version wasn't useful. Literals help when writing proofs-of-concept and rapid prototypes, areas where Python has historically done well. Java's designers probably used similar arguments against hard-coding when deciding not to include collection literals; meanwhile Python does have such literals and they appear to be much cherished as language features go. The parallels to the decimal situation are striking. Having decimal literals as well would at least keep things consistent. Sets are less common, yet they now have literals; why not decimals too? Cheers, Chris -- Follow the path of the Iguana... http://rebertia.com > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas > From cesare.dimauro at a-tono.com Thu Dec 4 11:19:12 2008 From: cesare.dimauro at a-tono.com (Cesare Di Mauro) Date: Thu, 04 Dec 2008 11:19:12 +0100 Subject: [Python-ideas] Decimal literal? In-Reply-To: <429565A0F2E34960A00B14E9D15F2D7F@RaymondLaptop1> References: <47c890dc0812032351r22e5bf09q1d7ab634358186a5@mail.gmail.com> <429565A0F2E34960A00B14E9D15F2D7F@RaymondLaptop1> Message-ID: On 04 dec 2008 at 10:56 AM, Raymond Hettinger wrote: >> But at least it will be more usable to have a short-hand for decimal >> declaration: >> >> a = 1234.5678d > > How often do you put non-integer constants in real programs? A few times indeed (except for strings). So why we are allowing floats literals? > Don't you find that most real decimal apps start with external > data sources instead of all the data values being hard-coded > in your program? The same happens with any kind of application: except for very common cases (like using integers and strings), constant definitions are rare. But working with financial applications, using decimal numerics is a very common practice. Even if implementation is slow, we prefer exact results over speed: there must be no possibility on failing calculations when we are manipulating moneys. If you take a look at other languages / IDEs, like Delphi or CBuilder, there's support for BCD-like type, but I never appreciated the need to import its library to use it on my applications. Also keep in mind that having the possibility to define literals for a set of types can help a lot in generating a more optimized bytecode. That's because we can do a more aggressive static analysis (a field were can be done a lot of work to improve the performance of the language). Cheers Cesare From cesare.dimauro at a-tono.com Thu Dec 4 11:22:12 2008 From: cesare.dimauro at a-tono.com (Cesare Di Mauro) Date: Thu, 04 Dec 2008 11:22:12 +0100 Subject: [Python-ideas] Decimal literal? In-Reply-To: <47c890dc0812040210u1db3f187gcfc61791f04b996c@mail.gmail.com> References: <47c890dc0812032351r22e5bf09q1d7ab634358186a5@mail.gmail.com> <429565A0F2E34960A00B14E9D15F2D7F@RaymondLaptop1> <47c890dc0812040210u1db3f187gcfc61791f04b996c@mail.gmail.com> Message-ID: On Thu, Dec 4, 2008 at 11:10 AM, Chris Rebert wrote: >> How often do you put non-integer constants in real programs? >> Don't you find that most real decimal apps start with external >> data sources instead of all the data values being hard-coded >> in your program? > > In all fairness, by that same argument we shouldn't have float > literals, yet we do despite that. They're useful in scripts where > things are hardcoded. Later, the scripts grow and we do end up reading > the numbers in from external sources. That doesn't mean the initial > script version wasn't useful. Literals help when writing > proofs-of-concept and rapid prototypes, areas where Python has > historically done well. > Java's designers probably used similar arguments against hard-coding > when deciding not to include collection literals; meanwhile Python > does have such literals and they appear to be much cherished as > language features go. The parallels to the decimal situation are > striking. > Having decimal literals as well would at least keep things consistent. > Sets are less common, yet they now have literals; why not decimals > too? > > Cheers, > Chris I absolutely agree. Literals, also, can help improve language speed. Cheers, Cesare From stephen at xemacs.org Thu Dec 4 11:43:43 2008 From: stephen at xemacs.org (Stephen J. Turnbull) Date: Thu, 04 Dec 2008 19:43:43 +0900 Subject: [Python-ideas] Decimal literal? In-Reply-To: <47c890dc0812040110w15cb30f3ld23e5bfdd0936a0f@mail.gmail.com> References: <47c890dc0812032351r22e5bf09q1d7ab634358186a5@mail.gmail.com> <204A1AE06E8341E7BE37050BF2674E55@RaymondLaptop1> <47c890dc0812040023i3b887b28l4cb89db68ae539e5@mail.gmail.com> <87y6ywwce1.fsf@uwakimon.sk.tsukuba.ac.jp> <47c890dc0812040110w15cb30f3ld23e5bfdd0936a0f@mail.gmail.com> Message-ID: <87oczsnrts.fsf@xemacs.org> Chris Rebert writes: > We're talking about adding a feature, not taking speed away. OK, that's reasonable. But adding features is expensive. BTW, don't listen to me, I've never done it. Listen to Raymond. > If anything, this would increase adoption of Python as people > writing programs that use decimals extensively would be able to use > decimals with greater ease. Maybe. I don't see a huge advantage of over import Decimal I also think that most of the (easy) advantage to Decimal will accrue to people who *never* have to deal with measurement error: accountants. But oops! they don't need Decimal per se; they're perfectly happy with big integers. People who really *do* need Decimal are not going to be deterred by 16 characters (counting the newline); they're already into real pain. > Additionally, your argument can be turned on its head ;-) Consider: > Does perfect accuracy matter quite *that* critically in most > everyday programs? Of course not. But that's the wrong question. > Python is a *general-purpose* programming language, not an > "everyday application where accuracy isn't critical programming > language". There are plenty of applications that just cry > out for a Python implementation where it does matter. I think you've misspelled "precision". Improved accuracy cannot be achieved simply by adding a new number type. From python at rcn.com Thu Dec 4 11:50:29 2008 From: python at rcn.com (Raymond Hettinger) Date: Thu, 4 Dec 2008 02:50:29 -0800 Subject: [Python-ideas] Decimal literal? References: <47c890dc0812032351r22e5bf09q1d7ab634358186a5@mail.gmail.com> <3A55A52D0AFB41C9AF79A7B4C272DB53@RaymondLaptop1> Message-ID: <92081E2A58C34372A270D5D54994DAC4@RaymondLaptop1> From: "Raymond Hettinger" > Last time I looked, the existing C implementations out there were license compatible with Python. That should have said "incompatible". From clp at rebertia.com Thu Dec 4 12:02:08 2008 From: clp at rebertia.com (Chris Rebert) Date: Thu, 4 Dec 2008 03:02:08 -0800 Subject: [Python-ideas] Decimal literal? In-Reply-To: <92081E2A58C34372A270D5D54994DAC4@RaymondLaptop1> References: <47c890dc0812032351r22e5bf09q1d7ab634358186a5@mail.gmail.com> <3A55A52D0AFB41C9AF79A7B4C272DB53@RaymondLaptop1> <92081E2A58C34372A270D5D54994DAC4@RaymondLaptop1> Message-ID: <47c890dc0812040302u48b45aeft6d55ba267cce797b@mail.gmail.com> On Thu, Dec 4, 2008 at 2:50 AM, Raymond Hettinger wrote: > From: "Raymond Hettinger" >> >> Last time I looked, the existing C implementations out there were license >> compatible with Python. > > That should have said "incompatible". > decNumber is available under the ICU License, which seems to be a variant of the original BSD license. Depending on exactly how the acknowledgement clause is interpreted (IANAL), it seems like it might be compatible. If not, IBM, which has copyright on decNumber, seems to have a fairly pro-open-source stance historically; perhaps if asked nicely by the community, they would be willing to relicense decNumber under the revised BSD license (a very minor change vs. the ICU License), which would certainly be compatible with Python's licensing policy. Or maybe there exists another library that's already compatible. Perhaps I'll investigate. But the key here is we should first determine whether people want decimal to be built-in and have a literal. Once that's established, then the details as to implementing that should be investigated. But yes, practicality and feasibility certainly are factors in all this. Cheers, Chris -- Follow the path of the Iguana... http://rebertia.com From facundobatista at gmail.com Thu Dec 4 12:33:25 2008 From: facundobatista at gmail.com (Facundo Batista) Date: Thu, 4 Dec 2008 09:33:25 -0200 Subject: [Python-ideas] Decimal literal? In-Reply-To: <47c890dc0812040302u48b45aeft6d55ba267cce797b@mail.gmail.com> References: <47c890dc0812032351r22e5bf09q1d7ab634358186a5@mail.gmail.com> <3A55A52D0AFB41C9AF79A7B4C272DB53@RaymondLaptop1> <92081E2A58C34372A270D5D54994DAC4@RaymondLaptop1> <47c890dc0812040302u48b45aeft6d55ba267cce797b@mail.gmail.com> Message-ID: 2008/12/4 Chris Rebert : > Or maybe there exists another library that's already compatible. > Perhaps I'll investigate. > > But the key here is we should first determine whether people want > decimal to be built-in and have a literal. Once that's established, > then the details as to implementing that should be investigated. But I'd put it around. The best we can do *now* with Decimal, if we want it to be included as a literal *somewhen*, is to get it in C. There're already some first steps in that direction, but *please* investigate that other path you're suggesting. Thanks! -- . Facundo Blog: http://www.taniquetil.com.ar/plog/ PyAr: http://www.python.org/ar/ From aahz at pythoncraft.com Thu Dec 4 14:35:38 2008 From: aahz at pythoncraft.com (Aahz) Date: Thu, 4 Dec 2008 05:35:38 -0800 Subject: [Python-ideas] Decimal literal? In-Reply-To: <3A55A52D0AFB41C9AF79A7B4C272DB53@RaymondLaptop1> References: <47c890dc0812032351r22e5bf09q1d7ab634358186a5@mail.gmail.com> <3A55A52D0AFB41C9AF79A7B4C272DB53@RaymondLaptop1> Message-ID: <20081204133538.GA21462@panix.com> On Thu, Dec 04, 2008, Raymond Hettinger wrote: > > One other thought, decimal literals are likely not very helpful in real > programs. Most apps that have specific numeric requirements, will have > code that manipulates numbers read-in from external sources and written > back out -- the scripts themselves typically contain very few constants > (and those are typically integers), so you don't get much help from a > decimal literal. That's half-true. Most applications IME that manipulate numbers need to express zero frequently as initializers. So yeah, it's easy to just write things like:: total = dzero balance = dzero but I think there's definitely some utility from writing:: total = 0.0d balance = 0.0d How much utility (especially from the readability side) is of course subject to debate, but please don't ignore it altogether. -- Aahz (aahz at pythoncraft.com) <*> http://www.pythoncraft.com/ "It is easier to optimize correct code than to correct optimized code." --Bill Harlan From lists at cheimes.de Thu Dec 4 15:31:02 2008 From: lists at cheimes.de (Christian Heimes) Date: Thu, 04 Dec 2008 15:31:02 +0100 Subject: [Python-ideas] Decimal literal? In-Reply-To: <204A1AE06E8341E7BE37050BF2674E55@RaymondLaptop1> References: <47c890dc0812032351r22e5bf09q1d7ab634358186a5@mail.gmail.com> <204A1AE06E8341E7BE37050BF2674E55@RaymondLaptop1> Message-ID: Raymond Hettinger wrote: > It's a non-starter until there is a fast, clean C implementation of > decimal. > The current module is hundreds of times slower than binary floats. If we ever going to consider Cython for core development, the decimal module could be the first module that uses Cython. IMHO it's the perfect candidate for a proof of concept. Christian From tjreedy at udel.edu Thu Dec 4 19:04:07 2008 From: tjreedy at udel.edu (Terry Reedy) Date: Thu, 04 Dec 2008 13:04:07 -0500 Subject: [Python-ideas] Decimal literal? In-Reply-To: <47c890dc0812032351r22e5bf09q1d7ab634358186a5@mail.gmail.com> References: <47c890dc0812032351r22e5bf09q1d7ab634358186a5@mail.gmail.com> Message-ID: Chris Rebert wrote: > It seems that decimal arithmetic is more intuitively correct that > plain floating point and floating point's main (only?) advantage is > speed, but it seems like premature optimization to favor speed over > correctness by default at the language level. One could say the same about rational arithmetic, which as also been considered and so far rejected for fractional literals. In fact, fractions are more accurate since there is never rounding unless one requests it. There is an advantage of binary floats that you missed. One can prototype float functions in Python and then translate as necessary for real speed to C and get the same results (using the same compiler on the same hardware). But even prototypes need to run faster than molasses. One can also use Python to glue together C (or Fortran) double routines without translating the numbers. The numerical module (now numpy) is over a decade old and was, I believe, Python's first killer app. > Obviously, making decimal the default instead of float would be > fraught with backward compatibility problems and thus is not presently > feasible, but at the least for now Python could make it easier to use > decimals and their associated nice arithmetic by having a literal > syntax for them and making them built-in. Ditto for fractions. > So what do people think of: > 1. making decimal.Decimal a built-in type, named "decimal" (or "dec" > if that's too long?) > 2. adding a literal syntax for decimals; I'd naively suggest a 'd' > suffix to the float literal syntax (which was suggested in the brief > aforementioned thread) I would just as soon do the same for fractions.Fraction, perhaps 1 f/ 2 or 1///2. Even with decimal literals, the functions would remain in the importable module, just as with math and cmath. > 3. (in Python 4.0/Python 4000) making decimal the default instead of > float, with floats instead requiring a 'f' suffix Decimal is not just a decimal arithmetic module. It implements and will track a particular complex, specialized, possibly changeable standard controlled by IBM, which already has a few crazy quirks present for commercial rather than technical reasons. This is fine for an add-on class but not, in my opinion, for Python's default fraction arithmetic. If Python's developers did consider replacing floats in that role, I would prefer either fractions or a much simplified decimal type designed by us for general purpose needs. Terry Jan Reedy From clp at rebertia.com Thu Dec 4 19:18:24 2008 From: clp at rebertia.com (Chris Rebert) Date: Thu, 4 Dec 2008 10:18:24 -0800 Subject: [Python-ideas] Decimal literal? In-Reply-To: References: <47c890dc0812032351r22e5bf09q1d7ab634358186a5@mail.gmail.com> Message-ID: <47c890dc0812041018i5f0ce5ag6a5258052df972b9@mail.gmail.com> On Thu, Dec 4, 2008 at 10:04 AM, Terry Reedy wrote: > Chris Rebert wrote: >> 3. (in Python 4.0/Python 4000) making decimal the default instead of >> float, with floats instead requiring a 'f' suffix > > Decimal is not just a decimal arithmetic module. It implements and will > track a particular complex, specialized, possibly changeable standard > controlled by IBM, which already has a few crazy quirks present for > commercial rather than technical reasons. This is fine for an add-on class > but not, in my opinion, for Python's default fraction arithmetic. If > Python's developers did consider replacing floats in that role, I would > prefer either fractions or a much simplified decimal type designed by us for > general purpose needs. I'll just point out that GvR seemed to favor the general idea (along with a transition mechanism) in the old thread I mentioned in my original post; otherwise I'd have been much more wary of including #3. I can't speak to how good the standard is comparatively except that the Python devs must have chosen it over others or a custom one for good reason, and at least it's better than plain floats. The PEP mentions it being almost completely ANSI/IEEE-compliant and that it has already taken into account the evil corner cases. Cheers, Chris -- Follow the path of the Iguana... http://rebertia.com > > Terry Jan Reedy > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas > From rhamph at gmail.com Thu Dec 4 19:18:49 2008 From: rhamph at gmail.com (Adam Olsen) Date: Thu, 4 Dec 2008 11:18:49 -0700 Subject: [Python-ideas] Decimal literal? In-Reply-To: References: <47c890dc0812032351r22e5bf09q1d7ab634358186a5@mail.gmail.com> <429565A0F2E34960A00B14E9D15F2D7F@RaymondLaptop1> Message-ID: On Thu, Dec 4, 2008 at 3:19 AM, Cesare Di Mauro wrote: > But working with financial applications, using decimal numerics > is a very common practice. Even if implementation is slow, we > prefer exact results over speed: there must be no possibility on > failing calculations when we are manipulating moneys. This has always bothered me: the suggestion that decimal *floats* are suitable for financial calculations, when fixed point is what you want. However, I now see some FAQ entries in http://docs.python.org/library/decimal.html that show how to get fixed point behaviour out of it. Including a wrapper around multiply and divide, for ease of use, heh. Regardless, although financial use cases are important, their behaviour is not universal. The next country over or a few years down the road may have different rules, different proportions, etc. Not something we want to hardcode. -- Adam Olsen, aka Rhamphoryncus From python at rcn.com Thu Dec 4 19:55:47 2008 From: python at rcn.com (Raymond Hettinger) Date: Thu, 4 Dec 2008 10:55:47 -0800 Subject: [Python-ideas] Decimal literal? References: <47c890dc0812032351r22e5bf09q1d7ab634358186a5@mail.gmail.com><204A1AE06E8341E7BE37050BF2674E55@RaymondLaptop1> Message-ID: <83DBFB195B034B5AA8248FE8789BEDA4@RaymondLaptop1> From: "Aahz" > That's half-true. Most applications IME that manipulate numbers need to > express zero frequently as initializers. No doubt that's true. Was just pointing-out that much of the utility of the decimal module independent of whether literals are built into the parser. Also noted, that it is a non-trivial exercise to get decimals fully integrated into the language. I would like to see both things happen but it won't be easy. FWIW, when I write decimal code, I use a brief-form for the constructor: from decimal import Decimal as D . . . balance = D(0) From: "Christian Heimes" > If we ever going to consider Cython for core development, the decimal > module could be the first module that uses Cython. IMHO it's the perfect > candidate for a proof of concept. Certainly, Cython would be helpful. That being said, the decimal module is likely a poor candidate to show-off Cython's capabilities. The current code is not setup in a way that translates well. Much better speed-ups could be had from Cython if the module were rewritten to use alternate data structures for decimal numbers and for contexts and to let temporary numbers (accumulators be mutated in-place). From: "Facundo Batista": > The best we can do *now* with Decimal, if we want it to be included as > a literal *somewhen*, is to get it in C. Well said. From: "Facundo Batista": > There're already some first steps in that direction, but *please* > investigate that other path you're suggesting. IMO, those efforts have been somewhat misdirected. They were going down the path of direct translation. Instead, there needs to be a pure implementation of the spec, using better data structures and then separately adding python wrappers. The first component needs to have its own efficient context objects and fast, temporary accumulators. The latter should match the current API. Raymond From facundobatista at gmail.com Thu Dec 4 21:07:19 2008 From: facundobatista at gmail.com (Facundo Batista) Date: Thu, 4 Dec 2008 18:07:19 -0200 Subject: [Python-ideas] Decimal literal? In-Reply-To: <83DBFB195B034B5AA8248FE8789BEDA4@RaymondLaptop1> References: <47c890dc0812032351r22e5bf09q1d7ab634358186a5@mail.gmail.com> <204A1AE06E8341E7BE37050BF2674E55@RaymondLaptop1> <83DBFB195B034B5AA8248FE8789BEDA4@RaymondLaptop1> Message-ID: 2008/12/4 Raymond Hettinger : >> There're already some first steps in that direction, but *please* >> investigate that other path you're suggesting. > > IMO, those efforts have been somewhat misdirected. They were > going down the path of direct translation. Instead, there needs to > be a pure implementation of the spec, using better data structures > and then separately adding python wrappers. The first component > needs to have its own efficient context objects and fast, temporary > accumulators. The latter should match the current API. I actually was talking about the issue 2486, which is the first step to "slowly, but steadily, replace parts of Decimal from Python to C as needed." I should have been more explicit, sorry for the confusion. Regards, -- . Facundo Blog: http://www.taniquetil.com.ar/plog/ PyAr: http://www.python.org/ar/ From leif.walsh at gmail.com Thu Dec 4 21:18:03 2008 From: leif.walsh at gmail.com (Leif Walsh) Date: Thu, 4 Dec 2008 15:18:03 -0500 Subject: [Python-ideas] Decimal literal? In-Reply-To: <83DBFB195B034B5AA8248FE8789BEDA4@RaymondLaptop1> References: <47c890dc0812032351r22e5bf09q1d7ab634358186a5@mail.gmail.com> <204A1AE06E8341E7BE37050BF2674E55@RaymondLaptop1> <83DBFB195B034B5AA8248FE8789BEDA4@RaymondLaptop1> Message-ID: Coming in to the thread _way_ late, here's my $0.015: Sure, it would be great to have an accurate and fast implementation of decimal/floating point numbers active by default in the language. We don't have that yet. We have a fast implementation, and we have an accurate one, and until we have both, there is a decision to be made: which one is easy to use (in builtins, has literals, (etc.?)), and which one is the "opt-in" implementation (needs a module import, needs a constructor)? We've been dealing with roughly the same fast and sometimes-inaccurate floating-point implementation for what, almost 40 years of C programming so far. Given that there exist accurate implementations of decimal numbers (GMP, MAPM), why hasn't C moved to make one of these the "default" implementation? Whatever the answer, it seems to me that this sets a sort of precedent in programming that fast floating-point numbers are favored over accurate floating-point numbers. GMP is blindingly fast, and it isn't C's default. Decimal is, I think I saw someone mention "hundreds of times slower" than the current float implementation. I think, until the decimal implementation approaches something like GMP's speed, there really isn't much point in even considering making it a default. Now, to the question of a 'decimal literal': Including support for something like '1.1d' requires that we include the decimal module in builtins. Now, I don't know that there's no way around this, but it seems like a slowdown for everyone just to let a few people type a bit less. -1 -- Cheers, Leif From rhamph at gmail.com Thu Dec 4 22:18:32 2008 From: rhamph at gmail.com (Adam Olsen) Date: Thu, 4 Dec 2008 14:18:32 -0700 Subject: [Python-ideas] Decimal literal? In-Reply-To: References: <47c890dc0812032351r22e5bf09q1d7ab634358186a5@mail.gmail.com> <204A1AE06E8341E7BE37050BF2674E55@RaymondLaptop1> <83DBFB195B034B5AA8248FE8789BEDA4@RaymondLaptop1> Message-ID: On Thu, Dec 4, 2008 at 1:18 PM, Leif Walsh wrote: > Coming in to the thread _way_ late, here's my $0.015: > > Sure, it would be great to have an accurate and fast implementation of > decimal/floating point numbers active by default in the language. We > don't have that yet. We have a fast implementation, and we have an > accurate one, and until we have both, there is a decision to be made: > which one is easy to use (in builtins, has literals, (etc.?)), and > which one is the "opt-in" implementation (needs a module import, needs > a constructor)? > > We've been dealing with roughly the same fast and sometimes-inaccurate > floating-point implementation for what, almost 40 years of C > programming so far. Given that there exist accurate implementations > of decimal numbers (GMP, MAPM), why hasn't C moved to make one of > these the "default" implementation? > > Whatever the answer, it seems to me that this sets a sort of precedent > in programming that fast floating-point numbers are favored over > accurate floating-point numbers. GMP is blindingly fast, and it isn't > C's default. Decimal is, I think I saw someone mention "hundreds of > times slower" than the current float implementation. GMP may be blindingly fast for an arbitrary precision floating point implementation, but it's quite slow compared to hardware floating point. Even in hardware there's a temptation to optimize for single-precision and skip various IEEE 754 special cases that would slow things down. Performance really does count. You're not going to find a broad solution here. Decimal is mildly more precise, but substantially slower. It's also less convenient for interacting with C code. Given the importance of C extensions to Python, interacting with C is the strongest argument here. It's not an elegant reason, but it's very practical. Besides, any user WILL have to learn what floats do to their numbers, so you might as well make it obvious. If you really want to avoid it you should be using a symbolic math library instead. Personally, if I need a calculator I usually use Qalculate, rather than an interactive interpreter. -- Adam Olsen, aka Rhamphoryncus From tjreedy at udel.edu Thu Dec 4 23:09:36 2008 From: tjreedy at udel.edu (Terry Reedy) Date: Thu, 04 Dec 2008 17:09:36 -0500 Subject: [Python-ideas] Decimal literal? In-Reply-To: References: <47c890dc0812032351r22e5bf09q1d7ab634358186a5@mail.gmail.com> <204A1AE06E8341E7BE37050BF2674E55@RaymondLaptop1> <83DBFB195B034B5AA8248FE8789BEDA4@RaymondLaptop1> Message-ID: Leif Walsh wrote: > Coming in to the thread _way_ late, here's my $0.015: > > Sure, it would be great to have an accurate and fast implementation of > decimal/floating point numbers active by default in the language. We have one by many definitions of 'accurate'. Being off by a few or even a hundred parts per quintillion is pretty good by some standards. > We don't have that yet. I disagree. > We have a fast implementation, and we have an > accurate one, and until we have both, there is a decision to be made: The notion that decimal is more 'accurate' than float needs a lot of qualification. Yes, it is intended to give *exactly* the answer to various financial calculations that various jurisdictions mandate, but that is a rather specialized meaning of 'accurate'. tjr From leif.walsh at gmail.com Fri Dec 5 05:41:35 2008 From: leif.walsh at gmail.com (Leif Walsh) Date: Thu, 4 Dec 2008 23:41:35 -0500 Subject: [Python-ideas] Decimal literal? In-Reply-To: References: <47c890dc0812032351r22e5bf09q1d7ab634358186a5@mail.gmail.com> <204A1AE06E8341E7BE37050BF2674E55@RaymondLaptop1> <83DBFB195B034B5AA8248FE8789BEDA4@RaymondLaptop1> Message-ID: On Thu, Dec 4, 2008 at 5:09 PM, Terry Reedy wrote: > We have one by many definitions of 'accurate'. Being off by a few or even a > hundred parts per quintillion is pretty good by some standards. I agree. That's why I don't think the decimal module should be the "default implementation". > I disagree. Okay. "Perfectly accurate" then. > The notion that decimal is more 'accurate' than float needs a lot of > qualification. Yes, it is intended to give *exactly* the answer to various > financial calculations that various jurisdictions mandate, but that is a > rather specialized meaning of 'accurate'. You've said what I mean better than I could. The float implementation is more than good enough for almost all applications, and it seems ridiculous to me to slow them down for the precious few that need more precision (and, at that, just don't want to type quite as much). -- Cheers, Leif From python at rcn.com Fri Dec 5 05:59:54 2008 From: python at rcn.com (Raymond Hettinger) Date: Thu, 4 Dec 2008 20:59:54 -0800 Subject: [Python-ideas] Decimal literal? References: <47c890dc0812032351r22e5bf09q1d7ab634358186a5@mail.gmail.com><204A1AE06E8341E7BE37050BF2674E55@RaymondLaptop1><83DBFB195B034B5AA8248FE8789BEDA4@RaymondLaptop1> Message-ID: <0DD3877E77AA4680A7CB6952AE54124C@RaymondLaptop1> >> The notion that decimal is more 'accurate' than float needs a lot of >> qualification. Yes, it is intended to give *exactly* the answer to various >> financial calculations that various jurisdictions mandate, but that is a >> rather specialized meaning of 'accurate'. > > You've said what I mean better than I could. The float implementation > is more than good enough for almost all applications, and it seems > ridiculous to me to slow them down for the precious few that need more > precision (and, at that, just don't want to type quite as much). While we're mincing words, I would state the case differently. Neither "precision" or "accuracy" captures the essential difference between binary and decimal floating point. It is all about what is "exactly representable". The main reason decimal is good for financial apps is that the numbers of interest are exactly representable in decimal floating point but not in binary floating point. In a financial app, it can matter that 1.10 is exact rather than some nearby value representable in binary floating point, 0x1.199999999999ap+0. Of course, there are other differences like control over rounding and variable precision, but the main story is about what is exactly representable. Raymond From jimjjewett at gmail.com Fri Dec 5 19:53:33 2008 From: jimjjewett at gmail.com (Jim Jewett) Date: Fri, 5 Dec 2008 13:53:33 -0500 Subject: [Python-ideas] Decimal literal? In-Reply-To: References: <47c890dc0812032351r22e5bf09q1d7ab634358186a5@mail.gmail.com> Message-ID: On Thu, Dec 4, 2008 at 4:45 AM, Cesare Di Mauro wrote: > But at least it will be more usable to have a > short-hand for decimal declaration: In isolation, a decimal literal sounds nice. But it may not be used often enough to justify the extra mental complexity. What should the following mean? >>> a = 123X It isn't obvious, which means that either it gets used all the time (decimal won't) or people will have to look it up -- or just guess, and sometimes get it wrong. > a = 1234.567d To someone who hasn't programmed much with decimal floating point, what does the "d" mean? Could it indicate "use double-precision"? Could it just mean that the written representation is "decimal" as opposed to "octal" or "hexadecimal", but that the internal form is still binary? > a = 1234.567d > is simpler than: [reworded to be even shorter per use] >>> from decimal import Decimal as d >>> a = d('1234.5678') but if you really have enough Decimal literals for the difference to matter, you could always write your own helper function. >>> # pretend to be using the European decimal point >>> a = d(1234,5678) >>> # maps easily to the tuple-format constructor >>> a = d(12345678, -4) My own hunch is that until Decimal is used enough that people start putting this sort of constructor into their personal libraries, it probably doesn't need a literal. -jJ From qrczak at knm.org.pl Fri Dec 5 21:33:59 2008 From: qrczak at knm.org.pl (Marcin 'Qrczak' Kowalczyk) Date: Fri, 5 Dec 2008 21:33:59 +0100 Subject: [Python-ideas] Decimal literal? In-Reply-To: References: <47c890dc0812032351r22e5bf09q1d7ab634358186a5@mail.gmail.com> Message-ID: <3f4107910812051233o5c8c1c79k26967aee09d73b6@mail.gmail.com> C# uses m (or M) as decimal suffix. Mnemonic: money. -- Marcin Kowalczyk qrczak at knm.org.pl http://qrnik.knm.org.pl/~qrczak/ From bruce at leapyear.org Fri Dec 5 23:02:17 2008 From: bruce at leapyear.org (Bruce Leban) Date: Fri, 5 Dec 2008 14:02:17 -0800 Subject: [Python-ideas] Decimal literal? In-Reply-To: References: <47c890dc0812032351r22e5bf09q1d7ab634358186a5@mail.gmail.com> Message-ID: There is a representation for decimal literals that nicely avoids the problem of remembering that 0d is decimal and 0m is meters etc.: >>> import decimal >>> decimal.Decimal(3) Decimal("3") >>> Decimal("3") Traceback (most recent call last): File "", line 1, in ? NameError: name 'Decimal' is not defined The error points out that I really need to do both: >>> import decimal >>> from decimal import Decimal. and I'd prefer the single import do both. Note that this anomaly of repr is not limited to decimal as I think this is a bit worse: >>> float('nan') nan >>> float('inf') inf --- Bruce On Fri, Dec 5, 2008 at 10:53 AM, Jim Jewett wrote: > On Thu, Dec 4, 2008 at 4:45 AM, Cesare Di Mauro > wrote: > > But at least it will be more usable to have a > > short-hand for decimal declaration: > > In isolation, a decimal literal sounds nice. > > But it may not be used often enough to justify the extra mental complexity. > > What should the following mean? > > >>> a = 123X > > It isn't obvious, which means that either it gets used all the time > (decimal won't) or people will have to look it up -- or just guess, > and sometimes get it wrong. > > > a = 1234.567d > > To someone who hasn't programmed much with decimal floating point, > what does the "d" mean? > > Could it indicate "use double-precision"? > > Could it just mean that the written representation is "decimal" as > opposed to "octal" or "hexadecimal", but that the internal form is > still binary? > > > a = 1234.567d > > > is simpler than: > > [reworded to be even shorter per use] > > >>> from decimal import Decimal as d > >>> a = d('1234.5678') > > but if you really have enough Decimal literals for the difference to > matter, you could always write your own helper function. > > >>> # pretend to be using the European decimal point > >>> a = d(1234,5678) > > >>> # maps easily to the tuple-format constructor > >>> a = d(12345678, -4) > > My own hunch is that until Decimal is used enough that people start > putting this sort of constructor into their personal libraries, it > probably doesn't need a literal. > > -jJ > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas > -------------- next part -------------- An HTML attachment was scrubbed... URL: From clp at rebertia.com Fri Dec 5 23:52:48 2008 From: clp at rebertia.com (Chris Rebert) Date: Fri, 5 Dec 2008 14:52:48 -0800 Subject: [Python-ideas] Decimal literal? In-Reply-To: References: <47c890dc0812032351r22e5bf09q1d7ab634358186a5@mail.gmail.com> Message-ID: <47c890dc0812051452n5c1d028eofdfb7aef0d549e12@mail.gmail.com> On Fri, Dec 5, 2008 at 2:02 PM, Bruce Leban wrote: > There is a representation for decimal literals that nicely avoids the > problem of remembering that 0d is decimal and 0m is meters etc.: > >>>> import decimal >>>> decimal.Decimal(3) > Decimal("3") >>>> Decimal("3") > Traceback (most recent call last): > File "", line 1, in ? > NameError: name 'Decimal' is not defined > > The error points out that I really need to do both: > >>>> import decimal >>>> from decimal import Decimal. You only need the second line there. The first line is unnecessary and does not effect the second. Cheers, Chris -- Follow the path of the Iguana... http://rebertia.com > > and I'd prefer the single import do both. Note that this anomaly of repr is > not limited to decimal as I think this is a bit worse: > >>>> float('nan') > nan >>>> float('inf') > inf > > --- Bruce > > On Fri, Dec 5, 2008 at 10:53 AM, Jim Jewett wrote: >> >> On Thu, Dec 4, 2008 at 4:45 AM, Cesare Di Mauro >> wrote: >> > But at least it will be more usable to have a >> > short-hand for decimal declaration: >> >> In isolation, a decimal literal sounds nice. >> >> But it may not be used often enough to justify the extra mental >> complexity. >> >> What should the following mean? >> >> >>> a = 123X >> >> It isn't obvious, which means that either it gets used all the time >> (decimal won't) or people will have to look it up -- or just guess, >> and sometimes get it wrong. >> >> > a = 1234.567d >> >> To someone who hasn't programmed much with decimal floating point, >> what does the "d" mean? >> >> Could it indicate "use double-precision"? >> >> Could it just mean that the written representation is "decimal" as >> opposed to "octal" or "hexadecimal", but that the internal form is >> still binary? >> >> > a = 1234.567d >> >> > is simpler than: >> >> [reworded to be even shorter per use] >> >> >>> from decimal import Decimal as d >> >>> a = d('1234.5678') >> >> but if you really have enough Decimal literals for the difference to >> matter, you could always write your own helper function. >> >> >>> # pretend to be using the European decimal point >> >>> a = d(1234,5678) >> >> >>> # maps easily to the tuple-format constructor >> >>> a = d(12345678, -4) >> >> My own hunch is that until Decimal is used enough that people start >> putting this sort of constructor into their personal libraries, it >> probably doesn't need a literal. >> >> -jJ >> _______________________________________________ >> Python-ideas mailing list >> Python-ideas at python.org >> http://mail.python.org/mailman/listinfo/python-ideas > > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas > > From idadesub at users.sourceforge.net Sun Dec 7 05:13:07 2008 From: idadesub at users.sourceforge.net (Erick Tryzelaar) Date: Sat, 6 Dec 2008 20:13:07 -0800 Subject: [Python-ideas] Anyone interested in zsh-style subpattern matching for fnmatch/glob? Message-ID: <1ef034530812062013h1871a01djb50146406abbbbda@mail.gmail.com> My project needs to extend fnmatch to support zsh-style globbing, where you can use brackets to designate subexpressions. Say you had a directory structure like this: foo/ foo.ext1 foo.ext2 bar/ foo.ext1 foo.ext2 The subexpressions will let you do patterns like this: >>> glob.glob('foo/foo.{ext1,ext2}') ['foo/foo.ext1', 'foo/foo.ext2'] >>> glob.glob('foo/foo.ext{1,2}') ['foo/foo.ext1', 'foo/foo.ext2'] >>> glob.glob('{foo,bar}') ['bar', 'foo'] >>> glob.glob('{foo,bar}/foo*') ['bar/foo.ext1', 'bar/foo.ext2', 'foo/foo.ext1', 'foo/foo.ext2'] >>> glob.glob('{foo,bar}/foo.{ext*}') ['bar/foo.ext1', 'bar/foo.ext2', 'foo/foo.ext1', 'foo/foo.ext2'] >>> glob.glob('{f?o,b?r}/foo.{ext*}') ['bar/foo.ext1', 'bar/foo.ext2', 'foo/foo.ext1', 'foo/foo.ext2'] Would this be interesting to anyone else? It would unfortunately break fnmatch since it currently would ignore with {} in it. It'd be easy to work around that by adding a flag or using a different function name. Anyway, here's the patch against the head of py3k. -e Index: Lib/glob.py =================================================================== --- Lib/glob.py (revision 67629) +++ Lib/glob.py (working copy) @@ -72,8 +72,8 @@ return [] -magic_check = re.compile('[*?[]') -magic_check_bytes = re.compile(b'[*?[]') +magic_check = re.compile('[*?[{]') +magic_check_bytes = re.compile(b'[*?[{]') def has_magic(s): if isinstance(s, bytes): Index: Lib/fnmatch.py =================================================================== --- Lib/fnmatch.py (revision 67629) +++ Lib/fnmatch.py (working copy) @@ -22,10 +22,11 @@ Patterns are Unix shell style: - * matches everything - ? matches any single character - [seq] matches any character in seq - [!seq] matches any char not in seq + * matches everything + ? matches any single character + [seq] matches any character in seq + [!seq] matches any char not in seq + {pat1,pat2} matches subpattern pat1 or subpattern pat2 An initial period in FILENAME is not special. Both FILENAME and PATTERN are first case-normalized @@ -84,10 +85,15 @@ There is no way to quote meta-characters. """ - i, n = 0, len(pat) + return _translate(0, pat, '')[2] + '$' + +def _translate(i, pat, end): res = '' + n = len(pat) while i < n: c = pat[i] + if c in end: + return i, c, res i = i+1 if c == '*': res = res + '.*' @@ -111,6 +117,27 @@ elif stuff[0] == '^': stuff = '\\' + stuff res = '%s[%s]' % (res, stuff) + elif c == '{': + i, sub = _translate_subexpression(i, pat) + res += sub else: res = res + re.escape(c) - return res + "$" + return i, '', res + +def _translate_subexpression(i, pat): + j = i + subexpressions = [] + while True: + j, c, res = _translate(j, pat, ',}') + subexpressions.append(res) + + if c == ',': + j += 1 + elif c == '}': + j += 1 + break + else: + # turns out we didn't have a subpattern + return j, '{' + ','.join(subexpressions) + + return j, '(' + '|'.join(subexpressions) + ')' Index: Lib/test/test_fnmatch.py =================================================================== --- Lib/test/test_fnmatch.py (revision 67629) +++ Lib/test/test_fnmatch.py (working copy) @@ -37,6 +37,12 @@ check('a', r'[!\]') check('\\', r'[!\]', 0) + check('abcdefghi', 'ab{cd,12*}ef{gh?,34}') + check('ab1234ef34', 'ab{cd,12*}ef{gh?,34}') + + check('abcdefgh', 'ab{cd,12*}ef{gh?,34}', 0) + check('ab1234ef345', 'ab{cd,12*}ef{gh?,34}', 0) + def test_mix_bytes_str(self): self.assertRaises(TypeError, fnmatch, 'test', b'*') self.assertRaises(TypeError, fnmatch, b'test', '*') Index: Lib/test/test_glob.py =================================================================== --- Lib/test/test_glob.py (revision 67629) +++ Lib/test/test_glob.py (working copy) @@ -69,6 +69,7 @@ eq(self.glob('aa?'), map(self.norm, ['aaa', 'aab'])) eq(self.glob('aa[ab]'), map(self.norm, ['aaa', 'aab'])) eq(self.glob('*q'), []) + eq(self.glob('a{?a,?b}'), map(self.norm, ['aaa', 'aab'])) def test_glob_nested_directory(self): eq = self.assertSequencesEqual_noorder @@ -89,6 +90,9 @@ [self.norm('a', 'bcd', 'efg', 'ha')]) eq(self.glob('?a?', '*F'), map(self.norm, [os.path.join('aaa', 'zzzF'), os.path.join('aab', 'F')])) + eq(self.glob('a', 'b{c,x}d', '{*}', '*a'), + [self.norm('a', 'bcd', 'efg', 'ha')]) + eq(self.glob('a', 'b{x,y}d', '{*}', '*a'), []) def test_glob_directory_with_trailing_slash(self): # We are verifying that when there is wildcard pattern which From greg at krypto.org Sun Dec 7 05:46:47 2008 From: greg at krypto.org (Gregory P. Smith) Date: Sat, 6 Dec 2008 20:46:47 -0800 Subject: [Python-ideas] Anyone interested in zsh-style subpattern matching for fnmatch/glob? In-Reply-To: <1ef034530812062013h1871a01djb50146406abbbbda@mail.gmail.com> References: <1ef034530812062013h1871a01djb50146406abbbbda@mail.gmail.com> Message-ID: <52dc1c820812062046x4e8969cdmf9055e8445da18a3@mail.gmail.com> This looks useful. Please post it as a feature request issue with patch on bugs.python.org. Also, if you could include updates to the fnmatch documentation to describe exactly what your code allows that would help. thanks, -Greg On Sat, Dec 6, 2008 at 8:13 PM, Erick Tryzelaar < idadesub at users.sourceforge.net> wrote: > My project needs to extend fnmatch to support zsh-style globbing, > where you can use brackets to designate subexpressions. Say you had a > directory structure like this: > > foo/ > foo.ext1 > foo.ext2 > bar/ > foo.ext1 > foo.ext2 > > The subexpressions will let you do patterns like this: > > >>> glob.glob('foo/foo.{ext1,ext2}') > ['foo/foo.ext1', 'foo/foo.ext2'] > >>> glob.glob('foo/foo.ext{1,2}') > ['foo/foo.ext1', 'foo/foo.ext2'] > >>> glob.glob('{foo,bar}') > ['bar', 'foo'] > >>> glob.glob('{foo,bar}/foo*') > ['bar/foo.ext1', 'bar/foo.ext2', 'foo/foo.ext1', 'foo/foo.ext2'] > >>> glob.glob('{foo,bar}/foo.{ext*}') > ['bar/foo.ext1', 'bar/foo.ext2', 'foo/foo.ext1', 'foo/foo.ext2'] > >>> glob.glob('{f?o,b?r}/foo.{ext*}') > ['bar/foo.ext1', 'bar/foo.ext2', 'foo/foo.ext1', 'foo/foo.ext2'] > > > Would this be interesting to anyone else? It would unfortunately break > fnmatch since it currently would ignore with {} in it. It'd be easy to > work around that by adding a flag or using a different function name. > Anyway, here's the patch against the head of py3k. > > -e > > > > Index: Lib/glob.py > =================================================================== > --- Lib/glob.py (revision 67629) > +++ Lib/glob.py (working copy) > @@ -72,8 +72,8 @@ > return [] > > > -magic_check = re.compile('[*?[]') > -magic_check_bytes = re.compile(b'[*?[]') > +magic_check = re.compile('[*?[{]') > +magic_check_bytes = re.compile(b'[*?[{]') > > def has_magic(s): > if isinstance(s, bytes): > Index: Lib/fnmatch.py > =================================================================== > --- Lib/fnmatch.py (revision 67629) > +++ Lib/fnmatch.py (working copy) > @@ -22,10 +22,11 @@ > > Patterns are Unix shell style: > > - * matches everything > - ? matches any single character > - [seq] matches any character in seq > - [!seq] matches any char not in seq > + * matches everything > + ? matches any single character > + [seq] matches any character in seq > + [!seq] matches any char not in seq > + {pat1,pat2} matches subpattern pat1 or subpattern pat2 > > An initial period in FILENAME is not special. > Both FILENAME and PATTERN are first case-normalized > @@ -84,10 +85,15 @@ > There is no way to quote meta-characters. > """ > > - i, n = 0, len(pat) > + return _translate(0, pat, '')[2] + '$' > + > +def _translate(i, pat, end): > res = '' > + n = len(pat) > while i < n: > c = pat[i] > + if c in end: > + return i, c, res > i = i+1 > if c == '*': > res = res + '.*' > @@ -111,6 +117,27 @@ > elif stuff[0] == '^': > stuff = '\\' + stuff > res = '%s[%s]' % (res, stuff) > + elif c == '{': > + i, sub = _translate_subexpression(i, pat) > + res += sub > else: > res = res + re.escape(c) > - return res + "$" > + return i, '', res > + > +def _translate_subexpression(i, pat): > + j = i > + subexpressions = [] > + while True: > + j, c, res = _translate(j, pat, ',}') > + subexpressions.append(res) > + > + if c == ',': > + j += 1 > + elif c == '}': > + j += 1 > + break > + else: > + # turns out we didn't have a subpattern > + return j, '{' + ','.join(subexpressions) > + > + return j, '(' + '|'.join(subexpressions) + ')' > Index: Lib/test/test_fnmatch.py > =================================================================== > --- Lib/test/test_fnmatch.py (revision 67629) > +++ Lib/test/test_fnmatch.py (working copy) > @@ -37,6 +37,12 @@ > check('a', r'[!\]') > check('\\', r'[!\]', 0) > > + check('abcdefghi', 'ab{cd,12*}ef{gh?,34}') > + check('ab1234ef34', 'ab{cd,12*}ef{gh?,34}') > + > + check('abcdefgh', 'ab{cd,12*}ef{gh?,34}', 0) > + check('ab1234ef345', 'ab{cd,12*}ef{gh?,34}', 0) > + > def test_mix_bytes_str(self): > self.assertRaises(TypeError, fnmatch, 'test', b'*') > self.assertRaises(TypeError, fnmatch, b'test', '*') > Index: Lib/test/test_glob.py > =================================================================== > --- Lib/test/test_glob.py (revision 67629) > +++ Lib/test/test_glob.py (working copy) > @@ -69,6 +69,7 @@ > eq(self.glob('aa?'), map(self.norm, ['aaa', 'aab'])) > eq(self.glob('aa[ab]'), map(self.norm, ['aaa', 'aab'])) > eq(self.glob('*q'), []) > + eq(self.glob('a{?a,?b}'), map(self.norm, ['aaa', 'aab'])) > > def test_glob_nested_directory(self): > eq = self.assertSequencesEqual_noorder > @@ -89,6 +90,9 @@ > [self.norm('a', 'bcd', 'efg', 'ha')]) > eq(self.glob('?a?', '*F'), map(self.norm, [os.path.join('aaa', > 'zzzF'), > os.path.join('aab', > 'F')])) > + eq(self.glob('a', 'b{c,x}d', '{*}', '*a'), > + [self.norm('a', 'bcd', 'efg', 'ha')]) > + eq(self.glob('a', 'b{x,y}d', '{*}', '*a'), []) > > def test_glob_directory_with_trailing_slash(self): > # We are verifying that when there is wildcard pattern which > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas > -------------- next part -------------- An HTML attachment was scrubbed... URL: From idadesub at users.sourceforge.net Sun Dec 7 09:19:01 2008 From: idadesub at users.sourceforge.net (Erick Tryzelaar) Date: Sun, 7 Dec 2008 00:19:01 -0800 Subject: [Python-ideas] Anyone interested in zsh-style subpattern matching for fnmatch/glob? In-Reply-To: <52dc1c820812062046x4e8969cdmf9055e8445da18a3@mail.gmail.com> References: <1ef034530812062013h1871a01djb50146406abbbbda@mail.gmail.com> <52dc1c820812062046x4e8969cdmf9055e8445da18a3@mail.gmail.com> Message-ID: <1ef034530812070019q27df5672mec1cbc8bb6a4f7b9@mail.gmail.com> On Sat, Dec 6, 2008 at 8:46 PM, Gregory P. Smith wrote: > This looks useful. > > Please post it as a feature request issue with patch on bugs.python.org. > Also, if you could include updates to the fnmatch documentation to describe > exactly what your code allows that would help. Thanks Greg. I've made issue4573 to track this. From ironfroggy at gmail.com Mon Dec 8 01:05:14 2008 From: ironfroggy at gmail.com (Calvin Spealman) Date: Sun, 7 Dec 2008 19:05:14 -0500 Subject: [Python-ideas] Anyone interested in zsh-style subpattern matching for fnmatch/glob? In-Reply-To: <1ef034530812070019q27df5672mec1cbc8bb6a4f7b9@mail.gmail.com> References: <1ef034530812062013h1871a01djb50146406abbbbda@mail.gmail.com> <52dc1c820812062046x4e8969cdmf9055e8445da18a3@mail.gmail.com> <1ef034530812070019q27df5672mec1cbc8bb6a4f7b9@mail.gmail.com> Message-ID: <76fd5acf0812071605t4007a3f3j1ba735de6d454224@mail.gmail.com> Backported to 2.7 as I think it is just applicable there. On Sun, Dec 7, 2008 at 3:19 AM, Erick Tryzelaar wrote: > On Sat, Dec 6, 2008 at 8:46 PM, Gregory P. Smith wrote: >> This looks useful. >> >> Please post it as a feature request issue with patch on bugs.python.org. >> Also, if you could include updates to the fnmatch documentation to describe >> exactly what your code allows that would help. > > Thanks Greg. I've made issue4573 to track this. > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas > -- Read my blog! I depend on your acceptance of my opinion! I am interesting! http://techblog.ironfroggy.com/ Follow me if you're into that sort of thing: http://www.twitter.com/ironfroggy From clp at rebertia.com Mon Dec 8 06:04:07 2008 From: clp at rebertia.com (Chris Rebert) Date: Sun, 7 Dec 2008 21:04:07 -0800 Subject: [Python-ideas] Decimal literal? In-Reply-To: <47c890dc0812040302u48b45aeft6d55ba267cce797b@mail.gmail.com> References: <47c890dc0812032351r22e5bf09q1d7ab634358186a5@mail.gmail.com> <3A55A52D0AFB41C9AF79A7B4C272DB53@RaymondLaptop1> <92081E2A58C34372A270D5D54994DAC4@RaymondLaptop1> <47c890dc0812040302u48b45aeft6d55ba267cce797b@mail.gmail.com> Message-ID: <47c890dc0812072104r1bbf3201p147b7a746ee7b7da@mail.gmail.com> Ok, so just to summarize should anyone bring up this same issue again and come upon this thread: * Decimal literals are a possibly good idea, but a fast C implementation with a Python-compatible license (which as of this writing does not yet exist) would be a necessary prerequisite * Making decimals the default instead of floats would be controversial to say the least and would definitely require further analysis+discussion Cheers, Chris -- Follow the path of the Iguana... http://rebertia.com On Thu, Dec 4, 2008 at 3:02 AM, Chris Rebert wrote: > On Thu, Dec 4, 2008 at 2:50 AM, Raymond Hettinger wrote: >> From: "Raymond Hettinger" >>> >>> Last time I looked, the existing C implementations out there were license >>> compatible with Python. >> >> That should have said "incompatible". >> > > decNumber is available under the ICU License, which seems to be a > variant of the original BSD license. Depending on exactly how the > acknowledgement clause is interpreted (IANAL), it seems like it might > be compatible. If not, IBM, which has copyright on decNumber, seems to > have a fairly pro-open-source stance historically; perhaps if asked > nicely by the community, they would be willing to relicense decNumber > under the revised BSD license (a very minor change vs. the ICU > License), which would certainly be compatible with Python's licensing > policy. > > Or maybe there exists another library that's already compatible. > Perhaps I'll investigate. > > But the key here is we should first determine whether people want > decimal to be built-in and have a literal. Once that's established, > then the details as to implementing that should be investigated. But > yes, practicality and feasibility certainly are factors in all this. > > Cheers, > Chris > > -- > Follow the path of the Iguana... > http://rebertia.com > From skip at pobox.com Thu Dec 11 15:18:32 2008 From: skip at pobox.com (skip at pobox.com) Date: Thu, 11 Dec 2008 08:18:32 -0600 Subject: [Python-ideas] This seems like a wart to me... Message-ID: <18753.8504.116845.736633@montanaro-dyndns-org.local> Python 2 and 3 both exhibit this behavior: >>> "".split() [] >>> "".split("*") [''] >>> "".split(" ") [''] It's not at all clear to me why splitting an empty string on implicit whitespace should yield an empty list but splitting it with a non-whitespace character or explicit whitespace should yield a list with an empty string as its lone element. I realize this is documented behavior, but I can't for the life of me understand what the rationale might be for the different behaviors. Seems like a wart which might best be removed sometime in 3.x. Skip From guido at python.org Thu Dec 11 16:51:38 2008 From: guido at python.org (Guido van Rossum) Date: Thu, 11 Dec 2008 07:51:38 -0800 Subject: [Python-ideas] This seems like a wart to me... In-Reply-To: <18753.8504.116845.736633@montanaro-dyndns-org.local> References: <18753.8504.116845.736633@montanaro-dyndns-org.local> Message-ID: On Thu, Dec 11, 2008 at 6:18 AM, wrote: > Python 2 and 3 both exhibit this behavior: > > >>> "".split() > [] > >>> "".split("*") > [''] > >>> "".split(" ") > [''] > > It's not at all clear to me why splitting an empty string on implicit > whitespace should yield an empty list but splitting it with a non-whitespace > character or explicit whitespace should yield a list with an empty string as > its lone element. I realize this is documented behavior, but I can't for > the life of me understand what the rationale might be for the different > behaviors. Seems like a wart which might best be removed sometime in 3.x. Which of the two would you choose for all? The empty string is the only reasonable behavior for split-with-argument, it is the logical consequence of how it behaves when the string is not empty. E.g. "x:y".split(":") -> ["x", "y"], "x::y".split(":") -> ["x", "", "y"], ":".split(":") -> ["", ""]. OTOH split-on-whitespace doesn't behave this way; it extracts the non-empty non-whitespace-containing substrings. If anything it's wrong, it's that they share the same name. This wasn't always the case. Do you really want to go back to .split() and .splitfields(sep)? -- --Guido van Rossum (home page: http://www.python.org/~guido/) From skip at pobox.com Thu Dec 11 17:36:11 2008 From: skip at pobox.com (skip at pobox.com) Date: Thu, 11 Dec 2008 10:36:11 -0600 Subject: [Python-ideas] This seems like a wart to me... In-Reply-To: References: <18753.8504.116845.736633@montanaro-dyndns-org.local> Message-ID: <18753.16763.980209.628297@montanaro-dyndns-org.local> Guido> Which of the two would you choose for all? The empty string is the Guido> only reasonable behavior for split-with-argument, it is the logical Guido> consequence of how it behaves when the string is not empty. E.g. Guido> "x:y".split(":") -> ["x", "y"], "x::y".split(":") -> ["x", "", "y"], Guido> ":".split(":") -> ["", ""]. OTOH split-on-whitespace doesn't behave Guido> this way; it extracts the non-empty non-whitespace-containing Guido> substrings. In my feeble way of thinking I go from something which evaluates to false to something which doesn't. It's almost like making matter out of empty space: bool("") -> False bool("".split()) -> False bool("".split("n")) -> True Guido> If anything it's wrong, it's that they share the same name. This Guido> wasn't always the case. Do you really want to go back to .split() Guido> and .splitfields(sep)? That might be preferable. The same method having such strikingly different behavior throws me every time I try splitting a possibly empty string with a non-whitespace character. It's a relatively uncommon case. Most of the time when you split a string with a non-whitespace character I think you know that the input can't be empty. Skip From tjreedy at udel.edu Thu Dec 11 20:55:13 2008 From: tjreedy at udel.edu (Terry Reedy) Date: Thu, 11 Dec 2008 14:55:13 -0500 Subject: [Python-ideas] This seems like a wart to me... In-Reply-To: References: <18753.8504.116845.736633@montanaro-dyndns-org.local> Message-ID: Guido van Rossum wrote: > On Thu, Dec 11, 2008 at 6:18 AM, wrote: > If anything it's wrong, it's that they share the same name. This > wasn't always the case. Do you really want to go back to .split() and > .splitfields(sep)? I hope not. I consider the current situation to be a definite improvement. I sometimes forgot which was which. From matt.horizon5 at gmail.com Thu Dec 11 21:11:41 2008 From: matt.horizon5 at gmail.com (Matthew Russell) Date: Thu, 11 Dec 2008 20:11:41 +0000 Subject: [Python-ideas] This seems like a wart to me... In-Reply-To: References: <18753.8504.116845.736633@montanaro-dyndns-org.local> Message-ID: <3b5110850812111211w5287ccbdp4f0f35aac9b5ed46@mail.gmail.com> It seems to me like spliting an empty string is something that makes little sense to do, similar to dividing by zero in terms of an analogy. How about str.split, partition and friends just raise ValueError exception when the value is the empty string? Regards, Matt 2008/12/11 Terry Reedy > Guido van Rossum wrote: > >> On Thu, Dec 11, 2008 at 6:18 AM, wrote: >> > > If anything it's wrong, it's that they share the same name. This >> wasn't always the case. Do you really want to go back to .split() and >> .splitfields(sep)? >> > > I hope not. I consider the current situation to be a definite improvement. > I sometimes forgot which was which. > > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas > -------------- next part -------------- An HTML attachment was scrubbed... URL: From guido at python.org Thu Dec 11 21:14:20 2008 From: guido at python.org (Guido van Rossum) Date: Thu, 11 Dec 2008 12:14:20 -0800 Subject: [Python-ideas] This seems like a wart to me... In-Reply-To: <3b5110850812111211w5287ccbdp4f0f35aac9b5ed46@mail.gmail.com> References: <18753.8504.116845.736633@montanaro-dyndns-org.local> <3b5110850812111211w5287ccbdp4f0f35aac9b5ed46@mail.gmail.com> Message-ID: On Thu, Dec 11, 2008 at 12:11 PM, Matthew Russell wrote: > It seems to me like spliting an empty string is something that makes little > sense to do, > similar to dividing by zero in terms of an analogy. I guess you have never had the need. Let me assure you that you are mistaken. :-) > How about str.split, partition and friends just raise ValueError exception > when the value is the empty string? Absolutely not. > Regards, > Matt > > 2008/12/11 Terry Reedy >> >> Guido van Rossum wrote: >>> >>> On Thu, Dec 11, 2008 at 6:18 AM, wrote: >> >>> If anything it's wrong, it's that they share the same name. This >>> wasn't always the case. Do you really want to go back to .split() and >>> .splitfields(sep)? >> >> I hope not. I consider the current situation to be a definite >> improvement. I sometimes forgot which was which. >> >> _______________________________________________ >> Python-ideas mailing list >> Python-ideas at python.org >> http://mail.python.org/mailman/listinfo/python-ideas > > > > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas > > -- --Guido van Rossum (home page: http://www.python.org/~guido/) From bruce at leapyear.org Thu Dec 11 21:33:39 2008 From: bruce at leapyear.org (Bruce Leban) Date: Thu, 11 Dec 2008 12:33:39 -0800 Subject: [Python-ideas] This seems like a wart to me... In-Reply-To: References: <18753.8504.116845.736633@montanaro-dyndns-org.local> <3b5110850812111211w5287ccbdp4f0f35aac9b5ed46@mail.gmail.com> Message-ID: I think splitting an empty string is more like dividing zero in half. No one expects that to raise a value exception. --- Bruce On Thu, Dec 11, 2008 at 12:14 PM, Guido van Rossum wrote: > On Thu, Dec 11, 2008 at 12:11 PM, Matthew Russell > wrote: > > It seems to me like spliting an empty string is something that makes > little > > sense to do, > > similar to dividing by zero in terms of an analogy. > > I guess you have never had the need. Let me assure you that you are > mistaken. :-) > > > How about str.split, partition and friends just raise ValueError > exception > > when the value is the empty string? > > Absolutely not. > > > Regards, > > Matt > > > > 2008/12/11 Terry Reedy > >> > >> Guido van Rossum wrote: > >>> > >>> On Thu, Dec 11, 2008 at 6:18 AM, wrote: > >> > >>> If anything it's wrong, it's that they share the same name. This > >>> wasn't always the case. Do you really want to go back to .split() and > >>> .splitfields(sep)? > >> > >> I hope not. I consider the current situation to be a definite > >> improvement. I sometimes forgot which was which. > >> > >> _______________________________________________ > >> Python-ideas mailing list > >> Python-ideas at python.org > >> http://mail.python.org/mailman/listinfo/python-ideas > > > > > > > > > > _______________________________________________ > > Python-ideas mailing list > > Python-ideas at python.org > > http://mail.python.org/mailman/listinfo/python-ideas > > > > > > > > -- > --Guido van Rossum (home page: http://www.python.org/~guido/ > ) > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas > -------------- next part -------------- An HTML attachment was scrubbed... URL: From matt.horizon5 at gmail.com Thu Dec 11 21:42:28 2008 From: matt.horizon5 at gmail.com (Matthew Russell) Date: Thu, 11 Dec 2008 20:42:28 +0000 Subject: [Python-ideas] This seems like a wart to me... In-Reply-To: References: <18753.8504.116845.736633@montanaro-dyndns-org.local> <3b5110850812111211w5287ccbdp4f0f35aac9b5ed46@mail.gmail.com> Message-ID: <3b5110850812111242k1ad67dc8m2b3802fd646c2214@mail.gmail.com> Sorry for wasting your brain power - I just read through some of my code and realised how stupid an idea it really was... I haven't even got the excuse of being remotely new to the language (or programming in general come to think of it) lesson(self): dont_post_at_end_of_day_without(thinking_a_lot_first) sorrly-mistaken-and-embarrassed Matt 2008/12/11 Bruce Leban > I think splitting an empty string is more like dividing zero in half. No > one expects that to raise a value exception. > > --- Bruce > > > On Thu, Dec 11, 2008 at 12:14 PM, Guido van Rossum wrote: > >> On Thu, Dec 11, 2008 at 12:11 PM, Matthew Russell >> wrote: >> > It seems to me like spliting an empty string is something that makes >> little >> > sense to do, >> > similar to dividing by zero in terms of an analogy. >> >> I guess you have never had the need. Let me assure you that you are >> mistaken. :-) >> >> > How about str.split, partition and friends just raise ValueError >> exception >> > when the value is the empty string? >> >> Absolutely not. >> >> > Regards, >> > Matt >> > >> > 2008/12/11 Terry Reedy >> >> >> >> Guido van Rossum wrote: >> >>> >> >>> On Thu, Dec 11, 2008 at 6:18 AM, wrote: >> >> >> >>> If anything it's wrong, it's that they share the same name. This >> >>> wasn't always the case. Do you really want to go back to .split() and >> >>> .splitfields(sep)? >> >> >> >> I hope not. I consider the current situation to be a definite >> >> improvement. I sometimes forgot which was which. >> >> >> >> _______________________________________________ >> >> Python-ideas mailing list >> >> Python-ideas at python.org >> >> http://mail.python.org/mailman/listinfo/python-ideas >> > >> > >> > >> > >> > _______________________________________________ >> > Python-ideas mailing list >> > Python-ideas at python.org >> > http://mail.python.org/mailman/listinfo/python-ideas >> > >> > >> >> >> >> -- >> --Guido van Rossum (home page: http://www.python.org/~guido/ >> ) >> _______________________________________________ >> Python-ideas mailing list >> Python-ideas at python.org >> http://mail.python.org/mailman/listinfo/python-ideas >> > > -- Cheers, Matt -------------- next part -------------- An HTML attachment was scrubbed... URL: From rrr at ronadam.com Fri Dec 12 00:58:38 2008 From: rrr at ronadam.com (Ron Adam) Date: Thu, 11 Dec 2008 17:58:38 -0600 Subject: [Python-ideas] This seems like a wart to me... In-Reply-To: <18753.16763.980209.628297@montanaro-dyndns-org.local> References: <18753.8504.116845.736633@montanaro-dyndns-org.local> <18753.16763.980209.628297@montanaro-dyndns-org.local> Message-ID: skip at pobox.com wrote: > Guido> Which of the two would you choose for all? The empty string is the > Guido> only reasonable behavior for split-with-argument, it is the logical > Guido> consequence of how it behaves when the string is not empty. E.g. > Guido> "x:y".split(":") -> ["x", "y"], "x::y".split(":") -> ["x", "", "y"], > Guido> ":".split(":") -> ["", ""]. OTOH split-on-whitespace doesn't behave > Guido> this way; it extracts the non-empty non-whitespace-containing > Guido> substrings. > > In my feeble way of thinking I go from something which evaluates to false to > something which doesn't. It's almost like making matter out of empty space: > > bool("") -> False > bool("".split()) -> False > bool("".split("n")) -> True > > Guido> If anything it's wrong, it's that they share the same name. This > Guido> wasn't always the case. Do you really want to go back to .split() > Guido> and .splitfields(sep)? > > That might be preferable. The same method having such strikingly different > behavior throws me every time I try splitting a possibly empty string with a > non-whitespace character. It's a relatively uncommon case. Most of the > time when you split a string with a non-whitespace character I think you > know that the input can't be empty. > > Skip It looks like there are several behaviors involved in split, and you want to split those behaviors out. Behaviors of string split: 1. Split on white space chrs by giving no argument. This has the effect of splitting on multiple characters. Strings with multiple white space characters are not multiply split. >>> ' '.split() [] >>> ' \t\n'.split() [] 2. Split on word by giving an argument. (A word can be one char.) In this case, the split is strict and does not combine/remove null string results. >>> ' '.split(' ') ['', '', '', '', '', '', '', ''] >>> ' \t\n'.split(' ') ['', '\t\n'] There doesn't seem to be an obvious way to split on different characters. A new to python programmer might try: >>> '1 (123) 456-7890'.split(' ()-') ['1 (123) 456-7890'] Expecting: ['1', '123', '456', '7890'] >>> '1 (123) 456-7890'.split([' ', '(', ')', '-']) Traceback (most recent call last): File "", line 1, in TypeError: expected a character buffer object When I needed to split on multiple chars other than the default white space, I have used .replace() to replace different splitting character with one single char sequence which I could then split on. It might be nice to have a .splitonchars() version of split with the default being whitespace chars, and an argument to specify other multiple characters to split on. The other behavior could be called .splitonwords(arg). The .splitonwords() method could possibly also accept a list of words. That leaves the possibility to leave the current .split() behavior alone and would not break current code. And alternately these could be functions in the string module. In that case the current .split() could just continue to exist as is. I find the name 'splitfields' to not be as intuitive as 'splitonwords' and 'splitonchars'. While both of those require more letters to type than split, they are more readable, and when you do need the capability of splitting on more than one char or word, they are far shorter and less prone to errors than rolling your own function. Ron From bruce at leapyear.org Fri Dec 12 01:23:52 2008 From: bruce at leapyear.org (Bruce Leban) Date: Thu, 11 Dec 2008 16:23:52 -0800 Subject: [Python-ideas] This seems like a wart to me... In-Reply-To: References: <18753.8504.116845.736633@montanaro-dyndns-org.local> <18753.16763.980209.628297@montanaro-dyndns-org.local> Message-ID: I think string.split(list) probably won't do what people expect either. Here's what I would expect it to do: >>> '1 (123) 456-7890'.split([' ', '(', ')', '-']) ['1', '', '123', '', '456', '7890'] but what you probably want is: >>>re.split(r'[ ()-]*', '1 (123) 456-7890') ['1', '123', '456', '7890'] using allows you to do that and avoids ambiguity about what it does. --- Bruce On Thu, Dec 11, 2008 at 3:58 PM, Ron Adam wrote: > > > skip at pobox.com wrote: > >> Guido> Which of the two would you choose for all? The empty string is >> the >> Guido> only reasonable behavior for split-with-argument, it is the >> logical >> Guido> consequence of how it behaves when the string is not empty. E.g. >> Guido> "x:y".split(":") -> ["x", "y"], "x::y".split(":") -> ["x", "", >> "y"], >> Guido> ":".split(":") -> ["", ""]. OTOH split-on-whitespace doesn't >> behave >> Guido> this way; it extracts the non-empty non-whitespace-containing >> Guido> substrings. >> >> In my feeble way of thinking I go from something which evaluates to false >> to >> something which doesn't. It's almost like making matter out of empty >> space: >> >> bool("") -> False >> bool("".split()) -> False >> bool("".split("n")) -> True >> >> Guido> If anything it's wrong, it's that they share the same name. This >> Guido> wasn't always the case. Do you really want to go back to >> .split() >> Guido> and .splitfields(sep)? >> >> That might be preferable. The same method having such strikingly >> different >> behavior throws me every time I try splitting a possibly empty string with >> a >> non-whitespace character. It's a relatively uncommon case. Most of the >> time when you split a string with a non-whitespace character I think you >> know that the input can't be empty. >> >> Skip >> > > > It looks like there are several behaviors involved in split, and you want > to split those behaviors out. > > > > Behaviors of string split: > > > 1. Split on white space chrs by giving no argument. > > This has the effect of splitting on multiple characters. Strings with > multiple white space characters are not multiply split. > > >>> ' '.split() > [] > >>> ' \t\n'.split() > [] > > > > 2. Split on word by giving an argument. (A word can be one char.) > > In this case, the split is strict and does not combine/remove null string > results. > > >>> ' '.split(' ') > ['', '', '', '', '', '', '', ''] > >>> ' \t\n'.split(' ') > ['', '\t\n'] > > > There doesn't seem to be an obvious way to split on different characters. > > > A new to python programmer might try: > > >>> '1 (123) 456-7890'.split(' ()-') > ['1 (123) 456-7890'] > > Expecting: ['1', '123', '456', '7890'] > > > >>> '1 (123) 456-7890'.split([' ', '(', ')', '-']) > Traceback (most recent call last): > File "", line 1, in > TypeError: expected a character buffer object > > > When I needed to split on multiple chars other than the default white > space, I have used .replace() to replace different splitting character with > one single char sequence which I could then split on. > > > It might be nice to have a .splitonchars() version of split with the > default being whitespace chars, and an argument to specify other multiple > characters to split on. > > The other behavior could be called .splitonwords(arg). The .splitonwords() > method could possibly also accept a list of words. > > > That leaves the possibility to leave the current .split() behavior alone > and would not break current code. > > And alternately these could be functions in the string module. In that > case the current .split() could just continue to exist as is. > > I find the name 'splitfields' to not be as intuitive as 'splitonwords' and > 'splitonchars'. While both of those require more letters to type than > split, they are more readable, and when you do need the capability of > splitting on more than one char or word, they are far shorter and less prone > to errors than rolling your own function. > > Ron > > > > > > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas > -------------- next part -------------- An HTML attachment was scrubbed... URL: From rhamph at gmail.com Fri Dec 12 01:18:24 2008 From: rhamph at gmail.com (Adam Olsen) Date: Thu, 11 Dec 2008 17:18:24 -0700 Subject: [Python-ideas] This seems like a wart to me... In-Reply-To: References: <18753.8504.116845.736633@montanaro-dyndns-org.local> <18753.16763.980209.628297@montanaro-dyndns-org.local> Message-ID: On Thu, Dec 11, 2008 at 4:58 PM, Ron Adam wrote: > There doesn't seem to be an obvious way to split on different characters. > > > A new to python programmer might try: > >>>> '1 (123) 456-7890'.split(' ()-') > ['1 (123) 456-7890'] > > Expecting: ['1', '123', '456', '7890'] > > >>>> '1 (123) 456-7890'.split([' ', '(', ')', '-']) > Traceback (most recent call last): > File "", line 1, in > TypeError: expected a character buffer object >>> re.split('[ ()-]', '1 (123) 456-7890') ['1', '', '123', '', '456', '7890'] >>> re.split('[ ()-]+', '1 (123) 456-7890') ['1', '123', '456', '7890'] str.split() handles the simplest, most common cases. Let's not clutter it up with a bad[1] impersonation of regex. [1] And if you thought regex was ugly enough to begin with... -- Adam Olsen, aka Rhamphoryncus From rrr at ronadam.com Fri Dec 12 01:38:05 2008 From: rrr at ronadam.com (Ron Adam) Date: Thu, 11 Dec 2008 18:38:05 -0600 Subject: [Python-ideas] This seems like a wart to me... In-Reply-To: References: <18753.8504.116845.736633@montanaro-dyndns-org.local> <18753.16763.980209.628297@montanaro-dyndns-org.local> Message-ID: Bruce Leban wrote: > I think string.split(list) probably won't do what people expect either. > Here's what I would expect it to do: > > >>> '1 (123) 456-7890'.split([' ', '(', ')', '-']) > ['1', '', '123', '', '456', '7890'] > > but what you probably want is: > > >>>re.split(r'[ ()-]*', '1 (123) 456-7890') > ['1', '123', '456', '7890'] > > using allows you to do that and avoids ambiguity about what it does. > > --- Bruce Without getting into regular expressions, it's easier to just allow adjacent char matches to act as one match so the following is true. longstring.splitchars(string.whitespace) = longstring.split() From rrr at ronadam.com Fri Dec 12 02:12:15 2008 From: rrr at ronadam.com (Ron Adam) Date: Thu, 11 Dec 2008 19:12:15 -0600 Subject: [Python-ideas] This seems like a wart to me... In-Reply-To: References: <18753.8504.116845.736633@montanaro-dyndns-org.local> <18753.16763.980209.628297@montanaro-dyndns-org.local> Message-ID: Adam Olsen wrote: > On Thu, Dec 11, 2008 at 4:58 PM, Ron Adam wrote: >> There doesn't seem to be an obvious way to split on different characters. >> >> >> A new to python programmer might try: >> >>>>> '1 (123) 456-7890'.split(' ()-') >> ['1 (123) 456-7890'] >> >> Expecting: ['1', '123', '456', '7890'] >> >> >>>>> '1 (123) 456-7890'.split([' ', '(', ')', '-']) >> Traceback (most recent call last): >> File "", line 1, in >> TypeError: expected a character buffer object > >>>> re.split('[ ()-]', '1 (123) 456-7890') > ['1', '', '123', '', '456', '7890'] >>>> re.split('[ ()-]+', '1 (123) 456-7890') > ['1', '123', '456', '7890'] > > str.split() handles the simplest, most common cases. Let's not > clutter it up with a bad[1] impersonation of regex. > > > [1] And if you thought regex was ugly enough to begin with... These examples was just what a "new" programmer might attempt. I have a feeling that most new programmers do not attempt regular expressions ie.. the re module, until sometime after they have learned the basics of python. Ron From bruce at leapyear.org Fri Dec 12 02:08:24 2008 From: bruce at leapyear.org (Bruce Leban) Date: Thu, 11 Dec 2008 17:08:24 -0800 Subject: [Python-ideas] This seems like a wart to me... In-Reply-To: References: <18753.8504.116845.736633@montanaro-dyndns-org.local> <18753.16763.980209.628297@montanaro-dyndns-org.local> Message-ID: -inf That breaks existing code in two different ways which I don't think makes it easy. it does NOT collapse adjacent characters: >>> "a&&b".split("&") ['a', '', 'b'] the separator it splits on is a string, not a character: >>> "ad".split("><") ['ad'] --- Bruce On Thu, Dec 11, 2008 at 4:38 PM, Ron Adam wrote: > > > Bruce Leban wrote: > >> I think string.split(list) probably won't do what people expect either. >> Here's what I would expect it to do: >> >> >>> '1 (123) 456-7890'.split([' ', '(', ')', '-']) >> ['1', '', '123', '', '456', '7890'] >> >> but what you probably want is: >> >> >>>re.split(r'[ ()-]*', '1 (123) 456-7890') >> ['1', '123', '456', '7890'] >> >> using allows you to do that and avoids ambiguity about what it does. >> >> --- Bruce >> > > Without getting into regular expressions, it's easier to just allow > adjacent char matches to act as one match so the following is true. > > longstring.splitchars(string.whitespace) = longstring.split() > > > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas > -------------- next part -------------- An HTML attachment was scrubbed... URL: From greg.ewing at canterbury.ac.nz Fri Dec 12 02:55:26 2008 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Fri, 12 Dec 2008 14:55:26 +1300 Subject: [Python-ideas] This seems like a wart to me... In-Reply-To: References: <18753.8504.116845.736633@montanaro-dyndns-org.local> <18753.16763.980209.628297@montanaro-dyndns-org.local> Message-ID: <4941C48E.50005@canterbury.ac.nz> Ron Adam wrote: > There doesn't seem to be an obvious way to split on different characters. Remember there's always re.split for the less common use cases. -- Greg From rrr at ronadam.com Fri Dec 12 03:20:51 2008 From: rrr at ronadam.com (Ron Adam) Date: Thu, 11 Dec 2008 20:20:51 -0600 Subject: [Python-ideas] This seems like a wart to me... In-Reply-To: References: <18753.8504.116845.736633@montanaro-dyndns-org.local> <18753.16763.980209.628297@montanaro-dyndns-org.local> Message-ID: Bruce Leban wrote: > -inf > > That breaks existing code in two different ways which I don't think > makes it easy. Correct, it would break existing code. Which is why it should have a different name rather than altering the existing split function. > it does NOT collapse adjacent characters: > >>> "a&&b".split("&") > ['a', '', 'b'] Also correct. But that is the behavior when splitting on the default white space. ie.. split() with no argument. ' '.split() is not the same as ' '.split(' '). Q: Would it be good to have a new method or function which extends the same behavior of whitespace splitting to other user specified characters? I would find it useful at times. > the separator it splits on is a string, not a character: > >>> "ad".split("><") > ['ad'] Yes, I know. To split on multiple chars in a given argument string it will need to be called something other than .split(). Such as .splitchars(), as in the example equality I gave. longstring.splitchars(string.whitespace) == longstring.split() Note: longstring.split() has no arguments. .split(arg) splits on a string as you stated. > --- Bruce > > On Thu, Dec 11, 2008 at 4:38 PM, Ron Adam > > wrote: > > > > Bruce Leban wrote: > > I think string.split(list) probably won't do what people expect > either. Here's what I would expect it to do: > > >>> '1 (123) 456-7890'.split([' ', '(', ')', '-']) > ['1', '', '123', '', '456', '7890'] > > but what you probably want is: > > >>>re.split(r'[ ()-]*', '1 (123) 456-7890') > ['1', '123', '456', '7890'] > > using allows you to do that and avoids ambiguity about what it does. > > --- Bruce > > > Without getting into regular expressions, it's easier to just allow > adjacent char matches to act as one match so the following is true. > > longstring.splitchars(string.whitespace) = longstring.split() > > > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > > http://mail.python.org/mailman/listinfo/python-ideas > > > > ------------------------------------------------------------------------ > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas From stephen at xemacs.org Fri Dec 12 03:32:35 2008 From: stephen at xemacs.org (Stephen J. Turnbull) Date: Fri, 12 Dec 2008 11:32:35 +0900 Subject: [Python-ideas] This seems like a wart to me... In-Reply-To: References: <18753.8504.116845.736633@montanaro-dyndns-org.local> <18753.16763.980209.628297@montanaro-dyndns-org.local> Message-ID: <87zlj286nw.fsf@xemacs.org> Ron Adam writes: > These examples was just what a "new" programmer might attempt. I have a > feeling that most new programmers do not attempt regular expressions ie.. > the re module, until sometime after they have learned the basics of python. Adding a str.split_on_any_of() would violate TOOWDTI, though. I think this is best addressed by an xref to re.split in the doc for str.split. From python at rcn.com Fri Dec 12 03:34:51 2008 From: python at rcn.com (Raymond Hettinger) Date: Thu, 11 Dec 2008 18:34:51 -0800 Subject: [Python-ideas] This seems like a wart to me... References: <18753.8504.116845.736633@montanaro-dyndns-org.local><18753.16763.980209.628297@montanaro-dyndns-org.local> Message-ID: <26EBB4D460D34D8D88CD70517CCB508C@RaymondLaptop1> From: "Adam Olsen" > str.split() handles the simplest, most common cases. Let's not > clutter it up with a bad[1] impersonation of regex. I concur and am -1 on *any* change to str.split(). It has been around for a very long time and is widely used. If there were any subtle change, even in 3.0, it would create migration problems that are very to diagnose and repair. Raymond From carl at carlsensei.com Fri Dec 12 05:46:04 2008 From: carl at carlsensei.com (Carl Johnson) Date: Thu, 11 Dec 2008 18:46:04 -1000 Subject: [Python-ideas] This seems like a wart to me... Message-ID: <72DFCC20-8325-4B13-B609-DAED9664050D@carlsensei.com> Ron Adam wrote: > These examples was just what a "new" programmer might attempt. I > have a feeling that most new programmers do not attempt regular > expressions ie.. the re module, until sometime after they have > learned the basics of python. Feel what you like, but I assumed that .split meant .splitonchars when I was learning Python in 2007 and was confused when my script didn't work. I was also confused about why it stopped getting rid of empty strings. And I still don't know how to write regexs, so now I when I want to split on multiple chars, I end up .replace-ing a bunch first, which I recognize to be terribly inefficient, but the scripts are throwaways, so it's hardly worth the time to learn a whole other language first. -- Carl From carl at carlsensei.com Fri Dec 12 05:58:21 2008 From: carl at carlsensei.com (Carl Johnson) Date: Thu, 11 Dec 2008 18:58:21 -1000 Subject: [Python-ideas] This seems like a wart to me... In-Reply-To: <72DFCC20-8325-4B13-B609-DAED9664050D@carlsensei.com> References: <72DFCC20-8325-4B13-B609-DAED9664050D@carlsensei.com> Message-ID: <58DB5F52-E02B-4535-A246-B4C7DBDE9D7D@carlsensei.com> Rereading your message along with your other ones, I see that I misinterpreted it. I thought you meant, "I can't imagine a new programmer wanting something as regex-like as string.splitonchars," but what you meant was "I can't imagine new programmers wanting to go into the re module to learn how to do something like string.splitonchars." To which I say: Yes! I heartily agree! :-D Embarrassedly-yours, Carl > Ron Adam wrote: >> These examples was just what a "new" programmer might attempt. I >> have a feeling that most new programmers do not attempt regular >> expressions ie.. the re module, until sometime after they have >> learned the basics of python. > > Feel what you like, but I assumed that .split meant .splitonchars > when I was learning Python in 2007 and was confused when my script > didn't work. I was also confused about why it stopped getting rid of > empty strings. And I still don't know how to write regexs, so now I > when I want to split on multiple chars, I end up .replace-ing a > bunch first, which I recognize to be terribly inefficient, but the > scripts are throwaways, so it's hardly worth the time to learn a > whole other language first. > > -- Carl From rhamph at gmail.com Fri Dec 12 06:07:23 2008 From: rhamph at gmail.com (Adam Olsen) Date: Thu, 11 Dec 2008 22:07:23 -0700 Subject: [Python-ideas] This seems like a wart to me... In-Reply-To: <87zlj286nw.fsf@xemacs.org> References: <18753.8504.116845.736633@montanaro-dyndns-org.local> <18753.16763.980209.628297@montanaro-dyndns-org.local> <87zlj286nw.fsf@xemacs.org> Message-ID: On Thu, Dec 11, 2008 at 7:32 PM, Stephen J. Turnbull wrote: > Ron Adam writes: > > > These examples was just what a "new" programmer might attempt. I have a > > feeling that most new programmers do not attempt regular expressions ie.. > > the re module, until sometime after they have learned the basics of python. > > Adding a str.split_on_any_of() would violate TOOWDTI, though. > > I think this is best addressed by an xref to re.split in the doc for > str.split. +1 -- Adam Olsen, aka Rhamphoryncus From turnbull at sk.tsukuba.ac.jp Fri Dec 12 08:14:23 2008 From: turnbull at sk.tsukuba.ac.jp (Stephen J. Turnbull) Date: Fri, 12 Dec 2008 16:14:23 +0900 Subject: [Python-ideas] This seems like a wart to me... In-Reply-To: <58DB5F52-E02B-4535-A246-B4C7DBDE9D7D@carlsensei.com> References: <72DFCC20-8325-4B13-B609-DAED9664050D@carlsensei.com> <58DB5F52-E02B-4535-A246-B4C7DBDE9D7D@carlsensei.com> Message-ID: <87r64d986o.fsf@xemacs.org> Carl Johnson writes: > what you meant was "I can't imagine new programmers wanting to go > into the re module to learn how to do something like > string.splitonchars." To which I say: Yes! I heartily agree! :-D I don't understand this point of view at all. True, regexps are a complex subject, with an unfortunately large number of dialects. Is it the confusion of dialects problem, or do you really never use regexps in any language? Anyway, for this purpose you only have to learn one idiom, that longstring.splitonchars (["x", "y", "z"]) is spelled import re re.split ("[xyz]", longstring) In fact, I personally would like to deprecate the with-argument implementation of string.split(), and have def split (self, delimiter = None): if delimiters is None: return self.usual_magic_splitting () else: import re return re.split (delimiter, self) (of course, that's because that's precisely the way split-string works in Emacs). Then the idiom would be longstring.split ("[xyz]") Would that work for you? From carl at carlsensei.com Fri Dec 12 08:51:23 2008 From: carl at carlsensei.com (Carl Johnson) Date: Thu, 11 Dec 2008 21:51:23 -1000 Subject: [Python-ideas] This seems like a wart to me... In-Reply-To: <87r64d986o.fsf@xemacs.org> References: <72DFCC20-8325-4B13-B609-DAED9664050D@carlsensei.com> <58DB5F52-E02B-4535-A246-B4C7DBDE9D7D@carlsensei.com> <87r64d986o.fsf@xemacs.org> Message-ID: <0B09B7C6-99BE-4B5B-9021-6EAF70D71180@carlsensei.com> Stephen J. Turnbull wrote: > I don't understand this point of view at all. True, regexps are a > complex subject, with an unfortunately large number of dialects. Is > it the confusion of dialects problem, or do you really never use > regexps in any language? I have half-heartedly tried to learn regexps before, but always given up after reading about the basics. Obviously, this would be shameless behavior for a professional programmer, but I'm just a dilettante, and the famed saying of Jamie Zawinski ("Some people, when confronted with a problem, think 'I know, I'll use regular expressions.' Now they have two problems.") is not highly motivating. :-D > Anyway, for this purpose you only have to learn one idiom, that > > longstring.splitonchars (["x", "y", "z"]) > > is spelled > > import re > re.split ("[xyz]", longstring) > > In fact, I personally would like to deprecate the with-argument > implementation of string.split(), and have > > def split (self, delimiter = None): > if delimiters is None: > return self.usual_magic_splitting () > else: > import re > return re.split (delimiter, self) > > (of course, that's because that's precisely the way split-string works > in Emacs). > > Then the idiom would be > > longstring.split ("[xyz]") > > Would that work for you? Wouldn't that subtly break the code of everyone who has written something like: lines = bigtext.splitlines() delimiter = lines[0] del lines[0] splitlines = [line.split(delimiter) for line in lines] ? Since suddenly if your delimiter uses one of the reserved regexp characters, such as brackets and parentheses, the code would stop working. (That's one of the things I dislike about regexps -- too many magical characters.) Here's a backward compatible idea instead: def split (self, delimiter = None): if delimiter is None: return self.usual_magic_splitting () elif isinstance(delimiter, str): return self.usual_delimiter_based_splitting() elif isinstance(delimiter, Sequence): return self.treat_delimiters_given_by_sequence_as_interchangable() else: raise TypeError("coercing to Unicode: need string or buffer or Sequence, " + repr(type(delimiter)) + " found") Since right now passing a list or tuple raises a TypeError, this would be backwards compatible. The idiom for doing re.split-like things would then be bigtext.split(list(" ;.,-!?")). It might even be a good idea to a keyword (only?) argument called "dropempty" to recreate the magical behavior of passing None as the delimiter where empty strings are dropped. That would also solve skip's original problem: just set it to text.split(None, dropempty=False). -- Carl From stephen at xemacs.org Fri Dec 12 09:43:22 2008 From: stephen at xemacs.org (Stephen J. Turnbull) Date: Fri, 12 Dec 2008 17:43:22 +0900 Subject: [Python-ideas] This seems like a wart to me... In-Reply-To: <0B09B7C6-99BE-4B5B-9021-6EAF70D71180@carlsensei.com> References: <72DFCC20-8325-4B13-B609-DAED9664050D@carlsensei.com> <58DB5F52-E02B-4535-A246-B4C7DBDE9D7D@carlsensei.com> <87r64d986o.fsf@xemacs.org> <0B09B7C6-99BE-4B5B-9021-6EAF70D71180@carlsensei.com> Message-ID: <87prjx942d.fsf@xemacs.org> Carl Johnson writes: > the famed saying of Jamie Zawinski ("Some people, when confronted with > a problem, think 'I know, I'll use regular expressions.' Now they > have two problems.") is not highly motivating. :-D Jamie was talking about the "to a man with a hammer, all problems look like thumbs" phenomenon. I've never heard anybody complain that shell globs are complex. But regexps will take you a lot farther with just character classes [] (which most modern shells implement), the wildcard character . (usually ? in shells), and the repetition operators * and/or + (available only as a variable-length wildcard * in shell globs). > > In fact, I personally would like to deprecate the with-argument > > implementation of string.split(), .... > > > > Would that work for you? > > Wouldn't that subtly break the code of everyone who has written > something like: Indeed it would. That was not a serious proposal. At this point, I'm trying to understand the resistence to regexps, not propose an improvement for .split(). From rhamph at gmail.com Fri Dec 12 10:23:46 2008 From: rhamph at gmail.com (Adam Olsen) Date: Fri, 12 Dec 2008 02:23:46 -0700 Subject: [Python-ideas] This seems like a wart to me... In-Reply-To: <87prjx942d.fsf@xemacs.org> References: <72DFCC20-8325-4B13-B609-DAED9664050D@carlsensei.com> <58DB5F52-E02B-4535-A246-B4C7DBDE9D7D@carlsensei.com> <87r64d986o.fsf@xemacs.org> <0B09B7C6-99BE-4B5B-9021-6EAF70D71180@carlsensei.com> <87prjx942d.fsf@xemacs.org> Message-ID: On Fri, Dec 12, 2008 at 1:43 AM, Stephen J. Turnbull wrote: > Carl Johnson writes: > > > the famed saying of Jamie Zawinski ("Some people, when confronted with > > a problem, think 'I know, I'll use regular expressions.' Now they > > have two problems.") is not highly motivating. :-D > > Jamie was talking about the "to a man with a hammer, all problems look > like thumbs" phenomenon. I've never heard anybody complain that shell > globs are complex. But regexps will take you a lot farther with just > character classes [] (which most modern shells implement), the > wildcard character . (usually ? in shells), and the repetition > operators * and/or + (available only as a variable-length wildcard * > in shell globs). > > > > In fact, I personally would like to deprecate the with-argument > > > implementation of string.split(), .... > > > > > > Would that work for you? > > > > Wouldn't that subtly break the code of everyone who has written > > something like: > > Indeed it would. That was not a serious proposal. At this point, I'm > trying to understand the resistence to regexps, not propose an > improvement for .split(). I'd say the lack of diagnostics when they "fail" is the biggest issue. I could easily spend half an hour trying random permutations of a pattern before I figure out why the original didn't work... and I've had a moderate amount of experience. -- Adam Olsen, aka Rhamphoryncus From python at rcn.com Fri Dec 12 10:35:33 2008 From: python at rcn.com (Raymond Hettinger) Date: Fri, 12 Dec 2008 01:35:33 -0800 Subject: [Python-ideas] This seems like a wart to me... References: <72DFCC20-8325-4B13-B609-DAED9664050D@carlsensei.com> Message-ID: <2F18D71A714746E4BEC8B3050ADEE1D4@RaymondLaptop1> From: "Carl Johnson" > And I still don't know how to write regexs, ... Maybe you should learn some of the fundamental tools provided by the langauge before you get in the business of demanding that the language be changed. Regexes occur in other languages and some command-line tools. Taking a little time to learn them will provide you with a life long skill that will serve you well in a number of contexts. This is doubly true in your case (since you've show an interest in text processing). Raymond From skip at pobox.com Fri Dec 12 16:14:34 2008 From: skip at pobox.com (skip at pobox.com) Date: Fri, 12 Dec 2008 09:14:34 -0600 Subject: [Python-ideas] This seems like a wart to me... In-Reply-To: <87r64d986o.fsf@xemacs.org> References: <87r64d986o.fsf@xemacs.org> Message-ID: <18754.32730.304523.362137@montanaro-dyndns-org.local> >> what you meant was "I can't imagine new programmers wanting to go >> into the re module to learn how to do something like >> string.splitonchars." To which I say: Yes! I heartily agree! :-D Steve> I don't understand this point of view at all. True, regexps are Steve> a complex subject, with an unfortunately large number of Steve> dialects. Is it the confusion of dialects problem, or do you Steve> really never use regexps in any language? Getting more than a little bit off the original topic, but... I think a person's affinity for regular expressions has a lot to do with their editing & programming environments. I work with some very experienced programmers (C++ & Python mostly, not much Perl, and generally very basic Emacs usage) who never (or almost never) use regular expressions. * C/C++: My impression was always that the C regex(3) API presented a lot of barriers to casual use. Maybe that's changed over time. * Python: You can go a long way without using regular expressions in Python because it has other easy-to-use string searching stuff (str.find, etc) as well as shell-style globbing for file name matching. * Emacs: I think part of the reason that I find re's so easy-to-use is that I've been using some dialect of Emacs for about 20 years and it exposes re's in a way that is real easy to experiment with: incremental search. i-search+re's - what a fabulous combination. * Perl: I suspect Perl mongers are as adept at re's as Emacs types because that's the primary (only?) way to search for patterns in strings. * vi: Probably somewhere between Perl and Emacs. vim does support incremental search but it's not the default. Are there other editors besides Emacs and vi for which regular expressions are so common? Bringing this back on-topic, I can see that I'm going to lose this argument. I still view "".split(':') as a wart. I guess I'll have to live with it though. Skip From skip at pobox.com Fri Dec 12 16:18:31 2008 From: skip at pobox.com (skip at pobox.com) Date: Fri, 12 Dec 2008 09:18:31 -0600 Subject: [Python-ideas] This seems like a wart to me... In-Reply-To: <0B09B7C6-99BE-4B5B-9021-6EAF70D71180@carlsensei.com> References: <0B09B7C6-99BE-4B5B-9021-6EAF70D71180@carlsensei.com> Message-ID: <18754.32967.192347.822469@montanaro-dyndns-org.local> Carl> I have half-heartedly tried to learn regexps before, but always Carl> given up after reading about the basics. Just out of curiosity, what editor do you use? Reading and doing are two different things. Carl> ... the famed saying of Jamie Zawinski ("Some people, when Carl> confronted with a problem, think 'I know, I'll use regular Carl> expressions.' Now they have two problems.") is not highly Carl> motivating. :-D Sure, but that addresses the topic of some peoples' desire to use regular expressions to parse everything from LL1 grammars to the tea leaves in the bottom of a cup. If you use them in an environment where there is almost no penalty for mistakes (incremental search) I think you will quickly gain an understanding of the syntax. Then your challenge will be not to fall into Jamie's re tar pit. ;-) Skip From stephen at xemacs.org Sat Dec 13 13:44:36 2008 From: stephen at xemacs.org (Stephen J. Turnbull) Date: Sat, 13 Dec 2008 21:44:36 +0900 Subject: [Python-ideas] This seems like a wart to me... In-Reply-To: References: <72DFCC20-8325-4B13-B609-DAED9664050D@carlsensei.com> <58DB5F52-E02B-4535-A246-B4C7DBDE9D7D@carlsensei.com> <87r64d986o.fsf@xemacs.org> <0B09B7C6-99BE-4B5B-9021-6EAF70D71180@carlsensei.com> <87prjx942d.fsf@xemacs.org> Message-ID: <87myf08csr.fsf@xemacs.org> Adam Olsen writes: > I could easily spend half an hour trying random permutations of a > pattern before I figure out why the original didn't work... and I've > had a moderate amount of experience. It takes a moderate amount of experience to get that far, though. In particular, in this case, all you need to understand is "[abc]" matches any of the characters "a", "b", or "c", and *that* is familiar to anybody who has used a decent shell (any Unix shell, and I believe 4DOS and friends provided it too but I haven't used them for 20 years). So I don't think that lack of diagnostics explains widespread reluctance to even substitute ".*" for "*", but instead propose something as ugly as .split(list("abc")). From stephen at xemacs.org Sat Dec 13 16:48:00 2008 From: stephen at xemacs.org (Stephen J. Turnbull) Date: Sun, 14 Dec 2008 00:48:00 +0900 Subject: [Python-ideas] This seems like a wart to me... In-Reply-To: <20081213085458.5844e108@bhuda.mired.org> References: <72DFCC20-8325-4B13-B609-DAED9664050D@carlsensei.com> <58DB5F52-E02B-4535-A246-B4C7DBDE9D7D@carlsensei.com> <87r64d986o.fsf@xemacs.org> <0B09B7C6-99BE-4B5B-9021-6EAF70D71180@carlsensei.com> <87prjx942d.fsf@xemacs.org> <87myf08csr.fsf@xemacs.org> <20081213085458.5844e108@bhuda.mired.org> Message-ID: <87iqpo84b3.fsf@xemacs.org> Mike Meyer writes: > Except that if you're doing anything *interesting* - like splitting on > punctuation, which is far more common than splitting on alphanumerics > - then it's not nearly that simpl. The equivalent of > .splitset("^()-[]") is *much* more complicated than just "[^()-[]]" Yeah, it's "[][)(^-]". So much for complexity of the needed regexp (see below for difficulty of composition). > > So I don't think that lack of diagnostics explains widespread reluctance > > to even substitute ".*" for "*", but instead propose something as ugly > > as .split(list("abc")). > > It isn't the lack of diagnostics, it's the write-once nature of re's. They're hardly write-once in this context. The above regexp is hard to write, agreed, because you have to remember to move the close bracket to the start, the hyphen to the end, and the caret away from the start. (Note that there is never a need to put the close bracket and hyphen in other positions, so this is not a particularly hard rule to remember IMO YMMV.) However, precisely because of the oddity of the positions of the close bracket and hyphen it's easy enough to read once you've learned to write it. As far as I can see, that is the hardest regexp that most people will ever want to write for re.split(). Again, I just don't see that (limited) use of regular expressions makes programs harder to read or write than proliferating special case functions that provide nowhere near the power of a single regular expression-based function. From rhamph at gmail.com Sat Dec 13 19:03:48 2008 From: rhamph at gmail.com (Adam Olsen) Date: Sat, 13 Dec 2008 11:03:48 -0700 Subject: [Python-ideas] This seems like a wart to me... In-Reply-To: <87iqpo84b3.fsf@xemacs.org> References: <72DFCC20-8325-4B13-B609-DAED9664050D@carlsensei.com> <58DB5F52-E02B-4535-A246-B4C7DBDE9D7D@carlsensei.com> <87r64d986o.fsf@xemacs.org> <0B09B7C6-99BE-4B5B-9021-6EAF70D71180@carlsensei.com> <87prjx942d.fsf@xemacs.org> <87myf08csr.fsf@xemacs.org> <20081213085458.5844e108@bhuda.mired.org> <87iqpo84b3.fsf@xemacs.org> Message-ID: On Sat, Dec 13, 2008 at 8:48 AM, Stephen J. Turnbull wrote: > Again, I just don't see that (limited) use of regular expressions > makes programs harder to read or write than proliferating special case > functions that provide nowhere near the power of a single regular > expression-based function. A lot of it is evil by association ? we're taught that regex is evil and should never be used ? only later do we figure out that regex is often the best tool regardless of being evil. In this case the docs are pretty overwhelming for the minor task. A simple "this is all you need for this task" tutorial might help. -- Adam Olsen, aka Rhamphoryncus From ggpolo at gmail.com Sat Dec 13 19:58:52 2008 From: ggpolo at gmail.com (Guilherme Polo) Date: Sat, 13 Dec 2008 16:58:52 -0200 Subject: [Python-ideas] Moving _tkinter._flatten to somewhere else Message-ID: Hi there, Probably many of you have seen/written/used a lot of recipes for flattening lists/tuples, several of them are similar diverging just a bit, some uses more obscure code than others, some are not fast enough, but I have noticed all those have one thing in common: none of them mention _tkinter._flatten, not even in comments (if the site allows that). Apparently _tkinter._flatten is unknown, and it being marked private doesn't help, and it also lives under _tkinter but doesn't depend on anything from tcl/tk. This _flatten has the advantage of being faster then the alternatives I have seen coded in Python, since it is done in C and its code is simple and it doesn't try to be too smart. It is also already part of Python, it is just unknown to most apparently. So, I would like to know what do you think about moving _tkinter._flatten to some other module and renaming it to "flatten" ? -- -- Guilherme H. Polo Goncalves From greg at krypto.org Sat Dec 13 20:22:51 2008 From: greg at krypto.org (Gregory P. Smith) Date: Sat, 13 Dec 2008 11:22:51 -0800 Subject: [Python-ideas] Moving _tkinter._flatten to somewhere else In-Reply-To: References: Message-ID: <52dc1c820812131122u62b9e44fxb91ab7c46c292edc@mail.gmail.com> On Sat, Dec 13, 2008 at 10:58 AM, Guilherme Polo wrote: > Hi there, > > Probably many of you have seen/written/used a lot of recipes for > flattening lists/tuples, several of them are similar diverging just a > bit, some uses more obscure code than others, some are not fast > enough, but I have noticed all those have one thing in common: none of > them mention _tkinter._flatten, not even in comments (if the site > allows that). > > Apparently _tkinter._flatten is unknown, and it being marked private > doesn't help, and it also lives under _tkinter but doesn't depend on > anything from tcl/tk. This _flatten has the advantage of being faster > then the alternatives I have seen coded in Python, since it is done in > C and its code is simple and it doesn't try to be too smart. It is > also already part of Python, it is just unknown to most apparently. > > So, I would like to know what do you think about moving > _tkinter._flatten to some other module and renaming it to "flatten" ? > Per the irc discussion... If this is to be made a public API somewhere it should be modernized to support the iterator protocol at which point it could find a home in itertools. -------------- next part -------------- An HTML attachment was scrubbed... URL: From rhamph at gmail.com Sat Dec 13 20:25:04 2008 From: rhamph at gmail.com (Adam Olsen) Date: Sat, 13 Dec 2008 12:25:04 -0700 Subject: [Python-ideas] Moving _tkinter._flatten to somewhere else In-Reply-To: References: Message-ID: On Sat, Dec 13, 2008 at 11:58 AM, Guilherme Polo wrote: > Hi there, > > Probably many of you have seen/written/used a lot of recipes for > flattening lists/tuples, several of them are similar diverging just a > bit, some uses more obscure code than others, some are not fast > enough, but I have noticed all those have one thing in common: none of > them mention _tkinter._flatten, not even in comments (if the site > allows that). > > Apparently _tkinter._flatten is unknown, and it being marked private > doesn't help, and it also lives under _tkinter but doesn't depend on > anything from tcl/tk. This _flatten has the advantage of being faster > then the alternatives I have seen coded in Python, since it is done in > C and its code is simple and it doesn't try to be too smart. It is > also already part of Python, it is just unknown to most apparently. > > So, I would like to know what do you think about moving > _tkinter._flatten to some other module and renaming it to "flatten" ? The problem is people often have a datastructure like this: x = [['foo'], ['bar', 'baz'], ['quux']] Although _flatten seems to work, it's actually overkill. It treats the input as a tree (using type checks to differentiate leaf from branch), rather than as a mere nested list. The simplest solution is itertools.chain(*x). If we did want to generalize flatten it should be made into iterator form and take an arbitrary predicate function to distinguish leaf vs branch. The default should either be depth based (subsuming chain, probably good in the long run) or use an appropriate ABC. I prefer depth based. -- Adam Olsen, aka Rhamphoryncus From carl at carlsensei.com Sat Dec 13 21:05:49 2008 From: carl at carlsensei.com (Carl Johnson) Date: Sat, 13 Dec 2008 10:05:49 -1000 Subject: [Python-ideas] This seems like a wart to me... In-Reply-To: <87myf08csr.fsf@xemacs.org> References: <72DFCC20-8325-4B13-B609-DAED9664050D@carlsensei.com> <58DB5F52-E02B-4535-A246-B4C7DBDE9D7D@carlsensei.com> <87r64d986o.fsf@xemacs.org> <0B09B7C6-99BE-4B5B-9021-6EAF70D71180@carlsensei.com> <87prjx942d.fsf@xemacs.org> <87myf08csr.fsf@xemacs.org> Message-ID: I think this discussion is drifting from the point. We all agree that regexps are great and powerful and no professional programmer should fail to learn them. But at the same time, it's worth noting that they are a different language from Python proper, and it's very easy to get weird results without knowing why. Anyway, apparently the proposal to allow splitting on a list is dead. What do people think of the proposal to add a dropitem keyword to allow the dropping (or retaining) of empty results? -- Carl From bruce at leapyear.org Sat Dec 13 21:14:07 2008 From: bruce at leapyear.org (Bruce Leban) Date: Sat, 13 Dec 2008 12:14:07 -0800 Subject: [Python-ideas] This seems like a wart to me... In-Reply-To: References: <72DFCC20-8325-4B13-B609-DAED9664050D@carlsensei.com> <58DB5F52-E02B-4535-A246-B4C7DBDE9D7D@carlsensei.com> <87r64d986o.fsf@xemacs.org> <0B09B7C6-99BE-4B5B-9021-6EAF70D71180@carlsensei.com> <87prjx942d.fsf@xemacs.org> <87myf08csr.fsf@xemacs.org> Message-ID: [i for i in s.split(x) if i] is simple enough if I don't know how to write "(" + re.escape(x) + ")+". I would like to be able to drop "i for" in cases like this and just write [i in s.split(x) if i]. --- Bruce On Sat, Dec 13, 2008 at 12:05 PM, Carl Johnson wrote: > I think this discussion is drifting from the point. We all agree that > regexps are great and powerful and no professional programmer should fail to > learn them. But at the same time, it's worth noting that they are a > different language from Python proper, and it's very easy to get weird > results without knowing why. > > Anyway, apparently the proposal to allow splitting on a list is dead. What > do people think of the proposal to add a dropitem keyword to allow the > dropping (or retaining) of empty results? > > -- Carl > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas > -------------- next part -------------- An HTML attachment was scrubbed... URL: From carl at carlsensei.com Sat Dec 13 21:30:17 2008 From: carl at carlsensei.com (Carl Johnson) Date: Sat, 13 Dec 2008 10:30:17 -1000 Subject: [Python-ideas] This seems like a wart to me... In-Reply-To: References: <72DFCC20-8325-4B13-B609-DAED9664050D@carlsensei.com> <58DB5F52-E02B-4535-A246-B4C7DBDE9D7D@carlsensei.com> <87r64d986o.fsf@xemacs.org> <0B09B7C6-99BE-4B5B-9021-6EAF70D71180@carlsensei.com> <87prjx942d.fsf@xemacs.org> <87myf08csr.fsf@xemacs.org> Message-ID: Bruce Leban wrote: > [i for i in s.split(x) if i] is simple enough if I don't know how to > write "(" + re.escape(x) + ")+". The point of the dropempty keyword would be less the dropempty=True case as the s.split(None, dropempty=False) case, which would otherwise require a regexp. -- Carl From stephen at xemacs.org Sun Dec 14 01:24:33 2008 From: stephen at xemacs.org (Stephen J. Turnbull) Date: Sun, 14 Dec 2008 09:24:33 +0900 Subject: [Python-ideas] This seems like a wart to me... In-Reply-To: References: <72DFCC20-8325-4B13-B609-DAED9664050D@carlsensei.com> <58DB5F52-E02B-4535-A246-B4C7DBDE9D7D@carlsensei.com> <87r64d986o.fsf@xemacs.org> <0B09B7C6-99BE-4B5B-9021-6EAF70D71180@carlsensei.com> <87prjx942d.fsf@xemacs.org> <87myf08csr.fsf@xemacs.org> Message-ID: <87hc578uym.fsf@xemacs.org> Carl Johnson writes: > Bruce Leban wrote: > > > [i for i in s.split(x) if i] is simple enough if I don't know how to > > write "(" + re.escape(x) + ")+". > > The point of the dropempty keyword would be less the dropempty=True > case as the s.split(None, dropempty=False) case, which would otherwise > require a regexp. -0. Eliminating str.split()'s implementation in favor of using str.split() in the no argument case and re.split when an argument is present is backward incompatible, so I can't really object although I prefer a fix by documenting re.split() in appropriate places. I do file a technical objection and ask the judge to strike the wording "require a regexp" from the transcript as prejudicial to the accused. Preferred phrasing is "would otherwise require an import of re." From guido at python.org Sun Dec 14 17:22:40 2008 From: guido at python.org (Guido van Rossum) Date: Sun, 14 Dec 2008 08:22:40 -0800 Subject: [Python-ideas] This seems like a wart to me... In-Reply-To: <87hc578uym.fsf@xemacs.org> References: <72DFCC20-8325-4B13-B609-DAED9664050D@carlsensei.com> <87r64d986o.fsf@xemacs.org> <0B09B7C6-99BE-4B5B-9021-6EAF70D71180@carlsensei.com> <87prjx942d.fsf@xemacs.org> <87myf08csr.fsf@xemacs.org> <87hc578uym.fsf@xemacs.org> Message-ID: Whoa. I haven't wasted much time trying to follow this (IMO rather silly) argument about consistency. We're not going to introduce backwards incompatibilities or deprecate existing usage of str.split() with or without arguments are we? A dropempty argument also seems excessive -- we can't possibly add ad-hoc filtering options to every function that returns a list or iterator, that would be madness. As long as the discussion is just about giving regexps a bad name I don't really care enough to comment; but I have to draw the line when actual API changes are being considered seriously. --Guido van Rossum (home page: http://www.python.org/~guido/) On Sat, Dec 13, 2008 at 4:24 PM, Stephen J. Turnbull wrote: > Carl Johnson writes: > > Bruce Leban wrote: > > > > > [i for i in s.split(x) if i] is simple enough if I don't know how to > > > write "(" + re.escape(x) + ")+". > > > > The point of the dropempty keyword would be less the dropempty=True > > case as the s.split(None, dropempty=False) case, which would otherwise > > require a regexp. > > -0. Eliminating str.split()'s implementation in favor of using > str.split() in the no argument case and re.split when an argument is > present is backward incompatible, so I can't really object although I > prefer a fix by documenting re.split() in appropriate places. > > I do file a technical objection and ask the judge to strike the > wording "require a regexp" from the transcript as prejudicial to the > accused. Preferred phrasing is "would otherwise require an > import of re." > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas > From skip at pobox.com Mon Dec 15 15:11:58 2008 From: skip at pobox.com (skip at pobox.com) Date: Mon, 15 Dec 2008 08:11:58 -0600 Subject: [Python-ideas] This seems like a wart to me... In-Reply-To: References: Message-ID: <18758.26030.201594.303415@montanaro-dyndns-org.local> Guido> We're not going to introduce backwards incompatibilities or Guido> deprecate existing usage of str.split() with or without arguments Guido> are we? As the OP, I long ago (in terms of number of posts on the topic) gave up on the thought that the str.split() API might change. Someone suggested this idiom: [elt for elt in s.split(",") if elt] which I will adopt (with a comment explaining why it's necessary). Skip From grosser.meister.morti at gmx.net Mon Dec 15 19:38:21 2008 From: grosser.meister.morti at gmx.net (=?ISO-8859-15?Q?Mathias_Panzenb=F6ck?=) Date: Mon, 15 Dec 2008 19:38:21 +0100 Subject: [Python-ideas] returning anonymous functions Message-ID: <4946A41D.6020100@gmx.net> It is a very common case to return a nested function, e.g. in a decorator: def deco(f): def _f(*args,**kwargs): do_something() try: return f(*args,**kwargs) finally: do_something_different() return _f While I agree that the current way with non-anonymous functions perfectly works, it looks ugly to a lot of people. Maybe one of these syntax variants would be an option: def deco(f): return def(*args,**kwargs): do_something() try: return f(*args,**kwargs) finally: do_something_different() def deco(f): return(*args,**kwargs): do_something() try: return f(*args,**kwargs) finally: do_something_different() def deco(f): def return(*args,**kwargs): do_something() try: return f(*args,**kwargs) finally: do_something_different() Ok, the last one is not serious. This would not be some kind of anonymous function expression but an extended form of the return statement. Maybe you could do the same thing for yield? Well, I guess not, because yield can have a return value and this would be awkward: x = yield(a,b): return a + b And this would not be an option: f(yield(a,b): return a + b) But extending return and not extending yield feels wrong. What do you think? Does anyone have a better idea or is the current way the only thinkable for python (which might very well be the case)? -panzi From scott+python-ideas at scottdial.com Mon Dec 15 19:52:20 2008 From: scott+python-ideas at scottdial.com (Scott Dial) Date: Mon, 15 Dec 2008 13:52:20 -0500 Subject: [Python-ideas] returning anonymous functions In-Reply-To: <4946A41D.6020100@gmx.net> References: <4946A41D.6020100@gmx.net> Message-ID: <4946A764.5080905@scottdial.com> Mathias Panzenb?ck wrote: > It is a very common case to return a nested function, e.g. in a decorator: > > def deco(f): > def _f(*args,**kwargs): > do_something() > try: > return f(*args,**kwargs) > finally: > do_something_different() > return _f Is is really that common? In this case, you are misrepresenting the pattern. The appropriate version of this would require references to _f to make it's signature match that of f's, and therefore this entire argument is specious. In reality, the number of times you can get away with returning a truly anonymous function (that isn't a glorified lambda) is rare, I think. def deco(f): def _f(*args,**kawrgs): ... functools.update_wrapper(_f, f) return _f Or: def deco(f): @functools.wraps(f) def _f(*args,**kawrgs): ... return _f Even in the second case, it would be awkward to inline with the return statement because of the need to invoke a decorator. -Scott -- Scott Dial scott at scottdial.com scodial at cs.indiana.edu From grosser.meister.morti at gmx.net Mon Dec 15 20:12:49 2008 From: grosser.meister.morti at gmx.net (=?ISO-8859-1?Q?Mathias_Panzenb=F6ck?=) Date: Mon, 15 Dec 2008 20:12:49 +0100 Subject: [Python-ideas] returning anonymous functions In-Reply-To: <4946A764.5080905@scottdial.com> References: <4946A41D.6020100@gmx.net> <4946A764.5080905@scottdial.com> Message-ID: <4946AC31.1040604@gmx.net> Scott Dial schrieb: > > Is is really that common? In this case, you are misrepresenting the > pattern. The appropriate version of this would require references to _f > to make it's signature match that of f's, and therefore this entire > argument is specious. In reality, the number of times you can get away > with returning a truly anonymous function (that isn't a glorified > lambda) is rare, I think. > > def deco(f): > def _f(*args,**kawrgs): > ... > functools.update_wrapper(_f, f) > return _f > > Or: > > def deco(f): > @functools.wraps(f) > def _f(*args,**kawrgs): > ... > return _f > > Even in the second case, it would be awkward to inline with the return > statement because of the need to invoke a decorator. > ic. Ok, maybe something even more different: curried functions. def plus2(f)(x): return f(x+2) @plus2 def foo(x): ... Or I don't know. Just a thought. And yes, you cannot user functools.wraps on that either. And this is just plain ugly/utter crap: def plus2(f) @functools.wraps(f) def (x): return f(x+2) -panzi From ziade.tarek at gmail.com Tue Dec 16 01:57:33 2008 From: ziade.tarek at gmail.com (=?ISO-8859-1?Q?Tarek_Ziad=E9?=) Date: Tue, 16 Dec 2008 01:57:33 +0100 Subject: [Python-ideas] Python Isolated Environment (PIE) Message-ID: <94bdd2610812151657gd38e864pcdb84fb3ce94ac07@mail.gmail.com> Hello, I would like to propose a new mechanism in Python, to deal with package dependencies. It is described here http://tarekziade.wordpress.com/2008/12/15/python-isolated-environment-pie/ Like PJE mentioned in the blog comments, I still need to describe in details how the versions are handled, But does it sounds like a good idea ? Regards, Tarek -- Tarek Ziad? | Association AfPy | www.afpy.org Blog FR | http://programmation-python.org Blog EN | http://tarekziade.wordpress.com/ From grosser.meister.morti at gmx.net Tue Dec 16 09:40:17 2008 From: grosser.meister.morti at gmx.net (=?ISO-8859-1?Q?Mathias_Panzenb=F6ck?=) Date: Tue, 16 Dec 2008 09:40:17 +0100 Subject: [Python-ideas] returning anonymous functions In-Reply-To: <4946AC31.1040604@gmx.net> References: <4946A41D.6020100@gmx.net> <4946A764.5080905@scottdial.com> <4946AC31.1040604@gmx.net> Message-ID: <49476971.3050206@gmx.net> Mathias Panzenb?ck schrieb: > Scott Dial schrieb: >> Is is really that common? In this case, you are misrepresenting the >> pattern. The appropriate version of this would require references to _f >> to make it's signature match that of f's, and therefore this entire >> argument is specious. In reality, the number of times you can get away >> with returning a truly anonymous function (that isn't a glorified >> lambda) is rare, I think. >> >> def deco(f): >> def _f(*args,**kawrgs): >> ... >> functools.update_wrapper(_f, f) >> return _f >> >> Or: >> >> def deco(f): >> @functools.wraps(f) >> def _f(*args,**kawrgs): >> ... >> return _f >> >> Even in the second case, it would be awkward to inline with the return >> statement because of the need to invoke a decorator. >> > > > ic. > > Ok, maybe something even more different: curried functions. > > def plus2(f)(x): > return f(x+2) > > @plus2 > def foo(x): > ... > > Or I don't know. Just a thought. And yes, you cannot user functools.wraps on > that either. And this is just plain ugly/utter crap: > > def plus2(f) > @functools.wraps(f) > def (x): > return f(x+2) > > > -panzi Thinking of it, we do not need any new syntax: def curry(f): def _f(x). def _f2(*args,**kwargs): return f(x,*args,**kwargs) return _f2 return _f @curry def foo(a,b,c): return a+b+c Maybe we can somehow also use functools.update_wrapper here. -panzi From leif.walsh at gmail.com Tue Dec 16 10:13:27 2008 From: leif.walsh at gmail.com (Leif Walsh) Date: Tue, 16 Dec 2008 04:13:27 -0500 Subject: [Python-ideas] returning anonymous functions In-Reply-To: <49476971.3050206@gmx.net> References: <4946A41D.6020100@gmx.net> <4946A764.5080905@scottdial.com> <4946AC31.1040604@gmx.net> <49476971.3050206@gmx.net> Message-ID: On Tue, Dec 16, 2008 at 3:40 AM, Mathias Panzenb?ck wrote: > Thinking of it, we do not need any new syntax: > > def curry(f): > def _f(x). > def _f2(*args,**kwargs): > return f(x,*args,**kwargs) > return _f2 > return _f > > @curry > def foo(a,b,c): > return a+b+c This forces you to call foo(1)(2)(3) if you want an answer. How about: def curry(f): def _f(*c_args, **c_kwargs): def _f2(*args, **kwargs): return f(*c_args, *args, **c_kwargs, **kwargs) return _f2 return _f @curry def foo(a, b, c): return a + b + c foo(1, 2)(3) I think this still prevents us from currying multiple times --- that is, you curry once, and you get a non-curriable function. I remember something in the wiki about a decorator in the form of an object, that accumulated arguments when __call__()ed, and I think that worked best (and still didn't need new syntax). -- Cheers, Leif From leif.walsh at gmail.com Tue Dec 16 10:14:10 2008 From: leif.walsh at gmail.com (Leif Walsh) Date: Tue, 16 Dec 2008 04:14:10 -0500 Subject: [Python-ideas] returning anonymous functions In-Reply-To: References: <4946A41D.6020100@gmx.net> <4946A764.5080905@scottdial.com> <4946AC31.1040604@gmx.net> <49476971.3050206@gmx.net> Message-ID: On Tue, Dec 16, 2008 at 4:13 AM, Leif Walsh wrote: > This forces you to call foo(1)(2)(3) if you want an answer. How about: Sorry, said that wrong. It doesn't let you curry more than one variable. -- Cheers, Leif From greg.ewing at canterbury.ac.nz Wed Dec 17 23:55:35 2008 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Thu, 18 Dec 2008 11:55:35 +1300 Subject: [Python-ideas] Moving _tkinter._flatten to somewhere else In-Reply-To: References: Message-ID: <49498367.3090506@canterbury.ac.nz> Guilherme Polo wrote: > So, I would like to know what do you think about moving > _tkinter._flatten to some other module and renaming it to "flatten" ? In my experience, use cases for flattening lists are very thin on the ground, and when one does arise, it usually has some application-specific criterion for when to stop recursing. -- Greg