From ezio.melotti at gmail.com  Mon Dec  1 02:17:50 2008
From: ezio.melotti at gmail.com (Ezio Melotti)
Date: Mon, 01 Dec 2008 03:17:50 +0200
Subject: [Python-ideas] A Wiki-style documentation with an approval
	process
In-Reply-To: <gg9qdr$jbq$1@ger.gmane.org>
References: <4926BC57.8030509@gmail.com>	<gg79k7$k5e$1@ger.gmane.org>	<8763mgl3xz.fsf@xemacs.org>
	<gg9qdr$jbq$1@ger.gmane.org>
Message-ID: <49333B3E.4020409@gmail.com>

Terry Reedy wrote:
> Stephen J. Turnbull wrote:
>
>> C'mon, I bet you've let a typo or two slide because your brain was on
>> fire to finish your latest hack.  Haven't we all?  If the doc you were
>> reading was the wiki and a fix was a mouse click, two keystrokes, and
>> another mouse click away, you might fix it in that situation.
>
> 1. There are very few overt typos left in the docs.
> 2. Why would I use such an inferior version as a wiki version would be?
>
> Now, if someone wrote a Microsoft Help workalike program that also 
> included an 'email corrections' feature, that would be something else.
>
I agree with what Stephen J. Turnbull said. The main point here is 
"/making easy things easy/ and hard things possible".
There are a several changes in the doc that don't require a related 
issue in the bug tracker and they would benefit from a wiki-like system 
(typos are just an example). If a change requires a discussion we can 
still use the bug tracker (and possibly edit the page directly at the 
end of the process, without using patches if they are not necessary.)
This system is not intended as a replacement, but just an improvement of 
what we already have.

Terry Reedy wrote:
> I suspect that the doc maintainers would spend as much time rewriting 
> submissions as they do now and more time rejecting suggestions.
Among the Defer/Approve/Reject radio buttons that Stephen suggested we 
can also add an "Open as a new issue" button. This can be used to 
"redirect" on the bug tracker the suggestions that are valid but still 
need some change.

-- 
Ezio Melotti


From clp at rebertia.com  Thu Dec  4 08:51:55 2008
From: clp at rebertia.com (Chris Rebert)
Date: Wed, 3 Dec 2008 23:51:55 -0800
Subject: [Python-ideas] Decimal literal?
Message-ID: <47c890dc0812032351r22e5bf09q1d7ab634358186a5@mail.gmail.com>

With Python 3.0 being released, and going over its many changes, I was
reminded that decimal numbers (decimal.Decimal) are still relegated to
a library and aren't built-in.

Has there been any thought to adding decimal literals and making
decimal a built-in type? I googled but was unable to locate any
discussion of the exact issue. The closest I could find was a
suggestion about making decimal the default instead of float:
http://mail.python.org/pipermail/python-ideas/2008-May/001565.html
It seems that decimal arithmetic is more intuitively correct that
plain floating point and floating point's main (only?) advantage is
speed, but it seems like premature optimization to favor speed over
correctness by default at the language level.
Obviously, making decimal the default instead of float would be
fraught with backward compatibility problems and thus is not presently
feasible, but at the least for now Python could make it easier to use
decimals and their associated nice arithmetic by having a literal
syntax for them and making them built-in.

So what do people think of:
1. making decimal.Decimal a built-in type, named "decimal" (or "dec"
if that's too long?)
2. adding a literal syntax for decimals; I'd naively suggest a 'd'
suffix to the float literal syntax (which was suggested in the brief
aforementioned thread)
3. (in Python 4.0/Python 4000) making decimal the default instead of
float, with floats instead requiring a 'f' suffix

Obviously #1 & #2 would be shooting for Python 3.1 or later.

Cheers,
Chris

P.S. Yay for the long-awaited release of Python 3.0! Better than can
be said for Perl 6.

--
Follow the path of the Iguana...
http://rebertia.com


From python at rcn.com  Thu Dec  4 09:00:04 2008
From: python at rcn.com (Raymond Hettinger)
Date: Thu, 4 Dec 2008 00:00:04 -0800
Subject: [Python-ideas] Decimal literal?
References: <47c890dc0812032351r22e5bf09q1d7ab634358186a5@mail.gmail.com>
Message-ID: <204A1AE06E8341E7BE37050BF2674E55@RaymondLaptop1>

From: "Chris Rebert" <clp at rebertia.com>


> With Python 3.0 being released, and going over its many changes, I was
> reminded that decimal numbers (decimal.Decimal) are still relegated to
> a library and aren't built-in.
> 
> Has there been any thought to adding decimal literals and making
> decimal a built-in type?

It's a non-starter until there is a fast, clean C implementation of decimal.
The current module is hundreds of times slower than binary floats.


Raymond


From clp at rebertia.com  Thu Dec  4 09:23:24 2008
From: clp at rebertia.com (Chris Rebert)
Date: Thu, 4 Dec 2008 00:23:24 -0800
Subject: [Python-ideas] Decimal literal?
In-Reply-To: <204A1AE06E8341E7BE37050BF2674E55@RaymondLaptop1>
References: <47c890dc0812032351r22e5bf09q1d7ab634358186a5@mail.gmail.com>
	<204A1AE06E8341E7BE37050BF2674E55@RaymondLaptop1>
Message-ID: <47c890dc0812040023i3b887b28l4cb89db68ae539e5@mail.gmail.com>

On Thu, Dec 4, 2008 at 12:00 AM, Raymond Hettinger <python at rcn.com> wrote:
> From: "Chris Rebert" <clp at rebertia.com>
>
>
>> With Python 3.0 being released, and going over its many changes, I was
>> reminded that decimal numbers (decimal.Decimal) are still relegated to
>> a library and aren't built-in.
>>
>> Has there been any thought to adding decimal literals and making
>> decimal a built-in type?
>
> It's a non-starter until there is a fast, clean C implementation of decimal.
> The current module is hundreds of times slower than binary floats.
>

Does performance matter quite *that* critically in most everyday
programs? If people need such ruthless speed, they can use floats and
accept the consequences or use another language entirely (e.g. C, C++,
OCaml) as Python would be too slow even as it currently is. We're
talking about giving people the option to explicitly, in a less
cumbersome way, make that choice of correctness over performance.
If slowing startup time for the interpreter is what worries you, a
'from __future__ import' directive could be required and the timeline
for full built-in-ness pushed back.

Also, by "built-in" I didn't mean to necessarily imply "written in C",
but rather "being present in the builtin namespace and available by
default". That said, there appears to be decNumber
(http://speleotrove.com/decimal/#decNumber), an ANSI C implementation
of the General Decimal Arithmetic spec to which Decimal.decimal
adheres. At least there's a place to start.

Cheers,
Chris
-- 
Follow the path of the Iguana...
http://rebertia.com

>
> Raymond
>


From stephen at xemacs.org  Thu Dec  4 09:52:22 2008
From: stephen at xemacs.org (Stephen J. Turnbull)
Date: Thu, 04 Dec 2008 17:52:22 +0900
Subject: [Python-ideas] Decimal literal?
In-Reply-To: <47c890dc0812040023i3b887b28l4cb89db68ae539e5@mail.gmail.com>
References: <47c890dc0812032351r22e5bf09q1d7ab634358186a5@mail.gmail.com>
	<204A1AE06E8341E7BE37050BF2674E55@RaymondLaptop1>
	<47c890dc0812040023i3b887b28l4cb89db68ae539e5@mail.gmail.com>
Message-ID: <87y6ywwce1.fsf@uwakimon.sk.tsukuba.ac.jp>

Chris Rebert writes:

 > Does performance matter quite *that* critically in most everyday
 > programs?

Of course not.  But that's the wrong question.  Python is a
*general-purpose* programming language, not an "everyday application
where performance isn't critical programming language".  There are
plenty of applications that just cry out<wink> for a Python
implementation where it does matter.


From clp at rebertia.com  Thu Dec  4 10:10:40 2008
From: clp at rebertia.com (Chris Rebert)
Date: Thu, 4 Dec 2008 01:10:40 -0800
Subject: [Python-ideas] Decimal literal?
In-Reply-To: <87y6ywwce1.fsf@uwakimon.sk.tsukuba.ac.jp>
References: <47c890dc0812032351r22e5bf09q1d7ab634358186a5@mail.gmail.com>
	<204A1AE06E8341E7BE37050BF2674E55@RaymondLaptop1>
	<47c890dc0812040023i3b887b28l4cb89db68ae539e5@mail.gmail.com>
	<87y6ywwce1.fsf@uwakimon.sk.tsukuba.ac.jp>
Message-ID: <47c890dc0812040110w15cb30f3ld23e5bfdd0936a0f@mail.gmail.com>

On Thu, Dec 4, 2008 at 12:52 AM, Stephen J. Turnbull <stephen at xemacs.org> wrote:
> Chris Rebert writes:
>
>  > Does performance matter quite *that* critically in most everyday
>  > programs?
>
> Of course not.  But that's the wrong question.  Python is a
> *general-purpose* programming language, not an "everyday application
> where performance isn't critical programming language".  There are
> plenty of applications that just cry out<wink> for a Python
> implementation where it does matter.
>

We're talking about adding a feature, not taking speed away. If
anything, this would increase adoption of Python as people writing
programs that use decimals extensively would be able to use decimals
with greater ease. Speed freaks could still use floats; there's no
change as far as they're concerned. Yes, people who need BOTH decimals
AND maximum speed would still be left out, but let's take this one
step at a time, and in a later step maybe we can fully satisfy such
people.
We wouldn't want the perfect long term (speedy built-in decimals)
getting in the way of the pretty good near term (built-in decimals).

Additionally, your argument can be turned on its head ;-) Consider:
 > Does perfect accuracy matter quite *that* critically in most
everyday programs?
Of course not.  But that's the wrong question.  Python is a
*general-purpose* programming language, not an "everyday application
where accuracy isn't critical programming language".  There are plenty
of applications that just cry out<wink> for a Python implementation
where it does matter.

<grin>

Cheers,
Chris

-- 
Follow the path of the Iguana...
http://rebertia.com


From rhamph at gmail.com  Thu Dec  4 10:37:11 2008
From: rhamph at gmail.com (Adam Olsen)
Date: Thu, 4 Dec 2008 02:37:11 -0700
Subject: [Python-ideas] Decimal literal?
In-Reply-To: <47c890dc0812032351r22e5bf09q1d7ab634358186a5@mail.gmail.com>
References: <47c890dc0812032351r22e5bf09q1d7ab634358186a5@mail.gmail.com>
Message-ID: <aac2c7cb0812040137t551c89eci995ff0a51b8efca@mail.gmail.com>

On Thu, Dec 4, 2008 at 12:51 AM, Chris Rebert <clp at rebertia.com> wrote:
> With Python 3.0 being released, and going over its many changes, I was
> reminded that decimal numbers (decimal.Decimal) are still relegated to
> a library and aren't built-in.
>
> Has there been any thought to adding decimal literals and making
> decimal a built-in type? I googled but was unable to locate any
> discussion of the exact issue. The closest I could find was a
> suggestion about making decimal the default instead of float:
> http://mail.python.org/pipermail/python-ideas/2008-May/001565.html
> It seems that decimal arithmetic is more intuitively correct that
> plain floating point and floating point's main (only?) advantage is
> speed, but it seems like premature optimization to favor speed over
> correctness by default at the language level.

Intuitively, you'd think it's more correct, but for non-trivial usage
I see no reason for it to be.  The strongest arguments on [1] seem to
be controllable precision and stricter standards.  Controllable
precision works just as well in a library.  Stricter standards (ie
very portable semantics) could be done with base-2 floats via software
emulating on all platforms (and throwing performance out the window).

Do you have some use cases that are (completely!) correct in decimal,
and not in base-2 floating point?  Something not trivial (emulating a
schoolbook, writing a calculator, etc.)

I see Decimal as a modest investment for a mild return.  Not worth the
effort to switch.


-- 
Adam Olsen, aka Rhamphoryncus


From python at rcn.com  Thu Dec  4 10:51:08 2008
From: python at rcn.com (Raymond Hettinger)
Date: Thu, 4 Dec 2008 01:51:08 -0800
Subject: [Python-ideas] Decimal literal?
References: <47c890dc0812032351r22e5bf09q1d7ab634358186a5@mail.gmail.com>
Message-ID: <3A55A52D0AFB41C9AF79A7B4C272DB53@RaymondLaptop1>

If decimals are to become built-in, there are a number of things that need to happen and one of them includes a C implementation, 
not just for speed, but also to integrate with the parser and the rest of the language.

Last time I looked, the existing C implementations out there were license compatible with Python. Also, there are other integration 
issues to solved, including that of contexts (which are an integral part of the spec).  None of this is a trivial exercise or I 
would have already done it.   I do want to move decimal towards being a builtin but don't underestimate the difficulty of doing so.

Also, there are other API issues.  As it stands, the decimal module is not friendly to newbies and presents challenges even for 
expert users. And don't underestimate the significance of performance -- it is a top reason that people currently avoid the decimal 
module and it is an issue for the language itself (lots of companies avoid Python because of its speed disadvantage).

One other thought, decimal literals are likely  not very helpful in real programs. Most apps that have specific numeric 
requirements, will have code that manipulates numbers read-in from external sources and written back out -- the scripts themselves 
typically contain very few constants (and those are typically integers), so you don't get much help from a decimal literal.


Raymond Hettinger

----- Original Message ----- 
From: "Chris Rebert" <clp at rebertia.com>
To: "Python-Ideas" <python-ideas at python.org>
Sent: Wednesday, December 03, 2008 11:51 PM
Subject: [Python-ideas] Decimal literal?


> With Python 3.0 being released, and going over its many changes, I was
> reminded that decimal numbers (decimal.Decimal) are still relegated to
> a library and aren't built-in.
>
> Has there been any thought to adding decimal literals and making
> decimal a built-in type? I googled but was unable to locate any
> discussion of the exact issue. The closest I could find was a
> suggestion about making decimal the default instead of float:
> http://mail.python.org/pipermail/python-ideas/2008-May/001565.html
> It seems that decimal arithmetic is more intuitively correct that
> plain floating point and floating point's main (only?) advantage is
> speed, but it seems like premature optimization to favor speed over
> correctness by default at the language level.
> Obviously, making decimal the default instead of float would be
> fraught with backward compatibility problems and thus is not presently
> feasible, but at the least for now Python could make it easier to use
> decimals and their associated nice arithmetic by having a literal
> syntax for them and making them built-in.
>
> So what do people think of:
> 1. making decimal.Decimal a built-in type, named "decimal" (or "dec"
> if that's too long?)
> 2. adding a literal syntax for decimals; I'd naively suggest a 'd'
> suffix to the float literal syntax (which was suggested in the brief
> aforementioned thread)
> 3. (in Python 4.0/Python 4000) making decimal the default instead of
> float, with floats instead requiring a 'f' suffix
>
> Obviously #1 & #2 would be shooting for Python 3.1 or later.
>
> Cheers,
> Chris
>
> P.S. Yay for the long-awaited release of Python 3.0! Better than can
> be said for Perl 6.
>
> --
> Follow the path of the Iguana...
> http://rebertia.com
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> http://mail.python.org/mailman/listinfo/python-ideas 


From cesare.dimauro at a-tono.com  Thu Dec  4 10:45:41 2008
From: cesare.dimauro at a-tono.com (Cesare Di Mauro)
Date: Thu, 04 Dec 2008 10:45:41 +0100
Subject: [Python-ideas] Decimal literal?
In-Reply-To: <aac2c7cb0812040137t551c89eci995ff0a51b8efca@mail.gmail.com>
References: <47c890dc0812032351r22e5bf09q1d7ab634358186a5@mail.gmail.com>
	<aac2c7cb0812040137t551c89eci995ff0a51b8efca@mail.gmail.com>
Message-ID: <op.ulmtefu303jqhe@cesareprova.org>

On 04 dicembre 2008 alle ore 10:37 AM, Adam Olsen <rhamph at gmail.com> wrote:

> Intuitively, you'd think it's more correct, but for non-trivial usage
> I see no reason for it to be.  The strongest arguments on [1] seem to
> be controllable precision and stricter standards.  Controllable
> precision works just as well in a library.  Stricter standards (ie
> very portable semantics) could be done with base-2 floats via software
> emulating on all platforms (and throwing performance out the window).
>
> Do you have some use cases that are (completely!) correct in decimal,
> and not in base-2 floating point?  Something not trivial (emulating a
> schoolbook, writing a calculator, etc.)
>
> I see Decimal as a modest investment for a mild return.  Not worth the
> effort to switch.

But at least it will be more usable to have a short-hand for decimal
declaration:

a = 1234.5678d

is simplier than:

import decimal
a = decimal.Decimal('1234.5678')

or:

from decimal import Decimal
a = Decimal('1234.5678')

Cheers
Cesare


From python at rcn.com  Thu Dec  4 10:56:58 2008
From: python at rcn.com (Raymond Hettinger)
Date: Thu, 4 Dec 2008 01:56:58 -0800
Subject: [Python-ideas] Decimal literal?
References: <47c890dc0812032351r22e5bf09q1d7ab634358186a5@mail.gmail.com><aac2c7cb0812040137t551c89eci995ff0a51b8efca@mail.gmail.com>
	<op.ulmtefu303jqhe@cesareprova.org>
Message-ID: <429565A0F2E34960A00B14E9D15F2D7F@RaymondLaptop1>

From: "Cesare Di Mauro" <cesare.dimauro at a-tono.com>
> But at least it will be more usable to have a short-hand for decimal
> declaration:
> 
> a = 1234.5678d

How often do you put non-integer constants in real programs?
Don't you find that most real decimal apps start with external
data sources instead of all the data values being hard-coded
in your program?


From clp at rebertia.com  Thu Dec  4 10:54:47 2008
From: clp at rebertia.com (Chris Rebert)
Date: Thu, 4 Dec 2008 01:54:47 -0800
Subject: [Python-ideas] Decimal literal?
In-Reply-To: <aac2c7cb0812040137t551c89eci995ff0a51b8efca@mail.gmail.com>
References: <47c890dc0812032351r22e5bf09q1d7ab634358186a5@mail.gmail.com>
	<aac2c7cb0812040137t551c89eci995ff0a51b8efca@mail.gmail.com>
Message-ID: <47c890dc0812040154h5c0d96ebn4e30b1bf87c95ef5@mail.gmail.com>

On Thu, Dec 4, 2008 at 1:37 AM, Adam Olsen <rhamph at gmail.com> wrote:
> On Thu, Dec 4, 2008 at 12:51 AM, Chris Rebert <clp at rebertia.com> wrote:
>> With Python 3.0 being released, and going over its many changes, I was
>> reminded that decimal numbers (decimal.Decimal) are still relegated to
>> a library and aren't built-in.
>>
>> Has there been any thought to adding decimal literals and making
>> decimal a built-in type? I googled but was unable to locate any
>> discussion of the exact issue. The closest I could find was a
>> suggestion about making decimal the default instead of float:
>> http://mail.python.org/pipermail/python-ideas/2008-May/001565.html
>> It seems that decimal arithmetic is more intuitively correct that
>> plain floating point and floating point's main (only?) advantage is
>> speed, but it seems like premature optimization to favor speed over
>> correctness by default at the language level.
>
> Intuitively, you'd think it's more correct, but for non-trivial usage
> I see no reason for it to be.  The strongest arguments on [1] seem to
> be controllable precision and stricter standards.  Controllable
> precision works just as well in a library.  Stricter standards (ie
> very portable semantics) could be done with base-2 floats via software
> emulating on all platforms (and throwing performance out the window).
>
> Do you have some use cases that are (completely!) correct in decimal,
> and not in base-2 floating point?  Something not trivial (emulating a
> schoolbook, writing a calculator, etc.)

No, not personally, but I assume there must be or the decimal module
would never have been added in the first place. PEP 327 suggests that
accurate financial calculations benefit from decimal.
Someone must have had (a) sufficiently compelling use case(s) to get
the BDFL to say yes. GvR doesn't approve PEPs indiscriminately.

Cheers,
Chris

-- 
Follow the path of the Iguana...
http://rebertia.com

>
> I see Decimal as a modest investment for a mild return.  Not worth the
> effort to switch.
>
>
> --
> Adam Olsen, aka Rhamphoryncus
>


From clp at rebertia.com  Thu Dec  4 11:10:33 2008
From: clp at rebertia.com (Chris Rebert)
Date: Thu, 4 Dec 2008 02:10:33 -0800
Subject: [Python-ideas] Decimal literal?
In-Reply-To: <429565A0F2E34960A00B14E9D15F2D7F@RaymondLaptop1>
References: <47c890dc0812032351r22e5bf09q1d7ab634358186a5@mail.gmail.com>
	<aac2c7cb0812040137t551c89eci995ff0a51b8efca@mail.gmail.com>
	<op.ulmtefu303jqhe@cesareprova.org>
	<429565A0F2E34960A00B14E9D15F2D7F@RaymondLaptop1>
Message-ID: <47c890dc0812040210u1db3f187gcfc61791f04b996c@mail.gmail.com>

On Thu, Dec 4, 2008 at 1:56 AM, Raymond Hettinger <python at rcn.com> wrote:
> From: "Cesare Di Mauro" <cesare.dimauro at a-tono.com>
>>
>> But at least it will be more usable to have a short-hand for decimal
>> declaration:
>>
>> a = 1234.5678d
>
> How often do you put non-integer constants in real programs?
> Don't you find that most real decimal apps start with external
> data sources instead of all the data values being hard-coded
> in your program?

In all fairness, by that same argument we shouldn't have float
literals, yet we do despite that. They're useful in scripts where
things are hardcoded. Later, the scripts grow and we do end up reading
the numbers in from external sources. That doesn't mean the initial
script version wasn't useful. Literals help when writing
proofs-of-concept and rapid prototypes, areas where Python has
historically done well.
Java's designers probably used similar arguments against hard-coding
when deciding not to include collection literals; meanwhile Python
does have such literals and they appear to be much cherished as
language features go. The parallels to the decimal situation are
striking.
Having decimal literals as well would at least keep things consistent.
Sets are less common, yet they now have literals; why not decimals
too?

Cheers,
Chris

-- 
Follow the path of the Iguana...
http://rebertia.com

> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> http://mail.python.org/mailman/listinfo/python-ideas
>


From cesare.dimauro at a-tono.com  Thu Dec  4 11:19:12 2008
From: cesare.dimauro at a-tono.com (Cesare Di Mauro)
Date: Thu, 04 Dec 2008 11:19:12 +0100
Subject: [Python-ideas] Decimal literal?
In-Reply-To: <429565A0F2E34960A00B14E9D15F2D7F@RaymondLaptop1>
References: <47c890dc0812032351r22e5bf09q1d7ab634358186a5@mail.gmail.com>
	<aac2c7cb0812040137t551c89eci995ff0a51b8efca@mail.gmail.com>
	<op.ulmtefu303jqhe@cesareprova.org>
	<429565A0F2E34960A00B14E9D15F2D7F@RaymondLaptop1>
Message-ID: <op.ulmuyazm03jqhe@cesareprova.org>

On 04 dec 2008 at 10:56 AM, Raymond Hettinger <python at rcn.com> wrote:

>> But at least it will be more usable to have a short-hand for decimal
>> declaration:
>>
>> a = 1234.5678d
>
> How often do you put non-integer constants in real programs?

A few times indeed (except for strings). So why we are allowing
floats literals?

> Don't you find that most real decimal apps start with external
> data sources instead of all the data values being hard-coded
> in your program?

The same happens with any kind of application: except for
very common cases (like using integers and strings), constant
definitions are rare.

But working with financial applications, using decimal numerics
is a very common practice. Even if implementation is slow, we
prefer exact results over speed: there must be no possibility on
failing calculations when we are manipulating moneys.

If you take a look at other languages / IDEs, like Delphi or CBuilder,
there's support for BCD-like type, but I never appreciated the
need to import its library to use it on my applications.

Also keep in mind that having the possibility to define literals for
a set of types can help a lot in generating a more optimized
bytecode.
That's because we can do a more aggressive static analysis
(a field were can be done a lot of work to improve the
performance of the language).

Cheers
Cesare


From cesare.dimauro at a-tono.com  Thu Dec  4 11:22:12 2008
From: cesare.dimauro at a-tono.com (Cesare Di Mauro)
Date: Thu, 04 Dec 2008 11:22:12 +0100
Subject: [Python-ideas] Decimal literal?
In-Reply-To: <47c890dc0812040210u1db3f187gcfc61791f04b996c@mail.gmail.com>
References: <47c890dc0812032351r22e5bf09q1d7ab634358186a5@mail.gmail.com>
	<aac2c7cb0812040137t551c89eci995ff0a51b8efca@mail.gmail.com>
	<op.ulmtefu303jqhe@cesareprova.org>
	<429565A0F2E34960A00B14E9D15F2D7F@RaymondLaptop1>
	<47c890dc0812040210u1db3f187gcfc61791f04b996c@mail.gmail.com>
Message-ID: <op.ulmu3ah903jqhe@cesareprova.org>

On Thu, Dec 4, 2008 at 11:10 AM, Chris Rebert <clp at rebertia.com> wrote:

>> How often do you put non-integer constants in real programs?
>> Don't you find that most real decimal apps start with external
>> data sources instead of all the data values being hard-coded
>> in your program?
>
> In all fairness, by that same argument we shouldn't have float
> literals, yet we do despite that. They're useful in scripts where
> things are hardcoded. Later, the scripts grow and we do end up reading
> the numbers in from external sources. That doesn't mean the initial
> script version wasn't useful. Literals help when writing
> proofs-of-concept and rapid prototypes, areas where Python has
> historically done well.
> Java's designers probably used similar arguments against hard-coding
> when deciding not to include collection literals; meanwhile Python
> does have such literals and they appear to be much cherished as
> language features go. The parallels to the decimal situation are
> striking.
> Having decimal literals as well would at least keep things consistent.
> Sets are less common, yet they now have literals; why not decimals
> too?
>
> Cheers,
> Chris

I absolutely agree. Literals, also, can help improve language speed.

Cheers,
Cesare


From stephen at xemacs.org  Thu Dec  4 11:43:43 2008
From: stephen at xemacs.org (Stephen J. Turnbull)
Date: Thu, 04 Dec 2008 19:43:43 +0900
Subject: [Python-ideas] Decimal literal?
In-Reply-To: <47c890dc0812040110w15cb30f3ld23e5bfdd0936a0f@mail.gmail.com>
References: <47c890dc0812032351r22e5bf09q1d7ab634358186a5@mail.gmail.com>
	<204A1AE06E8341E7BE37050BF2674E55@RaymondLaptop1>
	<47c890dc0812040023i3b887b28l4cb89db68ae539e5@mail.gmail.com>
	<87y6ywwce1.fsf@uwakimon.sk.tsukuba.ac.jp>
	<47c890dc0812040110w15cb30f3ld23e5bfdd0936a0f@mail.gmail.com>
Message-ID: <87oczsnrts.fsf@xemacs.org>

Chris Rebert writes:

 > We're talking about adding a feature, not taking speed away.

OK, that's reasonable.  But adding features is expensive.  BTW, don't
listen to me, I've never done it.  Listen to Raymond.

 > If anything, this would increase adoption of Python as people
 > writing programs that use decimals extensively would be able to use
 > decimals with greater ease.

Maybe.  I don't see a huge advantage of


over

import Decimal

I also think that most of the (easy) advantage to Decimal will accrue
to people who *never* have to deal with measurement error:
accountants.  But oops! they don't need Decimal per se; they're
perfectly happy with big integers.  People who really *do* need
Decimal are not going to be deterred by 16 characters (counting the
newline<wink>); they're already into real pain.

 > Additionally, your argument can be turned on its head ;-) Consider:
 > Does perfect accuracy matter quite *that* critically in most
 > everyday programs?  Of course not.  But that's the wrong question.
 > Python is a *general-purpose* programming language, not an
 > "everyday application where accuracy isn't critical programming
 > language".  There are plenty of applications that just cry
 > out<wink> for a Python implementation where it does matter.

I think you've misspelled "precision".<wink>  Improved accuracy cannot
be achieved simply by adding a new number type.


From python at rcn.com  Thu Dec  4 11:50:29 2008
From: python at rcn.com (Raymond Hettinger)
Date: Thu, 4 Dec 2008 02:50:29 -0800
Subject: [Python-ideas] Decimal literal?
References: <47c890dc0812032351r22e5bf09q1d7ab634358186a5@mail.gmail.com>
	<3A55A52D0AFB41C9AF79A7B4C272DB53@RaymondLaptop1>
Message-ID: <92081E2A58C34372A270D5D54994DAC4@RaymondLaptop1>

From: "Raymond Hettinger" 
> Last time I looked, the existing C implementations out there were license compatible with Python.

That should have said "incompatible".


From clp at rebertia.com  Thu Dec  4 12:02:08 2008
From: clp at rebertia.com (Chris Rebert)
Date: Thu, 4 Dec 2008 03:02:08 -0800
Subject: [Python-ideas] Decimal literal?
In-Reply-To: <92081E2A58C34372A270D5D54994DAC4@RaymondLaptop1>
References: <47c890dc0812032351r22e5bf09q1d7ab634358186a5@mail.gmail.com>
	<3A55A52D0AFB41C9AF79A7B4C272DB53@RaymondLaptop1>
	<92081E2A58C34372A270D5D54994DAC4@RaymondLaptop1>
Message-ID: <47c890dc0812040302u48b45aeft6d55ba267cce797b@mail.gmail.com>

On Thu, Dec 4, 2008 at 2:50 AM, Raymond Hettinger <python at rcn.com> wrote:
> From: "Raymond Hettinger"
>>
>> Last time I looked, the existing C implementations out there were license
>> compatible with Python.
>
> That should have said "incompatible".
>

decNumber is available under the ICU License, which seems to be a
variant of the original BSD license. Depending on exactly how the
acknowledgement clause is interpreted (IANAL), it seems like it might
be compatible. If not, IBM, which has copyright on decNumber, seems to
have a fairly pro-open-source stance historically; perhaps if asked
nicely by the community, they would be willing to relicense decNumber
under the revised BSD license (a very minor change vs. the ICU
License), which would certainly be compatible with Python's licensing
policy.

Or maybe there exists another library that's already compatible.
Perhaps I'll investigate.

But the key here is we should first determine whether people want
decimal to be built-in and have a literal. Once that's established,
then the details as to implementing that should be investigated. But
yes, practicality and feasibility certainly are factors in all this.

Cheers,
Chris

-- 
Follow the path of the Iguana...
http://rebertia.com


From facundobatista at gmail.com  Thu Dec  4 12:33:25 2008
From: facundobatista at gmail.com (Facundo Batista)
Date: Thu, 4 Dec 2008 09:33:25 -0200
Subject: [Python-ideas] Decimal literal?
In-Reply-To: <47c890dc0812040302u48b45aeft6d55ba267cce797b@mail.gmail.com>
References: <47c890dc0812032351r22e5bf09q1d7ab634358186a5@mail.gmail.com>
	<3A55A52D0AFB41C9AF79A7B4C272DB53@RaymondLaptop1>
	<92081E2A58C34372A270D5D54994DAC4@RaymondLaptop1>
	<47c890dc0812040302u48b45aeft6d55ba267cce797b@mail.gmail.com>
Message-ID: <e04bdf310812040333k472a74cfl96d4ef0a4a04eb6@mail.gmail.com>

2008/12/4 Chris Rebert <clp at rebertia.com>:

> Or maybe there exists another library that's already compatible.
> Perhaps I'll investigate.
>
> But the key here is we should first determine whether people want
> decimal to be built-in and have a literal. Once that's established,
> then the details as to implementing that should be investigated. But

I'd put it around.

The best we can do *now* with Decimal, if we want it to be included as
a literal *somewhen*, is to get it in C.

There're already some first steps in that direction, but *please*
investigate that other path you're suggesting.

Thanks!

-- 
.    Facundo

Blog: http://www.taniquetil.com.ar/plog/
PyAr: http://www.python.org/ar/


From aahz at pythoncraft.com  Thu Dec  4 14:35:38 2008
From: aahz at pythoncraft.com (Aahz)
Date: Thu, 4 Dec 2008 05:35:38 -0800
Subject: [Python-ideas] Decimal literal?
In-Reply-To: <3A55A52D0AFB41C9AF79A7B4C272DB53@RaymondLaptop1>
References: <47c890dc0812032351r22e5bf09q1d7ab634358186a5@mail.gmail.com>
	<3A55A52D0AFB41C9AF79A7B4C272DB53@RaymondLaptop1>
Message-ID: <20081204133538.GA21462@panix.com>

On Thu, Dec 04, 2008, Raymond Hettinger wrote:
>
> One other thought, decimal literals are likely  not very helpful in real 
> programs. Most apps that have specific numeric requirements, will have 
> code that manipulates numbers read-in from external sources and written 
> back out -- the scripts themselves typically contain very few constants 
> (and those are typically integers), so you don't get much help from a 
> decimal literal.

That's half-true.  Most applications IME that manipulate numbers need to
express zero frequently as initializers.  So yeah, it's easy to just
write things like::

    total = dzero
    balance = dzero

but I think there's definitely some utility from writing::

    total = 0.0d
    balance = 0.0d

How much utility (especially from the readability side) is of course
subject to debate, but please don't ignore it altogether.
-- 
Aahz (aahz at pythoncraft.com)           <*>         http://www.pythoncraft.com/

"It is easier to optimize correct code than to correct optimized code."
--Bill Harlan


From lists at cheimes.de  Thu Dec  4 15:31:02 2008
From: lists at cheimes.de (Christian Heimes)
Date: Thu, 04 Dec 2008 15:31:02 +0100
Subject: [Python-ideas] Decimal literal?
In-Reply-To: <204A1AE06E8341E7BE37050BF2674E55@RaymondLaptop1>
References: <47c890dc0812032351r22e5bf09q1d7ab634358186a5@mail.gmail.com>
	<204A1AE06E8341E7BE37050BF2674E55@RaymondLaptop1>
Message-ID: <gh8pj6$fh6$1@ger.gmane.org>

Raymond Hettinger wrote:
> It's a non-starter until there is a fast, clean C implementation of 
> decimal.
> The current module is hundreds of times slower than binary floats.

If we ever going to consider Cython for core development, the decimal 
module could be the first module that uses Cython. IMHO it's the perfect 
candidate for a proof of concept.

Christian


From tjreedy at udel.edu  Thu Dec  4 19:04:07 2008
From: tjreedy at udel.edu (Terry Reedy)
Date: Thu, 04 Dec 2008 13:04:07 -0500
Subject: [Python-ideas] Decimal literal?
In-Reply-To: <47c890dc0812032351r22e5bf09q1d7ab634358186a5@mail.gmail.com>
References: <47c890dc0812032351r22e5bf09q1d7ab634358186a5@mail.gmail.com>
Message-ID: <gh962j$i7$1@ger.gmane.org>

Chris Rebert wrote:

> It seems that decimal arithmetic is more intuitively correct that
> plain floating point and floating point's main (only?) advantage is
> speed, but it seems like premature optimization to favor speed over
> correctness by default at the language level.

One could say the same about rational arithmetic, which as also been 
considered and so far rejected for fractional literals.  In fact, 
fractions are more accurate since there is never rounding unless one 
requests it.

There is an advantage of binary floats that you missed.  One can 
prototype float functions in Python and then translate as necessary for 
real speed to C  and get the same results (using the same compiler on 
the same hardware).  But even prototypes need to run faster than 
molasses. One can also use Python to glue together C (or Fortran) double 
routines without translating the numbers.  The numerical module (now 
numpy) is over a decade old and was, I believe, Python's first killer app.

> Obviously, making decimal the default instead of float would be
> fraught with backward compatibility problems and thus is not presently
> feasible, but at the least for now Python could make it easier to use
> decimals and their associated nice arithmetic by having a literal
> syntax for them and making them built-in.

Ditto for fractions.

> So what do people think of:
> 1. making decimal.Decimal a built-in type, named "decimal" (or "dec"
> if that's too long?)
> 2. adding a literal syntax for decimals; I'd naively suggest a 'd'
> suffix to the float literal syntax (which was suggested in the brief
> aforementioned thread)

I would just as soon do the same for fractions.Fraction, perhaps 1 f/ 2 
or 1///2.  Even with decimal literals, the functions would remain in the 
importable module, just as with math and cmath.

> 3. (in Python 4.0/Python 4000) making decimal the default instead of
> float, with floats instead requiring a 'f' suffix

Decimal is not just a decimal arithmetic module.  It implements and will 
track a particular complex, specialized, possibly changeable standard 
controlled by IBM, which already has a few crazy quirks present for 
commercial rather than technical reasons.  This is fine for an add-on 
class but not, in my opinion, for Python's default fraction arithmetic. 
  If Python's developers did consider replacing floats in that role, I 
would prefer either fractions or a much simplified decimal type designed 
by us for general purpose needs.

Terry Jan Reedy


From clp at rebertia.com  Thu Dec  4 19:18:24 2008
From: clp at rebertia.com (Chris Rebert)
Date: Thu, 4 Dec 2008 10:18:24 -0800
Subject: [Python-ideas] Decimal literal?
In-Reply-To: <gh962j$i7$1@ger.gmane.org>
References: <47c890dc0812032351r22e5bf09q1d7ab634358186a5@mail.gmail.com>
	<gh962j$i7$1@ger.gmane.org>
Message-ID: <47c890dc0812041018i5f0ce5ag6a5258052df972b9@mail.gmail.com>

On Thu, Dec 4, 2008 at 10:04 AM, Terry Reedy <tjreedy at udel.edu> wrote:
> Chris Rebert wrote:
<snip>
>> 3. (in Python 4.0/Python 4000) making decimal the default instead of
>> float, with floats instead requiring a 'f' suffix
>
> Decimal is not just a decimal arithmetic module.  It implements and will
> track a particular complex, specialized, possibly changeable standard
> controlled by IBM, which already has a few crazy quirks present for
> commercial rather than technical reasons.  This is fine for an add-on class
> but not, in my opinion, for Python's default fraction arithmetic.  If
> Python's developers did consider replacing floats in that role, I would
> prefer either fractions or a much simplified decimal type designed by us for
> general purpose needs.

I'll just point out that GvR seemed to favor the general idea (along
with a transition mechanism) in the old thread I mentioned in my
original post; otherwise I'd have been much more wary of including #3.
I can't speak to how good the standard is comparatively except that
the Python devs must have chosen it over others or a custom one for
good reason, and at least it's better than plain floats. The PEP
mentions it being almost completely ANSI/IEEE-compliant and that it
has already taken into account the evil corner cases.

Cheers,
Chris

-- 
Follow the path of the Iguana...
http://rebertia.com

>
> Terry Jan Reedy
>
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> http://mail.python.org/mailman/listinfo/python-ideas
>


From rhamph at gmail.com  Thu Dec  4 19:18:49 2008
From: rhamph at gmail.com (Adam Olsen)
Date: Thu, 4 Dec 2008 11:18:49 -0700
Subject: [Python-ideas] Decimal literal?
In-Reply-To: <op.ulmuyazm03jqhe@cesareprova.org>
References: <47c890dc0812032351r22e5bf09q1d7ab634358186a5@mail.gmail.com>
	<aac2c7cb0812040137t551c89eci995ff0a51b8efca@mail.gmail.com>
	<op.ulmtefu303jqhe@cesareprova.org>
	<429565A0F2E34960A00B14E9D15F2D7F@RaymondLaptop1>
	<op.ulmuyazm03jqhe@cesareprova.org>
Message-ID: <aac2c7cb0812041018i3ed45ccej491f3963e475ea81@mail.gmail.com>

On Thu, Dec 4, 2008 at 3:19 AM, Cesare Di Mauro
<cesare.dimauro at a-tono.com> wrote:
> But working with financial applications, using decimal numerics
> is a very common practice. Even if implementation is slow, we
> prefer exact results over speed: there must be no possibility on
> failing calculations when we are manipulating moneys.

This has always bothered me: the suggestion that decimal *floats* are
suitable for financial calculations, when fixed point is what you
want.  However, I now see some FAQ entries in
http://docs.python.org/library/decimal.html that show how to get fixed
point behaviour out of it.  Including a wrapper around multiply and
divide, for ease of use, heh.

Regardless, although financial use cases are important, their
behaviour is not universal.  The next country over or a few years down
the road may have different rules, different proportions, etc.  Not
something we want to hardcode.


-- 
Adam Olsen, aka Rhamphoryncus


From python at rcn.com  Thu Dec  4 19:55:47 2008
From: python at rcn.com (Raymond Hettinger)
Date: Thu, 4 Dec 2008 10:55:47 -0800
Subject: [Python-ideas] Decimal literal?
References: <47c890dc0812032351r22e5bf09q1d7ab634358186a5@mail.gmail.com><204A1AE06E8341E7BE37050BF2674E55@RaymondLaptop1>
	<gh8pj6$fh6$1@ger.gmane.org>
Message-ID: <83DBFB195B034B5AA8248FE8789BEDA4@RaymondLaptop1>

From: "Aahz"
> That's half-true.  Most applications IME that manipulate numbers need to
> express zero frequently as initializers. 

No doubt that's true.  Was just pointing-out that much of the utility
of the decimal module independent of whether literals are built into
the parser.  Also noted, that it is a non-trivial exercise to get decimals
fully integrated into the language.  I would like to see both things
happen but it won't be easy.

FWIW, when I write decimal code, I use a brief-form for the constructor:

   from decimal import Decimal as D
   . . .
   balance = D(0)
 

From: "Christian Heimes" <lists at cheimes.de>
> If we ever going to consider Cython for core development, the decimal 
> module could be the first module that uses Cython. IMHO it's the perfect 
> candidate for a proof of concept.

Certainly, Cython would be helpful.  That being said, the decimal module
is likely a poor candidate to show-off Cython's capabilities.  The current
code is not setup in a way that translates well.  Much better speed-ups
could be had from Cython if the module were rewritten to use alternate
data structures for decimal numbers and for contexts and to let temporary
numbers (accumulators be mutated in-place).


From: "Facundo Batista":
> The best we can do *now* with Decimal, if we want it to be included as
> a literal *somewhen*, is to get it in C.

Well said.

From: "Facundo Batista":
> There're already some first steps in that direction, but *please*
> investigate that other path you're suggesting.

IMO, those efforts have been somewhat misdirected.  They were
going down the path of direct translation.  Instead, there needs to
be a pure implementation of the spec, using better data structures
and then separately adding python wrappers.  The first component
needs to have its own efficient context objects and fast, temporary
accumulators.  The latter should match the current API.


Raymond


From facundobatista at gmail.com  Thu Dec  4 21:07:19 2008
From: facundobatista at gmail.com (Facundo Batista)
Date: Thu, 4 Dec 2008 18:07:19 -0200
Subject: [Python-ideas] Decimal literal?
In-Reply-To: <83DBFB195B034B5AA8248FE8789BEDA4@RaymondLaptop1>
References: <47c890dc0812032351r22e5bf09q1d7ab634358186a5@mail.gmail.com>
	<204A1AE06E8341E7BE37050BF2674E55@RaymondLaptop1>
	<gh8pj6$fh6$1@ger.gmane.org>
	<83DBFB195B034B5AA8248FE8789BEDA4@RaymondLaptop1>
Message-ID: <e04bdf310812041207ub4865dfibda49e7d8016f4fc@mail.gmail.com>

2008/12/4 Raymond Hettinger <python at rcn.com>:

>> There're already some first steps in that direction, but *please*
>> investigate that other path you're suggesting.
>
> IMO, those efforts have been somewhat misdirected.  They were
> going down the path of direct translation.  Instead, there needs to
> be a pure implementation of the spec, using better data structures
> and then separately adding python wrappers.  The first component
> needs to have its own efficient context objects and fast, temporary
> accumulators.  The latter should match the current API.

I actually was talking about the issue 2486, which is the first step
to "slowly, but steadily, replace parts of Decimal from Python to C as
needed."

I should have been more explicit, sorry for the confusion.

Regards,

-- 
.    Facundo

Blog: http://www.taniquetil.com.ar/plog/
PyAr: http://www.python.org/ar/


From leif.walsh at gmail.com  Thu Dec  4 21:18:03 2008
From: leif.walsh at gmail.com (Leif Walsh)
Date: Thu, 4 Dec 2008 15:18:03 -0500
Subject: [Python-ideas] Decimal literal?
In-Reply-To: <83DBFB195B034B5AA8248FE8789BEDA4@RaymondLaptop1>
References: <47c890dc0812032351r22e5bf09q1d7ab634358186a5@mail.gmail.com>
	<204A1AE06E8341E7BE37050BF2674E55@RaymondLaptop1>
	<gh8pj6$fh6$1@ger.gmane.org>
	<83DBFB195B034B5AA8248FE8789BEDA4@RaymondLaptop1>
Message-ID: <cc7430500812041218s10611ed6sa8491ee4b1b59bd3@mail.gmail.com>

Coming in to the thread _way_ late, here's my $0.015:

Sure, it would be great to have an accurate and fast implementation of
decimal/floating point numbers active by default in the language.  We
don't have that yet.  We have a fast implementation, and we have an
accurate one, and until we have both, there is a decision to be made:
which one is easy to use (in builtins, has literals, (etc.?)), and
which one is the "opt-in" implementation (needs a module import, needs
a constructor)?

We've been dealing with roughly the same fast and sometimes-inaccurate
floating-point implementation for what, almost 40 years of C
programming so far.  Given that there exist accurate implementations
of decimal numbers (GMP, MAPM), why hasn't C moved to make one of
these the "default" implementation?

Whatever the answer, it seems to me that this sets a sort of precedent
in programming that fast floating-point numbers are favored over
accurate floating-point numbers.  GMP is blindingly fast, and it isn't
C's default.  Decimal is, I think I saw someone mention "hundreds of
times slower" than the current float implementation.

I think, until the decimal implementation approaches something like
GMP's speed, there really isn't much point in even considering making
it a default.

Now, to the question of a 'decimal literal':  Including support for
something like '1.1d' requires that we include the decimal module in
builtins.  Now, I don't know that there's no way around this, but it
seems like a slowdown for everyone just to let a few people type a bit
less.  -1

-- 
Cheers,
Leif


From rhamph at gmail.com  Thu Dec  4 22:18:32 2008
From: rhamph at gmail.com (Adam Olsen)
Date: Thu, 4 Dec 2008 14:18:32 -0700
Subject: [Python-ideas] Decimal literal?
In-Reply-To: <cc7430500812041218s10611ed6sa8491ee4b1b59bd3@mail.gmail.com>
References: <47c890dc0812032351r22e5bf09q1d7ab634358186a5@mail.gmail.com>
	<204A1AE06E8341E7BE37050BF2674E55@RaymondLaptop1>
	<gh8pj6$fh6$1@ger.gmane.org>
	<83DBFB195B034B5AA8248FE8789BEDA4@RaymondLaptop1>
	<cc7430500812041218s10611ed6sa8491ee4b1b59bd3@mail.gmail.com>
Message-ID: <aac2c7cb0812041318t37c54c63oe19013f8d2c4b306@mail.gmail.com>

On Thu, Dec 4, 2008 at 1:18 PM, Leif Walsh <leif.walsh at gmail.com> wrote:
> Coming in to the thread _way_ late, here's my $0.015:
>
> Sure, it would be great to have an accurate and fast implementation of
> decimal/floating point numbers active by default in the language.  We
> don't have that yet.  We have a fast implementation, and we have an
> accurate one, and until we have both, there is a decision to be made:
> which one is easy to use (in builtins, has literals, (etc.?)), and
> which one is the "opt-in" implementation (needs a module import, needs
> a constructor)?
>
> We've been dealing with roughly the same fast and sometimes-inaccurate
> floating-point implementation for what, almost 40 years of C
> programming so far.  Given that there exist accurate implementations
> of decimal numbers (GMP, MAPM), why hasn't C moved to make one of
> these the "default" implementation?
>
> Whatever the answer, it seems to me that this sets a sort of precedent
> in programming that fast floating-point numbers are favored over
> accurate floating-point numbers.  GMP is blindingly fast, and it isn't
> C's default.  Decimal is, I think I saw someone mention "hundreds of
> times slower" than the current float implementation.

GMP may be blindingly fast for an arbitrary precision floating point
implementation, but it's quite slow compared to hardware floating
point.  Even in hardware there's a temptation to optimize for
single-precision and skip various IEEE 754 special cases that would
slow things down.  Performance really does count.

You're not going to find a broad solution here.  Decimal is mildly
more precise, but substantially slower.  It's also less convenient for
interacting with C code.

Given the importance of C extensions to Python, interacting with C is
the strongest argument here.  It's not an elegant reason, but it's
very practical.

Besides, any user WILL have to learn what floats do to their numbers,
so you might as well make it obvious.  If you really want to avoid it
you should be using a symbolic math library instead.  Personally, if I
need a calculator I usually use Qalculate, rather than an interactive
interpreter.


-- 
Adam Olsen, aka Rhamphoryncus


From tjreedy at udel.edu  Thu Dec  4 23:09:36 2008
From: tjreedy at udel.edu (Terry Reedy)
Date: Thu, 04 Dec 2008 17:09:36 -0500
Subject: [Python-ideas] Decimal literal?
In-Reply-To: <cc7430500812041218s10611ed6sa8491ee4b1b59bd3@mail.gmail.com>
References: <47c890dc0812032351r22e5bf09q1d7ab634358186a5@mail.gmail.com>	<204A1AE06E8341E7BE37050BF2674E55@RaymondLaptop1>	<gh8pj6$fh6$1@ger.gmane.org>	<83DBFB195B034B5AA8248FE8789BEDA4@RaymondLaptop1>
	<cc7430500812041218s10611ed6sa8491ee4b1b59bd3@mail.gmail.com>
Message-ID: <gh9kes$m55$1@ger.gmane.org>

Leif Walsh wrote:
> Coming in to the thread _way_ late, here's my $0.015:
> 
> Sure, it would be great to have an accurate and fast implementation of
> decimal/floating point numbers active by default in the language.

We have one by many definitions of 'accurate'.  Being off by a few or 
even a hundred parts per quintillion is pretty good by some standards.

>  We don't have that yet.

I disagree.

>  We have a fast implementation, and we have an
> accurate one, and until we have both, there is a decision to be made:

The notion that decimal is more 'accurate' than float needs a lot of 
qualification.  Yes, it is intended to give *exactly* the answer to 
various financial calculations that various jurisdictions mandate, but 
that is a rather specialized meaning of 'accurate'.

tjr


From leif.walsh at gmail.com  Fri Dec  5 05:41:35 2008
From: leif.walsh at gmail.com (Leif Walsh)
Date: Thu, 4 Dec 2008 23:41:35 -0500
Subject: [Python-ideas] Decimal literal?
In-Reply-To: <gh9kes$m55$1@ger.gmane.org>
References: <47c890dc0812032351r22e5bf09q1d7ab634358186a5@mail.gmail.com>
	<204A1AE06E8341E7BE37050BF2674E55@RaymondLaptop1>
	<gh8pj6$fh6$1@ger.gmane.org>
	<83DBFB195B034B5AA8248FE8789BEDA4@RaymondLaptop1>
	<cc7430500812041218s10611ed6sa8491ee4b1b59bd3@mail.gmail.com>
	<gh9kes$m55$1@ger.gmane.org>
Message-ID: <cc7430500812042041g26a9eac7uf48da55e035dab9b@mail.gmail.com>

On Thu, Dec 4, 2008 at 5:09 PM, Terry Reedy <tjreedy at udel.edu> wrote:
> We have one by many definitions of 'accurate'.  Being off by a few or even a
> hundred parts per quintillion is pretty good by some standards.

I agree.  That's why I don't think the decimal module should be the
"default implementation".

> I disagree.

Okay.  "Perfectly accurate" then.

> The notion that decimal is more 'accurate' than float needs a lot of
> qualification.  Yes, it is intended to give *exactly* the answer to various
> financial calculations that various jurisdictions mandate, but that is a
> rather specialized meaning of 'accurate'.

You've said what I mean better than I could.  The float implementation
is more than good enough for almost all applications, and it seems
ridiculous to me to slow them down for the precious few that need more
precision (and, at that, just don't want to type quite as much).

-- 
Cheers,
Leif


From python at rcn.com  Fri Dec  5 05:59:54 2008
From: python at rcn.com (Raymond Hettinger)
Date: Thu, 4 Dec 2008 20:59:54 -0800
Subject: [Python-ideas] Decimal literal?
References: <47c890dc0812032351r22e5bf09q1d7ab634358186a5@mail.gmail.com><204A1AE06E8341E7BE37050BF2674E55@RaymondLaptop1><gh8pj6$fh6$1@ger.gmane.org><83DBFB195B034B5AA8248FE8789BEDA4@RaymondLaptop1><cc7430500812041218s10611ed6sa8491ee4b1b59bd3@mail.gmail.com><gh9kes$m55$1@ger.gmane.org>
	<cc7430500812042041g26a9eac7uf48da55e035dab9b@mail.gmail.com>
Message-ID: <0DD3877E77AA4680A7CB6952AE54124C@RaymondLaptop1>

>> The notion that decimal is more 'accurate' than float needs a lot of
>> qualification.  Yes, it is intended to give *exactly* the answer to various
>> financial calculations that various jurisdictions mandate, but that is a
>> rather specialized meaning of 'accurate'.
> 
> You've said what I mean better than I could.  The float implementation
> is more than good enough for almost all applications, and it seems
> ridiculous to me to slow them down for the precious few that need more
> precision (and, at that, just don't want to type quite as much).

While we're mincing words, I would state the case differently.
Neither "precision" or "accuracy" captures the essential difference between
binary and decimal floating point.  It is all about what is "exactly representable".
The main reason decimal is good for financial apps is that the numbers of interest 
are exactly representable in decimal floating point but not in binary floating point.
In a financial app, it can matter that 1.10 is exact rather than some nearby
value representable in binary floating point, 0x1.199999999999ap+0.

Of course, there are other differences like control over rounding and
variable precision, but the main story is about what is exactly representable.


Raymond


From jimjjewett at gmail.com  Fri Dec  5 19:53:33 2008
From: jimjjewett at gmail.com (Jim Jewett)
Date: Fri, 5 Dec 2008 13:53:33 -0500
Subject: [Python-ideas] Decimal literal?
In-Reply-To: <op.ulmtefu303jqhe@cesareprova.org>
References: <47c890dc0812032351r22e5bf09q1d7ab634358186a5@mail.gmail.com>
	<aac2c7cb0812040137t551c89eci995ff0a51b8efca@mail.gmail.com>
	<op.ulmtefu303jqhe@cesareprova.org>
Message-ID: <fb6fbf560812051053l5276b57flb75db8b3b55f5002@mail.gmail.com>

On Thu, Dec 4, 2008 at 4:45 AM, Cesare Di Mauro
<cesare.dimauro at a-tono.com> wrote:
> But at least it will be more usable to have a
> short-hand for decimal declaration:

In isolation, a decimal literal sounds nice.

But it may not be used often enough to justify the extra mental complexity.

What should the following mean?

>>> a = 123X

It isn't obvious, which means that either it gets used all the time
(decimal won't) or people will have to look it up -- or just guess,
and sometimes get it wrong.

> a = 1234.567d

To someone who hasn't programmed much with decimal floating point,
what does the "d" mean?

Could it indicate "use double-precision"?

Could it just mean that the written representation is "decimal" as
opposed to "octal" or "hexadecimal", but that the internal form is
still binary?

> a = 1234.567d

> is simpler than:

[reworded to be even shorter per use]

>>> from decimal import Decimal as d
>>> a = d('1234.5678')

but if you really have enough Decimal literals for the difference to
matter, you could always write your own helper function.

>>> # pretend to be using the European decimal point
>>> a = d(1234,5678)

>>> # maps easily to the tuple-format constructor
>>> a = d(12345678, -4)

My own hunch is that until Decimal is used enough that people start
putting this sort of constructor into their personal libraries, it
probably doesn't need a literal.

-jJ


From qrczak at knm.org.pl  Fri Dec  5 21:33:59 2008
From: qrczak at knm.org.pl (Marcin 'Qrczak' Kowalczyk)
Date: Fri, 5 Dec 2008 21:33:59 +0100
Subject: [Python-ideas] Decimal literal?
In-Reply-To: <fb6fbf560812051053l5276b57flb75db8b3b55f5002@mail.gmail.com>
References: <47c890dc0812032351r22e5bf09q1d7ab634358186a5@mail.gmail.com>
	<aac2c7cb0812040137t551c89eci995ff0a51b8efca@mail.gmail.com>
	<op.ulmtefu303jqhe@cesareprova.org>
	<fb6fbf560812051053l5276b57flb75db8b3b55f5002@mail.gmail.com>
Message-ID: <3f4107910812051233o5c8c1c79k26967aee09d73b6@mail.gmail.com>

C# uses m (or M) as decimal suffix. Mnemonic: money.

-- 
Marcin Kowalczyk
qrczak at knm.org.pl
http://qrnik.knm.org.pl/~qrczak/


From bruce at leapyear.org  Fri Dec  5 23:02:17 2008
From: bruce at leapyear.org (Bruce Leban)
Date: Fri, 5 Dec 2008 14:02:17 -0800
Subject: [Python-ideas] Decimal literal?
In-Reply-To: <fb6fbf560812051053l5276b57flb75db8b3b55f5002@mail.gmail.com>
References: <47c890dc0812032351r22e5bf09q1d7ab634358186a5@mail.gmail.com>
	<aac2c7cb0812040137t551c89eci995ff0a51b8efca@mail.gmail.com>
	<op.ulmtefu303jqhe@cesareprova.org>
	<fb6fbf560812051053l5276b57flb75db8b3b55f5002@mail.gmail.com>
Message-ID: <cf5b87740812051402x2ec3ab65oa09fc25712970a5@mail.gmail.com>

There is a representation for decimal literals that nicely avoids the
problem of remembering that 0d is decimal and 0m is meters etc.:

>>> import decimal
>>> decimal.Decimal(3)
Decimal("3")
>>> Decimal("3")
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
NameError: name 'Decimal' is not defined

The error points out that I really need to do both:

>>> import decimal
>>> from decimal import Decimal.

and I'd prefer the single import do both. Note that this anomaly of repr is
not limited to decimal as I think this is a bit worse:

>>> float('nan')
nan
>>> float('inf')
inf

--- Bruce

On Fri, Dec 5, 2008 at 10:53 AM, Jim Jewett <jimjjewett at gmail.com> wrote:

> On Thu, Dec 4, 2008 at 4:45 AM, Cesare Di Mauro
> <cesare.dimauro at a-tono.com> wrote:
> > But at least it will be more usable to have a
> > short-hand for decimal declaration:
>
> In isolation, a decimal literal sounds nice.
>
> But it may not be used often enough to justify the extra mental complexity.
>
> What should the following mean?
>
> >>> a = 123X
>
> It isn't obvious, which means that either it gets used all the time
> (decimal won't) or people will have to look it up -- or just guess,
> and sometimes get it wrong.
>
> > a = 1234.567d
>
> To someone who hasn't programmed much with decimal floating point,
> what does the "d" mean?
>
> Could it indicate "use double-precision"?
>
> Could it just mean that the written representation is "decimal" as
> opposed to "octal" or "hexadecimal", but that the internal form is
> still binary?
>
> > a = 1234.567d
>
> > is simpler than:
>
> [reworded to be even shorter per use]
>
> >>> from decimal import Decimal as d
> >>> a = d('1234.5678')
>
> but if you really have enough Decimal literals for the difference to
> matter, you could always write your own helper function.
>
> >>> # pretend to be using the European decimal point
> >>> a = d(1234,5678)
>
> >>> # maps easily to the tuple-format constructor
> >>> a = d(12345678, -4)
>
> My own hunch is that until Decimal is used enough that people start
> putting this sort of constructor into their personal libraries, it
> probably doesn't need a literal.
>
> -jJ
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> http://mail.python.org/mailman/listinfo/python-ideas
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20081205/6622e427/attachment.html>

From clp at rebertia.com  Fri Dec  5 23:52:48 2008
From: clp at rebertia.com (Chris Rebert)
Date: Fri, 5 Dec 2008 14:52:48 -0800
Subject: [Python-ideas] Decimal literal?
In-Reply-To: <cf5b87740812051402x2ec3ab65oa09fc25712970a5@mail.gmail.com>
References: <47c890dc0812032351r22e5bf09q1d7ab634358186a5@mail.gmail.com>
	<aac2c7cb0812040137t551c89eci995ff0a51b8efca@mail.gmail.com>
	<op.ulmtefu303jqhe@cesareprova.org>
	<fb6fbf560812051053l5276b57flb75db8b3b55f5002@mail.gmail.com>
	<cf5b87740812051402x2ec3ab65oa09fc25712970a5@mail.gmail.com>
Message-ID: <47c890dc0812051452n5c1d028eofdfb7aef0d549e12@mail.gmail.com>

On Fri, Dec 5, 2008 at 2:02 PM, Bruce Leban <bruce at leapyear.org> wrote:
> There is a representation for decimal literals that nicely avoids the
> problem of remembering that 0d is decimal and 0m is meters etc.:
>
>>>> import decimal
>>>> decimal.Decimal(3)
> Decimal("3")
>>>> Decimal("3")
> Traceback (most recent call last):
>   File "<stdin>", line 1, in ?
> NameError: name 'Decimal' is not defined
>
> The error points out that I really need to do both:
>
>>>> import decimal
>>>> from decimal import Decimal.

You only need the second line there. The first line is unnecessary and
does not effect the second.

Cheers,
Chris

-- 
Follow the path of the Iguana...
http://rebertia.com

>
> and I'd prefer the single import do both. Note that this anomaly of repr is
> not limited to decimal as I think this is a bit worse:
>
>>>> float('nan')
> nan
>>>> float('inf')
> inf
>
> --- Bruce
>
> On Fri, Dec 5, 2008 at 10:53 AM, Jim Jewett <jimjjewett at gmail.com> wrote:
>>
>> On Thu, Dec 4, 2008 at 4:45 AM, Cesare Di Mauro
>> <cesare.dimauro at a-tono.com> wrote:
>> > But at least it will be more usable to have a
>> > short-hand for decimal declaration:
>>
>> In isolation, a decimal literal sounds nice.
>>
>> But it may not be used often enough to justify the extra mental
>> complexity.
>>
>> What should the following mean?
>>
>> >>> a = 123X
>>
>> It isn't obvious, which means that either it gets used all the time
>> (decimal won't) or people will have to look it up -- or just guess,
>> and sometimes get it wrong.
>>
>> > a = 1234.567d
>>
>> To someone who hasn't programmed much with decimal floating point,
>> what does the "d" mean?
>>
>> Could it indicate "use double-precision"?
>>
>> Could it just mean that the written representation is "decimal" as
>> opposed to "octal" or "hexadecimal", but that the internal form is
>> still binary?
>>
>> > a = 1234.567d
>>
>> > is simpler than:
>>
>> [reworded to be even shorter per use]
>>
>> >>> from decimal import Decimal as d
>> >>> a = d('1234.5678')
>>
>> but if you really have enough Decimal literals for the difference to
>> matter, you could always write your own helper function.
>>
>> >>> # pretend to be using the European decimal point
>> >>> a = d(1234,5678)
>>
>> >>> # maps easily to the tuple-format constructor
>> >>> a = d(12345678, -4)
>>
>> My own hunch is that until Decimal is used enough that people start
>> putting this sort of constructor into their personal libraries, it
>> probably doesn't need a literal.
>>
>> -jJ
>> _______________________________________________
>> Python-ideas mailing list
>> Python-ideas at python.org
>> http://mail.python.org/mailman/listinfo/python-ideas
>
>
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> http://mail.python.org/mailman/listinfo/python-ideas
>
>


From idadesub at users.sourceforge.net  Sun Dec  7 05:13:07 2008
From: idadesub at users.sourceforge.net (Erick Tryzelaar)
Date: Sat, 6 Dec 2008 20:13:07 -0800
Subject: [Python-ideas] Anyone interested in zsh-style subpattern matching
	for fnmatch/glob?
Message-ID: <1ef034530812062013h1871a01djb50146406abbbbda@mail.gmail.com>

My project needs to extend fnmatch to support zsh-style globbing,
where you can use brackets to designate subexpressions. Say you had a
directory structure like this:

foo/
  foo.ext1
  foo.ext2
bar/
  foo.ext1
  foo.ext2

The subexpressions will let you do patterns like this:

>>> glob.glob('foo/foo.{ext1,ext2}')
['foo/foo.ext1', 'foo/foo.ext2']
>>> glob.glob('foo/foo.ext{1,2}')
['foo/foo.ext1', 'foo/foo.ext2']
>>> glob.glob('{foo,bar}')
['bar', 'foo']
>>> glob.glob('{foo,bar}/foo*')
['bar/foo.ext1', 'bar/foo.ext2', 'foo/foo.ext1', 'foo/foo.ext2']
>>> glob.glob('{foo,bar}/foo.{ext*}')
['bar/foo.ext1', 'bar/foo.ext2', 'foo/foo.ext1', 'foo/foo.ext2']
>>> glob.glob('{f?o,b?r}/foo.{ext*}')
['bar/foo.ext1', 'bar/foo.ext2', 'foo/foo.ext1', 'foo/foo.ext2']


Would this be interesting to anyone else? It would unfortunately break
fnmatch since it currently would ignore with {} in it. It'd be easy to
work around that by adding a flag or using a different function name.
Anyway, here's the patch against the head of py3k.

-e


Index: Lib/glob.py
===================================================================
--- Lib/glob.py	(revision 67629)
+++ Lib/glob.py	(working copy)
@@ -72,8 +72,8 @@
     return []


-magic_check = re.compile('[*?[]')
-magic_check_bytes = re.compile(b'[*?[]')
+magic_check = re.compile('[*?[{]')
+magic_check_bytes = re.compile(b'[*?[{]')

 def has_magic(s):
     if isinstance(s, bytes):
Index: Lib/fnmatch.py
===================================================================
--- Lib/fnmatch.py	(revision 67629)
+++ Lib/fnmatch.py	(working copy)
@@ -22,10 +22,11 @@

     Patterns are Unix shell style:

-    *       matches everything
-    ?       matches any single character
-    [seq]   matches any character in seq
-    [!seq]  matches any char not in seq
+    *           matches everything
+    ?           matches any single character
+    [seq]       matches any character in seq
+    [!seq]      matches any char not in seq
+    {pat1,pat2} matches subpattern pat1 or subpattern pat2

     An initial period in FILENAME is not special.
     Both FILENAME and PATTERN are first case-normalized
@@ -84,10 +85,15 @@
     There is no way to quote meta-characters.
     """

-    i, n = 0, len(pat)
+    return _translate(0, pat, '')[2] + '$'
+
+def _translate(i, pat, end):
     res = ''
+    n = len(pat)
     while i < n:
         c = pat[i]
+        if c in end:
+            return i, c, res
         i = i+1
         if c == '*':
             res = res + '.*'
@@ -111,6 +117,27 @@
                 elif stuff[0] == '^':
                     stuff = '\\' + stuff
                 res = '%s[%s]' % (res, stuff)
+        elif c == '{':
+            i, sub = _translate_subexpression(i, pat)
+            res += sub
         else:
             res = res + re.escape(c)
-    return res + "$"
+    return i, '', res
+
+def _translate_subexpression(i, pat):
+    j = i
+    subexpressions = []
+    while True:
+        j, c, res = _translate(j, pat, ',}')
+        subexpressions.append(res)
+
+        if c == ',':
+            j += 1
+        elif c == '}':
+            j += 1
+            break
+        else:
+            # turns out we didn't have a subpattern
+            return j, '{' + ','.join(subexpressions)
+
+    return j, '(' + '|'.join(subexpressions) + ')'
Index: Lib/test/test_fnmatch.py
===================================================================
--- Lib/test/test_fnmatch.py	(revision 67629)
+++ Lib/test/test_fnmatch.py	(working copy)
@@ -37,6 +37,12 @@
         check('a', r'[!\]')
         check('\\', r'[!\]', 0)

+        check('abcdefghi', 'ab{cd,12*}ef{gh?,34}')
+        check('ab1234ef34', 'ab{cd,12*}ef{gh?,34}')
+
+        check('abcdefgh', 'ab{cd,12*}ef{gh?,34}', 0)
+        check('ab1234ef345', 'ab{cd,12*}ef{gh?,34}', 0)
+
     def test_mix_bytes_str(self):
         self.assertRaises(TypeError, fnmatch, 'test', b'*')
         self.assertRaises(TypeError, fnmatch, b'test', '*')
Index: Lib/test/test_glob.py
===================================================================
--- Lib/test/test_glob.py	(revision 67629)
+++ Lib/test/test_glob.py	(working copy)
@@ -69,6 +69,7 @@
         eq(self.glob('aa?'), map(self.norm, ['aaa', 'aab']))
         eq(self.glob('aa[ab]'), map(self.norm, ['aaa', 'aab']))
         eq(self.glob('*q'), [])
+        eq(self.glob('a{?a,?b}'), map(self.norm, ['aaa', 'aab']))

     def test_glob_nested_directory(self):
         eq = self.assertSequencesEqual_noorder
@@ -89,6 +90,9 @@
            [self.norm('a', 'bcd', 'efg', 'ha')])
         eq(self.glob('?a?', '*F'), map(self.norm, [os.path.join('aaa', 'zzzF'),
                                                    os.path.join('aab', 'F')]))
+        eq(self.glob('a', 'b{c,x}d', '{*}', '*a'),
+           [self.norm('a', 'bcd', 'efg', 'ha')])
+        eq(self.glob('a', 'b{x,y}d', '{*}', '*a'), [])

     def test_glob_directory_with_trailing_slash(self):
         # We are verifying that when there is wildcard pattern which


From greg at krypto.org  Sun Dec  7 05:46:47 2008
From: greg at krypto.org (Gregory P. Smith)
Date: Sat, 6 Dec 2008 20:46:47 -0800
Subject: [Python-ideas] Anyone interested in zsh-style subpattern
	matching for fnmatch/glob?
In-Reply-To: <1ef034530812062013h1871a01djb50146406abbbbda@mail.gmail.com>
References: <1ef034530812062013h1871a01djb50146406abbbbda@mail.gmail.com>
Message-ID: <52dc1c820812062046x4e8969cdmf9055e8445da18a3@mail.gmail.com>

This looks useful.

Please post it as a feature request issue with patch on bugs.python.org.
Also, if you could include updates to the fnmatch documentation to describe
exactly what your code allows that would help.

thanks,
-Greg

On Sat, Dec 6, 2008 at 8:13 PM, Erick Tryzelaar <
idadesub at users.sourceforge.net> wrote:

> My project needs to extend fnmatch to support zsh-style globbing,
> where you can use brackets to designate subexpressions. Say you had a
> directory structure like this:
>
> foo/
>  foo.ext1
>  foo.ext2
> bar/
>  foo.ext1
>  foo.ext2
>
> The subexpressions will let you do patterns like this:
>
> >>> glob.glob('foo/foo.{ext1,ext2}')
> ['foo/foo.ext1', 'foo/foo.ext2']
> >>> glob.glob('foo/foo.ext{1,2}')
> ['foo/foo.ext1', 'foo/foo.ext2']
> >>> glob.glob('{foo,bar}')
> ['bar', 'foo']
> >>> glob.glob('{foo,bar}/foo*')
> ['bar/foo.ext1', 'bar/foo.ext2', 'foo/foo.ext1', 'foo/foo.ext2']
> >>> glob.glob('{foo,bar}/foo.{ext*}')
> ['bar/foo.ext1', 'bar/foo.ext2', 'foo/foo.ext1', 'foo/foo.ext2']
> >>> glob.glob('{f?o,b?r}/foo.{ext*}')
> ['bar/foo.ext1', 'bar/foo.ext2', 'foo/foo.ext1', 'foo/foo.ext2']
>
>
> Would this be interesting to anyone else? It would unfortunately break
> fnmatch since it currently would ignore with {} in it. It'd be easy to
> work around that by adding a flag or using a different function name.
> Anyway, here's the patch against the head of py3k.
>
> -e
>
>
>
> Index: Lib/glob.py
> ===================================================================
> --- Lib/glob.py (revision 67629)
> +++ Lib/glob.py (working copy)
> @@ -72,8 +72,8 @@
>     return []
>
>
> -magic_check = re.compile('[*?[]')
> -magic_check_bytes = re.compile(b'[*?[]')
> +magic_check = re.compile('[*?[{]')
> +magic_check_bytes = re.compile(b'[*?[{]')
>
>  def has_magic(s):
>     if isinstance(s, bytes):
> Index: Lib/fnmatch.py
> ===================================================================
> --- Lib/fnmatch.py      (revision 67629)
> +++ Lib/fnmatch.py      (working copy)
> @@ -22,10 +22,11 @@
>
>     Patterns are Unix shell style:
>
> -    *       matches everything
> -    ?       matches any single character
> -    [seq]   matches any character in seq
> -    [!seq]  matches any char not in seq
> +    *           matches everything
> +    ?           matches any single character
> +    [seq]       matches any character in seq
> +    [!seq]      matches any char not in seq
> +    {pat1,pat2} matches subpattern pat1 or subpattern pat2
>
>     An initial period in FILENAME is not special.
>     Both FILENAME and PATTERN are first case-normalized
> @@ -84,10 +85,15 @@
>     There is no way to quote meta-characters.
>     """
>
> -    i, n = 0, len(pat)
> +    return _translate(0, pat, '')[2] + '$'
> +
> +def _translate(i, pat, end):
>     res = ''
> +    n = len(pat)
>     while i < n:
>         c = pat[i]
> +        if c in end:
> +            return i, c, res
>         i = i+1
>         if c == '*':
>             res = res + '.*'
> @@ -111,6 +117,27 @@
>                 elif stuff[0] == '^':
>                     stuff = '\\' + stuff
>                 res = '%s[%s]' % (res, stuff)
> +        elif c == '{':
> +            i, sub = _translate_subexpression(i, pat)
> +            res += sub
>         else:
>             res = res + re.escape(c)
> -    return res + "$"
> +    return i, '', res
> +
> +def _translate_subexpression(i, pat):
> +    j = i
> +    subexpressions = []
> +    while True:
> +        j, c, res = _translate(j, pat, ',}')
> +        subexpressions.append(res)
> +
> +        if c == ',':
> +            j += 1
> +        elif c == '}':
> +            j += 1
> +            break
> +        else:
> +            # turns out we didn't have a subpattern
> +            return j, '{' + ','.join(subexpressions)
> +
> +    return j, '(' + '|'.join(subexpressions) + ')'
> Index: Lib/test/test_fnmatch.py
> ===================================================================
> --- Lib/test/test_fnmatch.py    (revision 67629)
> +++ Lib/test/test_fnmatch.py    (working copy)
> @@ -37,6 +37,12 @@
>         check('a', r'[!\]')
>         check('\\', r'[!\]', 0)
>
> +        check('abcdefghi', 'ab{cd,12*}ef{gh?,34}')
> +        check('ab1234ef34', 'ab{cd,12*}ef{gh?,34}')
> +
> +        check('abcdefgh', 'ab{cd,12*}ef{gh?,34}', 0)
> +        check('ab1234ef345', 'ab{cd,12*}ef{gh?,34}', 0)
> +
>     def test_mix_bytes_str(self):
>         self.assertRaises(TypeError, fnmatch, 'test', b'*')
>         self.assertRaises(TypeError, fnmatch, b'test', '*')
> Index: Lib/test/test_glob.py
> ===================================================================
> --- Lib/test/test_glob.py       (revision 67629)
> +++ Lib/test/test_glob.py       (working copy)
> @@ -69,6 +69,7 @@
>         eq(self.glob('aa?'), map(self.norm, ['aaa', 'aab']))
>         eq(self.glob('aa[ab]'), map(self.norm, ['aaa', 'aab']))
>         eq(self.glob('*q'), [])
> +        eq(self.glob('a{?a,?b}'), map(self.norm, ['aaa', 'aab']))
>
>     def test_glob_nested_directory(self):
>         eq = self.assertSequencesEqual_noorder
> @@ -89,6 +90,9 @@
>            [self.norm('a', 'bcd', 'efg', 'ha')])
>         eq(self.glob('?a?', '*F'), map(self.norm, [os.path.join('aaa',
> 'zzzF'),
>                                                    os.path.join('aab',
> 'F')]))
> +        eq(self.glob('a', 'b{c,x}d', '{*}', '*a'),
> +           [self.norm('a', 'bcd', 'efg', 'ha')])
> +        eq(self.glob('a', 'b{x,y}d', '{*}', '*a'), [])
>
>     def test_glob_directory_with_trailing_slash(self):
>         # We are verifying that when there is wildcard pattern which
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> http://mail.python.org/mailman/listinfo/python-ideas
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20081206/62a67f99/attachment.html>

From idadesub at users.sourceforge.net  Sun Dec  7 09:19:01 2008
From: idadesub at users.sourceforge.net (Erick Tryzelaar)
Date: Sun, 7 Dec 2008 00:19:01 -0800
Subject: [Python-ideas] Anyone interested in zsh-style subpattern
	matching for fnmatch/glob?
In-Reply-To: <52dc1c820812062046x4e8969cdmf9055e8445da18a3@mail.gmail.com>
References: <1ef034530812062013h1871a01djb50146406abbbbda@mail.gmail.com>
	<52dc1c820812062046x4e8969cdmf9055e8445da18a3@mail.gmail.com>
Message-ID: <1ef034530812070019q27df5672mec1cbc8bb6a4f7b9@mail.gmail.com>

On Sat, Dec 6, 2008 at 8:46 PM, Gregory P. Smith <greg at krypto.org> wrote:
> This looks useful.
>
> Please post it as a feature request issue with patch on bugs.python.org.
> Also, if you could include updates to the fnmatch documentation to describe
> exactly what your code allows that would help.

Thanks Greg. I've made issue4573 to track this.


From ironfroggy at gmail.com  Mon Dec  8 01:05:14 2008
From: ironfroggy at gmail.com (Calvin Spealman)
Date: Sun, 7 Dec 2008 19:05:14 -0500
Subject: [Python-ideas] Anyone interested in zsh-style subpattern
	matching for fnmatch/glob?
In-Reply-To: <1ef034530812070019q27df5672mec1cbc8bb6a4f7b9@mail.gmail.com>
References: <1ef034530812062013h1871a01djb50146406abbbbda@mail.gmail.com>
	<52dc1c820812062046x4e8969cdmf9055e8445da18a3@mail.gmail.com>
	<1ef034530812070019q27df5672mec1cbc8bb6a4f7b9@mail.gmail.com>
Message-ID: <76fd5acf0812071605t4007a3f3j1ba735de6d454224@mail.gmail.com>

Backported to 2.7 as I think it is just applicable there.

On Sun, Dec 7, 2008 at 3:19 AM, Erick Tryzelaar
<idadesub at users.sourceforge.net> wrote:
> On Sat, Dec 6, 2008 at 8:46 PM, Gregory P. Smith <greg at krypto.org> wrote:
>> This looks useful.
>>
>> Please post it as a feature request issue with patch on bugs.python.org.
>> Also, if you could include updates to the fnmatch documentation to describe
>> exactly what your code allows that would help.
>
> Thanks Greg. I've made issue4573 to track this.
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> http://mail.python.org/mailman/listinfo/python-ideas
>


-- 
Read my blog! I depend on your acceptance of my opinion! I am interesting!
http://techblog.ironfroggy.com/
Follow me if you're into that sort of thing: http://www.twitter.com/ironfroggy


From clp at rebertia.com  Mon Dec  8 06:04:07 2008
From: clp at rebertia.com (Chris Rebert)
Date: Sun, 7 Dec 2008 21:04:07 -0800
Subject: [Python-ideas] Decimal literal?
In-Reply-To: <47c890dc0812040302u48b45aeft6d55ba267cce797b@mail.gmail.com>
References: <47c890dc0812032351r22e5bf09q1d7ab634358186a5@mail.gmail.com>
	<3A55A52D0AFB41C9AF79A7B4C272DB53@RaymondLaptop1>
	<92081E2A58C34372A270D5D54994DAC4@RaymondLaptop1>
	<47c890dc0812040302u48b45aeft6d55ba267cce797b@mail.gmail.com>
Message-ID: <47c890dc0812072104r1bbf3201p147b7a746ee7b7da@mail.gmail.com>

Ok, so just to summarize should anyone bring up this same issue again
and come upon this thread:

* Decimal literals are a possibly good idea, but a fast C
implementation with a Python-compatible license (which as of this
writing does not yet exist) would be a necessary prerequisite
* Making decimals the default instead of floats would be controversial
to say the least and would definitely require further
analysis+discussion

Cheers,
Chris

-- 
Follow the path of the Iguana...
http://rebertia.com

On Thu, Dec 4, 2008 at 3:02 AM, Chris Rebert <clp at rebertia.com> wrote:
> On Thu, Dec 4, 2008 at 2:50 AM, Raymond Hettinger <python at rcn.com> wrote:
>> From: "Raymond Hettinger"
>>>
>>> Last time I looked, the existing C implementations out there were license
>>> compatible with Python.
>>
>> That should have said "incompatible".
>>
>
> decNumber is available under the ICU License, which seems to be a
> variant of the original BSD license. Depending on exactly how the
> acknowledgement clause is interpreted (IANAL), it seems like it might
> be compatible. If not, IBM, which has copyright on decNumber, seems to
> have a fairly pro-open-source stance historically; perhaps if asked
> nicely by the community, they would be willing to relicense decNumber
> under the revised BSD license (a very minor change vs. the ICU
> License), which would certainly be compatible with Python's licensing
> policy.
>
> Or maybe there exists another library that's already compatible.
> Perhaps I'll investigate.
>
> But the key here is we should first determine whether people want
> decimal to be built-in and have a literal. Once that's established,
> then the details as to implementing that should be investigated. But
> yes, practicality and feasibility certainly are factors in all this.
>
> Cheers,
> Chris
>
> --
> Follow the path of the Iguana...
> http://rebertia.com
>


From skip at pobox.com  Thu Dec 11 15:18:32 2008
From: skip at pobox.com (skip at pobox.com)
Date: Thu, 11 Dec 2008 08:18:32 -0600
Subject: [Python-ideas] This seems like a wart to me...
Message-ID: <18753.8504.116845.736633@montanaro-dyndns-org.local>

Python 2 and 3 both exhibit this behavior:

    >>> "".split()
    []
    >>> "".split("*")
    ['']
    >>> "".split(" ")
    ['']

It's not at all clear to me why splitting an empty string on implicit
whitespace should yield an empty list but splitting it with a non-whitespace
character or explicit whitespace should yield a list with an empty string as
its lone element.  I realize this is documented behavior, but I can't for
the life of me understand what the rationale might be for the different
behaviors.  Seems like a wart which might best be removed sometime in 3.x.

Skip


From guido at python.org  Thu Dec 11 16:51:38 2008
From: guido at python.org (Guido van Rossum)
Date: Thu, 11 Dec 2008 07:51:38 -0800
Subject: [Python-ideas] This seems like a wart to me...
In-Reply-To: <18753.8504.116845.736633@montanaro-dyndns-org.local>
References: <18753.8504.116845.736633@montanaro-dyndns-org.local>
Message-ID: <ca471dc20812110751y1704d01fu1e521edcd057b29d@mail.gmail.com>

On Thu, Dec 11, 2008 at 6:18 AM,  <skip at pobox.com> wrote:
> Python 2 and 3 both exhibit this behavior:
>
>    >>> "".split()
>    []
>    >>> "".split("*")
>    ['']
>    >>> "".split(" ")
>    ['']
>
> It's not at all clear to me why splitting an empty string on implicit
> whitespace should yield an empty list but splitting it with a non-whitespace
> character or explicit whitespace should yield a list with an empty string as
> its lone element.  I realize this is documented behavior, but I can't for
> the life of me understand what the rationale might be for the different
> behaviors.  Seems like a wart which might best be removed sometime in 3.x.

Which of the two would you choose for all? The empty string is the
only reasonable behavior for split-with-argument, it is the logical
consequence of how it behaves when the string is not empty. E.g.
"x:y".split(":") -> ["x", "y"], "x::y".split(":") -> ["x", "", "y"],
":".split(":") -> ["", ""]. OTOH split-on-whitespace doesn't behave
this way; it extracts the non-empty non-whitespace-containing
substrings.

If anything it's wrong, it's that they share the same name. This
wasn't always the case. Do you really want to go back to .split() and
.splitfields(sep)?

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)


From skip at pobox.com  Thu Dec 11 17:36:11 2008
From: skip at pobox.com (skip at pobox.com)
Date: Thu, 11 Dec 2008 10:36:11 -0600
Subject: [Python-ideas] This seems like a wart to me...
In-Reply-To: <ca471dc20812110751y1704d01fu1e521edcd057b29d@mail.gmail.com>
References: <18753.8504.116845.736633@montanaro-dyndns-org.local>
	<ca471dc20812110751y1704d01fu1e521edcd057b29d@mail.gmail.com>
Message-ID: <18753.16763.980209.628297@montanaro-dyndns-org.local>


    Guido> Which of the two would you choose for all? The empty string is the
    Guido> only reasonable behavior for split-with-argument, it is the logical
    Guido> consequence of how it behaves when the string is not empty. E.g.
    Guido> "x:y".split(":") -> ["x", "y"], "x::y".split(":") -> ["x", "", "y"],
    Guido> ":".split(":") -> ["", ""]. OTOH split-on-whitespace doesn't behave
    Guido> this way; it extracts the non-empty non-whitespace-containing
    Guido> substrings.

In my feeble way of thinking I go from something which evaluates to false to
something which doesn't. It's almost like making matter out of empty space:

    bool("") -> False
    bool("".split()) -> False
    bool("".split("n")) -> True

    Guido> If anything it's wrong, it's that they share the same name. This
    Guido> wasn't always the case. Do you really want to go back to .split()
    Guido> and .splitfields(sep)?

That might be preferable.  The same method having such strikingly different
behavior throws me every time I try splitting a possibly empty string with a
non-whitespace character.  It's a relatively uncommon case.  Most of the
time when you split a string with a non-whitespace character I think you
know that the input can't be empty.

Skip


From tjreedy at udel.edu  Thu Dec 11 20:55:13 2008
From: tjreedy at udel.edu (Terry Reedy)
Date: Thu, 11 Dec 2008 14:55:13 -0500
Subject: [Python-ideas] This seems like a wart to me...
In-Reply-To: <ca471dc20812110751y1704d01fu1e521edcd057b29d@mail.gmail.com>
References: <18753.8504.116845.736633@montanaro-dyndns-org.local>
	<ca471dc20812110751y1704d01fu1e521edcd057b29d@mail.gmail.com>
Message-ID: <ghrr6v$b2m$1@ger.gmane.org>

Guido van Rossum wrote:
> On Thu, Dec 11, 2008 at 6:18 AM,  <skip at pobox.com> wrote:

> If anything it's wrong, it's that they share the same name. This
> wasn't always the case. Do you really want to go back to .split() and
> .splitfields(sep)?

I hope not.  I consider the current situation to be a definite 
improvement.  I sometimes forgot which was which.


From matt.horizon5 at gmail.com  Thu Dec 11 21:11:41 2008
From: matt.horizon5 at gmail.com (Matthew Russell)
Date: Thu, 11 Dec 2008 20:11:41 +0000
Subject: [Python-ideas] This seems like a wart to me...
In-Reply-To: <ghrr6v$b2m$1@ger.gmane.org>
References: <18753.8504.116845.736633@montanaro-dyndns-org.local>
	<ca471dc20812110751y1704d01fu1e521edcd057b29d@mail.gmail.com>
	<ghrr6v$b2m$1@ger.gmane.org>
Message-ID: <3b5110850812111211w5287ccbdp4f0f35aac9b5ed46@mail.gmail.com>

It seems to me like spliting an empty string is something that makes little
sense to do,
similar to dividing by zero in terms of an analogy.
How about str.split, partition and friends just raise ValueError exception
when the value is the empty string?

Regards,
Matt

2008/12/11 Terry Reedy <tjreedy at udel.edu>

> Guido van Rossum wrote:
>
>> On Thu, Dec 11, 2008 at 6:18 AM,  <skip at pobox.com> wrote:
>>
>
>  If anything it's wrong, it's that they share the same name. This
>> wasn't always the case. Do you really want to go back to .split() and
>> .splitfields(sep)?
>>
>
> I hope not.  I consider the current situation to be a definite improvement.
>  I sometimes forgot which was which.
>
>
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> http://mail.python.org/mailman/listinfo/python-ideas
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20081211/149bead6/attachment.html>

From guido at python.org  Thu Dec 11 21:14:20 2008
From: guido at python.org (Guido van Rossum)
Date: Thu, 11 Dec 2008 12:14:20 -0800
Subject: [Python-ideas] This seems like a wart to me...
In-Reply-To: <3b5110850812111211w5287ccbdp4f0f35aac9b5ed46@mail.gmail.com>
References: <18753.8504.116845.736633@montanaro-dyndns-org.local>
	<ca471dc20812110751y1704d01fu1e521edcd057b29d@mail.gmail.com>
	<ghrr6v$b2m$1@ger.gmane.org>
	<3b5110850812111211w5287ccbdp4f0f35aac9b5ed46@mail.gmail.com>
Message-ID: <ca471dc20812111214l5044cbefmc3e124d3040d1e7f@mail.gmail.com>

On Thu, Dec 11, 2008 at 12:11 PM, Matthew Russell
<matt.horizon5 at gmail.com> wrote:
> It seems to me like spliting an empty string is something that makes little
> sense to do,
> similar to dividing by zero in terms of an analogy.

I guess you have never had the need. Let me assure you that you are
mistaken. :-)

> How about str.split, partition and friends just raise ValueError exception
> when the value is the empty string?

Absolutely not.

> Regards,
> Matt
>
> 2008/12/11 Terry Reedy <tjreedy at udel.edu>
>>
>> Guido van Rossum wrote:
>>>
>>> On Thu, Dec 11, 2008 at 6:18 AM,  <skip at pobox.com> wrote:
>>
>>> If anything it's wrong, it's that they share the same name. This
>>> wasn't always the case. Do you really want to go back to .split() and
>>> .splitfields(sep)?
>>
>> I hope not.  I consider the current situation to be a definite
>> improvement.  I sometimes forgot which was which.
>>
>> _______________________________________________
>> Python-ideas mailing list
>> Python-ideas at python.org
>> http://mail.python.org/mailman/listinfo/python-ideas
>
>
>
>
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> http://mail.python.org/mailman/listinfo/python-ideas
>
>


-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)


From bruce at leapyear.org  Thu Dec 11 21:33:39 2008
From: bruce at leapyear.org (Bruce Leban)
Date: Thu, 11 Dec 2008 12:33:39 -0800
Subject: [Python-ideas] This seems like a wart to me...
In-Reply-To: <ca471dc20812111214l5044cbefmc3e124d3040d1e7f@mail.gmail.com>
References: <18753.8504.116845.736633@montanaro-dyndns-org.local>
	<ca471dc20812110751y1704d01fu1e521edcd057b29d@mail.gmail.com>
	<ghrr6v$b2m$1@ger.gmane.org>
	<3b5110850812111211w5287ccbdp4f0f35aac9b5ed46@mail.gmail.com>
	<ca471dc20812111214l5044cbefmc3e124d3040d1e7f@mail.gmail.com>
Message-ID: <cf5b87740812111233u2987ddb0hc6620492662b085f@mail.gmail.com>

I think splitting an empty string is more like dividing zero in half. No one
expects that to raise a value exception.

--- Bruce

On Thu, Dec 11, 2008 at 12:14 PM, Guido van Rossum <guido at python.org> wrote:

> On Thu, Dec 11, 2008 at 12:11 PM, Matthew Russell
> <matt.horizon5 at gmail.com> wrote:
> > It seems to me like spliting an empty string is something that makes
> little
> > sense to do,
> > similar to dividing by zero in terms of an analogy.
>
> I guess you have never had the need. Let me assure you that you are
> mistaken. :-)
>
> > How about str.split, partition and friends just raise ValueError
> exception
> > when the value is the empty string?
>
> Absolutely not.
>
> > Regards,
> > Matt
> >
> > 2008/12/11 Terry Reedy <tjreedy at udel.edu>
> >>
> >> Guido van Rossum wrote:
> >>>
> >>> On Thu, Dec 11, 2008 at 6:18 AM,  <skip at pobox.com> wrote:
> >>
> >>> If anything it's wrong, it's that they share the same name. This
> >>> wasn't always the case. Do you really want to go back to .split() and
> >>> .splitfields(sep)?
> >>
> >> I hope not.  I consider the current situation to be a definite
> >> improvement.  I sometimes forgot which was which.
> >>
> >> _______________________________________________
> >> Python-ideas mailing list
> >> Python-ideas at python.org
> >> http://mail.python.org/mailman/listinfo/python-ideas
> >
> >
> >
> >
> > _______________________________________________
> > Python-ideas mailing list
> > Python-ideas at python.org
> > http://mail.python.org/mailman/listinfo/python-ideas
> >
> >
>
>
>
> --
> --Guido van Rossum (home page: http://www.python.org/~guido/<http://www.python.org/%7Eguido/>
> )
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> http://mail.python.org/mailman/listinfo/python-ideas
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20081211/3c385e23/attachment.html>

From matt.horizon5 at gmail.com  Thu Dec 11 21:42:28 2008
From: matt.horizon5 at gmail.com (Matthew Russell)
Date: Thu, 11 Dec 2008 20:42:28 +0000
Subject: [Python-ideas] This seems like a wart to me...
In-Reply-To: <cf5b87740812111233u2987ddb0hc6620492662b085f@mail.gmail.com>
References: <18753.8504.116845.736633@montanaro-dyndns-org.local>
	<ca471dc20812110751y1704d01fu1e521edcd057b29d@mail.gmail.com>
	<ghrr6v$b2m$1@ger.gmane.org>
	<3b5110850812111211w5287ccbdp4f0f35aac9b5ed46@mail.gmail.com>
	<ca471dc20812111214l5044cbefmc3e124d3040d1e7f@mail.gmail.com>
	<cf5b87740812111233u2987ddb0hc6620492662b085f@mail.gmail.com>
Message-ID: <3b5110850812111242k1ad67dc8m2b3802fd646c2214@mail.gmail.com>

Sorry for wasting your brain power -
I just read through some of my code and realised how stupid an idea it
really was... I haven't even got the excuse of being remotely new to the
language (or programming in general come to think of it)

lesson(self): dont_post_at_end_of_day_without(thinking_a_lot_first)

sorrly-mistaken-and-embarrassed
Matt

2008/12/11 Bruce Leban <bruce at leapyear.org>

> I think splitting an empty string is more like dividing zero in half. No
> one expects that to raise a value exception.
>
> --- Bruce
>
>
> On Thu, Dec 11, 2008 at 12:14 PM, Guido van Rossum <guido at python.org>wrote:
>
>> On Thu, Dec 11, 2008 at 12:11 PM, Matthew Russell
>> <matt.horizon5 at gmail.com> wrote:
>> > It seems to me like spliting an empty string is something that makes
>> little
>> > sense to do,
>> > similar to dividing by zero in terms of an analogy.
>>
>> I guess you have never had the need. Let me assure you that you are
>> mistaken. :-)
>>
>> > How about str.split, partition and friends just raise ValueError
>> exception
>> > when the value is the empty string?
>>
>> Absolutely not.
>>
>> > Regards,
>> > Matt
>> >
>> > 2008/12/11 Terry Reedy <tjreedy at udel.edu>
>> >>
>> >> Guido van Rossum wrote:
>> >>>
>> >>> On Thu, Dec 11, 2008 at 6:18 AM,  <skip at pobox.com> wrote:
>> >>
>> >>> If anything it's wrong, it's that they share the same name. This
>> >>> wasn't always the case. Do you really want to go back to .split() and
>> >>> .splitfields(sep)?
>> >>
>> >> I hope not.  I consider the current situation to be a definite
>> >> improvement.  I sometimes forgot which was which.
>> >>
>> >> _______________________________________________
>> >> Python-ideas mailing list
>> >> Python-ideas at python.org
>> >> http://mail.python.org/mailman/listinfo/python-ideas
>> >
>> >
>> >
>> >
>> > _______________________________________________
>> > Python-ideas mailing list
>> > Python-ideas at python.org
>> > http://mail.python.org/mailman/listinfo/python-ideas
>> >
>> >
>>
>>
>>
>> --
>> --Guido van Rossum (home page: http://www.python.org/~guido/<http://www.python.org/%7Eguido/>
>> )
>> _______________________________________________
>> Python-ideas mailing list
>> Python-ideas at python.org
>> http://mail.python.org/mailman/listinfo/python-ideas
>>
>
>


-- 
Cheers,
Matt
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20081211/477c1d86/attachment.html>

From rrr at ronadam.com  Fri Dec 12 00:58:38 2008
From: rrr at ronadam.com (Ron Adam)
Date: Thu, 11 Dec 2008 17:58:38 -0600
Subject: [Python-ideas] This seems like a wart to me...
In-Reply-To: <18753.16763.980209.628297@montanaro-dyndns-org.local>
References: <18753.8504.116845.736633@montanaro-dyndns-org.local>	<ca471dc20812110751y1704d01fu1e521edcd057b29d@mail.gmail.com>
	<18753.16763.980209.628297@montanaro-dyndns-org.local>
Message-ID: <ghs9fj$tge$1@ger.gmane.org>


skip at pobox.com wrote:
>     Guido> Which of the two would you choose for all? The empty string is the
>     Guido> only reasonable behavior for split-with-argument, it is the logical
>     Guido> consequence of how it behaves when the string is not empty. E.g.
>     Guido> "x:y".split(":") -> ["x", "y"], "x::y".split(":") -> ["x", "", "y"],
>     Guido> ":".split(":") -> ["", ""]. OTOH split-on-whitespace doesn't behave
>     Guido> this way; it extracts the non-empty non-whitespace-containing
>     Guido> substrings.
> 
> In my feeble way of thinking I go from something which evaluates to false to
> something which doesn't. It's almost like making matter out of empty space:
> 
>     bool("") -> False
>     bool("".split()) -> False
>     bool("".split("n")) -> True
> 
>     Guido> If anything it's wrong, it's that they share the same name. This
>     Guido> wasn't always the case. Do you really want to go back to .split()
>     Guido> and .splitfields(sep)?
> 
> That might be preferable.  The same method having such strikingly different
> behavior throws me every time I try splitting a possibly empty string with a
> non-whitespace character.  It's a relatively uncommon case.  Most of the
> time when you split a string with a non-whitespace character I think you
> know that the input can't be empty.
> 
> Skip


It looks like there are several behaviors involved in split, and you want 
to split those behaviors out.


Behaviors of string split:


1. Split on white space chrs by giving no argument.

This has the effect of splitting on multiple characters. Strings with 
multiple white space characters are not multiply split.

 >>> '       '.split()
[]
 >>> ' \t\n'.split()
[]


2. Split on word by giving an argument. (A word can be one char.)

In this case, the split is strict and does not combine/remove null string 
results.

 >>> '       '.split(' ')
['', '', '', '', '', '', '', '']
 >>> ' \t\n'.split(' ')
['', '\t\n']


There doesn't seem to be an obvious way to split on different characters.


A new to python programmer might try:

 >>> '1 (123) 456-7890'.split(' ()-')
['1 (123) 456-7890']

Expecting: ['1', '123', '456', '7890']


 >>> '1 (123) 456-7890'.split([' ', '(', ')', '-'])
Traceback (most recent call last):
   File "<stdin>", line 1, in <module>
TypeError: expected a character buffer object


When I needed to split on multiple chars other than the default white 
space, I have used .replace() to replace different splitting character with 
one single char sequence which I could then split on.


It might be nice to have a .splitonchars() version of split with the 
default being whitespace chars, and an argument to specify other multiple 
characters to split on.

The other behavior could be called .splitonwords(arg). The .splitonwords() 
method could possibly also accept a list of words.


That leaves the possibility to leave the current .split() behavior alone 
and would not break current code.

And alternately these could be functions in the string module.  In that 
case the current .split() could just continue to exist as is.

I find the name 'splitfields' to not be as intuitive as 'splitonwords' and 
'splitonchars'.   While both of those require more letters to type than 
split, they are more readable, and when you do need the capability of 
splitting on more than one char or word, they are far shorter and less 
prone to errors than rolling your own function.

Ron


From bruce at leapyear.org  Fri Dec 12 01:23:52 2008
From: bruce at leapyear.org (Bruce Leban)
Date: Thu, 11 Dec 2008 16:23:52 -0800
Subject: [Python-ideas] This seems like a wart to me...
In-Reply-To: <ghs9fj$tge$1@ger.gmane.org>
References: <18753.8504.116845.736633@montanaro-dyndns-org.local>
	<ca471dc20812110751y1704d01fu1e521edcd057b29d@mail.gmail.com>
	<18753.16763.980209.628297@montanaro-dyndns-org.local>
	<ghs9fj$tge$1@ger.gmane.org>
Message-ID: <cf5b87740812111623u2c4e52f7jd501c8849118d5a3@mail.gmail.com>

I think string.split(list) probably won't do what people expect either.
Here's what I would expect it to do:

>>> '1 (123) 456-7890'.split([' ', '(', ')', '-'])
['1', '', '123', '', '456', '7890']

but what you probably want is:

>>>re.split(r'[ ()-]*', '1 (123) 456-7890')
['1', '123', '456', '7890']

using allows you to do that and avoids ambiguity about what it does.

--- Bruce

On Thu, Dec 11, 2008 at 3:58 PM, Ron Adam <rrr at ronadam.com> wrote:

>
>
> skip at pobox.com wrote:
>
>>    Guido> Which of the two would you choose for all? The empty string is
>> the
>>    Guido> only reasonable behavior for split-with-argument, it is the
>> logical
>>    Guido> consequence of how it behaves when the string is not empty. E.g.
>>    Guido> "x:y".split(":") -> ["x", "y"], "x::y".split(":") -> ["x", "",
>> "y"],
>>    Guido> ":".split(":") -> ["", ""]. OTOH split-on-whitespace doesn't
>> behave
>>    Guido> this way; it extracts the non-empty non-whitespace-containing
>>    Guido> substrings.
>>
>> In my feeble way of thinking I go from something which evaluates to false
>> to
>> something which doesn't. It's almost like making matter out of empty
>> space:
>>
>>    bool("") -> False
>>    bool("".split()) -> False
>>    bool("".split("n")) -> True
>>
>>    Guido> If anything it's wrong, it's that they share the same name. This
>>    Guido> wasn't always the case. Do you really want to go back to
>> .split()
>>    Guido> and .splitfields(sep)?
>>
>> That might be preferable.  The same method having such strikingly
>> different
>> behavior throws me every time I try splitting a possibly empty string with
>> a
>> non-whitespace character.  It's a relatively uncommon case.  Most of the
>> time when you split a string with a non-whitespace character I think you
>> know that the input can't be empty.
>>
>> Skip
>>
>
>
> It looks like there are several behaviors involved in split, and you want
> to split those behaviors out.
>
>
>
> Behaviors of string split:
>
>
> 1. Split on white space chrs by giving no argument.
>
> This has the effect of splitting on multiple characters. Strings with
> multiple white space characters are not multiply split.
>
> >>> '       '.split()
> []
> >>> ' \t\n'.split()
> []
>
>
>
> 2. Split on word by giving an argument. (A word can be one char.)
>
> In this case, the split is strict and does not combine/remove null string
> results.
>
> >>> '       '.split(' ')
> ['', '', '', '', '', '', '', '']
> >>> ' \t\n'.split(' ')
> ['', '\t\n']
>
>
> There doesn't seem to be an obvious way to split on different characters.
>
>
> A new to python programmer might try:
>
> >>> '1 (123) 456-7890'.split(' ()-')
> ['1 (123) 456-7890']
>
> Expecting: ['1', '123', '456', '7890']
>
>
> >>> '1 (123) 456-7890'.split([' ', '(', ')', '-'])
> Traceback (most recent call last):
>  File "<stdin>", line 1, in <module>
> TypeError: expected a character buffer object
>
>
> When I needed to split on multiple chars other than the default white
> space, I have used .replace() to replace different splitting character with
> one single char sequence which I could then split on.
>
>
> It might be nice to have a .splitonchars() version of split with the
> default being whitespace chars, and an argument to specify other multiple
> characters to split on.
>
> The other behavior could be called .splitonwords(arg). The .splitonwords()
> method could possibly also accept a list of words.
>
>
> That leaves the possibility to leave the current .split() behavior alone
> and would not break current code.
>
> And alternately these could be functions in the string module.  In that
> case the current .split() could just continue to exist as is.
>
> I find the name 'splitfields' to not be as intuitive as 'splitonwords' and
> 'splitonchars'.   While both of those require more letters to type than
> split, they are more readable, and when you do need the capability of
> splitting on more than one char or word, they are far shorter and less prone
> to errors than rolling your own function.
>
> Ron
>
>
>
>
>
>
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> http://mail.python.org/mailman/listinfo/python-ideas
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20081211/26ce4db0/attachment.html>

From rhamph at gmail.com  Fri Dec 12 01:18:24 2008
From: rhamph at gmail.com (Adam Olsen)
Date: Thu, 11 Dec 2008 17:18:24 -0700
Subject: [Python-ideas] This seems like a wart to me...
In-Reply-To: <ghs9fj$tge$1@ger.gmane.org>
References: <18753.8504.116845.736633@montanaro-dyndns-org.local>
	<ca471dc20812110751y1704d01fu1e521edcd057b29d@mail.gmail.com>
	<18753.16763.980209.628297@montanaro-dyndns-org.local>
	<ghs9fj$tge$1@ger.gmane.org>
Message-ID: <aac2c7cb0812111618v416bd13an679153cf6c1d98b3@mail.gmail.com>

On Thu, Dec 11, 2008 at 4:58 PM, Ron Adam <rrr at ronadam.com> wrote:
> There doesn't seem to be an obvious way to split on different characters.
>
>
> A new to python programmer might try:
>
>>>> '1 (123) 456-7890'.split(' ()-')
> ['1 (123) 456-7890']
>
> Expecting: ['1', '123', '456', '7890']
>
>
>>>> '1 (123) 456-7890'.split([' ', '(', ')', '-'])
> Traceback (most recent call last):
>  File "<stdin>", line 1, in <module>
> TypeError: expected a character buffer object

>>> re.split('[ ()-]', '1 (123) 456-7890')
['1', '', '123', '', '456', '7890']
>>> re.split('[ ()-]+', '1 (123) 456-7890')
['1', '123', '456', '7890']

str.split() handles the simplest, most common cases.  Let's not
clutter it up with a bad[1] impersonation of regex.


[1] And if you thought regex was ugly enough to begin with...

-- 
Adam Olsen, aka Rhamphoryncus


From rrr at ronadam.com  Fri Dec 12 01:38:05 2008
From: rrr at ronadam.com (Ron Adam)
Date: Thu, 11 Dec 2008 18:38:05 -0600
Subject: [Python-ideas] This seems like a wart to me...
In-Reply-To: <cf5b87740812111623u2c4e52f7jd501c8849118d5a3@mail.gmail.com>
References: <18753.8504.116845.736633@montanaro-dyndns-org.local>	<ca471dc20812110751y1704d01fu1e521edcd057b29d@mail.gmail.com>	<18753.16763.980209.628297@montanaro-dyndns-org.local>	<ghs9fj$tge$1@ger.gmane.org>
	<cf5b87740812111623u2c4e52f7jd501c8849118d5a3@mail.gmail.com>
Message-ID: <ghsbpe$40i$1@ger.gmane.org>


Bruce Leban wrote:
> I think string.split(list) probably won't do what people expect either. 
> Here's what I would expect it to do:
> 
>  >>> '1 (123) 456-7890'.split([' ', '(', ')', '-'])
> ['1', '', '123', '', '456', '7890']
> 
> but what you probably want is:
> 
>  >>>re.split(r'[ ()-]*', '1 (123) 456-7890')
> ['1', '123', '456', '7890']
> 
> using allows you to do that and avoids ambiguity about what it does.
> 
> --- Bruce

Without getting into regular expressions, it's easier to just allow 
adjacent char matches to act as one match so the following is true.

     longstring.splitchars(string.whitespace)  =  longstring.split()


From rrr at ronadam.com  Fri Dec 12 02:12:15 2008
From: rrr at ronadam.com (Ron Adam)
Date: Thu, 11 Dec 2008 19:12:15 -0600
Subject: [Python-ideas] This seems like a wart to me...
In-Reply-To: <aac2c7cb0812111618v416bd13an679153cf6c1d98b3@mail.gmail.com>
References: <18753.8504.116845.736633@montanaro-dyndns-org.local>	<ca471dc20812110751y1704d01fu1e521edcd057b29d@mail.gmail.com>	<18753.16763.980209.628297@montanaro-dyndns-org.local>	<ghs9fj$tge$1@ger.gmane.org>
	<aac2c7cb0812111618v416bd13an679153cf6c1d98b3@mail.gmail.com>
Message-ID: <ghsdpj$8qi$1@ger.gmane.org>


Adam Olsen wrote:
> On Thu, Dec 11, 2008 at 4:58 PM, Ron Adam <rrr at ronadam.com> wrote:
>> There doesn't seem to be an obvious way to split on different characters.
>>
>>
>> A new to python programmer might try:
>>
>>>>> '1 (123) 456-7890'.split(' ()-')
>> ['1 (123) 456-7890']
>>
>> Expecting: ['1', '123', '456', '7890']
>>
>>
>>>>> '1 (123) 456-7890'.split([' ', '(', ')', '-'])
>> Traceback (most recent call last):
>>  File "<stdin>", line 1, in <module>
>> TypeError: expected a character buffer object
> 
>>>> re.split('[ ()-]', '1 (123) 456-7890')
> ['1', '', '123', '', '456', '7890']
>>>> re.split('[ ()-]+', '1 (123) 456-7890')
> ['1', '123', '456', '7890']
> 
> str.split() handles the simplest, most common cases.  Let's not
> clutter it up with a bad[1] impersonation of regex.
> 
> 
> [1] And if you thought regex was ugly enough to begin with...

These examples was just what a "new" programmer might attempt.  I have a 
feeling that most new programmers do not attempt regular expressions ie.. 
the re module, until sometime after they have learned the basics of python.

Ron


From bruce at leapyear.org  Fri Dec 12 02:08:24 2008
From: bruce at leapyear.org (Bruce Leban)
Date: Thu, 11 Dec 2008 17:08:24 -0800
Subject: [Python-ideas] This seems like a wart to me...
In-Reply-To: <ghsbpe$40i$1@ger.gmane.org>
References: <18753.8504.116845.736633@montanaro-dyndns-org.local>
	<ca471dc20812110751y1704d01fu1e521edcd057b29d@mail.gmail.com>
	<18753.16763.980209.628297@montanaro-dyndns-org.local>
	<ghs9fj$tge$1@ger.gmane.org>
	<cf5b87740812111623u2c4e52f7jd501c8849118d5a3@mail.gmail.com>
	<ghsbpe$40i$1@ger.gmane.org>
Message-ID: <cf5b87740812111708p103aec67m4575a227327132a8@mail.gmail.com>

-inf

That breaks existing code in two different ways which I don't think makes it
easy.

it does NOT collapse adjacent characters:
        >>> "a&&b".split("&")
        ['a', '', 'b']

the separator it splits on is a string, not a character:
        >>> "a<b><c>d".split("><")
        ['a<b', 'c>d']

--- Bruce

On Thu, Dec 11, 2008 at 4:38 PM, Ron Adam <rrr at ronadam.com> wrote:

>
>
> Bruce Leban wrote:
>
>> I think string.split(list) probably won't do what people expect either.
>> Here's what I would expect it to do:
>>
>>  >>> '1 (123) 456-7890'.split([' ', '(', ')', '-'])
>> ['1', '', '123', '', '456', '7890']
>>
>> but what you probably want is:
>>
>>  >>>re.split(r'[ ()-]*', '1 (123) 456-7890')
>> ['1', '123', '456', '7890']
>>
>> using allows you to do that and avoids ambiguity about what it does.
>>
>> --- Bruce
>>
>
> Without getting into regular expressions, it's easier to just allow
> adjacent char matches to act as one match so the following is true.
>
>    longstring.splitchars(string.whitespace)  =  longstring.split()
>
>
>
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> http://mail.python.org/mailman/listinfo/python-ideas
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20081211/46ecca35/attachment.html>

From greg.ewing at canterbury.ac.nz  Fri Dec 12 02:55:26 2008
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Fri, 12 Dec 2008 14:55:26 +1300
Subject: [Python-ideas] This seems like a wart to me...
In-Reply-To: <ghs9fj$tge$1@ger.gmane.org>
References: <18753.8504.116845.736633@montanaro-dyndns-org.local>
	<ca471dc20812110751y1704d01fu1e521edcd057b29d@mail.gmail.com>
	<18753.16763.980209.628297@montanaro-dyndns-org.local>
	<ghs9fj$tge$1@ger.gmane.org>
Message-ID: <4941C48E.50005@canterbury.ac.nz>

Ron Adam wrote:

> There doesn't seem to be an obvious way to split on different characters.

Remember there's always re.split for the less common
use cases.

-- 
Greg


From rrr at ronadam.com  Fri Dec 12 03:20:51 2008
From: rrr at ronadam.com (Ron Adam)
Date: Thu, 11 Dec 2008 20:20:51 -0600
Subject: [Python-ideas] This seems like a wart to me...
In-Reply-To: <cf5b87740812111708p103aec67m4575a227327132a8@mail.gmail.com>
References: <18753.8504.116845.736633@montanaro-dyndns-org.local>	<ca471dc20812110751y1704d01fu1e521edcd057b29d@mail.gmail.com>	<18753.16763.980209.628297@montanaro-dyndns-org.local>	<ghs9fj$tge$1@ger.gmane.org>	<cf5b87740812111623u2c4e52f7jd501c8849118d5a3@mail.gmail.com>	<ghsbpe$40i$1@ger.gmane.org>
	<cf5b87740812111708p103aec67m4575a227327132a8@mail.gmail.com>
Message-ID: <ghshqo$hk5$1@ger.gmane.org>


Bruce Leban wrote:
> -inf
> 
> That breaks existing code in two different ways which I don't think 
> makes it easy.

Correct, it would break existing code.  Which is why it should have  a 
different name rather than altering the existing split function.


> it does NOT collapse adjacent characters:
>         >>> "a&&b".split("&")
>         ['a', '', 'b']

Also correct.  But that is the behavior when splitting on the default white 
space. ie.. split() with no argument.  '   '.split() is not the same as
'   '.split(' ').

Q: Would it be good to have a new method or function which extends the same 
behavior of whitespace splitting to other user specified characters?

I would find it useful at times.


> the separator it splits on is a string, not a character:
>         >>> "a<b><c>d".split("><")
>         ['a<b', 'c>d']

Yes, I know.  To split on multiple chars in a given argument string it will 
  need to be called something other than .split(). Such as .splitchars(), 
as in the example equality I gave.

     longstring.splitchars(string.whitespace)  ==  longstring.split()

Note: longstring.split() has no arguments.  .split(arg) splits on a string 
     as you stated.


> --- Bruce
> 
> On Thu, Dec 11, 2008 at 4:38 PM, Ron Adam 
> <rrr at ronadam.com 
> <mailto:rrr at ronadam.com>> wrote:
> 
> 
> 
>     Bruce Leban wrote:
> 
>         I think string.split(list) probably won't do what people expect
>         either. Here's what I would expect it to do:
> 
>          >>> '1 (123) 456-7890'.split([' ', '(', ')', '-'])
>         ['1', '', '123', '', '456', '7890']
> 
>         but what you probably want is:
> 
>          >>>re.split(r'[ ()-]*', '1 (123) 456-7890')
>         ['1', '123', '456', '7890']
> 
>         using allows you to do that and avoids ambiguity about what it does.
> 
>         --- Bruce
> 
> 
>     Without getting into regular expressions, it's easier to just allow
>     adjacent char matches to act as one match so the following is true.
> 
>        longstring.splitchars(string.whitespace)  =  longstring.split()
> 
> 
> 
>     _______________________________________________
>     Python-ideas mailing list
>     Python-ideas at python.org
>     <mailto:Python-ideas at python.org>
>     http://mail.python.org/mailman/listinfo/python-ideas
> 
> 
> 
> ------------------------------------------------------------------------
> 
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> http://mail.python.org/mailman/listinfo/python-ideas


From stephen at xemacs.org  Fri Dec 12 03:32:35 2008
From: stephen at xemacs.org (Stephen J. Turnbull)
Date: Fri, 12 Dec 2008 11:32:35 +0900
Subject: [Python-ideas] This seems like a wart to me...
In-Reply-To: <ghsdpj$8qi$1@ger.gmane.org>
References: <18753.8504.116845.736633@montanaro-dyndns-org.local>
	<ca471dc20812110751y1704d01fu1e521edcd057b29d@mail.gmail.com>
	<18753.16763.980209.628297@montanaro-dyndns-org.local>
	<ghs9fj$tge$1@ger.gmane.org>
	<aac2c7cb0812111618v416bd13an679153cf6c1d98b3@mail.gmail.com>
	<ghsdpj$8qi$1@ger.gmane.org>
Message-ID: <87zlj286nw.fsf@xemacs.org>

Ron Adam writes:

 > These examples was just what a "new" programmer might attempt.  I have a 
 > feeling that most new programmers do not attempt regular expressions ie.. 
 > the re module, until sometime after they have learned the basics of python.

Adding a str.split_on_any_of() would violate TOOWDTI, though.

I think this is best addressed by an xref to re.split in the doc for
str.split.


From python at rcn.com  Fri Dec 12 03:34:51 2008
From: python at rcn.com (Raymond Hettinger)
Date: Thu, 11 Dec 2008 18:34:51 -0800
Subject: [Python-ideas] This seems like a wart to me...
References: <18753.8504.116845.736633@montanaro-dyndns-org.local><ca471dc20812110751y1704d01fu1e521edcd057b29d@mail.gmail.com><18753.16763.980209.628297@montanaro-dyndns-org.local><ghs9fj$tge$1@ger.gmane.org>
	<aac2c7cb0812111618v416bd13an679153cf6c1d98b3@mail.gmail.com>
Message-ID: <26EBB4D460D34D8D88CD70517CCB508C@RaymondLaptop1>

 From: "Adam Olsen" <rhamph at gmail.com>
> str.split() handles the simplest, most common cases.  Let's not
> clutter it up with a bad[1] impersonation of regex.

I concur and am -1 on *any* change to str.split().  It has been
around for a very long time and is widely used.  If there were
any subtle change, even in 3.0, it would create migration problems
that are very to diagnose and repair.


Raymond


From carl at carlsensei.com  Fri Dec 12 05:46:04 2008
From: carl at carlsensei.com (Carl Johnson)
Date: Thu, 11 Dec 2008 18:46:04 -1000
Subject: [Python-ideas] This seems like a wart to me...
Message-ID: <72DFCC20-8325-4B13-B609-DAED9664050D@carlsensei.com>

Ron Adam wrote:
> These examples was just what a "new" programmer might attempt. I  
> have a feeling that most new programmers do not attempt regular  
> expressions ie.. the re module, until sometime after they have  
> learned the basics of python.

Feel what you like, but I assumed that .split meant .splitonchars when  
I was learning Python in 2007 and was confused when my script didn't  
work. I was also confused about why it stopped getting rid of empty  
strings. And I still don't know how to write regexs, so now I when I  
want to split on multiple chars, I end up .replace-ing a bunch first,  
which I recognize to be terribly inefficient, but the scripts are  
throwaways, so it's hardly worth the time to learn a whole other  
language first.

-- Carl


From carl at carlsensei.com  Fri Dec 12 05:58:21 2008
From: carl at carlsensei.com (Carl Johnson)
Date: Thu, 11 Dec 2008 18:58:21 -1000
Subject: [Python-ideas] This seems like a wart to me...
In-Reply-To: <72DFCC20-8325-4B13-B609-DAED9664050D@carlsensei.com>
References: <72DFCC20-8325-4B13-B609-DAED9664050D@carlsensei.com>
Message-ID: <58DB5F52-E02B-4535-A246-B4C7DBDE9D7D@carlsensei.com>

Rereading your message along with your other ones, I see that I  
misinterpreted it. I thought you meant, "I can't imagine a new  
programmer wanting something as regex-like as string.splitonchars,"  
but what you meant was "I can't imagine new programmers wanting to go  
into the re module to learn how to do something like  
string.splitonchars." To which I say: Yes! I heartily agree! :-D

Embarrassedly-yours,

Carl

> Ron Adam wrote:
>> These examples was just what a "new" programmer might attempt. I  
>> have a feeling that most new programmers do not attempt regular  
>> expressions ie.. the re module, until sometime after they have  
>> learned the basics of python.
>
> Feel what you like, but I assumed that .split meant .splitonchars  
> when I was learning Python in 2007 and was confused when my script  
> didn't work. I was also confused about why it stopped getting rid of  
> empty strings. And I still don't know how to write regexs, so now I  
> when I want to split on multiple chars, I end up .replace-ing a  
> bunch first, which I recognize to be terribly inefficient, but the  
> scripts are throwaways, so it's hardly worth the time to learn a  
> whole other language first.
>
> -- Carl


From rhamph at gmail.com  Fri Dec 12 06:07:23 2008
From: rhamph at gmail.com (Adam Olsen)
Date: Thu, 11 Dec 2008 22:07:23 -0700
Subject: [Python-ideas] This seems like a wart to me...
In-Reply-To: <87zlj286nw.fsf@xemacs.org>
References: <18753.8504.116845.736633@montanaro-dyndns-org.local>
	<ca471dc20812110751y1704d01fu1e521edcd057b29d@mail.gmail.com>
	<18753.16763.980209.628297@montanaro-dyndns-org.local>
	<ghs9fj$tge$1@ger.gmane.org>
	<aac2c7cb0812111618v416bd13an679153cf6c1d98b3@mail.gmail.com>
	<ghsdpj$8qi$1@ger.gmane.org> <87zlj286nw.fsf@xemacs.org>
Message-ID: <aac2c7cb0812112107o276c1a0ew4b1eab88ec01b35e@mail.gmail.com>

On Thu, Dec 11, 2008 at 7:32 PM, Stephen J. Turnbull <stephen at xemacs.org> wrote:
> Ron Adam writes:
>
>  > These examples was just what a "new" programmer might attempt.  I have a
>  > feeling that most new programmers do not attempt regular expressions ie..
>  > the re module, until sometime after they have learned the basics of python.
>
> Adding a str.split_on_any_of() would violate TOOWDTI, though.
>
> I think this is best addressed by an xref to re.split in the doc for
> str.split.

+1


-- 
Adam Olsen, aka Rhamphoryncus


From turnbull at sk.tsukuba.ac.jp  Fri Dec 12 08:14:23 2008
From: turnbull at sk.tsukuba.ac.jp (Stephen J. Turnbull)
Date: Fri, 12 Dec 2008 16:14:23 +0900
Subject: [Python-ideas] This seems like a wart to me...
In-Reply-To: <58DB5F52-E02B-4535-A246-B4C7DBDE9D7D@carlsensei.com>
References: <72DFCC20-8325-4B13-B609-DAED9664050D@carlsensei.com>
	<58DB5F52-E02B-4535-A246-B4C7DBDE9D7D@carlsensei.com>
Message-ID: <87r64d986o.fsf@xemacs.org>

Carl Johnson writes:

 > what you meant was "I can't imagine new programmers wanting to go  
 > into the re module to learn how to do something like  
 > string.splitonchars." To which I say: Yes! I heartily agree! :-D

I don't understand this point of view at all.  True, regexps are a
complex subject, with an unfortunately large number of dialects.  Is
it the confusion of dialects problem, or do you really never use
regexps in any language?

Anyway, for this purpose you only have to learn one idiom, that

    longstring.splitonchars (["x", "y", "z"])

is spelled

    import re
    re.split ("[xyz]", longstring)

In fact, I personally would like to deprecate the with-argument
implementation of string.split(), and have

    def split (self, delimiter = None):
        if delimiters is None:
            return self.usual_magic_splitting ()
        else:
            import re
            return re.split (delimiter, self)

(of course, that's because that's precisely the way split-string works
in Emacs).

Then the idiom would be

    longstring.split ("[xyz]")

Would that work for you?


From carl at carlsensei.com  Fri Dec 12 08:51:23 2008
From: carl at carlsensei.com (Carl Johnson)
Date: Thu, 11 Dec 2008 21:51:23 -1000
Subject: [Python-ideas] This seems like a wart to me...
In-Reply-To: <87r64d986o.fsf@xemacs.org>
References: <72DFCC20-8325-4B13-B609-DAED9664050D@carlsensei.com>
	<58DB5F52-E02B-4535-A246-B4C7DBDE9D7D@carlsensei.com>
	<87r64d986o.fsf@xemacs.org>
Message-ID: <0B09B7C6-99BE-4B5B-9021-6EAF70D71180@carlsensei.com>

Stephen J. Turnbull wrote:

> I don't understand this point of view at all.  True, regexps are a
> complex subject, with an unfortunately large number of dialects.  Is
> it the confusion of dialects problem, or do you really never use
> regexps in any language?

I have half-heartedly tried to learn regexps before, but always given  
up after reading about the basics. Obviously, this would be shameless  
behavior for a professional programmer, but I'm just a dilettante, and  
the famed saying of Jamie Zawinski ("Some people, when confronted with  
a problem, think 'I know, I'll use regular expressions.'  Now they  
have two problems.") is not highly motivating. :-D

> Anyway, for this purpose you only have to learn one idiom, that
>
>    longstring.splitonchars (["x", "y", "z"])
>
> is spelled
>
>    import re
>    re.split ("[xyz]", longstring)
>
> In fact, I personally would like to deprecate the with-argument
> implementation of string.split(), and have
>
>    def split (self, delimiter = None):
>        if delimiters is None:
>            return self.usual_magic_splitting ()
>        else:
>            import re
>            return re.split (delimiter, self)
>
> (of course, that's because that's precisely the way split-string works
> in Emacs).
>
> Then the idiom would be
>
>    longstring.split ("[xyz]")
>
> Would that work for you?

Wouldn't that subtly break the code of everyone who has written  
something like:

lines = bigtext.splitlines()
delimiter = lines[0]
del lines[0]
splitlines = [line.split(delimiter) for line in lines]

? Since suddenly if your delimiter uses one of the reserved regexp  
characters, such as brackets and parentheses, the code would stop  
working. (That's one of the things I dislike about regexps -- too many  
magical characters.)

Here's a backward compatible idea instead:

    def split (self, delimiter = None):
        if delimiter is None:
            return self.usual_magic_splitting ()
        elif isinstance(delimiter, str):
            return self.usual_delimiter_based_splitting()
        elif isinstance(delimiter, Sequence):
            return  
self.treat_delimiters_given_by_sequence_as_interchangable()
        else:
            raise TypeError("coercing to Unicode: need string or  
buffer or Sequence, " + repr(type(delimiter)) + " found")

Since right now passing a list or tuple raises a TypeError, this would  
be backwards compatible. The idiom for doing re.split-like things  
would then be bigtext.split(list(" ;.,-!?")). It might even be a good  
idea to a keyword (only?) argument called "dropempty" to recreate the  
magical behavior of passing None as the delimiter where empty strings  
are dropped. That would also solve skip's original problem: just set  
it to text.split(None, dropempty=False).

-- Carl


From stephen at xemacs.org  Fri Dec 12 09:43:22 2008
From: stephen at xemacs.org (Stephen J. Turnbull)
Date: Fri, 12 Dec 2008 17:43:22 +0900
Subject: [Python-ideas] This seems like a wart to me...
In-Reply-To: <0B09B7C6-99BE-4B5B-9021-6EAF70D71180@carlsensei.com>
References: <72DFCC20-8325-4B13-B609-DAED9664050D@carlsensei.com>
	<58DB5F52-E02B-4535-A246-B4C7DBDE9D7D@carlsensei.com>
	<87r64d986o.fsf@xemacs.org>
	<0B09B7C6-99BE-4B5B-9021-6EAF70D71180@carlsensei.com>
Message-ID: <87prjx942d.fsf@xemacs.org>

Carl Johnson writes:

 > the famed saying of Jamie Zawinski ("Some people, when confronted with  
 > a problem, think 'I know, I'll use regular expressions.'  Now they  
 > have two problems.") is not highly motivating. :-D

Jamie was talking about the "to a man with a hammer, all problems look
like thumbs" phenomenon.  I've never heard anybody complain that shell
globs are complex.  But regexps will take you a lot farther with just
character classes [] (which most modern shells implement), the
wildcard character . (usually ? in shells), and the repetition
operators * and/or + (available only as a variable-length wildcard *
in shell globs).

 > > In fact, I personally would like to deprecate the with-argument
 > > implementation of string.split(), ....
 > >
 > > Would that work for you?
 > 
 > Wouldn't that subtly break the code of everyone who has written  
 > something like:

Indeed it would.  That was not a serious proposal.  At this point, I'm
trying to understand the resistence to regexps, not propose an
improvement for .split().


From rhamph at gmail.com  Fri Dec 12 10:23:46 2008
From: rhamph at gmail.com (Adam Olsen)
Date: Fri, 12 Dec 2008 02:23:46 -0700
Subject: [Python-ideas] This seems like a wart to me...
In-Reply-To: <87prjx942d.fsf@xemacs.org>
References: <72DFCC20-8325-4B13-B609-DAED9664050D@carlsensei.com>
	<58DB5F52-E02B-4535-A246-B4C7DBDE9D7D@carlsensei.com>
	<87r64d986o.fsf@xemacs.org>
	<0B09B7C6-99BE-4B5B-9021-6EAF70D71180@carlsensei.com>
	<87prjx942d.fsf@xemacs.org>
Message-ID: <aac2c7cb0812120123w732aee27ucb919d628ca61384@mail.gmail.com>

On Fri, Dec 12, 2008 at 1:43 AM, Stephen J. Turnbull <stephen at xemacs.org> wrote:
> Carl Johnson writes:
>
>  > the famed saying of Jamie Zawinski ("Some people, when confronted with
>  > a problem, think 'I know, I'll use regular expressions.'  Now they
>  > have two problems.") is not highly motivating. :-D
>
> Jamie was talking about the "to a man with a hammer, all problems look
> like thumbs" phenomenon.  I've never heard anybody complain that shell
> globs are complex.  But regexps will take you a lot farther with just
> character classes [] (which most modern shells implement), the
> wildcard character . (usually ? in shells), and the repetition
> operators * and/or + (available only as a variable-length wildcard *
> in shell globs).
>
>  > > In fact, I personally would like to deprecate the with-argument
>  > > implementation of string.split(), ....
>  > >
>  > > Would that work for you?
>  >
>  > Wouldn't that subtly break the code of everyone who has written
>  > something like:
>
> Indeed it would.  That was not a serious proposal.  At this point, I'm
> trying to understand the resistence to regexps, not propose an
> improvement for .split().

I'd say the lack of diagnostics when they "fail" is the biggest issue.
 I could easily spend half an hour trying random permutations of a
pattern before I figure out why the original didn't work... and I've
had a moderate amount of experience.


-- 
Adam Olsen, aka Rhamphoryncus


From python at rcn.com  Fri Dec 12 10:35:33 2008
From: python at rcn.com (Raymond Hettinger)
Date: Fri, 12 Dec 2008 01:35:33 -0800
Subject: [Python-ideas] This seems like a wart to me...
References: <72DFCC20-8325-4B13-B609-DAED9664050D@carlsensei.com>
Message-ID: <2F18D71A714746E4BEC8B3050ADEE1D4@RaymondLaptop1>

From: "Carl Johnson" <carl at carlsensei.com>
> And I still don't know how to write regexs, ...

Maybe you should learn some of the fundamental tools provided by the langauge before you get in the business of demanding that the 
language be changed.

Regexes occur in other languages and some command-line tools.  Taking a little time to learn them will provide you with a life long 
skill that will serve you well in a number of contexts.  This is doubly true in your case (since you've show an interest in text 
processing).


Raymond


From skip at pobox.com  Fri Dec 12 16:14:34 2008
From: skip at pobox.com (skip at pobox.com)
Date: Fri, 12 Dec 2008 09:14:34 -0600
Subject: [Python-ideas] This seems like a wart to me...
In-Reply-To: <87r64d986o.fsf@xemacs.org>
References: <87r64d986o.fsf@xemacs.org>
Message-ID: <18754.32730.304523.362137@montanaro-dyndns-org.local>


    >> what you meant was "I can't imagine new programmers wanting to go
    >> into the re module to learn how to do something like
    >> string.splitonchars." To which I say: Yes! I heartily agree! :-D

    Steve> I don't understand this point of view at all.  True, regexps are
    Steve> a complex subject, with an unfortunately large number of
    Steve> dialects.  Is it the confusion of dialects problem, or do you
    Steve> really never use regexps in any language?

Getting more than a little bit off the original topic, but...  I think a
person's affinity for regular expressions has a lot to do with their editing
& programming environments.  I work with some very experienced programmers
(C++ & Python mostly, not much Perl, and generally very basic Emacs usage)
who never (or almost never) use regular expressions.

* C/C++: My impression was always that the C regex(3) API presented a lot of
  barriers to casual use.  Maybe that's changed over time.

* Python: You can go a long way without using regular expressions in Python
  because it has other easy-to-use string searching stuff (str.find, etc) as
  well as shell-style globbing for file name matching.

* Emacs: I think part of the reason that I find re's so easy-to-use is that
  I've been using some dialect of Emacs for about 20 years and it exposes
  re's in a way that is real easy to experiment with: incremental search.
  i-search+re's - what a fabulous combination.

* Perl: I suspect Perl mongers are as adept at re's as Emacs types because
  that's the primary (only?) way to search for patterns in strings.

* vi: Probably somewhere between Perl and Emacs.  vim does support
  incremental search but it's not the default.

Are there other editors besides Emacs and vi for which regular expressions
are so common?

Bringing this back on-topic, I can see that I'm going to lose this argument.
I still view "".split(':') as a wart.  I guess I'll have to live with it
though.

Skip


From skip at pobox.com  Fri Dec 12 16:18:31 2008
From: skip at pobox.com (skip at pobox.com)
Date: Fri, 12 Dec 2008 09:18:31 -0600
Subject: [Python-ideas] This seems like a wart to me...
In-Reply-To: <0B09B7C6-99BE-4B5B-9021-6EAF70D71180@carlsensei.com>
References: <0B09B7C6-99BE-4B5B-9021-6EAF70D71180@carlsensei.com>
Message-ID: <18754.32967.192347.822469@montanaro-dyndns-org.local>


    Carl> I have half-heartedly tried to learn regexps before, but always
    Carl> given up after reading about the basics.

Just out of curiosity, what editor do you use?  Reading and doing are two
different things.

    Carl> ... the famed saying of Jamie Zawinski ("Some people, when
    Carl> confronted with a problem, think 'I know, I'll use regular
    Carl> expressions.'  Now they have two problems.") is not highly
    Carl> motivating. :-D

Sure, but that addresses the topic of some peoples' desire to use regular
expressions to parse everything from LL1 grammars to the tea leaves in the
bottom of a cup.  If you use them in an environment where there is almost no
penalty for mistakes (incremental search) I think you will quickly gain an
understanding of the syntax.  Then your challenge will be not to fall into
Jamie's re tar pit. ;-)

Skip


From stephen at xemacs.org  Sat Dec 13 13:44:36 2008
From: stephen at xemacs.org (Stephen J. Turnbull)
Date: Sat, 13 Dec 2008 21:44:36 +0900
Subject: [Python-ideas] This seems like a wart to me...
In-Reply-To: <aac2c7cb0812120123w732aee27ucb919d628ca61384@mail.gmail.com>
References: <72DFCC20-8325-4B13-B609-DAED9664050D@carlsensei.com>
	<58DB5F52-E02B-4535-A246-B4C7DBDE9D7D@carlsensei.com>
	<87r64d986o.fsf@xemacs.org>
	<0B09B7C6-99BE-4B5B-9021-6EAF70D71180@carlsensei.com>
	<87prjx942d.fsf@xemacs.org>
	<aac2c7cb0812120123w732aee27ucb919d628ca61384@mail.gmail.com>
Message-ID: <87myf08csr.fsf@xemacs.org>

Adam Olsen writes:

 >  I could easily spend half an hour trying random permutations of a
 > pattern before I figure out why the original didn't work... and I've
 > had a moderate amount of experience.

It takes a moderate amount of experience to get that far, though.  In
particular, in this case, all you need to understand is "[abc]"
matches any of the characters "a", "b", or "c", and *that* is familiar
to anybody who has used a decent shell (any Unix shell, and I believe
4DOS and friends provided it too but I haven't used them for 20 years).

So I don't think that lack of diagnostics explains widespread reluctance
to even substitute ".*" for "*", but instead propose something as ugly
as .split(list("abc")).


From stephen at xemacs.org  Sat Dec 13 16:48:00 2008
From: stephen at xemacs.org (Stephen J. Turnbull)
Date: Sun, 14 Dec 2008 00:48:00 +0900
Subject: [Python-ideas] This seems like a wart to me...
In-Reply-To: <20081213085458.5844e108@bhuda.mired.org>
References: <72DFCC20-8325-4B13-B609-DAED9664050D@carlsensei.com>
	<58DB5F52-E02B-4535-A246-B4C7DBDE9D7D@carlsensei.com>
	<87r64d986o.fsf@xemacs.org>
	<0B09B7C6-99BE-4B5B-9021-6EAF70D71180@carlsensei.com>
	<87prjx942d.fsf@xemacs.org>
	<aac2c7cb0812120123w732aee27ucb919d628ca61384@mail.gmail.com>
	<87myf08csr.fsf@xemacs.org>
	<20081213085458.5844e108@bhuda.mired.org>
Message-ID: <87iqpo84b3.fsf@xemacs.org>

Mike Meyer writes:

 > Except that if you're doing anything *interesting* - like splitting on
 > punctuation, which is far more common than splitting on alphanumerics
 > - then it's not nearly that simpl.  The equivalent of
 > .splitset("^()-[]") is *much* more complicated than just "[^()-[]]"

Yeah, it's "[][)(^-]".  So much for complexity of the needed regexp
(see below for difficulty of composition).

 > > So I don't think that lack of diagnostics explains widespread reluctance
 > > to even substitute ".*" for "*", but instead propose something as ugly
 > > as .split(list("abc")).
 > 
 > It isn't the lack of diagnostics, it's the write-once nature of re's.

They're hardly write-once in this context.  The above regexp is hard
to write, agreed, because you have to remember to move the close
bracket to the start, the hyphen to the end, and the caret away from
the start.  (Note that there is never a need to put the close bracket
and hyphen in other positions, so this is not a particularly hard rule
to remember IMO YMMV.)  However, precisely because of the oddity of
the positions of the close bracket and hyphen it's easy enough to read
once you've learned to write it.  As far as I can see, that is the
hardest regexp that most people will ever want to write for
re.split().

Again, I just don't see that (limited) use of regular expressions
makes programs harder to read or write than proliferating special case
functions that provide nowhere near the power of a single regular
expression-based function.


From rhamph at gmail.com  Sat Dec 13 19:03:48 2008
From: rhamph at gmail.com (Adam Olsen)
Date: Sat, 13 Dec 2008 11:03:48 -0700
Subject: [Python-ideas] This seems like a wart to me...
In-Reply-To: <87iqpo84b3.fsf@xemacs.org>
References: <72DFCC20-8325-4B13-B609-DAED9664050D@carlsensei.com>
	<58DB5F52-E02B-4535-A246-B4C7DBDE9D7D@carlsensei.com>
	<87r64d986o.fsf@xemacs.org>
	<0B09B7C6-99BE-4B5B-9021-6EAF70D71180@carlsensei.com>
	<87prjx942d.fsf@xemacs.org>
	<aac2c7cb0812120123w732aee27ucb919d628ca61384@mail.gmail.com>
	<87myf08csr.fsf@xemacs.org> <20081213085458.5844e108@bhuda.mired.org>
	<87iqpo84b3.fsf@xemacs.org>
Message-ID: <aac2c7cb0812131003v1da7aa1av543ac4494b602225@mail.gmail.com>

On Sat, Dec 13, 2008 at 8:48 AM, Stephen J. Turnbull <stephen at xemacs.org> wrote:
> Again, I just don't see that (limited) use of regular expressions
> makes programs harder to read or write than proliferating special case
> functions that provide nowhere near the power of a single regular
> expression-based function.

A lot of it is evil by association ? we're taught that regex is evil
and should never be used ? only later do we figure out that regex is
often the best tool regardless of being evil.

In this case the docs are pretty overwhelming for the minor task.  A
simple "this is all you need for this task" tutorial might help.


-- 
Adam Olsen, aka Rhamphoryncus

From ggpolo at gmail.com  Sat Dec 13 19:58:52 2008
From: ggpolo at gmail.com (Guilherme Polo)
Date: Sat, 13 Dec 2008 16:58:52 -0200
Subject: [Python-ideas] Moving _tkinter._flatten to somewhere else
Message-ID: <ac2200130812131058n50fcb6b3g20c5b93e1a90250a@mail.gmail.com>

Hi there,

Probably many of you have seen/written/used a lot of recipes for
flattening lists/tuples, several of them are similar diverging just a
bit, some uses more obscure code than others, some are not fast
enough, but I have noticed all those have one thing in common: none of
them mention _tkinter._flatten, not even in comments (if the site
allows that).

Apparently _tkinter._flatten is unknown, and it being marked private
doesn't help, and it also lives under _tkinter but doesn't depend on
anything from tcl/tk. This _flatten has the advantage of being faster
then the alternatives I have seen coded in Python, since it is done in
C and its code is simple and it doesn't try to be too smart. It is
also already part of Python, it is just unknown to most apparently.

So, I would like to know what do you think about moving
_tkinter._flatten to some other module and renaming it to "flatten" ?

-- 
-- Guilherme H. Polo Goncalves


From greg at krypto.org  Sat Dec 13 20:22:51 2008
From: greg at krypto.org (Gregory P. Smith)
Date: Sat, 13 Dec 2008 11:22:51 -0800
Subject: [Python-ideas] Moving _tkinter._flatten to somewhere else
In-Reply-To: <ac2200130812131058n50fcb6b3g20c5b93e1a90250a@mail.gmail.com>
References: <ac2200130812131058n50fcb6b3g20c5b93e1a90250a@mail.gmail.com>
Message-ID: <52dc1c820812131122u62b9e44fxb91ab7c46c292edc@mail.gmail.com>

On Sat, Dec 13, 2008 at 10:58 AM, Guilherme Polo <ggpolo at gmail.com> wrote:

> Hi there,
>
> Probably many of you have seen/written/used a lot of recipes for
> flattening lists/tuples, several of them are similar diverging just a
> bit, some uses more obscure code than others, some are not fast
> enough, but I have noticed all those have one thing in common: none of
> them mention _tkinter._flatten, not even in comments (if the site
> allows that).
>
> Apparently _tkinter._flatten is unknown, and it being marked private
> doesn't help, and it also lives under _tkinter but doesn't depend on
> anything from tcl/tk. This _flatten has the advantage of being faster
> then the alternatives I have seen coded in Python, since it is done in
> C and its code is simple and it doesn't try to be too smart. It is
> also already part of Python, it is just unknown to most apparently.
>
> So, I would like to know what do you think about moving
> _tkinter._flatten to some other module and renaming it to "flatten" ?
>

Per the irc discussion...  If this is to be made a public API somewhere it
should be modernized to support the iterator protocol at which point it
could find a home in itertools.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20081213/be17b145/attachment.html>

From rhamph at gmail.com  Sat Dec 13 20:25:04 2008
From: rhamph at gmail.com (Adam Olsen)
Date: Sat, 13 Dec 2008 12:25:04 -0700
Subject: [Python-ideas] Moving _tkinter._flatten to somewhere else
In-Reply-To: <ac2200130812131058n50fcb6b3g20c5b93e1a90250a@mail.gmail.com>
References: <ac2200130812131058n50fcb6b3g20c5b93e1a90250a@mail.gmail.com>
Message-ID: <aac2c7cb0812131125o426d4fc2jc8a5bb3d36baf1ff@mail.gmail.com>

On Sat, Dec 13, 2008 at 11:58 AM, Guilherme Polo <ggpolo at gmail.com> wrote:
> Hi there,
>
> Probably many of you have seen/written/used a lot of recipes for
> flattening lists/tuples, several of them are similar diverging just a
> bit, some uses more obscure code than others, some are not fast
> enough, but I have noticed all those have one thing in common: none of
> them mention _tkinter._flatten, not even in comments (if the site
> allows that).
>
> Apparently _tkinter._flatten is unknown, and it being marked private
> doesn't help, and it also lives under _tkinter but doesn't depend on
> anything from tcl/tk. This _flatten has the advantage of being faster
> then the alternatives I have seen coded in Python, since it is done in
> C and its code is simple and it doesn't try to be too smart. It is
> also already part of Python, it is just unknown to most apparently.
>
> So, I would like to know what do you think about moving
> _tkinter._flatten to some other module and renaming it to "flatten" ?

The problem is people often have a datastructure like this:

x = [['foo'], ['bar', 'baz'], ['quux']]

Although _flatten seems to work, it's actually overkill.  It treats
the input as a tree (using type checks to differentiate leaf from
branch), rather than as a mere nested list.

The simplest solution is itertools.chain(*x).

If we did want to generalize flatten it should be made into iterator
form and take an arbitrary predicate function to distinguish leaf vs
branch.  The default should either be depth based (subsuming chain,
probably good in the long run) or use an appropriate ABC.  I prefer
depth based.


-- 
Adam Olsen, aka Rhamphoryncus


From carl at carlsensei.com  Sat Dec 13 21:05:49 2008
From: carl at carlsensei.com (Carl Johnson)
Date: Sat, 13 Dec 2008 10:05:49 -1000
Subject: [Python-ideas] This seems like a wart to me...
In-Reply-To: <87myf08csr.fsf@xemacs.org>
References: <72DFCC20-8325-4B13-B609-DAED9664050D@carlsensei.com>
	<58DB5F52-E02B-4535-A246-B4C7DBDE9D7D@carlsensei.com>
	<87r64d986o.fsf@xemacs.org>
	<0B09B7C6-99BE-4B5B-9021-6EAF70D71180@carlsensei.com>
	<87prjx942d.fsf@xemacs.org>
	<aac2c7cb0812120123w732aee27ucb919d628ca61384@mail.gmail.com>
	<87myf08csr.fsf@xemacs.org>
Message-ID: <FB8C8BEE-8D29-483E-B444-E5269045B926@carlsensei.com>

I think this discussion is drifting from the point. We all agree that  
regexps are great and powerful and no professional programmer should  
fail to learn them. But at the same time, it's worth noting that they  
are a different language from Python proper, and it's very easy to get  
weird results without knowing why.

Anyway, apparently the proposal to allow splitting on a list is dead.  
What do people think of the proposal to add a dropitem keyword to  
allow the dropping (or retaining) of empty results?

-- Carl


From bruce at leapyear.org  Sat Dec 13 21:14:07 2008
From: bruce at leapyear.org (Bruce Leban)
Date: Sat, 13 Dec 2008 12:14:07 -0800
Subject: [Python-ideas] This seems like a wart to me...
In-Reply-To: <FB8C8BEE-8D29-483E-B444-E5269045B926@carlsensei.com>
References: <72DFCC20-8325-4B13-B609-DAED9664050D@carlsensei.com>
	<58DB5F52-E02B-4535-A246-B4C7DBDE9D7D@carlsensei.com>
	<87r64d986o.fsf@xemacs.org>
	<0B09B7C6-99BE-4B5B-9021-6EAF70D71180@carlsensei.com>
	<87prjx942d.fsf@xemacs.org>
	<aac2c7cb0812120123w732aee27ucb919d628ca61384@mail.gmail.com>
	<87myf08csr.fsf@xemacs.org>
	<FB8C8BEE-8D29-483E-B444-E5269045B926@carlsensei.com>
Message-ID: <cf5b87740812131214g7ca37ffet28dd67eeedf84927@mail.gmail.com>

[i for i in s.split(x) if i] is simple enough if I don't know how to write
"(" + re.escape(x) + ")+".

I would like to be able to drop "i for" in cases like this and just write [i
in s.split(x) if i].

--- Bruce

On Sat, Dec 13, 2008 at 12:05 PM, Carl Johnson <carl at carlsensei.com> wrote:

> I think this discussion is drifting from the point. We all agree that
> regexps are great and powerful and no professional programmer should fail to
> learn them. But at the same time, it's worth noting that they are a
> different language from Python proper, and it's very easy to get weird
> results without knowing why.
>
> Anyway, apparently the proposal to allow splitting on a list is dead. What
> do people think of the proposal to add a dropitem keyword to allow the
> dropping (or retaining) of empty results?
>
> -- Carl
>
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> http://mail.python.org/mailman/listinfo/python-ideas
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20081213/816cd211/attachment.html>

From carl at carlsensei.com  Sat Dec 13 21:30:17 2008
From: carl at carlsensei.com (Carl Johnson)
Date: Sat, 13 Dec 2008 10:30:17 -1000
Subject: [Python-ideas] This seems like a wart to me...
In-Reply-To: <cf5b87740812131213o5e59f7d8g9365173ee744fdde@mail.gmail.com>
References: <72DFCC20-8325-4B13-B609-DAED9664050D@carlsensei.com>
	<58DB5F52-E02B-4535-A246-B4C7DBDE9D7D@carlsensei.com>
	<87r64d986o.fsf@xemacs.org>
	<0B09B7C6-99BE-4B5B-9021-6EAF70D71180@carlsensei.com>
	<87prjx942d.fsf@xemacs.org>
	<aac2c7cb0812120123w732aee27ucb919d628ca61384@mail.gmail.com>
	<87myf08csr.fsf@xemacs.org>
	<FB8C8BEE-8D29-483E-B444-E5269045B926@carlsensei.com>
	<cf5b87740812131213o5e59f7d8g9365173ee744fdde@mail.gmail.com>
Message-ID: <C1C1A274-83FB-4E09-90E6-8EAF0598217A@carlsensei.com>

Bruce Leban wrote:

> [i for i in s.split(x) if i] is simple enough if I don't know how to  
> write "(" + re.escape(x) + ")+".

The point of the dropempty keyword would be less the dropempty=True  
case as the s.split(None, dropempty=False) case, which would otherwise  
require a regexp.

-- Carl


From stephen at xemacs.org  Sun Dec 14 01:24:33 2008
From: stephen at xemacs.org (Stephen J. Turnbull)
Date: Sun, 14 Dec 2008 09:24:33 +0900
Subject: [Python-ideas] This seems like a wart to me...
In-Reply-To: <C1C1A274-83FB-4E09-90E6-8EAF0598217A@carlsensei.com>
References: <72DFCC20-8325-4B13-B609-DAED9664050D@carlsensei.com>
	<58DB5F52-E02B-4535-A246-B4C7DBDE9D7D@carlsensei.com>
	<87r64d986o.fsf@xemacs.org>
	<0B09B7C6-99BE-4B5B-9021-6EAF70D71180@carlsensei.com>
	<87prjx942d.fsf@xemacs.org>
	<aac2c7cb0812120123w732aee27ucb919d628ca61384@mail.gmail.com>
	<87myf08csr.fsf@xemacs.org>
	<FB8C8BEE-8D29-483E-B444-E5269045B926@carlsensei.com>
	<cf5b87740812131213o5e59f7d8g9365173ee744fdde@mail.gmail.com>
	<C1C1A274-83FB-4E09-90E6-8EAF0598217A@carlsensei.com>
Message-ID: <87hc578uym.fsf@xemacs.org>

Carl Johnson writes:
 > Bruce Leban wrote:
 > 
 > > [i for i in s.split(x) if i] is simple enough if I don't know how to  
 > > write "(" + re.escape(x) + ")+".
 > 
 > The point of the dropempty keyword would be less the dropempty=True  
 > case as the s.split(None, dropempty=False) case, which would otherwise  
 > require a regexp.

-0.  Eliminating str.split()'s implementation in favor of using
str.split() in the no argument case and re.split when an argument is
present is backward incompatible, so I can't really object although I
prefer a fix by documenting re.split() in appropriate places.

I do file a technical objection and ask the judge to strike the
wording "require a regexp" from the transcript as prejudicial to the
accused.<wink>  Preferred phrasing is "would otherwise require an
import of re."


From guido at python.org  Sun Dec 14 17:22:40 2008
From: guido at python.org (Guido van Rossum)
Date: Sun, 14 Dec 2008 08:22:40 -0800
Subject: [Python-ideas] This seems like a wart to me...
In-Reply-To: <87hc578uym.fsf@xemacs.org>
References: <72DFCC20-8325-4B13-B609-DAED9664050D@carlsensei.com>
	<87r64d986o.fsf@xemacs.org>
	<0B09B7C6-99BE-4B5B-9021-6EAF70D71180@carlsensei.com>
	<87prjx942d.fsf@xemacs.org>
	<aac2c7cb0812120123w732aee27ucb919d628ca61384@mail.gmail.com>
	<87myf08csr.fsf@xemacs.org>
	<FB8C8BEE-8D29-483E-B444-E5269045B926@carlsensei.com>
	<cf5b87740812131213o5e59f7d8g9365173ee744fdde@mail.gmail.com>
	<C1C1A274-83FB-4E09-90E6-8EAF0598217A@carlsensei.com>
	<87hc578uym.fsf@xemacs.org>
Message-ID: <ca471dc20812140822w5426c957v695dfbcd07058c9@mail.gmail.com>

Whoa. I haven't wasted much time trying to follow this (IMO rather
silly) argument about consistency. We're not going to introduce
backwards incompatibilities or deprecate existing usage of str.split()
with or without arguments are we? A dropempty argument also seems
excessive -- we can't possibly add ad-hoc filtering options to every
function that returns a list or iterator, that would be madness. As
long as the discussion is just about giving regexps a bad name I don't
really care enough to comment; but I have to draw the line when actual
API changes are being considered seriously.

--Guido van Rossum (home page: http://www.python.org/~guido/)


On Sat, Dec 13, 2008 at 4:24 PM, Stephen J. Turnbull <stephen at xemacs.org> wrote:
> Carl Johnson writes:
>  > Bruce Leban wrote:
>  >
>  > > [i for i in s.split(x) if i] is simple enough if I don't know how to
>  > > write "(" + re.escape(x) + ")+".
>  >
>  > The point of the dropempty keyword would be less the dropempty=True
>  > case as the s.split(None, dropempty=False) case, which would otherwise
>  > require a regexp.
>
> -0.  Eliminating str.split()'s implementation in favor of using
> str.split() in the no argument case and re.split when an argument is
> present is backward incompatible, so I can't really object although I
> prefer a fix by documenting re.split() in appropriate places.
>
> I do file a technical objection and ask the judge to strike the
> wording "require a regexp" from the transcript as prejudicial to the
> accused.<wink>  Preferred phrasing is "would otherwise require an
> import of re."
>
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> http://mail.python.org/mailman/listinfo/python-ideas
>


From skip at pobox.com  Mon Dec 15 15:11:58 2008
From: skip at pobox.com (skip at pobox.com)
Date: Mon, 15 Dec 2008 08:11:58 -0600
Subject: [Python-ideas] This seems like a wart to me...
In-Reply-To: <ca471dc20812140822w5426c957v695dfbcd07058c9@mail.gmail.com>
References: <ca471dc20812140822w5426c957v695dfbcd07058c9@mail.gmail.com>
Message-ID: <18758.26030.201594.303415@montanaro-dyndns-org.local>


    Guido> We're not going to introduce backwards incompatibilities or
    Guido> deprecate existing usage of str.split() with or without arguments
    Guido> are we?

As the OP, I long ago (in terms of number of posts on the topic) gave up on
the thought that the str.split() API might change.  Someone suggested this
idiom:

    [elt for elt in s.split(",") if elt]

which I will adopt (with a comment explaining why it's necessary).

Skip


From grosser.meister.morti at gmx.net  Mon Dec 15 19:38:21 2008
From: grosser.meister.morti at gmx.net (=?ISO-8859-15?Q?Mathias_Panzenb=F6ck?=)
Date: Mon, 15 Dec 2008 19:38:21 +0100
Subject: [Python-ideas] returning anonymous functions
Message-ID: <4946A41D.6020100@gmx.net>

It is a very common case to return a nested function, e.g. in a decorator:

def deco(f):
   def _f(*args,**kwargs):
      do_something()
      try:
         return f(*args,**kwargs)
      finally:
         do_something_different()
   return _f

While I agree that the current way with non-anonymous functions perfectly works,
it looks ugly to a lot of people. Maybe one of these syntax variants would be an
option:

def deco(f):
   return def(*args,**kwargs):
      do_something()
      try:
         return f(*args,**kwargs)
      finally:
         do_something_different()

def deco(f):
   return(*args,**kwargs):
      do_something()
      try:
         return f(*args,**kwargs)
      finally:
         do_something_different()

def deco(f):
   def return(*args,**kwargs):
      do_something()
      try:
         return f(*args,**kwargs)
      finally:
         do_something_different()

Ok, the last one is not serious.

This would not be some kind of anonymous function expression but an extended
form of the return statement. Maybe you could do the same thing for yield? Well,
I guess not, because yield can have a return value and this would be awkward:

x = yield(a,b):
   return a + b

And this would not be an option:

f(yield(a,b):
   return a + b)

But extending return and not extending yield feels wrong.

What do you think? Does anyone have a better idea or is the current way the only
thinkable for python (which might very well be the case)?

	-panzi


From scott+python-ideas at scottdial.com  Mon Dec 15 19:52:20 2008
From: scott+python-ideas at scottdial.com (Scott Dial)
Date: Mon, 15 Dec 2008 13:52:20 -0500
Subject: [Python-ideas] returning anonymous functions
In-Reply-To: <4946A41D.6020100@gmx.net>
References: <4946A41D.6020100@gmx.net>
Message-ID: <4946A764.5080905@scottdial.com>

Mathias Panzenb?ck wrote:
> It is a very common case to return a nested function, e.g. in a decorator:
> 
> def deco(f):
>    def _f(*args,**kwargs):
>       do_something()
>       try:
>          return f(*args,**kwargs)
>       finally:
>          do_something_different()
>    return _f

Is is really that common? In this case, you are misrepresenting the
pattern. The appropriate version of this would require references to _f
to make it's signature match that of f's, and therefore this entire
argument is specious. In reality, the number of times you can get away
with returning a truly anonymous function (that isn't a glorified
lambda) is rare, I think.

def deco(f):
    def _f(*args,**kawrgs):
        ...
    functools.update_wrapper(_f, f)
    return _f

Or:

def deco(f):
    @functools.wraps(f)
    def _f(*args,**kawrgs):
        ...
    return _f

Even in the second case, it would be awkward to inline with the return
statement because of the need to invoke a decorator.

-Scott

-- 
Scott Dial
scott at scottdial.com
scodial at cs.indiana.edu


From grosser.meister.morti at gmx.net  Mon Dec 15 20:12:49 2008
From: grosser.meister.morti at gmx.net (=?ISO-8859-1?Q?Mathias_Panzenb=F6ck?=)
Date: Mon, 15 Dec 2008 20:12:49 +0100
Subject: [Python-ideas] returning anonymous functions
In-Reply-To: <4946A764.5080905@scottdial.com>
References: <4946A41D.6020100@gmx.net> <4946A764.5080905@scottdial.com>
Message-ID: <4946AC31.1040604@gmx.net>

Scott Dial schrieb:
>
> Is is really that common? In this case, you are misrepresenting the
> pattern. The appropriate version of this would require references to _f
> to make it's signature match that of f's, and therefore this entire
> argument is specious. In reality, the number of times you can get away
> with returning a truly anonymous function (that isn't a glorified
> lambda) is rare, I think.
>
> def deco(f):
>     def _f(*args,**kawrgs):
>         ...
>     functools.update_wrapper(_f, f)
>     return _f
>
> Or:
>
> def deco(f):
>     @functools.wraps(f)
>     def _f(*args,**kawrgs):
>         ...
>     return _f
>
> Even in the second case, it would be awkward to inline with the return
> statement because of the need to invoke a decorator.
>


ic.

Ok, maybe something even more different: curried functions.

def plus2(f)(x):
   return f(x+2)

@plus2
def foo(x):
   ...

Or I don't know. Just a thought. And yes, you cannot user functools.wraps on
that either. And this is just plain ugly/utter crap:

def plus2(f)
@functools.wraps(f)
def (x):
	return f(x+2)


	-panzi


From ziade.tarek at gmail.com  Tue Dec 16 01:57:33 2008
From: ziade.tarek at gmail.com (=?ISO-8859-1?Q?Tarek_Ziad=E9?=)
Date: Tue, 16 Dec 2008 01:57:33 +0100
Subject: [Python-ideas] Python Isolated Environment (PIE)
Message-ID: <94bdd2610812151657gd38e864pcdb84fb3ce94ac07@mail.gmail.com>

Hello,

I would like to propose a new mechanism in Python, to deal with
package dependencies.

It is described here
http://tarekziade.wordpress.com/2008/12/15/python-isolated-environment-pie/

Like PJE mentioned in the blog comments, I still need to describe in
details how the versions are handled,

But does it sounds like a good idea ?

Regards,
Tarek

-- 
Tarek Ziad? | Association AfPy | www.afpy.org
Blog FR | http://programmation-python.org
Blog EN | http://tarekziade.wordpress.com/


From grosser.meister.morti at gmx.net  Tue Dec 16 09:40:17 2008
From: grosser.meister.morti at gmx.net (=?ISO-8859-1?Q?Mathias_Panzenb=F6ck?=)
Date: Tue, 16 Dec 2008 09:40:17 +0100
Subject: [Python-ideas] returning anonymous functions
In-Reply-To: <4946AC31.1040604@gmx.net>
References: <4946A41D.6020100@gmx.net> <4946A764.5080905@scottdial.com>
	<4946AC31.1040604@gmx.net>
Message-ID: <49476971.3050206@gmx.net>

Mathias Panzenb?ck schrieb:
> Scott Dial schrieb:
>> Is is really that common? In this case, you are misrepresenting the
>> pattern. The appropriate version of this would require references to _f
>> to make it's signature match that of f's, and therefore this entire
>> argument is specious. In reality, the number of times you can get away
>> with returning a truly anonymous function (that isn't a glorified
>> lambda) is rare, I think.
>>
>> def deco(f):
>>     def _f(*args,**kawrgs):
>>         ...
>>     functools.update_wrapper(_f, f)
>>     return _f
>>
>> Or:
>>
>> def deco(f):
>>     @functools.wraps(f)
>>     def _f(*args,**kawrgs):
>>         ...
>>     return _f
>>
>> Even in the second case, it would be awkward to inline with the return
>> statement because of the need to invoke a decorator.
>>
>
>
> ic.
>
> Ok, maybe something even more different: curried functions.
>
> def plus2(f)(x):
>    return f(x+2)
>
> @plus2
> def foo(x):
>    ...
>
> Or I don't know. Just a thought. And yes, you cannot user functools.wraps on
> that either. And this is just plain ugly/utter crap:
>
> def plus2(f)
> @functools.wraps(f)
> def (x):
> 	return f(x+2)
>
>
> 	-panzi

Thinking of it, we do not need any new syntax:

def curry(f):
    def _f(x).
        def _f2(*args,**kwargs):
            return f(x,*args,**kwargs)
        return _f2
    return _f

@curry
def foo(a,b,c):
    return a+b+c


Maybe we can somehow also use functools.update_wrapper here.

	-panzi


From leif.walsh at gmail.com  Tue Dec 16 10:13:27 2008
From: leif.walsh at gmail.com (Leif Walsh)
Date: Tue, 16 Dec 2008 04:13:27 -0500
Subject: [Python-ideas] returning anonymous functions
In-Reply-To: <49476971.3050206@gmx.net>
References: <4946A41D.6020100@gmx.net> <4946A764.5080905@scottdial.com>
	<4946AC31.1040604@gmx.net> <49476971.3050206@gmx.net>
Message-ID: <cc7430500812160113x65443594t6400b54c74d131ae@mail.gmail.com>

On Tue, Dec 16, 2008 at 3:40 AM, Mathias Panzenb?ck
<grosser.meister.morti at gmx.net> wrote:
> Thinking of it, we do not need any new syntax:
>
> def curry(f):
>    def _f(x).
>        def _f2(*args,**kwargs):
>            return f(x,*args,**kwargs)
>        return _f2
>    return _f
>
> @curry
> def foo(a,b,c):
>    return a+b+c

This forces you to call foo(1)(2)(3) if you want an answer.  How about:

def curry(f):
  def _f(*c_args, **c_kwargs):
    def _f2(*args, **kwargs):
      return f(*c_args, *args, **c_kwargs, **kwargs)
    return _f2
  return _f

@curry
def foo(a, b, c):
  return a + b + c

foo(1, 2)(3)

I think this still prevents us from currying multiple times --- that
is, you curry once, and you get a non-curriable function.  I remember
something in the wiki about a decorator in the form of an object, that
accumulated arguments when __call__()ed, and I think that worked best
(and still didn't need new syntax).

-- 
Cheers,
Leif

From leif.walsh at gmail.com  Tue Dec 16 10:14:10 2008
From: leif.walsh at gmail.com (Leif Walsh)
Date: Tue, 16 Dec 2008 04:14:10 -0500
Subject: [Python-ideas] returning anonymous functions
In-Reply-To: <cc7430500812160113x65443594t6400b54c74d131ae@mail.gmail.com>
References: <4946A41D.6020100@gmx.net> <4946A764.5080905@scottdial.com>
	<4946AC31.1040604@gmx.net> <49476971.3050206@gmx.net>
	<cc7430500812160113x65443594t6400b54c74d131ae@mail.gmail.com>
Message-ID: <cc7430500812160114m4aa7b3aekf605357bb862fb38@mail.gmail.com>

On Tue, Dec 16, 2008 at 4:13 AM, Leif Walsh <leif.walsh at gmail.com> wrote:
> This forces you to call foo(1)(2)(3) if you want an answer.  How about:

Sorry, said that wrong.  It doesn't let you curry more than one variable.

-- 
Cheers,
Leif


From greg.ewing at canterbury.ac.nz  Wed Dec 17 23:55:35 2008
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Thu, 18 Dec 2008 11:55:35 +1300
Subject: [Python-ideas] Moving _tkinter._flatten to somewhere else
In-Reply-To: <ac2200130812131058n50fcb6b3g20c5b93e1a90250a@mail.gmail.com>
References: <ac2200130812131058n50fcb6b3g20c5b93e1a90250a@mail.gmail.com>
Message-ID: <49498367.3090506@canterbury.ac.nz>

Guilherme Polo wrote:

> So, I would like to know what do you think about moving
> _tkinter._flatten to some other module and renaming it to "flatten" ?

In my experience, use cases for flattening lists are
very thin on the ground, and when one does arise, it
usually has some application-specific criterion for
when to stop recursing.

-- 
Greg