Why not FP for Money?

Andrew Dalke adalke at mindspring.com
Sat Sep 25 14:30:36 EDT 2004


Carlos Ribeiro wrote:
> Why not? It may be a good idea for Py3k -- instead of raw strings, why
> not regular expressions strings?

When Python was firse released it used the regex module
for regular expressions.  These were not thread safe at
all (the match result was attached to the pattern object)
and used an old style of pattern definition while perl's
regexp syntax basically dominated common usage.  (I
still find it hard to believe Perl6 is going to use a
new syntax.)

Python adapted by having a new module, 're', to handle
the new syntax.  It was based on the pcre C library.
Old code worked with regex, new with re.  Then /F
worked on a replacement for pcre called 'sre'.  In
part for unicode support.

I've also seen Python bindings for a POSIX compliant
regular expression library.

If regular expression strings were cooked into
the Python language then we wouldn't have this
ability to change to meet the changing definitions
and requirements for regexps while also supporting
backwards compatibility.

> What is the 'value' of a RE literal? If it's taken to be the RE
> itself (in it's compiled form), I can't see much practical use for
> expressions involving literals - even concatenation (ex: re1 + re2)
> has nasty side effects in the general case; it's better to concat the
> original RE strings and recompile the RE anyway.

Interestingly enough I happen to have a regexp engine
I wrote which allows addition on evaluated patterns.
It works by merging the parse trees.

 >>> from Martel import Re
 >>> word = Re("(\w+)")
 >>> spaces = Re("\s+")
 >>> pat = word + spaces + word + spaces + word
 >>> print pat
([\dA-Z_a-z]+)[\t-\r ]+([\dA-Z_a-z]+)[\t-\r ]+([\dA-Z_a-z]+)
 >>> word
<Martel.Expression.Group instance at 0x5a9dc8>

It's very useful for building large patterns because

(?P<AC_block>(?P<AC>AC 
(?P<bioformat:dbid?type=accession&dbname=sp>(?P<ac_number>[\dA-Z_a-z]+))\;(
(?P<bioformat:dbid?type=accession>(?P<ac_number>[\dA-Z_a-z]+))\;)*(\n|\r\n?))+)

is harder to write by hand than

AC = Group("AC",
            Str("AC   ") +
            Std.dbid(Martel.Word("ac_number"),
                    {"type": "accession",
                     "dbname": "sp"}) + \
            Str(";") +
            Rep(Martel.Str(" ") +
                Std.dbid(Martel.Word("ac_number"),
                         {"type": "accession"}) +
                Str(";")) +
            Martel.AnyEol())

AC_block = Group("AC_block", Martel.Rep1(AC))


(Even with MULTILINE, re.compile doesn't tell you the
character position of the syntax error as well as
Python does, nor does my editor help me out.)

> Because (in my opinion) a decimal point literal is in the same general
> class as the objects you've mentioned.

So are rationals.  And more esoterically, so are
polynomials.

I'm also biased the other way in that I don't think
I'll be using decimal all that often.  The only case
that comes to mind is one data field a few years
back that stored X-ray resolution and where "2.0"
and "2.00" were different because the latter meant
there was higher precision.

I'm not sure though if keeping that as a Decimal
literal would work right.

> Because the call-style syntax requires the use of a string as an
> argument to avoid practical and phylosophical problems involving the
> conversion of binary floats to decimal floats, and the string looks
> out of place in numeric expressions.

That argument makes much more sense to me than saying
that using 'd' is a compromise to having no way to write
it as a literal.

> Because, in the prompt, having a easy way to write numeric literals of
> any kind is important to allow seamless interactive use -- be it as
> powerful calculator, or to prototype small functions, or to test
> concepts.

That to me just isn't a good argument.  I use
regular expressions a lot.  Having a regexp literal
would also simplify interactive use.  I wouldn't have
to 'import re' and use 're.compile'.

Unless you are arguing for a large increase in the number
of syntatic literals in Python, my problem with this
argument is that it doesn't recognize the tension I
mentioned previously.  It doesn't say when it shouldn't
be used.

> It's strange, because complex numbers are much more of a special case
> than decimals. Decimals have a single value, complex numbers are a
> tuple; decimals are orderable; but complex made it to the language. It
> was another time, and I don't know how many people rely on Python
> complex numbers, but I bet that decimals have the potential to become
> more important in the long run.

Complex is rarely used, I agree.  I just read through
the original discussion on adding complex to Python.
The thread was titled "Should I add complex numbers to
the Python code?" starting Dec. 10, 1995.

It reminded me that there is no complex literal in Python,
only an imaginary literal, which creates a complex instance
with 0.0 as the real component.

The basic arguments from Guido, summarized by /F

   - It adds 10-20k to the interpreter.
   + It's useful in many existing applications.
   + Many other languages have similar constructs.
   + It does not conflict with any existing code.

At that time Python's audience was much more limited and
nearly all of the responses were along the lines of
(quoting James C. Ahlstrom)

    "However, many of us are engineers or retread
     physicists, so we _like_ complex numbers.

Here's a statement from Chris Hoffmann using an argument
very much along your lines:
 > To me, complex numbers are not some special case object.
 > They are a fundamental numeric type, just like integers
 > and reals. Most languages don't have them builtin (or
 > even a convenient way to add them as an extension), but
 > some do. In those that do have them builtin, I'd argue
 > that a fair percentage of the users of those languages
 > take advantage of them. Offhand, I don't know of any
 > general purpose language that has date-time types or
 > coordinate types, as useful as they might be.


Python's audience has changed a bit since then.  There
were some non-math people then.  Their comments went
something like

Greg Stein:
 > something like complex number handling seems to me
 > to be much more of an optional-type feature rather
 > than a core  feature.

Mark Lutz
 > They're obviously  useful to others.
 >
 > What's 20K among friends?,

Others argued against it because:

Barry Merriman
 > Becasue the base object of these domains are not part
 > of the python  core---i.e. python does not have core
 > objects for polynomials, fourier transforms, most
 > special functions---it does not seem reasonable that
 > the numerical basis that exisits essentially only for
 > these specialized domains, complex numbers, must be
 > integrated into the core.

There was one or two people who wanted an alternate
notation, like

David Redish
 > Define the constant i = sqrt(-1 + 0i) in the module complex.
   ..
 >  import complex
 >  i = complex.i
 >
 >  x = 2 + 4*i
 >
 > Both of these seem to be much more in the spirit of python,
 > then changing the core language.

That was shot down because 'i' as a standalone variable
is used too often as an index.

Michael McLay wanted to
 > ... settle for second class with the option for an
 > upgrade to first class on some future flight:-)

by using "complex(2.0,3.0)"


As you have seen, I'm with Michael McLay in my viewpoint.
Why wasn't it done?  Guido posted saying that he had
decided that if complex was added to the Python code
then he had already decided to use the "x+yj" - style
syntax (possibly with an "i" instead of "j")

An interesting point, made by Paul Dubois, is that
 > Expressions like those shown never occur in real
 > life. In fact, complex literals are RARE. Usually
 > complex numbers occur as the output of a function,
 > and usually in arrays (often, BIG arrays).  So in
 > fact this issue is not very important.

Is the same true for decimal?

BTW, at this time I was a ready of c.l.py but not
an active user.  I didn't post to the thread.  I
wanted complex because, well, I am a "retread
physicist."  But I've never actually used complex
for any of my Python work and suspect that I
wouldn't have a problem using (say)
   cmath.complex(1.12, 2.9)
instead if I did.

				Andrew
				dalke at dalkescientific.com



More information about the Python-list mailing list