[Python-ideas] real numbers with SI scale factors

Sun Aug 28 23:29:45 EDT 2016

On Mon, Aug 29, 2016 at 12:33:16PM +1000, Chris Angelico wrote:
> On Mon, Aug 29, 2016 at 11:44 AM, Ken Kundert
> <python-ideas at shalmirane.com> wrote:
> > When working with a general purpose programming language, the above numbers
> > become:
> >
> >     780kpc -> 7.8e+05
> >     108MPa -> 1.08e+08
> >     600TW  -> 6e+14
> >     3.2Gb  -> 3.2e+09
> >     53pm   -> 5.3e-11
> >     $8G    -> 8e+09
> >
> > Notice that the numbers become longer, harder to read, harder to type, harder to
> > say, and harder to hear.
> >
> 
> And easier to compare. The SI prefixes are almost consistent in using
> uppercase for larger units and lowercase for smaller, but not quite;
> and there's no particular pattern in which letter is larger. For
> someone who isn't extremely familiar with them, that makes them
> completely unordered - which is larger, peta or exa? Which is smaller,
> nano or pico? Plus, there's a limit to how far you can go with these
> kinds of numbers, currently yotta at e+24. Exponential notation scales
> to infinity (to 1e308 in IEEE 64-bit binary floats, but plenty further
> in decimal.Decimal - I believe its limit is about 1e+(1e6), and REXX
> on OS/2 had a limit of 1e+(1e10) for its arithmetic), remaining
> equally readable at all scales.
> 
> So we can't get rid of exponential notation, no matter what happens.
> Mathematics cannot usefully handle a system in which we have to
> represent large exponents with ridiculous compound scale factors:
> 
> sys.float_info.max = 179.76931348623157*Y*Y*Y*Y*Y*Y*Y*Y*Y*Y*Y*Y*E
> 
> (It's even worse if the Exa collision means you stop at Peta.
> 179.76931348623157*P*P*P*P*P*P*P*P*P*P*P*P*P*P*P*P*P*P*P*P*M, anyone?)
> 
> Which means that these tags are a duplicate way of representing a
> specific set of exponents.
> 

Yes, of course.  No one is suggesting abandoning exponential notation. I am not 
suggesting we force people to use SI scale factors, only that we allow them to.  
What I am suggesting is that we stop saying to them things like 'you must use 
exponential notation because we have decided that its better. See, you can 
easily compare the size of numbers by looking at the exponents.'

What is wrong with have two ways of doing things? We have many ways of 
specifying the value of the integer 16: 0b10000, 0o20, 16, 0x10, 16L, ....

> > Before we expend any more effort on this topic, let's put aside the question 
> > of
> > how it should be done, or how it will be used after its done, and just focus on
> > whether we do it at all. Should Python support real numbers specified with SI
> > scale factors as first class citizens?
> 
> Except that those are exactly the important questions to be answered.
> How *could* it be done? With the units stripped off, your examples
> become:
> 
>     780k == 7.8e+05 == 780*k
>     108M == 1.08e+08 == 108*M
>     600T == 6e+14 == 600*T
>     3.2G == 3.2e+09 == 3.2*G
>     53p == 5.3e-11 == 53*p
>     8G == 8e+09 == 8*G
> 
> Without any support whatsoever, you can already use the third column
> notation, simply by creating this module:
> 
> # si.py
> k, M, G, T, P, E, Z, Y = 1e3, 1e6, 1e9, 1e12, 1e15, 1e18, 1e21, 1e24
> m, μ, n, p, f, a, z, y = 1e-3, 1e-6, 1e-9, 1e-12, 1e-15, 1e-18, 1e-21, 1e-24
> u = μ
> K = k
> 
> And using it as "from si import *" at the top of your code. Do we see
> a lot of code in the wild doing this? "[H]ow it will be used after
> it's done" is exactly the question that this would answer.

Because by focusing on the implementation details, we miss the big picture.  We 
have already done that, and we ended up going down countless ratholes.

> 
> > Don't Python's users in the scientific and engineering communities deserve
> > the same treatment?  These are, after all, core communities for Python.
> 
> Yes. That's why we have things like the @ matrix multiplication
> operator (because the numeric computational community asked for it),
> and %-formatting for bytes strings (because the networking, mainly
> HTTP serving, community asked for it). Python *does* have a history of
> supporting things that are needed by specific sub-communities of
> Python coders. But there first needs to be a demonstrable need. How
> much are people currently struggling due to the need to transform
> "gigapascal" into "e+9"? Can you show convoluted real-world code that
> would be made dramatically cleaner by language support?

Can you show code that would have been convoluted if Python had used a library 
rather than built-in support for hexadecimal numbers?

So, in summary, you are suggesting that we tell the scientific and engineering 
communities that we refuse to provide native support for their preferred way of 
writing numbers because:
1. our way is better,
2. their way is bad because some uneducated person might see the numbers and not 
   understand them,
3. we already have way of representing numbers that we came up with in the '60s 
   and we simply cannot do another,
4. well we could do it, but we have decided that if you would only adapt to this 
   new way of doing things that we just came up with, then we would not have to 
   do any work, and that is better for us. Oh and this this new way of writing 
   numbers, it only works in the program itself. Your out of luck when it comes 
   to IO.

These do not seem like good reasons for not doing this.

-Ken