Human readable number formatting

Mike Meyer mwm at mired.org
Tue Sep 27 20:34:45 EDT 2005


Alex Willmer <alex at moreati.org.uk> writes:

> When reporting file sizes to the user, it's nice to print '16.1 MB',
> rather than '16123270 B'. This is the behaviour the command 'df -h'
> implements. There's no python function that I could find to perform this
> formatting , so I've taken a stab at it:
>
> import math
> def human_readable(n, suffix='B', places=2):
>     '''Return a human friendly approximation of n, using SI prefixes'''
>     prefixes = ['','k','M','G','T']
>     base, step, limit = 10, 3, 100
>     
>     if n == 0:
>         magnitude = 0 #cannot take log(0)
>     else:
>         magnitude = math.log(n, base)
>     
>     order = int(round(magnitude)) // step
>     return '%.1f %s%s' % (float(n)/base**(order*step), \
>                           prefixes[order], suffix)
>
> Example usage
>>>> print [human_readable(x) for x in [0, 1, 23.5, 100, 1000/3, 500,
> 1000000, 12.345e9]]
> ['0.0 B', '1.0 B', '23.5 B', '100.0 B', '0.3 kB', '0.5 kB', '1.0 MB',
> '12.3 GB']
>
> I'd hoped to generalise this to base 2 (eg human_readable(1024, base=2)
> == '1 KiB' and enforcing of 3 digits at most (ie human_readable(100) ==
> '0.1 KB' instead of '100 B). However I can't get the right results
> adapting the above code.
>
> Here's where I'd like to ask for your help.
> Am I chasing the right target, in basing my function on log()?

I wouldn't have done it that way, but that's not worth very much. Can
you use the log() variation to change form proper scientific units
to the CS powers-of-two variation?

if not, I would do it this way:

def human_readable(n, suffix = 'B', places = 2):
    prefixes = ['', 'K', 'M', 'G', 'T', 'P', 'E']

    top = 10 ** places
    index = 0
    n = float(n)
    while abs(n) > top:
          n /= 10
          index += 1
    return '%.1f %s%s' % (n, prefixes[index], suffix)

> Does this function already exist in some python module?

humanize_number is a cross-platform C library function, about 150
lines of code. It uses the loop I gave above. It might be worthwhile
to swipe the code (it's BSD-licensed), wrap it, and submit a PR to add
it to the standard library - just so you get properly tested code.

     <mike
-- 
Mike Meyer <mwm at mired.org>			http://www.mired.org/home/mwm/
Independent WWW/Perforce/FreeBSD/Unix consultant, email for more information.



More information about the Python-list mailing list