extracting numbers with decimal places from a string

Thomas 'PointedEars' Lahn PointedEars at web.de
Sun Jan 11 18:07:57 EST 2015


Store Makhzan wrote:

> I have this script which can calculate the total of numbers given in a
> string […]
> total = 0
> for c in '0123456789':
>    total += int(c)
> print total
> 
> […]
> How should I modify this script to find the total of if the numbers given
> in the string form have decimal places? That is, how do I need to modify
> this line: […]
> 
> for c in '1.32, 5.32, 4.4, 3.78':
> 
> […] to find the total of these given numbers.

The original script already does not do what it advertises.  Instead, it 
iterates over the characters of the string, attempts to convert each to an 
integer and then computes the sum.  That is _not_ “calculate the total of 
numbers given in a string”.

A solution has been presented, but it is not very pythonic because the 
original code was not; that should have been

### Ahh, Gauß ;-)
print(sum(map(lambda x: int(x), list('0123456789'))))
### --------------------------------------------------------------------

Also, it cannot handle non-numeric strings well.  Consider this instead:

### --------------------------------------------------------------------
from re import findall

s = '1.32, 5.32, 4.4, 3.78'
print(sum(map(lambda x: float(x), findall(r'-?\d+\.\d+', s))))
### --------------------------------------------------------------------

But if you are sure that except for the comma separator there are only 
numeric strings, it is more efficient to use re.split() instead of 
re.findall() here.


Aside:

I thought I had more than a fair grasp of regular expressions, but I am 
puzzled by

| $ python3
| Python 3.4.2 (default, Dec 27 2014, 13:16:08) 
| [GCC 4.9.2] on linux
| >>> from re import findall
| >>> s = '1.32, 5.32, 4.4, 3.78'
| >>> findall(r'-?\d+(\.\d+)?', s)
| ['.32', '.32', '.4', '.78']

Why does this more flexible pattern not work as I expected in Python 3.x, 
but virtually everywhere else?

And why this?

| >>> findall(r'-?\d+\.\d+', s)
| ['1.32', '5.32', '4.4', '3.78']
| >>> findall(r'-?\d+(\.\d+)', s)
| ['.32', '.32', '.4', '.78']

Feature?  Bug?

-- 
PointedEars

Twitter: @PointedEars2
Please do not cc me. / Bitte keine Kopien per E-Mail.



More information about the Python-list mailing list