extracting numbers with decimal places from a string

Mark Lawrence breamoreboy at yahoo.co.uk
Sun Jan 11 19:04:49 EST 2015


On 11/01/2015 23:07, Thomas 'PointedEars' Lahn wrote:
> Store Makhzan wrote:
>
>> I have this script which can calculate the total of numbers given in a
>> string […]
>> total = 0
>> for c in '0123456789':
>>     total += int(c)
>> print total
>>
>> […]
>> How should I modify this script to find the total of if the numbers given
>> in the string form have decimal places? That is, how do I need to modify
>> this line: […]
>>
>> for c in '1.32, 5.32, 4.4, 3.78':
>>
>> […] to find the total of these given numbers.
>
> The original script already does not do what it advertises.  Instead, it
> iterates over the characters of the string, attempts to convert each to an
> integer and then computes the sum.  That is _not_ “calculate the total of
> numbers given in a string”.
>
> A solution has been presented, but it is not very pythonic because the
> original code was not; that should have been
>
> ### Ahh, Gauß ;-)
> print(sum(map(lambda x: int(x), list('0123456789'))))
> ### --------------------------------------------------------------------
>
> Also, it cannot handle non-numeric strings well.  Consider this instead:
>
> ### --------------------------------------------------------------------
> from re import findall
>
> s = '1.32, 5.32, 4.4, 3.78'
> print(sum(map(lambda x: float(x), findall(r'-?\d+\.\d+', s))))
> ### --------------------------------------------------------------------
>
> But if you are sure that except for the comma separator there are only
> numeric strings, it is more efficient to use re.split() instead of
> re.findall() here.
>
>
> Aside:
>
> I thought I had more than a fair grasp of regular expressions, but I am
> puzzled by
>
> | $ python3
> | Python 3.4.2 (default, Dec 27 2014, 13:16:08)
> | [GCC 4.9.2] on linux
> | >>> from re import findall
> | >>> s = '1.32, 5.32, 4.4, 3.78'
> | >>> findall(r'-?\d+(\.\d+)?', s)
> | ['.32', '.32', '.4', '.78']
>
> Why does this more flexible pattern not work as I expected in Python 3.x,
> but virtually everywhere else?
>
> And why this?
>
> | >>> findall(r'-?\d+\.\d+', s)
> | ['1.32', '5.32', '4.4', '3.78']
> | >>> findall(r'-?\d+(\.\d+)', s)
> | ['.32', '.32', '.4', '.78']
>
> Feature?  Bug?
>

I can't tell you as I avoid regexes like I avoid the plague.  Having 
said that I do know that there loads of old bugs on the bug tracker, 
many of which are fixed in the "new" regex module that's available here 
https://pypi.python.org/pypi/regex/

-- 
My fellow Pythonistas, ask not what our language can do for you, ask
what you can do for our language.

Mark Lawrence




More information about the Python-list mailing list