howto remove the thousand separator
Steven D'Aprano
steve+comp.lang.python at pearwood.info
Sun Apr 14 21:14:25 EDT 2013
On Sun, 14 Apr 2013 17:44:28 -0700, Mark Janssen wrote:
> On Sun, Apr 14, 2013 at 5:29 PM, Steven D'Aprano
> <steve+comp.lang.python at pearwood.info> wrote:
>> On Sun, 14 Apr 2013 12:06:12 -0700, Mark Janssen wrote:
>>
>>> cleaned=''
>>> for c in myStringNumber:
>>> if c != ',':
>>> cleaned+=c
>>> int(cleaned)
>>
>> ....due to being an O(N**2) algorithm.
>
> What on earth makes you think that is an O(n**2) algorithm and not O(n)?
Strings are immutable. Consider building up a single string from four
substrings:
s = ''
s += 'fe'
s += 'fi'
s += 'fo'
s += 'fum'
Python *might* optimize the first concatenation, '' + 'fe', to just reuse
'fe', (but it might not). Let's assume it does, so that no copying is
needed. Then it gets to the second concatenation, and now it has to copy
characters, because strings are immutable and cannot be modified in
place. Showing the *running* total of characters copied:
'fe' + 'fi' => 'fefi' # four characters copied
'fefi' + 'fo' => 'fefifo' # 4 + 6 = ten characters copied
'fefifo' + 'fum' => 'fefifofum' # 10 + 9 = nineteen characters copied
Notice how each intermediate substring gets copied repeatedly? In order
to build up a string of length 9, we've had to copy at least 19
characters. With only four substrings, it's not terribly obvious how
badly this performs. So let's add some more substrings, and see how the
running total increases:
'fefifofum' + 'foo' => 'fefifofumfoo' # 19 + 12 = 31
'fefifofumfoo' + 'bar' => 'fefifofumfoobar' # 31 + 15 = 46
'fefifofumfoobar' + 'baz' => 'fefifofumfoobarbaz' # 46 + 18 = 64
'fefifofumfoobarbaz' + 'spam' => 'fefifofumfoobarbazspam' # 64 + 22 = 86
To build up a string of length 22, we've had to copy, and re-copy, and re-
re-copy, 86 characters in total. And the string gets bigger, the
inefficiency gets worse. Each substring (except the very last one) gets
copied multiple times; the number of times it gets copied is proportional
to the number of substrings.
If the substrings are individual characters, then each character is
copied a number of times proportional to the number of characters N;
since there are N characters, each being copied (proportional to) N
times, that makes N*N or N**2.
--
Steven
More information about the Python-list
mailing list