The Running Time of += on Char Strings ?
Edg Bamyasi
support.services.complaints at gmail.com
Thu Mar 24 04:28:42 EST 2005
This Is A Late Cross Post from comp.lang.python. It seems the mistery
is deeper then i expected.
What is the running time of conactination on character strings.
i.e.
>> joe="123"
>> joe+="99999999999999999"
is it Amortized Constant time? I don't think it would be O((number of
chars)^2) but i really don't know.
Teach me how to fish, where would i find out more about the
internal
representations of data types in python (and guarenteed run times, im
think of something like sgi.com 's info on the STL) . I have looked
through the docs but i don't seem to see these types of specifications.
thanks * 100
- Haz
P.S.
- Should Note that i am famliure with timeit, but understanding the
underly data structures and representations is an important thing to
know.
P.P.S
This a bit of what i think relevent discourse i have been having via a
email responder of my usenet posting.
> Haz> i should have mentioned that i am familure with the timeit
> Haz> function, but shurly there must be a specification in the language
> Haz> of the running time (number of flops).
>
> Nope. I can't think of an instance where it would be appropriate to specify
> runtime properties of various algorithms in the language. For example, if
> you were to specify that sorting of lists was O(n log n) that would
> potentially preclude the choice of quicksort as an algorithm because its
> worst case behavior is O(n * n) even though it is generally faster than most
> other sorting algorithms.
The answere here is to use omega(n log n) or specify average and worst
cases. I truely do think that there can be a complexity specification
for the language. I mean all algorithms have a complexity and surely
data structures are choosen with the size/speed tradeoffs in mind.
For instance in the STL has a sorting algorithm and it specifies a
running time the latter way (why they can do assure this is by using
coding with concepts, but i think in the base language it could be
simpler because the data structures are known.) i.e. Its know what
data types += works on and thus it should be know what algoritms are
to be used (that is the highly optimal ones)
>From SGI STL Page:
sort
<snip>
Complexity
O(N log(N)) comparisons (both average and worst-case), where N is last
- first. [2]
<snip>
source : http://www.sgi.com/tech/stl/sort.html
> Haz> Without knowing these specifications its hard to optimize.
>
> No, you still need to see where your program runs slow and figure out ways
> to make it run faster.
Well basically my point is that it is hard to know why a code section
is running slow unless you understand the underlying data
represenations and algorithms
For instance
matlab code:
A=[]
for i=1:N
A=[A;'a']
end
is O(N^2) operation
C code:
vector<char> A;
for(i=0; i<N; i++){
A.push_back('a');
}
is Amortized O(N) [note that this isn't the correct wording exactly,
but in general is O(N)]
Python Code:
A=""
for i in range(N):
A+='a'
running time : ???
So if you are looking through your code in matlab or C and see this
concatination loop you know that it is a problem in the former and not
in the latter. But in python ???
More information about the Python-list
mailing list