Python and STL efficiency
Maric Michaud
maric at aristote.info
Tue Aug 22 16:49:34 EDT 2006
Le mardi 22 août 2006 12:55, Mc Osten a écrit :
> In fact Python here is faster. Suppose it has a really optimized set
> class...
Maybe I'm missing something but the posted c++codes are not equivalent IMO to
what python is doing. I discarded the "slow" version, and tried to get the
equivalent in c++ of :
"""
#!/usr/bin/env python
size = 1000000
def f():
a = []
for i in range(size):
a.append('What do you know')
a.append('so long...')
a.append('chicken crosses road')
a.append('fool')
b = set(a)
for s in b:
print s
import time
from time import clock
f_start = clock()
f()
f_end = clock()
print "Elapsed: %f seconds" % (f_end - f_start)
"""
I came at first with the following, which is still slower than the python
version :
"""
void print_occurence_of_unique_strings(){
vector<string> a;
const string& s1 = "What do you know?" ;
const string& s2 = "so long..." ;
const string& s3 = "chicken crosses road";
const string& s4 = "fool" ;
for (long int i=0; i<SIZE ; ++i){
a.push_back(s1);
a.push_back(s2);
a.push_back(s3);
a.push_back(s4);
}
set<string> b(a.begin(), a.end());
copy(b.begin(), b.end(),
ostream_iterator<string>(cout, "\n"));
}
"""
Here, all strings, while passed by reference to the vector, are copied one by
one.
Then, I tried this, it just overcome the performance of python code, but not
in the proportion I expected :
"""
void print_occurence_of_unique_strings_compare_by_adresses(){
vector<string*> a;
string s1 = "What do you know?";
string s2 = "so long...";
string s3 = "chicken crosses road";
string s4 = "fool";
for (long int i=0; i<SIZE ; ++i){
a.push_back(&s1);
a.push_back(&s2);
a.push_back(&s3);
a.push_back(&s4);
}
set<string> b;
for (vector<string*>::iterator it=a.begin(); it!=a.end(); it++)
b.insert(**it);
copy(b.begin(), b.end(), ostream_iterator<string>(cout, "\n"));
}
"""
The problem here, is that the strings in the set are compared by value, which
is not optimal, and I guess python compare them by adress ("s*n is s*n" has
the same complexity than "s*n == s*n" in CPython, right ?).
so, finally, the code in c++, about ten times faster than the equivalent in
python, must be :
"""
void print_occurence_of_unique_strings_compared_by_address(){
cout << "print_occurence_of_unique_strings_compared_by_address" << endl;
vector<string*> a;
string s1 = "What do you know?";
string s2 = "so long...";
string s3 = "chicken crosses road";
string s4 = "fool";
for (long int i=0; i<SIZE ; ++i){
a.push_back(&s1);
a.push_back(&s2);
a.push_back(&s3);
a.push_back(&s4);
}
set<string*> b(a.begin(), a.end());
set<string> c; // well ordered set (b is ordered by address)
for (set<string*>::iterator it=b.begin(); it!=b.end(); it++)
c.insert(**it);
copy(c.begin(), c.end(), ostream_iterator<string>(cout, "\n"));
}
"""
the result on my box is :
maric at redflag2 mar aoû 22 22:24:23:~$ g++ -O3 -o testcpp testcpp.cpp
maric at redflag2 mar aoû 22 22:24:29:~$ ./testcpp
print_occurence_of_strings
What do you know?
chicken crosses road
fool
so long...
print_occurence_of_unique_strings
What do you know?
chicken crosses road
fool
so long...
print_occurence_of_unique_strings_compared_by_address
What do you know?
chicken crosses road
fool
so long...
strings : 1.89
unique strings : 0.87
compared by address : 0.18
maric at redflag2 mar aoû 22 22:24:38:~$ python2.4 testpython.py
so long...
What do you know
fool
chicken crosses road
Elapsed: 1.680000 seconds
maric at redflag2 mar aoû 22 22:24:51:~$ g++ -v
Using built-in specs.
Target: i486-linux-gnu
Configured
with: ../src/configure -v --enable-languages=c,c++,java,fortran,objc,obj-c++,ada,treelang --prefix=/usr --enable-shared --with-system-zlib --libexecdir=/usr/lib --without-included-gettext --enable-threads=posix --enable-nls --program-suffix=-4.1 --enable-__cxa_atexit --enable-clocale=gnu --enable-libstdcxx-debug --enable-java-awt=gtk --enable-gtk-cairo --with-java-home=/usr/lib/jvm/java-1.4.2-gcj-4.1-1.4.2.0/jre --enable-mpfr --with-tune=i686 --enable-checking=release
i486-linux-gnu
Thread model: posix
gcc version 4.1.2 20060613 (prerelease) (Debian 4.1.1-5)
I've joined the full c++ file as an attachment.
--
_____________
Maric Michaud
_____________
Aristote - www.aristote.info
3 place des tapis
69004 Lyon
Tel: +33 426 880 097
-------------- next part --------------
A non-text attachment was scrubbed...
Name: testcpp.cpp
Type: text/x-c++src
Size: 2764 bytes
Desc: not available
URL: <http://mail.python.org/pipermail/python-list/attachments/20060822/be507338/attachment.cpp>
More information about the Python-list
mailing list