Python(2.5) reads an input file FASTER than pure C(Mingw)
hdante
hdante at gmail.com
Sun Apr 27 21:17:51 EDT 2008
On Apr 27, 4:54 pm, n00m <n... at narod.ru> wrote:
> Another PC, another OS (Linux) and another compiler C++ (g++ 4.0.0-8)
>
> Compare 2 my latest submissions:http://www.spoj.pl/status/SBANK,zzz/
>
> times: 1.32s and 0.60s
>
> Submitted codes:
>
> import sys
> z=sys.stdin.readlines()
> print z[5]
>
> #include <cstdio>
> #include <cstdlib>
> #include <vector>
> #include <string>
>
> using namespace std;
>
> vector<string> vs;
>
> int main() {
> while (true) {
> char line[50];
> if (!fgets(line,50,stdin)) break;
> vs.push_back(line);
> }
> return 0;
>
> }
>
> If it proves nothing then white is black and good is evil
It seems that the "push_back" line takes most of the time of the
code. Remove it and execution will drop to 0.25s.
Python readline uses fread instead of fgets:
http://svn.python.org/view/python/tags/r251/Objects/fileobject.c?rev=54864&view=markup
(see the file_readlines function)
If you write a code that does an fread loop, execution will drop to
0.01s.
This C code takes 0.25s. Almost all time is spent with string
manipulation.
#include <stdio.h>
#include <string.h>
#define B 8192
char vs[100000][40];
char buffer[B];
int main(void) {
int count;
char *begin, *end;
int i;
i = 0;
while (1) {
count = fread(buffer, 1, B, stdin);
if (count == 0) break;
begin = buffer;
while(1) {
end = (char *)memchr(begin, '\n', buffer+B-begin);
if (end == NULL) {
memmove(buffer, begin, buffer+B-begin);
break;
}
memmove(vs[i], begin, end-begin);
i = (i+1)%100000;
begin = end + 1;
}
}
return 0;
}
The difference, 0.60s-0.25s = 0.35s is probably mostly python's
memory management (which seems to be much more efficient than
std::vector default).
Very interesting post. :-) I had no idea about how much optimized the
builtin library was.
More information about the Python-list
mailing list