When Python outruns C++ ...

Alex Martelli aleax at aleax.it
Tue Apr 1 04:07:52 EST 2003


Jacek Generowicz wrote:
   ...
> Will you make this code public ?

Sure, it's nothing special -- just a tiny, simple example.

Here's the naivest Python I could come up with for the
task (Python 2.3 given the use of enumerate, but that's
easy to remove if you want, of course):


# build a word -> line numbers mapping
import sys
idx = {}
for n,line in enumerate(sys.stdin):
    for word in line.split():
        idx.setdefault(word,[]).append(n)

# display by alphabetically-sorted word
words = idx.keys(); words.sort()
for word in words:
    print "%s:" % word,
    for n in idx[word]: print n,
    print


Note that "words" are defined in the naivest way (just
the way the >> input operator will build them in C++),
as whitespace-separated sequences of non-whitespace.

Here's the closest simple translation I could easily give 
for this in C++:


#include <string>
#include <iostream>
#include <sstream>
#include <map>
#include <vector>

int main()
{
    typedef std::map<std::string, std::vector<int> > index;
    index idx;

    std::string line;
    int n = 0;
    while(getline(std::cin, line)) {
        std::istringstream sline(line);
        std::string word;
        while(sline >> word) {
            idx[word].push_back(n);
        }
        n += 1;
    }

    for(index::iterator i = idx.begin(); i != idx.end(); ++i) {
        std::cout << i->first << ": ";
        for(std::vector<int>::iterator j = i->second.begin();
                j != i->second.end(); ++j) {
            std::cout << ' ' << *j;
        }
        std::cout << "\n";
    }

    return 0;
}


and here is a slightly optimized Python version:


def main():
        import sys
        idx = {}
        listofword = idx.setdefault

        for n,line in enumerate(sys.stdin):
            n = str(n)
            for word in line.split():
                listofword(word,[]).append(n)

        words = idx.keys()
        words.sort()
        emit = sys.stdout.writelines
        for word in words:
            emit((word,':',' '.join(idx[word]),'\n'))

main()


Alex





More information about the Python-list mailing list