"Zope-certified Python Engineers" [was: Java and Python]

Fri Mar 22 13:33:13 EST 2002

>On Thu, 21 Mar 2002, John J. Lee wrote:
>
>In general, and for a Python archive in particular, I guess there are two
>problems: which metadata, and which software.  So, what's wrong with the
>web + Google (not saying there isn't anything wrong, just interested in
>how you would improve on it)?

The problem is more with web-sites, than with Google, though Google has
made its own little problem.  Most people only visit the top 3 or 5 hits
that Google finds.  So, if you make a query, and web-page 7 is the best
of the lot, you will still tend to read the top 3.  There isn't any
way to say 'this site was useless for me, consider me one unhappy customer'.
Now consider the case where site number 15,899 is your best bet.  Unless
you can come up with the sort of search terms that makes this site come
up higher -- you are never going to see it.  This is a neat problem.

One days reading of your weeks spam will yield you many offers of
programs and advice on how to make your website rise in the Google
ranking.  This reflects the reality that once you have made it to the
top, you basically stay there, so an effort really pays off.  These
companies are, of course, fuelling an arms race, but then some of
them are advising metadata, so I do not mind all that much.

The deeper problem is one of the web itself.  A few months ago, I
wasted 3 days trying to find either a) a bug in my Haskell program, 
and when that looked very unlikely b) a bug in the floating point of
my C libraries. That looked real promising.  Alas, the real problem
was that the formula I had taken from the most popular Google site
for finding the formula I was interested in making into an algorithm
had a + sign where what belonged was a minus sign.

I can't report this; I can't get it fixed; and right now some other
poor soul may have done exactly what I did.  This is worrysome  We
now have too much information at our disposal, vastly too much, when
throughout history we were more likely to have too little.  The problem
now isn't finding stuff -- it is knowing how trustable the stuff is.

EBay has one sort of approach.  So does Advogato (see http://advogato.org ).
It is a real problem and one that interests me a lot.

>[OT: A question I've asked many times and got no answer to is 'why has
>nobody done self-consistent ranking for academic papers'?  I fear the
>answer is simply that most of the data is owned by a small number of
>companies who, as a result, have no real incentive to advance the state of
>the art.  Still, maybe somebody out there is trying to do it...]

Oh yes! Many people have been trying to do this.  See
http://www.isinet.com/isi/news/2001/productnews/8117876/index.html
for one attempt.

>
>
>John