THE IMPORTANCE OF MAKING THE GOOGLE INDEX DOWNLOADABLE

gen_tricomi ogahejini at yahoo.com
Wed Jun 7 10:11:22 EDT 2006


THE IMPORTANCE OF MAKING THE GOOGLE INDEX DOWNLOADABLE



I write here to make a request on behalf of all the programmers on
earth who have been or are intending to use the Google web search API
for either research purposes or for the development of real world
applications, that Google make their indexes downloadable.

Currently application programmers using the Google web search API are
limited to 1000 queries a day. This on the one hand is a reasonable
decision by Google because; limiting the queries will prevent harm on
the Google system by unnecessary automated queries; but it is also
limiting us programmers severely. The query limit limits the usefulness
of whatever applications we decide to craft out and even limits our
imagination on what is possible with a handful of indexes.

Firstly, I will commend the Google Corporation for opening their
preciously crawled indexes. This is a great service to humanity and
especially to the band of programmers who are interested in
epistemology and are using the Google web search API to enable them
achieve their goals.

Google would be doing another great service for us if they would make
their indexes downloadable to programmers with a good interface for
programmatically accessing the indexes.

The advantages of the above approach would be:

1.	Decentralizing the Google system.
2.	Reducing the overhead of queries on Google from programmers.
3.	Enabling programmers to craft out applications that run on their
local systems (only requiring internet connection when a web page is
needed since the links return on a query are the most important in the
result set)  thus enabling them have unlimited number of queries should
these applications go public.
4.	Give Google the competing edge in search engine technology and user
satisfaction by gaining programmer loyalty.
5.	Encouraging the global adoption and use of the API + INDEXES
provided by Google.
6.	Another good thing may be here for Google if they create mechanisms
in the downloaded INDEXES + API that enable programmers update the
indexes from the web. An agreement can now be made that Google will
have unlimited access to the indexes whenever the user's computer is
online and IDLE. So Google update its own indexes from the ones on
various programmers' local machines. Thereby building a truly
distributed global crawler. This can be achieved using grid
technologies thereby possibly cutting down the 300year range for
crawling the world's crawlable information.


Google may still enforce their terms of service by enforcing some kind
of authentication for the use of index already residing on the
programmer's local machine. Though it may not require that the
programmer be on the internet every time he/she wishes to access the
system; since the programmer may wish to tinker with the API and
indexes locally without requiring an internet connection. Online
authentication may be required anytime the user gets online. The
non-commerciability of the indexes must be emphasized through several
schemes.


The Google API can be a tool for epistemological engineers to craft
future Infowares (Information Applications). The most important thing
in the indexes is the links to resources that are returned on queries.

2 versions of the API + INDEXES can be made available.
1. The one without cached pages attached. So that on querying the API
on the local machine with the locally stored indexes, the results are
like those on the regular internet API result set.

2. The one with the cached pages. This one is optional as it will be
large in size.

If you people were good enough to release your API's publicly then
you would also consider this request.

It would be good if the API + INDEX download is accessible by
programmers who program in the following languages:
(a)	Python
(b)	Java
(c)	Perl
(d)	Ruby

Or some language independent mechanism can be formulated so programmers
in various languages can access the API + INDEX download.

Page Rank may or may not be included in the package depending on
decisions at Google.

It may also be closed source / open source / or partial source (part
open part closed).

This will be a great service to humanity and to programmers especially.

Thanks,

Ogah Ejini,

Nigeria, West Africa.

Mobile: +234 802 601 5061




More information about the Python-list mailing list