[Distutils] Distutils integration

T-Muthy Meddleten as544@freenet.toronto.on.ca
Mon, 06 Dec 1999 08:02:50 -0500


Thu, 2 Dec 1999 11:51:04 -0500, Greg Ward <gward@cnri.reston.va.us> wrote:
>out of existence (or so it seemed to me when I was trying desperately to
>track down a tarball for Perl 5.001m, rather than the 5.001 tarball and

See, you needed a nice meta-archive to stitch all them damned
collections together! <-;

>Perl, go visit that site *now*.  From the top, you can browse by module

I have visited. You are right, it is very nice. Very clean, very fast.
Looks to be high quality stuff. Very well organised for the most part.
(My only quibble being that i find the search results a little
difficult to scan---and there seems to be no sorting options---i have
sort by date/label ascending/descending... and that's only because
i've been too lazy to implement the 'relevancy' system i've had in
mind, as of yet---the idea is to basically weight the various fields
in different ways, assuming matches in certain fields are more
indicitive of what people actually are looking for... also i wanted to
put in custom sorting for the category listings---but, i go off on a
vapourware tangent).

(Also i've got boolean capabilities --- surprisingly search.cpan
doesn't. They have a simplfied regex --- actually my postgress search
is completely regex based ... but no one knows it (which can lead to
problems, like whens someone searched for "c++" yesterday... boom).
Heh. But that's mostly by incident than by design... one may argue
that these things are generally overkill for this type of searching,
of course---i have many ideas for improving the search engine... just
haven't got around to fooling around with them all yet).

>by author name (or ID -- every CPAN contributor has a user ID), module

Every "owner" has an ID in my database too, rest assured. They just
don't know it yet. <-; Every "owner" in my database also has a
randomly generated password. They don't know that yet either. Heh. No
need for them to know, it doesn't do anything; but i put it in because
it seemed like it might be useful for something sometime...

>name, or distribution name.  You can search the text of all the
>documentation for every module uploaded to CPAN.

Ah, now you've given me more devious ideas... mmmm. When i'm coding
the link checker, why not suck back all the pages and index them while
i'm at it. (-: Interesting. Heh. Yes i know, not the same as
documentation search. Perhaps i should add yet another URL field to
each item: URL to documentation. (But then probably no one would fill
it out---just like so far no one has filled out the "download" url
when making submissions---they have no real insentive to. They might
fill out the download url however if they had a 'distutils' compatable
package and Parnassus flagged it as such. Likewise if Parnassus
implemented "documentation" search people might be more motivated to
enter a "documentation" url in order to be a part of the database...
on of them chicken and egg things... needs to be implemented before
anyone will support it... if they would at all, that is).

I shall now say yet another something you will disagree with: i hate
POD! However, alas, i shall concede that POD is certainly a hell of a
lot better than what python has---virtually nothing. Pythondoc is
quite nice, but no real standards for formatting, and has been dormant
a long time.

Certainly I can't deny that the documentation search there is not very
nice. But it's not really possible for python in the current state of
things---even if we had a central archive. The central archive would
still tend to be a mess. <-; Witness ftp.python.org. This is not
really an archive vs. meta-archive issue.

>trying to remember why I use Python.  Oh yeah, it has actual syntactic
>support for OO programming, that's why.  ;-)

Painful when such a crappy language has such a nice resource eh. <g>
I sympathise with your pain.

>solutions: scan the database regularly and rigorously, bitching at

This is my plan. Though i'm still thinking about the details. For
instance, another complication of a meta archive is that one doesn't
know the state of all the various servers that resources are on...
perhaps one goes down for a day or two, but comes back up. My idea is
to have 'degrees' of brokenness. Sort of. First give a broken link
maybe a day or two to see if it comes back (maybe). Then squirel away
the broken links and re-check them periodically.

Then there's also the automated spam---a great way to make new
friends!

>downloads stuff from developer's pages and puts it somewhere whose links
>*won't* break.

You know this would be pretty easy to do also really, for the most
part. If one had the server space. Mirror the category structure, and
just autoamtically go fetch things and drop them in the directory.
Something to think about perhaps.

>trying to find some free Java classes (because Sun somehow forgot that
>"the programming language for the Internet" might need to do MIME

ARGH. Finding java stuff is hell! I agree. I haven't done it for a
while, but i still recall the trauma of the last time i tried. Seems
SUN mysteriously left out all the good stuff! That's one of the things
that really attracted me to python: it seemed like it included all the
good stuff! Or very large subset of it, at least.

>decoding -- oops).  There is a big meta-archive at gamelan.com, but
>for everything I found -- bam, broken links everywhere.

Yes, don't they publish a book or something, so people can try and
find stuff in there? <-;

>careful link-policing!

Ah linking policy. On that note, yes, although in my previous message
i advocated the meta-style archive on the basis that in my opinion
poeple feel freer to add stuff---it's less official, less
constraining. But of course this does have the downside of generally
less quality control. I'm sure CPAN has fairly high quality
submissions on average (without imposing any external controls). I
mean, anyone who takes the trouble to package up their stuff for CPAN
is probably not going to do it frivolously. Whereas anyone might
submit anything to Parnassus on a whim, in any state, and half the
time i have no idea what it is.

I had thought of trying to institute some sort of rating system---yes,
another field in threw in the database, "just in case"; but probably
it wouldn't be used much. But who knows... maybe in the future. (I
think freshmeat has something like this? For now the rating system is
the number of "clicks". <g>)

>I will continue to argue for just the right amount of bondage,
>discipline, structure, and bureaucracy to achieve that level of
>automation.

Oh, i agree completely.

>The trick is using the *bare minimum* of bureaucracy that

And i agree there too even more. (-:

>listen to people like you and Greg Stein arguing for anarchy.  ;-)

I wouldn't say i'm arguing for anarchy. After all Vaults of Parnassus
is a totalitarian regime as it stands. Though hopefully not fascist.

Anyhow, interesting (if not provoking---which is even better)
thoughts. Thanks for your brutal criticsm. (-: VoP's only been up for
a matter of weeks. And as I was saying to Martijn recently, there were
around 700 click throughs for resources (these are clicks going off
the site, not views of Parnassus pages) during the week after i reset
the stats (over a thousand now --- hasn't quite been two weeks yet)...
that's a thousand python resources this month that people might not
have seen, bothered to look for, or in some cases (dare i say many?)
even known to have existed. Even if the site shut down tomorrow, i
think some good as been done at least.

-- 
 ..,.,.,,..,...,.,.,.,.,.,.,,,,.,.,.,...,.,.,..,.,,.,..,..,.,..,.,.,..
  Tim Middletin =-=-= not all who wander are lost =-=-= x @ veX . net
 `'~`'~`'~`'~`'~`'~`'~`'~`'~`'~`'~`'~`'~`'~`'~`'~`'~`'~`'~`'~`'~`'~`'~
   * Dec 6th * International Day of Rotting Cabbage