Proposal: Python Info Collective

T. Middletan T.Middletan at news.vex.net
Mon Nov 15 12:00:24 EST 1999


[ Hmmm, i replied to this message already earlier via the python-list
eGroups web interface... but it has not shown up in usenet, and also
is not listed in the eGroups new message list. So something seems
to have messed up somewhere. Just as well, it was a somewhat
annoyingly windy message, and terribly self depricatory, so just
as well it didn't make it and i'll edit it a bit and try again.
Oh dear, starting out committing the same crimes again already!
Oh well! Incorrible! ]

>The issue of indexing modules is an old one, and I'm still astounded
>that we've never converged on a solution, though everyone always
>complains about this.

Yes, i'm astounded by it too. I've been skulking around c.l.py for
some time, and I've seen the occasional flare-ups, and proposals, and
tries, and abandoned projects...

One day when i was looking desperately for a resource that I remember
seeing posted in the newsgroup and not being able to find it, i
snapped... and maniacally (if not furtively) thought i'd see if i
could put something together. I figured it'd be useful for my own
archival purposes if nothing else! And the result is Parnassus, such
as it is.

I resisted announcing it for many reasons. I wasn't sure i'd be able
to pull it off is one of them. Another is that I didn't want to be
another of the voices crying in the wilderness over this, and not
get anything done. And the rest of the reasons I'll skip, lest this
message become too windy and self-deprecatory!

So that is the story of that.

>and the Locator-SIG was started in 1995.  Looking through the SIG

Hmm, i missed that SIG. Will have to have a look.

>The backbone of Trove (http://www.tuxedo.org/~esr/trove/) is partially

I'd never heard of this until Jules mentioned it either. Very
interesting. I shall say more shortly.

>The problem is getting acceptance from the community; people need to

Yep. Which is another reason my project has an initial advantage in
that it was designed "for my own use, if nothing else"... i'm not
waiting or relying on acceptance. I have been stocking and updating
the database myself (so if nothing else, at least *I* can find things).
Before today only one other person had ever submitted anything. (Today
there was a small flood which i wasn't quite prepared for! But i've
managed to cope with it, it would seem <G>).

>always, *always* release things on CPAN, and follow CPAN's naming scheme.

Originally i had a longer more vitupritive rant here, but lets skip
it this time! Oh what the hell. Everyone skip the next paragraph
please. Thanks.

<personal rant> I hate CPAN. I have always hated CPAN. I will always
hate CPAN! All right, okay, in truth back when i was a PERL hacker i
thought CPAN was the greatest, very convenient, it's true. But after
the great light of Python descended upon me (actually it was kind of
thrust upon me as a "strong suggestion", at a time when i had no
interest in Python whatsoever --- thank D'Arcy of PyGreSQL fame!) my
perspective changed. I realised PERL was amazingly painful to code,
and even more painful to maintain --- why didn't i realise this
before? And as the scales fell from my eyes i realised that CPAN was
incredibly ugly and tedious to wade through, now that i thought about
it. A sea of README files, and ugly filenames.

My apologies to CPAN maintainers. Opinions may differ.

>We just need to push the Python community very hard to enter their data.

The current volume of python releases isn't that huge. At the moment
I don't mind (and have not minded) passively watching and snagging
everything manually. Though there is of course a fairly decent (though
could be improved) submission form at Parnassus ready and waiting for
anyone, which would make things even easier---although not quite
automated---at this end.

>seeding the initial database, and that's important, because it makes the
>index immediately useful.

My feelings exactly. That's the goal. To be useful. Immediately is the
best way to be useful.

>Yet I hadn't heard about it until now; is the
>site still beta?  Did I miss a c.l.py.announce article about it?

Oops, i guess i forgot to announce it. <-;

It's been more or less running for 2 or 3 weeks, maybe a month. With
no real announcement. Seems fairly stable. Had only gotten about 40
hits in that time. But since yesterday around noon i've gotten about
200. (Uh, this is up to about 300 now, since my first writing of this
message. Whee).

There are a lot more ideas and things i would like to add/change about
the site. Nothing is set-in-stone, so to speak. But i think it's
fairly stable and useable as it is now.

>There are various issues with it -- I don't know how scalable the
>underlying code is,

Well, yes, there are issues. But i'm not sure which exact issues you
are referring too. So rather than slagging my own site (as i'm very
tempated to do, and did in my first post!), i'll wait for complaints.
(-: Or suggestions.

One of the issues is (i must moderate my language here) that I'm
not a "real" programmer. I hope no one holds this against me. (-:
I'm not in any way trained in computer sciences or database
management. I'm just a self-taught hobbiest. It makes for somewhat
messy code, and lots of trial-and-error approaches.

What i'm saying is that i don't know exactly what you mean when you
say "scalable".

I'll just mention a few details, perhaps you can glean something from.

The data is contained in a relational database (being served currently
by PostgreSQL. The design structure is (i think) extremely flexible.
The "tree" system can be easily modified, and can be infinitely
deep (though of course this is not necessary). All objects can be
nested within any number of other objects in the database.

In fact when i went and looked at the Trove proposal afte hearing
about it from Jules i was surprised to see that in many ways my
database is very similar to that proposal. I think I could move
in the direction of compliance with it over time without too many
migraine headaches.

The code itself is much sloppier, but reasonable "modular" and
flexible (much of the main display code is based on a template system
of my own contrivance, for example). It can be expanded and modified
without insurrmountable difficulty at least, in my estimation.

However, yes, as you say above, and i already agree, there are
issues... inteface issues, design issues, all sorts of issues.

>the graphic design could be simpler,

Heh! I will say only that my main goal in design was to make it as
fast and easy to browse and search, as possible. My secondary goal
was, in reaction to the sterile environment of CPAN, to create some
"atmosphere". I may have gone a little overboard with that. But i
don't appologise... i like it. But alas, it may not look exactly
"professional", i admit. But it's functional despite that!

Of course it doesn't look very fabulous in Lynx. But even there it is
still fully functional... just a bit hard on the eyeballs.

>vex.net's reliability and bandwidth are unknown,

Yes, this is a concern to me also. It's not my server. I'm just a poor
fellow sitting in a basement with his old p166. I just have a lowly
shell account on vex.net. I have bandwidth quotas. About 100 megs a
week. I have absolutely no clue how much traffic Parnassus might
generate if it caught on... this is another reason i have been
hesitant to announce it. Perhaps someone might give me an estimate for
this?

As for realiability of the server itself, bandwidth aside... i don't
have too many fears there. D'Arcy is always making improvements. But
then again, i really have no real conception as to the potential load.

Here's what we are running at Vex. http://www.vex.net/tech.py
(though i think some of this page needs updating... that'll be
somethign close to that, i imagine. 200MHz AMD K6 system with 128M
RAM. Running Apache with PyApache at the moment. No persistant
database connection.

Maybe D'Arcy will notice the vex.net logo i put at the bottom of all
the pages and give me a break on the bandwidth and beef up the system
a little (he has been looking into some other alternatives to PyApache
lately I believe). (-: [ <mini-plug> Vex.net is a nice little server
and does offer relatively inexpensive shell accounts with Python (and
now Zope) installed! ]

>and we can argue about the classification hierarchy *forever* --
>but it's still valuable.

Oh, please, please ... argue it with me! Heh. Seriously, if anyone
wants to suggest better categorisations i'm *very* interested. As
mentioned above, i'm not all that educated in many aspects of
programming --- i've done the best I could (starting off with copying
the categories on the python.org Contrib page)... but i'm sure it
could be much better. The database is very flexible and changing
things around is not a problem in this respect.

I have no ego when it comes to this stuff! No one need worry about
offending my categorisation sensibilities. I actually explicitly
placed the "suggest categories" link beside the details view for
each item to cover up for these sins of ignorance of mine. I'm hoping
if people who know more see things in the wrong places they will be
swift to take advantage of the convenient form and let me know.

The one thing which has kept me from worrying about the
classifications *TOO* much is that i figure most people will rather
"search" for what they want than to "browse" for it. So hopefully
people can find what they are after that way, even if the
classification is a bit dodgy. The search engine could be beefed up
quite a bit too. I have many ideas of how to make it more efficient
and accurate... maybe in time.

>and has the incalculable advantage of being implemented.

Yes, that's my feeling as well. Perhaps Parnassus's only advantage.
Although "incalculable" may be a bit overstated! I'm certainly open to
anyone else's ideas though. I'm just taking things as they come, and
seeing how it all goes.

>Is everything you've done in the index?  If not, add the missing items.

Yep, everything is pretty much up to date, as far as data. What you
see is what you get. (Though i still do have a few hundred of the
oldest c.l.py.ann messages i occasionally browse through... but these
are 6 months or more old...) I think i've managed to enter just about
every reference which has been mentioned in c.l.py in the last few
months... and c.l.py.ann for longer than that. Of course i ripped
everything from the python.org Contrib page long ago. (-:

>Can we prominently link to it from python.org, so people begin to use it?

You could do that. Though i find the prospect slightly frightening.
And we may run into bandwidth issues. I'll be keeping an eye on things
fairly closely. If they get too out of hand i may have to ... i don't
know what! Open the vault only on Sundays. <-; Or perhaps some other
solution can be found.

These then have been some of the whys and wherefores.

Your Crawling Parnassus Slave (who's true goal in life is merely to
one glorious day be able to sport such resplendent white whiskers as
those of the ever glowing R.D.---a few of whose novels i have indeed
myself read---as seen in his image off on A.M.K's site!),

T.






More information about the Python-list mailing list