[Catalog-sig] metadata

Martin v. Loewis martin@loewis.home.cs.tu-berlin.de
Wed, 5 Sep 2001 09:01:21 +0200


> I've started work on a fourth catalog implementation.
> I've had a few questions/comments, i wanted to send to
> the list to resolve regarding metadata.

While this is a laudable goal, I have a few procedural concerns with
your message.

> the key for interoperability among the three is having package
> metadata.

More specific, the key for interoperability is having *standard*
metadata.

> looking over pep 241, i can note several deficiencies
> that i would like to address. While the use of rfc822
> for metadata definition does lower the author burden
> is unextensible and creates the opportunity for
> ambiguity in the metadata, i'd like to change this to
> an xml based format. 

This is where my procedural concerns start. PEP 241, as-is, is already
implemented in distutils and currently available to users through
Python 2.2aX. Any change at this point in time needs a *very* good
reason for the change, such as the PEP being unimplementable, not
achieving the goal it is intended to achieve, etc.

IOW, changing the format of the metadata now will significantly slow
down progress on producing a catalog implementation, and getting
packages registered with it. Thus, we might get the perfect system on
paper; I'd rather prefer an incomplete system in reality.

As for XML specifically: What problem does the mere switching to XML
achieve? I believe your claim that the current format is unextensible
is incorrect: The Metadata-Version was put in precisely to allow
future extensions. I'd strongly discourage "proprietary" extensions at
this time, so not being able to put in those is a good thing: Any
extensions used ought to be published and documented, in a revision of
PEP 241.

> probably the biggest problem with adoption of pep241
> is the lack of dependency info. Dependency info should
> be both version specific and capable of being os
> dependent. 

Because package dependency is really hard, I believe it was
deliberately left out from version 1.0 of the metadata. That means
that any package author requiring prerequisite packages should put the
prerequisite list into the Description, with the user of the catalog
being responsible for fulfilling the prerequisites.

So lack of dependency info is IMO a key to success, rather than a
problem.

> there is also an assumption within the pep241 and 243
> i'd like to address. namely that the author of a
> package will be the person to upload a package. at
> least initially this is likely to be unlikely,
> especially during an initial rush to fill up the
> repository via some semi-automated extraction from the
> vaults.

That's a good point. Should we support a Packager field in addition to
the Author field (which, of course, requires a new Metadata-Version)?
Alternatively, would could encourage uploaders to put their name into
the Author field, and put the "true" author into the Description.  I
doubt the true author would be happy to receive complaints about the
packaging when she didn't even know somebody uploaded the package.

That also relates to the question how package uploads get approved;
that is something that whoever operates the catalog needs to find a
policy for. E.g. some uploaders could get permission to upload
packages they didn't author (you can find out the signer of a package
from the signature, right?)

> i'll try and write up an xml schema which defines this
> package-metadata xml format.

Before you do so, I'd like to hear more what problems you expect to be
solved by an XML format over the PEP 241 format.

Regards,
Martin