[XML-SIG] Content Syndication

Mark Nottingham mnot@pobox.com
Fri, 2 Jul 1999 19:08:40 +1000


> HTML is a display-oriented format.  It usually is not even well-formed in
the
> xml sense.  Further, it has great potential to break the layout of your
page,
> for example, if the publisher embeds a </TABLE> tag.  It is possible to
watch
> for this, and avoid it, but its not exactly 'simple'.  The real problem is
that
> it gives the publisher control over the display of the content. In a
syndicated
> system, I think what you really want is to be able to publish *data*, and
let
> the receiver format it however they choose, so long as they can understand
it.

I"m with you and in complete agreement. It doesn't make sense at all to have
HTML in. However, some people already do it; for instance, passing <i>, <P>
and <BR> to format their text.

Some people will want to put formatting into the channels, but IMHO that
level of detail belongs in the original, cited content, not a short 'teaser'
to the link. This sort of stuff should be spelled out clearly to potential
content providers, and enforced by aggregation/presentation engines.


> On second thought, one thing that might be a cool compromise is if we had
an
> optional tag indicating which style-sheet(s) the publisher thinks should
be
> used.

Don't know how that would be incorporated, but it's interesting to think
about.


> If you have a real need to transfer HTML documents, then what you need is
> something like ICE that takes care of the packaging and tagging of
documents.
> Otherwise, what we provide as a subset will never be fully html
compliant -- eg
> some tags won't work, and will also be problematic from a validation point
of
> view, since HTML is generally not "valid" in the xml sense.

Assuming you're not just reponding to my question rhetorically, when do you
think such a capability would be necessary/desireable?


> I'd like to hear people's thoughts on this topic.  I'm going to be gone
till
> Tues though, so if I'm quiet, that's why.  My own feeling, having worked a
> little with RDF and metadata previously, is that the goal should be to
transfer
> data about resources.  We should not try to dictate how that data will be
> presented, if at all, which is what happens when embedded HTML is allowed.
> Reader's comments and related links could certainly be construed as
"metadata"
> about a given resource, so it would be nice to be able to transmit these
as
> well.

Hmm. A commenting/annotation capability is intriguing. My initial thought is
that perhaps there is a need for a separate annotation format, that can
optionally be used in conjunction with this; my concept is that this is
primarily resource discovery; annotation is another domain which may
interoperate with it, but the overlap is minimal. Have you seen
http://www.thirdvoice.com/ ?