Re: Content type for /site-meta (or HTTP header fragment format)

Previous Topic Next Topic
classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view

Re: Content type for /site-meta (or HTTP header fragment format)

Phil Archer-3


I've made a couple of minor comments on this proposal which in general I
like as it does seem to be the well known location to end all well known
locations (which I reckon is about the only justification there could be
for a new well known location).

Eran Hammer-Lahav wrote:

> Context
> The /site-meta proposal (a known-location solution for site metadata) [1]
> includes a simple XML format for representing site metadata directly or via
> links. In discussing the proposal and the appropriate format for the list of
> meta resources, John Panzer suggested using a simpler text format [2]
> directly based on the content of the Link header [3].
> While I see the value of an XML format for this data, and was the main
> supported of it, I now strongly support the idea of using a super-simple
> text-based document. Partially because it fits better with the current
> use-cases, and partially because I am an editor of a "competing" XML format
> which covers this use case (XRDS/XRD) but is too complex to be positioned as
> the default form.
> I would like /site-meta to list a single text-based format with a clear
> Content-type associated with it. I also want the spec to explicitly allow
> user-agents to request other representations of the /site-meta resource with
> the default being the super-simple-text-based version. One such
> representation (I expect to be widely supported) will be
> application/xrd+xml.
> Some Questions (and answers)
> - Should the /site-meta text format be restricted to a set of links or
> provide an easy path for extensions of some other kinds of records?
> While I can't come come up with compelling use cases for /site-meta to
> directly include other metadata, it is likely someone else will in the
> future.

I fully understand the desire for extensibility and for not imposing
restrictions unnecessarily. However, I do think it would be a big
mistake to allow a /site-meta file to include anything other than links
to data. Let's imagine you allowed, say, Dublin Core and Creative
Commons to be encoded in a /site-meta file directly. Why not? They're
well-defined, well used metadata systems that can often be applied to a
whole site.

Why make people put this in a separate file when it could, surely, go in
the /site-meta file? Well, you could allow it, and any other metadata -
and hey presto you've just reinvented a WKL for POWDER, XRD and whatever
comes next.

No... if /site-meta is the WKL to end all WKLs then it has to be just a
set of pointers to where the 'real data' actually is. So I would say
that there is a case for deliberately limiting the extensibility. As you
go on to point out, if it supports an HTTP Link-like structure, that's
already flexible and it meets the need. When extensibility leads to
mission creep, things will go wrong.

By replacing each record in John's proposal:

> ---
> /robots.txt rel="robots"
> /p3p.xml rel="privacy"
> rel=""
> ---
> with actual Link headers:
> ---
> Link: </robots.txt>; rel="robots"
> Link: </p3p.xml>; rel="privacy"
> Link: <>; rel=""
> ---
> other record types can be added in the future.

Indeed. Here are two that come to mind:

Link: </styles.css>; rel="stylesheet"; type="text/css"
Link: </powder.xml> rel="describedby"; type="text/powder+xml"

The mobile world would probably like something like

Link: <>;

Link: <>;

(I'm basing this on the metaTXT work just getting going [PA1])

Oops... I'm straying into mission creep there aren't I? I mean, are
those URIs links or metadata? I hope it doesn't matter - I've used URIs
where URIs are allowed.

One thing I have done in my first 2 examples is to include the type
attribute (which if we're following the HTTP Link format is allowed and,
IMHO, should be encouraged!)

  This also means the same code

> used to read Link headers (or HTTP headers in general) can be used for this
> format. This also plays nicely with the idea of equating links in /site-meta
> to Links in individual resources' HTTP response headers.
> - Should /site-meta define its own content type, use an existing content
> type, or define a new generic content type?
> If we take the route of using an HTTP-header-like format for /site-meta, is
> there value in making this format generally available for other resources.
> RFC 2616 offers a similar construct in the form of message/http. It seems
> that as long as the document can be considered a valid HTTP request or
> response, we can use this content type.
> So /site-meta can be considered a body-less HTTP response with Link headers.
> The question is, is such a header-fragment allowed in a message/http
> document? It is not clear if in this use-case, the Date header may be
> omitted, which is otherwise required for a valid response header. The Date
> header makes little sense in this context and should be omitted. Note that
> the HTTP header for GET /site-meta must still include Date.
> In Conclusion
> 1. The idea of allowing multiple representations for /site-meta resources
> suggests the use of a more generic content type for the default (and the
> only required) representation than application/site-meta.

I'd stick with one format. Choice can be overrated and leads to
confusion (and you thought I was a dripping wet liberal? Only when it
suits me ;-) )

> 2. There is value in using a single mechanism for metadata discovery, either
> for an individual resource (via HTTP Link header or HTML/ATOM Link element)
> and for a domain authority (via /site-meta list of links). Using the exact
> same semantics between HTTP Link and /site-meta links seems productive.

Agreed. And this further supports the one-format point.

> 3. Preparing for some unknown need for extending /site-meta while not
> increasing complexity (assuming Link header structured is simple enough)
> seems like a good idea.

Yes - but the flexibility is in the relationship and content types. Sign
posts can point to towns, multi-lane highways, country dirt tracks and
little 'ol houses on the prairie, but they're still sign posts and
that's what, for me, /site-meta is about. Enough with the flexibility.

Actually, at least 3 use cases - robots.txt, p3p and POWDER - all have
their own method of defining which sections of Web sites they refer to.
If there is an argument for making /site-meta more complex or flexible,
I'd say it would be in the area of defining a common method of doing
that - but that means re-writing those specs so let's not go there.

> Action Items
> * Change /site-meta draft to use the Link header format instead of the
> current XML proposal.


> * If allowed, use message/http as the default content type for /site-meta.
> If not, register a new content type, preferably something like
> application/http-header-fragment, or just application/site-meta.

Why application? I'd say text was more appropriate. Application suggests
something really complicated that needs a lot of processing. This is
just a bunch of links and a little syntactic sugar.

> * Clarify that the content of /site-meta does not describe any actual
> resource or URI, but the abstract concept of 'web site' or 'domain
> authority', expressed as an HTTP header. In practice, it is still just a
> registry for resource locations to avoid more known-location solutions.




Phil Archer