Comments on the draft W3C Annotatea specification

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

Comments on the draft W3C Annotatea specification

Stephen Crawley-2

These comments are based on the draft specification on the W3C
website at:
http://www.w3.org/2002/12/AnnoteaProtocol-20021219

[There is a more recent draft on the www.annotea.org website at:
http://www.annotea.org/Annotea/User/AnnoteaProtocol-20051226.html
which includes a whole new section on Bookmarks and Topics.  However, I
don't believe it has any official standing.  If it does ... or
should ... have standing, then it would be a good idea if someone at the
W3C added it to the W3C website!!]

0)  The structure and language of the current draft is not really
appropriate as a specification. This is understandable given the history
of the document, but it would need to be addressed if Annotea is to
progress to a W3C recommendation.  (IMO, this work needs to be done
anyway.)

1)  The relationship between the Annotea Protocol and Annotea RDF
schemas should be spelled out more clearly.  I would expect to a
reference in the Annotea Protocol document's overview, and a later
section that lists and describes the core schema properties, and states
whether they are mandatory, recommended, optional, single-valued,
multi-valued, whatever in the context of the protocol.  (I acknowledge
that making properties mandatory goes against the grain with RDF, but an
Annotea client or server-side implementation needs a clear "contract"
which spells out what is required for the protocol to "work".  For
example, the "annotates" property needs to be mandatory in some
contexts.  This is noted in the spec at some points ... but are there
other properties like this?)

2)  Some aspects of the schemas may need to be more tightly specified.
For instance, (IMO) the Annotea Protocol needs to REQUIRE that
'modified' and 'created' properties (if provided) are well-formed XML
dateTime values.  (The Annotea Schema "recommends" use of XML dateTime
syntax, but the examples in the Annotea Protocol document ignore this!!)

3)  The examples in the Annotea Protocol document should probably not
use the 'date' property, as this has no clear meaning.

4)  When POST-ing or PUT-ing an Annotation, a) is the Client allowed to
include arbitrary properties in the RDF, and b) is the Server allowed to
or required to store these properties and return them in subsequent GET
responses?  I think the answers are a) YES and b) ALLOWED not REQUIRED,
but this is not specified.

5)  Which of the listed header fields in the various Annotea requests
and replies are MANDATORY and which ones are just recommended?

6)  When I POST or PUT an annotation with an embedded body, the body is
assigned a URI.  The examples show URLs for the body taking the form
"/Annotations/body/XYZZY", where "/Annotations/XYZZY" was the url for
the annotation.  Is the string "body" required to be present, or could
some other string be used?  Similarly, could the unique part of the body
URI ("XYZZY") be different to the corresponding part of the annotation
URI?

7)  If I PUT or DELETE an annotation that was previously POSTed with an
embedded body, what should happen to the old embedded body?  Bear in
mind that the old body would have been given a URL, and that it could be
the 'object' of some other annotation property.

8)  When I PUT an annotation and the new version leaves out a property
that was present in the old one, should the server delete the property?
I think the answer should be YES, but I don't think that the spec says
this clearly.

9)  If I POST or PUT an annotation containing a non-standard property
whose value is a blank-node, and the Server decides to accept it.
Should the Server assign a URI for the blank-node in the same way that
an embedded body it assigns one for an embedded body?

10)  The "Optimizing query requests" section says that a Server may
refuse to honor a combined request.  What HTTP status code should it set
in the response?  (The Client needs to know this if it is going to retry
the requests individually.)

Finally, here are a couple of areas where I think that the Annotea
Protocol specification needs to be extended:

1) Now that SPARQL is the 'standard' way to express RDF queries, the
Annotea Protocol spec consider specifying how to express SPARQL-based
annotation queries.  (The Algae protocol material could potentially be
deprecated or excised.)

2) Consider extending the Annotea Protocol with mechanism for saying
what annotation properties are required / allowed by a given server.
For instance the Annotea server that I am currently developing will
require a 'context' properties, and may require others.  (I would
envisage defining a bunch of annotation schemas as RDF schemas, and
performing some server-side validation to ensure that annotations
conform with the schemas.)



Reply | Threaded
Open this post in threaded view
|

Re: Comments on the draft W3C Annotatea specification

Stephen Crawley-2

Here are a couple of clarifications / followups to my previous post:

On Tue, 2008-11-11 at 13:27 +1000, Stephen Crawley wrote:
> 6)  When I POST or PUT an annotation with an embedded body, the body
> is
> assigned a URI.  The examples show URLs for the body taking the form
> "/Annotations/body/XYZZY", where "/Annotations/XYZZY" was the url for
> the annotation.  Is the string "body" required to be present, or could
> some other string be used?  Similarly, could the unique part of the
> body
> URI ("XYZZY") be different to the corresponding part of the annotation
> URI?

6a)  If the embedded body has a URI, should this URI be preserved or
should the body be stored with a new URI?

> 7)  If I PUT or DELETE an annotation that was previously POSTed with
> an
> embedded body, what should happen to the old embedded body?  Bear in
> mind that the old body would have been given a URL, and that it could
> be
> the 'object' of some other annotation property.

7a)  The spec clearly states that the DELETE should cause an embedded
body originating from an earlier POST or PUT should be deleted.  (My
mistake ...)  But it does not say if a PUT causes an old embedded body
to be deleted immediately.  

11)  The spec does not say what should happen if you POST two
Annotations with embedded bodies that have the same preallocated URI,
and then you DELETE one of them.  Specifically, what should happen to
the "shared" body.  [Clearly, there are ways to avoid this difficult
case.  One way is to answer 6a) above by saying that the embedded body
should always be given a new URI.  That would mean that the body cannot
be shared.]

12)  What should happen if you try to use DELETE to delete an embedded
body; i.e. by passing it "/Annotations/body/XYZZY"?  What about using
PUT to update an embedded body?  (Maybe the best answer to these cases
is to say that they are beyond the scope of the spec.)

13)  The Annotea Schema appears to allow properties like 'annotates' and
'body' to be repeated in an Annotation.  Should this be forbidden?  What
about the case of an Annotation with no body property?  (An Annotation
with no 'annotates' property is already forbidden.)



Reply | Threaded
Open this post in threaded view
|

Re: Comments on the draft W3C Annotatea specification

Stephen Crawley-2

Here are yet more comments:

14)  If a query (e.g. GET ?w3c_annotates=... or GET ?w3c_reply_tree=...)
matches nothing, should the server respond with a 404 status code, or a
200 status code with no content, or a 200 status code and an empty RDF
element.  (It appears that annotea.w3.org does the latter ... and
certain clients assume this is what will happen.)

15)  In general, the Annotea spec should clearly specify what HTTP
response statuses and contents that a client should expect, in normal
and (Annotea related) error cases.  This would cover 14) and related
points.

16)  In a Reply, are the 'root' and 'inReplyTo' properties mandatory?  I
think that the answer should be YES, but this is not clearly stated.

17)  The specification discussion of orphan discussion threads in 3.1
and 3.5 seems to suggest that a server MAY wish to prevent them
happening by blocking certain deletions.  But supposing that the server
doesn't do this, the w3c_reply_tree query (as defined by the spec) will
return a tree with nodes missing.  The spec should make it clear that a
client should expect this, and that a fully functional client UI is
expected to be able to display the orphaned subtrees.

18)  A related issue to 17) is that a server may implement access
control extensions;  e.g. Vannotea allows an annotation or reply to be
tagged with an access control policy.  You could get a situation where a
reply R1 is not visible to user Fred, but R2 which is a reply to R1 is
visible.  If Fred issues a w3c_reply_tree query, the response may show
R2 as orphaned.  This is another reason why a client should be able to
cope with this.  [It might be more sensible to include the R1 reply in
the response with various properties removed.  However, the corporate
access control policies may mandate that all traces of R1 ... including
its identity ... are hidden from Fred.]

19)  In the introduction to section 3, it says the following about
't:root':  "whose value is the URI of the resource naming the start of a
discussion (in this case, the annotation that was first replied to)."
Perhaps I am being pedantic, but that >>could<< be read as allowing the
't:root' property to be the URI of a Reply ... thereby starting "a new
discussion thread" whose subject is the Reply.  IMO, it would be a good
thing for the text of the spec made it clear that this was not allowed;
e.g. by saying that the 't:root' property MUST be the URI of an
annotation.

20)  I couldn't find anything in the Annotea Protocol spec that says
that a reply tree must be a strict tree.  Should a server ensure that
this is the case by testing for cycles, etc in POST / PUT requests?  Are
the 'root' and 'inReplyTo' properties required to be single valued?  If
the server is not required to check these things, is a client expected
to be able to cope with cycles and shared nodes in a response from a
w3c_reply_tree query.  IMO, server should be required to prevent cycles
and shared subtrees, or at least to filter them out of a w3c_reply_tree
response.






Reply | Threaded
Open this post in threaded view
|

Re: Comments on the draft W3C Annotatea specification

Stephen Crawley-2

And a couple more:

21)  In Section 2.1.2, the spec talks about using the HTTP Message
schema to encode XML-based formats such as XHTML, MATHML and SVG as
annotation bodies.  However, it is not clear whether this use of the
HTTP Message schema is mandatory or optional.  (I assume that it is
optional.)

22)  Section 2.3 does not clearly state how the server should respond to
a GET request for a 'body' URL.  If the original embedded annotation
body conforms to the HTTP Message schema, should the server respond to a
GET by decoding and sending the HTTP body?  What should it do if the
original body is does NOT conform to the HTTP Message schema?  What
should it do if the original body contains other RDF properties?

23)  Section 2.3. should include examples showing how the server
responds to GET requests.

24)  Section 2.1.2 contains the words "[t]he authors have not decided
whether to move to using that mechanism.".

25)  There should be a way for a client to TELL the server to respond a
GET request for a 'body' URI by sending the body as RDF ... not
withstanding the use of the HTTP Message schema.